Understanding Limits

In this guide we'll take a look at the limits you are subject to when building your PYLON for LinkedIn Engagement Insights solution. We'll also look at how you can monitor these limits so you can keep your application running in a production environment.

Limits are documented in the relevant locations across this site (for example the platform allowances page). However, this guide pulls all limits across each aspect of the platform together in one place.

With PYLON for LinkedIn Engagement Insights you are subject to the following categories of limits:

  • Platform limits - these limits apply to all customers and are independent of your package
  • Account limits - these limits are applied based on the package you have purchased

Platform limits

Regardless of your account package you will always need to keep in mind the following platform limits. The majority of platform limits are hard limits you need to aware of when designing your solution.

Recorded data expiry

All analysis tasks are performed on a pre-recorded index, containing the data from the last 30 days. Daily at 00:00 UTC, data which is over 30 days old will expire from the recording

Query filter complexity

Filters for analysis queries are limited to 30 conditions, each with a maximum of 100 arguments.

If you attempt to submit a query to the POST /pylon/{service}/task endpoint which exceeds these limits you'll receive an error response.

Frequency distribution results limit

A maximum of 200 elements can be returned in a report.

Nested frequency distribution results limits

Maximum nesting depth

The maximum depth of nesting is three levels - one parent and two children.

For each level of the analysis, the maximum number of results that can be returned is 200. The number of results to return for each level is specified using the threshold parameter.

Nested threshold product

Additionally, the overall 'threshold product' for a nested query is limited to 80,000, except for queries that analyze user skills for which the limit is 1000.

You can calculate the threshold product of a query by multiplying the threshold parameters together.

For example, the following nesting of targets and thresholds exceeds the overall threshold limit, because multiplying the thresholds together (200 x 27 x 18) gives 97,200:

  • li.user.member.country (threshold = 200)
    • li.user.member.functions (threshold = 27)
      • li.user.member.employer_industry_sectors (threshold = 18)

You can reduce the threshold values to stay within the limit:

  • li.user.member.country (threshold = 200)
    • li.user.member.functions (threshold = 27)
      • li.user.member.employer_industry_sectors (threshold = 14)

Here the threshold product is 75,600 and so the query is allowed by the platform.

Identifying nested analysis targets

Any analysis target may be used as the parent target, but only a subset of low cardinality targets (fewer than 50 unique values) can be used as child targets. The PYLON Target Explorer tool lists all targets, the Properties section includes information about where a target may be used.

In the example, the li.user.company.industry_sector target may be used in Analysis and Child Analysis Queries, and also in Query Filters:

linkedin_targets_explorer_properties

Time series results limit

Interval Limits

There is a different limit to the duration of an Analysis Query for each interval.

  • Minute Interval
    • An analysis with a minute interval cannot cover a period greater than 60 minutes. The period can cross boundaries from one day to another.
  • Hour Interval
    • An hourly analysis cannot cover a period greater than 336 hours (2 weeks). The period can cross boundaries from one day to another.
  • Day Interval
    • A day analysis cannot cover a period greater than 32 days.
  • Week Interval
    • A weekly analysis cannot cover a period greater than 4 weeks.
  • Month Interval
    • A month interval cannot cover a period greater than 1 month.

Account limits

As a DataSift customer your account has a PYLON for LinkedIn Engagement Insights package assigned. In this section we'll look at the limits your package applies. Account limits can be monitored via the API, here we'll show you how.

API rate limit

Your account is subject to an API rate limit which is based on a number of credits you can spend in an hour.

Each call to the API has an associated cost in credits. The cost of each PYLON API call is listed on the platform allowances page.

You can monitor your API usage by inspecting returned headers from each API request. For each request you make to the API you will receive the following headers in the response:

  • X-RateLimit-Limit - Your account's assigned rate limit (in credits)
  • X-RateLimit-Remaining - Your current remaining credits
  • X-RateLimit-Cost - The cost of the call you just made
  • X-RateLimit-Reset-Ttl - The number of seconds until your rate limit resets

Analysis task rate limit

Your account will have a limit for the number of analysis tasks you can submit to the POST /pylon/{service}/task endpoint per hour. This limit is separate from your general API rate limit.

Calls to the endpoint cost 5 credits. Therefore if you have 5,000 credits for calls to the endpoint, you can submit 1000 analysis tasks per hour.

This limit states the number analysis tasks you can submit per hour, NOT the number of tasks that will be processed in an hour by the platform. Processing of the analysis tasks is carried out on a best-effort basis, currently of up to 160 tasks per hour.

It's important you consider usage of this limit carefully. Your usage is most likely to be split between:

  • Your own exploration of data - Naturally you'll want to explore the available data from time to time, but this is unlikely to use up many of your queries.
  • Repeated sets of queries to populate dashboards, data stores and baselines - Here you can design your query sets to fit within your limits.
  • Ad-hoc queries made by end customers (if you choose to provide this) - This is more difficult to predict as it depends on the feature you provide to your users. If you have enough headroom you can go ahead and provide live exploration to your users. On the other hand you may want to cache results or only allow users to explore results you've previously stored.

However you choose to portion your analysis query limit, we recommend that you cache analysis results in your data store to give you the most possible queries to work with.

You can monitor your consumption of this limit using the headers returned by each request. For each request you make to the API you will receive the following headers in the response:

  • X-RateLimit-Limit - Your account's assigned analysis query limit (in credits)
  • X-RateLimit-Remaining - Your current remaining credits
  • X-RateLimit-Cost - The cost of the call you just made (25 credits for this call)
  • X-RateLimit-Reset-Ttl - The number of seconds until your rate limit resets

note icon

As noted above this limit is distinct from your overall API rate limit and only applies to analysis requests. The returned headers reflect your analysis task limit when you call the POST /pylon/{service}/task endpoint.