Understanding Limits & Monitoring Usage

In this guide we'll take a look at the limits you are subject to when building your PYLON solution. We'll also look at how you can monitor these limits so you can keep your application running in a production environment.

Limits are documented in the relevant locations across this site (for example the platform allowances page). However, this guide pull all limits across each aspect of the platform together in one place.

With PYLON you are subject to the following categories of limits:

  • Platform limits - these limits apply to all customers and are independent of your package
  • Account limits - these limits are applied based on the package you have purchased
  • Identity limits - you can apply limits to identities within your account to help ensure fair usage for your end customers

Platform limits

Regardless of your account package you will always need to keep in mind the following platform limits. The majority of platform limits are hard limits you need to aware of when designing your solution.

Interaction filter complexity

Any interaction filter you create is limited to 1MB of text and/or 999 logical operators.

If you call the /pylon/compile endpoint with CSDL that exceeds these limits you will receive an error. If in doubt you can validate your CSDL using the /pylon/validate endpoint first.

If you hit this limit you can consider make use of the stream keyword, splitting filters into multiple definitions.

An interaction filter can include up to 10,000 tag or scoring rules, including from those you have included using the stream keyword.

Recorded data expiry

Data you record into your index will remain for 32 days. This limit cannot be changed due to privacy constraints.

We recommend you regularly store the results of analysis queries to your data store to provide longer time periods to your end users.

Per recording limit = 1 million interactions, per day

Regardless of your account package, you can only record 1 million interactions per day in a recording.

You can monitor how much data has been stored by a recording using the /pylon/get endpoint.

{
    "volume":2000,
    "reached_capacity":false,
    "remaining_index_capacity":10000,
    ...
}

For each recording the endpoint returns the following:

  • reached_capacity - Whether your index has hit the daily limit
  • remaining_index_capacity - The remaining daily limit for the index

If this limit is limiting your application you can consider splitting your recording across multiple indexes. See our Designing with Filters, Indexes and Queries guide for ideas.

Query filter complexity

Filters for analysis queries are limited to 30 conditions, each with a maximum of 100 arguments.

If you attempt to submit a query to the /pylon/analyze endpoint which exceeds these limits you'll receive an error response.

Frequency distribution results limit

The maximum number of items you can return from a frequency distribution analysis is 200.

If this limit is limiting your application you could consider using query filters to analyze different portions of your index.

Time series results limit

The maximum number of intervals you can return from a time series analysis depends on the interval parameter you specify:

  • minute - Cannot cover a period greater than 60 minutes and must fall within a single 24 hour day
  • hour - Cannot cover a period greater than 336 hours
  • day - Cannot cover a period greater than 32 days
  • week - Cannot cover a period greater than 4 weeks
  • month - Cannot cover a period greater than 1 month

If you'd like minute level analysis for more than 60 minutes you can submit multiple calls for different periods and combine the results yourself.

Account limits

As a DataSift customer your account has a PYLON package assigned. In this section we'll look at the limits your package applies. Account limits can be monitored via the API, here we'll show you how.

Concurrent recordings limit

Your package will specify the maximum number of recordings you can run simultaneously.

You can monitor the number of recordings you have running in your account by hitting the /pylon/get endpoint. Use your account API key (not an identity API key) and do not specify a recording ID to get a full list of recording.

{
    "count": 2,
    "page": 1,
    "pages": 1,
    "per_page": 25,
    "subscriptions": [
        {
            "volume": 12300,
            "start": 1436085514,
            "end": 1436089932,
            "status": "running",
            "name": "example1",
            ...
        }, ...
   ]
}

You can page through this list counting the recordings with status 'running' to count the number of running recordings.

Note the count value value represents the number of recordings you have created and have been run in the last 32 days. It does not represent the number of recordings that are currently running.

Account recording limit

Your package specifies your allowed index capacity per month for your account. This translates to a daily, account-level interaction limit based on the following formula:

Daily account limit* = Index capacity / 30 days

The daily account limit is always rounded up to the nearest 1 million. For example, if your index capacity is 12 million, your daily account limit would be 1 million.

If you hit your limit then no more interactions will be recorded to any of your indexes for the remainder of the day (until midnight PST).

You can monitor the volume you have recorded in the current day by hitting the /pylon/get endpoint using your account API key. You will receive data similar to the following:

{
    "count": 2,
    "page": 1,
    "pages": 1,
    "per_page": 25,
    "subscriptions": [
        {
            "status": "running",
            "name": "example1",
            "remaining_index_capacity":10000,
            "remaining_account_capacity":20000,
            ...
        }, ...
   ]
}

The remaining_account_capacity value given for every recording is how many more interactions you can record for the current day for your account.

You will be sent notifications when you reach 50%, 90% and 100% of your daily account limit. You can configure notification options for your account by visiting Notification Preferences in account settings.

API rate limit

Your account is subject to an API rate limit which is based on a number of credits you can spend in an hour.

Each call to the API has an associated cost in credits. The cost of each PYLON API call is listed on the platform allowances page.

You can monitor your API usage by inspecting returned headers from each API request. For each request you make to the API you will receive the following headers in the response:

  • X-RateLimit-Limit - Your account's assigned rate limit (in credits)
  • X-RateLimit-Remaining - Your current remaining credits
  • X-RateLimit-Cost - The cost of the call you just made
  • X-RateLimit-Reset-Ttl - The number of seconds until your rate limit resets

Analysis query rate limit

Your account will have a limit for the number of analysis queries you can make to the /pylon/analyze endpoint per hour. The limit depends on your package but for example may be 3,500 queries / hour. This limit is separate from your general API rate limit.

It's important you consider usage of this limit carefully. Your usage is most likely to be split between:

  • Your own exploration of recorded data - Naturally you'll want to explore the data you've recorded from time to time, but this is unlikely to use up many of your queries.
  • Repeated sets of queries to fill dashboards and data stores - Here you can design your query sets to fit within your limits.
  • Ad-hoc queries made by end customers (if you choose to provide this) - This is more difficult to predict as it depends on the feature you provide to your users. If you have enough headroom you can go ahead and provide live exploration to your users. On the other hand you may want to cache results or only allow users to explore results you've previously stored.

However you choose to portion your analysis query limit, we recommend that you cache analysis results in your data store to give you the most possible queries to work with.

You can monitor your consumption of this limit using the headers returned by each request. For each request you make to the API you will receive the following headers in the response:

  • X-RateLimit-Limit - Your account's assigned analysis query limit (in credits)
  • X-RateLimit-Remaining - Your current remaining credits
  • X-RateLimit-Cost - The cost of the call you just made (25 credits for this call)
  • X-RateLimit-Reset-Ttl - The number of seconds until your rate limit resets

As noted above this limit is distinct from your overall API rate limit and only applies to analysis requests. The returned headers reflect your analysis query limit when you call the pylon/analyze endpoint.

Identity limits

To help you manage your usage we recommend you use identity limits to spread your account limits fairly across your end customers.

Read our developer guide on Managing Identities for more details.

Currently we do not provide a mechanism to limit identities to a fixed number of API calls (outside of calls to the pylon/analyze endpoint). If this is a concern you can look at two options:

  • Implement your own logging for API calls you make per customer
  • Calculate the number of API calls you will make per customer depending on the features you provide to each