Billing details are explained in full on our Terms page. The cost of a stream depends on how many operators you include. Some operators are more expensive than others. All streams have a fixed cost; some have a variable cost too because some data suppliers charge for their content. Billing for Historics also works in the same way. The following page will provide you with detailed information on how our billing system works.
Don't forget to take a look at our Billing FAQ too.
You can compile and preview streams free of charge through the website UI.
There is a charge to use streams through our APIs. The cost of using a stream via the API is a function of two variables:
Data processing effort required to execute the rule
Each rule is assigned an hourly data processing effort, measured in data processing units (DPUs), according to an analysis of its complexity. The simplest rule incurs an hourly cost of 0.1 DPU. However, note that DataSift's minimum charge rate is 1 DPU per hour. Therefore, you can run ten 0.1 DPU streams simultaneously for the same overall DPU cost as one.
Interaction throughput of the rule
The interaction throughput of a rule is the number of data objects it delivers. The cost of accepting a data object depends on the object's source and the licensing agreement we have with the provider. For example, each accepted Tweet costs $0.0001*. That means, if you accept 1,000* Tweets the cost will be $0.10. Note that in order to receive data objects, you must sign the license agreements for a number of data sources, including Twitter, on the license page of the website.
*Subject to change.
There are two types of payment plan which differ in the approach to charging for the data processing cost, allowing you to optimize according to your usage pattern:
Each DPU is charged at a fixed rate of $0.20 per hour* so, for example a rule rated at 1.5 DPU is charged at $0.30* per hour.
Note that DataSift's minimum charge is $0.20 per hour, so a 0.5 DPU rule would cost $0.20 per hour. If you use DataSift's multistream capability and run 10 streams simultaneously, and all those streams are rated at 0.1 DPU, the total is 1 DPU and so the total cost to run the stream is $0.20.
Whenever you want, you can buy credits in increments of $10 which allow you to run streams. As your streams run we continuously compute the combined DPU and throughput cost and reduce your credit balance. If your balance drops to zero, your streams stop until you top up your balance.
You agree to buy a fixed number of DPU hours per month for a fixed price. As your streams run they consume your fixed DPU allowance and, separately, incur a variable licensing fee. The licensing fees are calculated depending on the licensing agreement we have with the provider. Assuming you don't exhaust your DPU allowance, your monthly bill will be the fixed cost plus the licensing fees. If you do exceed your DPU allowance, the excess DPU hours are charged at the on-demand rate.
You must also set a variable cost limit for your monthly subscription to DataSift. The variable cost limit is the sum of:
As long as the combined total of your license costs and excess DPU costs are less than your set variable cost limit, you will be able to consume data normally.
But if you run over your variable cost limit, your streams will stop, and you will receive the following error message as part of your stream:
*Subject to change.
Whereas a rule's data processing rate is certain as soon as it is defined, its throughput is impossible to predict, it can only be estimated. You might want to run some sample executions to get a feel for the throughput cost of a stream.
The DataSift billing system calculates the cost of using streams from the DPU rate and licensing costs. DataSift also allows you to monitor your usage by enabling notifications via email and the Dashboard. The notifications vary depending on the type of payment plan you are on.
If you choose the On Demand plan, you will receive notifications if your credit balance runs low or falls to zero.
In a monthly subscription, you can set a variable cost limit on your account. You will receive notifications when you are close to and if you reach your variable cost limit. You can set or change your variable cost limit any time during the billing cycle.
The first notification is triggered when you have used up 80 percent of your variable cost limit. For example, if you set your variable cost limit to $2,500, you will receive the first notification when you have used up $2,000 on your account. You will receive the second notification when you reach your variable cost limit, at which point we will stop your streams. It is good practice to monitor your usage and ensure that your variable cost limit is always high enough for you to be certain that you will not have any problems for the duration of the month.
Preview of notifications in Dashboard
Preview of notifications via email
If you notice that you are close to your variable cost limit and then you raise it, you might be below 80 percent of the new limit or you might be above 80 percent of the new limit; it all depends on where you set your new limit.
For example, if you set the variable cost limit to $2,000 on your account, you receive the first notification when you have used up $1,600. Suppose that you receive that notification and you raise the variable cost limit to $2,500. There are two possible scenarios to consider:
- If you are below 80 percent of the new variable cost limit, which is $2,000, you would receive both the notifications: a warning when you reach 80 percent of the new variable cost limit and then a notification when you reach your variable cost limit.
- If you were above 80 percent usage, you will only receive a notification when you reach your new variable cost limit.
Billing for Historics queries
You can use Historics queries if you are on a monthly subscription, subject to one-time activation by your account manager. The cost of running a Historics query depends on data processing usage plus licensing costs, and the original DPU complexity of the stream you are running the query on.
Data processing usage for Historics is calculated based on the duration and sample size of the output data. The duration of the query is determined using the timeframe of the query, that is the duration between the start date and time, and the end date and time of the query. The sample size of the output data can be either 100 percent or 10 percent. For all Historics queries, there is a premium on the DPU usage compared to usage for live streaming. DPU usage for the 100% sample size is 125% of what you would pay for live streaming of the same filter. Similarly, for the 10% sample size, the DPU usage is 40% of what you would pay for live streaming of the same filter.
Hence, when you create a Historics query, DataSift is able to calculate the DPU usage before the query is executed. This DPU usage information is displayed on the Confirm New Historic Query page. When running a Historics query through the Historics API, you need to hit the historics/prepare endpoint to create a Historics query and get the total DPU breakdown for your Historics query before it is executed. DPU usage charges are deducted from the monthly DPU allowance.
On the other hand, licensing costs are calculated based on the volume of data retrieved for a particular Historics query. For a given CSDL filter, licensing costs for a Historics query of 100 percent sample size will be more than for a Historics query of 10 percent sample size.
You can view usage statistics for Historics queries on the Billing page. You can view total licensing costs and the DPU usage for your Historics queries. You can also view the volume of data retrieved by a Historics query and the number of Historics hours used. Alternatively, you can hit the usage endpoint in DataSift API which will give you a more accurate figure for the number of objects processed.
Billing for Historics Preview
Historics Preview is available for all accounts, on any payment plan: Subscription or Pay As You Go. Each request has a fixed cost of 10 DPUs plus 2 DPUs per day. For example:
- 1 day = 12 DPU
- 30 day = 70 DPU
There are no licensing fees charged for a Historics Preview since you will not be actually receiving any interactions matching your filter. You will ony receive aggregate statistics for your selected filter.
The DPUs are deducted from your account only after a complete and successful execution of your Historics Preview request. If your request gets interrupted while it is being processed, you won't get charged. You can only request a single Historics Preview per stream; if you request a new one, the previous request is overwritten.
Billing for Managed Sources
Billing for Managed Sources has two components:
- There is a charge for the complexity of your query, based on the number and type of operators.
- Each source is also billed as follows:
50 DPUs per Facebook page per month.
50 DPUs per search term per month. Search terms are:
|Google+||50 DPUs per Google+ page or keyword search per month.|
Find your DPU cost via the UI
In DataSift's UI you can check the DPU breakdown:
1. Select a stream.
2. Click View Definition.
The DPU breakdown appears below your CSDL code.
Find your DPU cost via the API
DataSift's REST API provides a dpu endpoint that gives the total DPU cost for a rule and the breakdown of its individual elements.
For Historics, DataSift's REST API provides a historics/prepare endpoint that gives the total DPU breakdown for a Historics query.
Find your throughput via the API
DataSift's REST API provides a usage endpoint that gives the number of object processed.
Cost of operators
Some operators in CSDL have a fixed DPU cost while others have a variable cost.
For fixed-cost operators you simply multiply the number of times you use the operator in a stream by its DPU cost. For example, if you use the contains operator twice in a stream the cost is 0,2 DPUs.
|Operator or Keyword||DPUs|
|contains||variable - see below|
|contains_any||variable - see below|
|contains_all||variable - see below|
|in||variable - see below|
|comparisons (==, > and so on)||0.1|
|regular expressions||variable - see below|
|geo_polygon||variable - see below|
|tag||variable - see below|
|wildcard||variable - see below|
The DPU cost of a regular expression is calculated as:
cost = the number of characters in the expression divided by 100.
The minimum charge for one regular expression is 0.1 so, for example, a regular expression that includes 10 characters costs 0.1 DPUs while a regular expression that includes 100 characters costs 1.0 DPUs.
The DPU cost of a geo_polygon depends on the number of vertices it has. To determine the DPU cost of any geo_polygon, divide the number of vertices by 30.
For example, a hexagon has 6 vertices so it has a DPU cost of 0.2. A triangle has 3 vertices so it has a DPU cost of 0.1.
The DPU cost for the
contains operator is based on the number of values you match against and the way you use the operator.
Using the contains operator to find a phrase
twitter.text contains "My dog ate my homework"
In this case, you can match against up to seven values for a cost of 0.1 DPU. The cost increases by 0.1 DPU as you add more words to the matching phrase. Here are the first few DPU cost bands.
|Maximum number of values||DPUs|
|and so on...|
For example this filter has just one word in the argument so it costs 0.1 DPU:
twitter.text contains "iPad"
This filter has eight words in the argument so it costs 0.2 DPU:
twitter.text contains "iPad is my favorite tablet device right now"
Using the contains operator to find individual words
twitter.text contains "xxx" and
twitter.text contains "yyy" and
twitter.text contains "zzz"
In this case, you can match against up to three values costs 0.1 DPU. The cost increases by 0.1 DPU for every four extra values you add. Here are the first few DPU cost bands.
|Maximum number of values||DPUs|
|and so on...|
The DPU cost for the
contains_all operators is based on the number of values you match against. The following table shows the DPU cost for any filter that uses these operators.
For example, this filter matches against 10 values so it costs 0.2 DPUs.
twitter.text contains_any "apple, microsoft, hp, dell, oracle, google, yahoo, ebay, amazon, facebook"
|Maximum number of values||DPUs|
The exact cost is determined using a sliding scale, so if you have 99 values in the command, the cost will be slightly lower than 1 DPU. Note that the table shows how we calculate DPU costs for a list of single keywords. In practise, you will often write filters that use the
contains_any keyword with a list of phrases of varying length. For example:
twitter.text contains_any "Yesterday, Yellow Submarine, The Long and Winding Road"
Since phrases take longer for DataSift to process than single keywords, the DPU cost is slightly higher. For example, a list of 30 single keywords with the
contains_any operator incurs a DPU cost of 0.4. However, if you filter for 10 phrases, each of three words, the DPU cost is 0.5.
We recommend that you check the DPU cost before you run a filter. The /compile endpoint returns a JSON object that includes the DPU cost.
Wildcards are charged at double the cost of the
Operators used inside a tag statement are normally charged at 10% of their usual DPU cost.
For example, if the normal cost of a rule is 1 DPU, that same code inside a tag statement would cost 0.1 DPU.
If you use any of the newer features of tagging such as namespaces or scoring, the pricing is based on the combined cost of operators in the tagging logic and in the filter definition. We simply count how many times each operator appears, and calculate the overall cost. For example, if you use the contains operator nine times in your tagging and you use it twice in your filtering logic, you will be charged for 11 uses of that operator.
It doesn't matter whether you include tags from external files or define them locally, the cost is the same. Similarly, it doesn't matter whether you define your filter locally or include part of it using the filter or stream keyword, the cost is the same.
Chunking and Punctuation
Foreign-language chunking adds a surcharge to the DPUs for any filter they appear in. For example, suppose the cost of a filter is 2 DPU. If you use Japanese chunking in that rule, it adds 20% to the overall cost. That is: 1.2 * 2 = 2.4 DPU. This is a one-off fee for the filter so it doesn't matter whether you use Japanese chunking once or 50 times in a filter, the cost is the same.
If you use Chinese chunking as well as Japanese chunking in a filter, you will be charged the 20% surcharge twice, so we multiply your DPU cost by 1.4 in this case.
Similarly, punctuation introduces a surcharge of 10% for each element in keep or drop. By element we mean:
- any single punctuation character
So, if you want to use the extended character set and drop commas, the surcharge is 10% + 10%.
The DPUs are rounded to the nearest 0.1.