By default, DataSift filters against all the data from your chosen sources. For example, this filter looks at every input object sent to DataSift along the Twitter Firehose:

In situations where you are performing statistical analysis on data, you can use the technique of sampling.

The interaction.sample target is an internally generated floating-point random number between 0 and 100.

This filter samples 5.25 percent of the incoming input objects and ignores the rest:


Rate Limiting

Twitter limits you to 500,000 Tweets in a 24-hour period. You can use interaction.sample to reduce your data consumption.

1.  Filter for a sample of 1 percent of incoming Tweets:

2.  Filter for all the Tweets that mention "coffee" and for a 10-percent sample of the Retweets that mention coffee:

3.  You can even nest the samples: