By default, DataSift filters against all the data from your chosen sources. For example, this filter looks at every input object sent to DataSift along the Twitter Firehose:
In situations where you are performing statistical analysis on data, you can use the technique of sampling.
The interaction.sample target is an internally generated floating-point random number between 0 and 100.
This filter samples 5.25 percent of the incoming input objects and ignores the rest:
Twitter limits you to 500,000 Tweets in a 24-hour period. You can use interaction.sample to reduce your data consumption.
1. Filter for a sample of 1 percent of incoming Tweets:
2. Filter for all the Tweets that mention "coffee" and for a 10-percent sample of the Retweets that mention coffee:
3. You can even nest the samples: