Advanced Features

CSDL comes with a range of advanced features that allow much more complex filtering through the ability to include parent streams and also use our data processing to perform 'tagging' of content to remove the burden of doing it on the client side.

There's only one limit and most developers are unlikely to run into it: the maximun length of the CSDL code in your filter must not exceed 1 MB. Fortunately, there's an easy workaround that employs the stream keyword. If your stream reaches the point where it exceeds 1 MB, you can easily have it call one or more other streams. In this way, you can distribute your code across many chunks of CSDL code.

Here are some additional keywords and techniques you can employ to optimize streams, reduce costs, and create even more complex and powerful filters.

Including one filter in another

Use the filter keyword to include an existing filter definition in another.

Regular expressions

Use regular expressions to create super-powerful stream definitions.

Selecting data sources

You can select or exclude individual social media sites in a variety of ways.


You can optimize your CSDL code to run more efficiently and to minimize costs.

Sampling data

You can sample data rather than drinking from the entire firehose. For instance, you can create a filter that samples just a percentage of the input objects flowing into DataSift. This approach is particularly useful if you're performing statistical analysis where, for example, just 10 percent of the data might be enough to form a representative sample.