Quick Start using CSDL

This quick start guide walks you through creating filters using the Curated Stream Definition Language (CSDL) to find relevant interactions and conversations that are taking place right now.

1. Enabling Sources

DataSift is a single platform with access to a large number of data sources. You need to enable one or more data sources before writing a filter. In this example, you will enable Tumblr™.

To enable Tumblr, log in to your account and click the Data Sources tab.

All available data sources are displayed. Find the Tumblr source and click the Activate button. Sources with Enquire buttons are restricted to accounts on premium subscriptions.

Add your signature in the license screen.

Select the check box agreeing to the License Agreement and click Agree.

2. Creating a Filter

A stream is all the social media interactions and extra data added by DataSift as a result of your filter.

To create a filter, click the Streams tab and click the Create Stream button.

Create a filter. Type in a name and description for your stream. Select the CSDL Code Editor and click the Start Editing button.

The DataSift Curated Stream Definition Language (CSDL) enables you to create large and powerful filters. Each one can be up to 1MB.

Filtering Condition Elements

Filters include one or more filter conditions. Each filter condition usually has a Target, Operator and Argument. In the editor, these are color coded blue, red and green.

The example is a filtering condition with all three elements.

tumblr.body contains "starbucks"


A single interaction contains many data source attributes and values. These attributes are called targets and a filtering condition starts with the name of a target.

All targets are listed in the developer documentation in the three groups - feeds, augmentations and managed sources. The targets available for each source are described in the Public Sources documentation.

Add a condition to filter for Tumblr posts that contain the string “starbucks”. Use the target tumblr.body and the contains operator.

_Note: The contains_any operator allows a comma-separated list of words._

This simple filter will match all Tumblr posts from every author that contain the word “starbucks”, which is not case sensitive. Click Save and Close.

A summary of the configured stream is shown along with the cost (in Data Processing Units) and options to run the stream or edit it again.

3. Refining Filters

A filter which displays every Tumblr post about your brand or your competitor's brand may produce too much data for you to analyze. The filter can be improved to retrieve interactions which match your business needs.

In this example, we will look for positive comments about Starbucks. The comments are measured by an Augmentation which looks for positive or negative sentiment in a post.

To update a filter to include augmentations, check the Sentiment Augmentation is enabled under Data Sources.

From the Streams tab, select your Starbucks stream and click Edit Stream.

Next create a condition based on the sentiment of the interactions. I can look for sentiment in the content or title, select content for this example. Add the target salience.content.sentiment, and return positive interactions. Integers are not enclosed in quotes.

The sentiment scale is from negative 100 to positive 100 although it’s rare to see sentiment at these extremes of the scale. To filter for any level of positivity, select a value that is at least 1.

Filtering Logic

Three filters now define your stream. When additional conditions are added to a filter, we need to define how they work together. Add the following logic to your filter:

OR returns interactions if any of the filter conditions are matched

AND returns interactions if ALL filter conditions are matched

NOT returns all interactions except those that match the filter condition

Example using OR logic.

It is also possible to configure more complex logic using brackets.

My example uses AND logic, which matches positive sentiment Tumblr posts in English, that contain the string Starbucks.

Click Save and Close.

Use preview to verify correct interactions are being provided in your stream.

4. Previewing Streams

A summary of the configured stream is shown along with the cost (in Data Processing Units) and options to run the stream or edit it again.

To use Live Preview click Live Preview.

In the summary of sources, check that Tumblr is listed.

Click the Play button at the bottom of the screen to start the live preview.

Wait for a few interactions to appear…

Then click the pause button.

5. Analyzing Interactions

Each interaction is displayed along with Augmentations. DataSift enhances the information in the interaction with meta-data in the form of augmentations.

In this example, the icons under the text show the content contains positive sentiment and the text is English.

To display an interaction in more detail, move the mouse pointer over an interaction and a debug symbol is displayed. Click to reveal more information about the interaction.

Use the debug window to view the Tumblr data and all the extra augmented data provided by the DataSift platform.

Next Steps

Well done! You have successfully created a filter to provide you with a real-time stream of interactions using a Tumblr data source and Augmentations. To maximize the benefit of DataSift functionality, consider the following steps.

Consuming Data

So far, you have previewed the stream in your web browser. To consume the stream over an extended period of time or to process the data programmatically, you need to consider Streaming or Destinations.

Data Consumption Method



Using an Application Programing Interface (API), the data is sent in real time to a program as JavaScript Object Notation (JSON) objects.


The stream of interactions is sent to a pre-configured destination such as an FTP server, Amazon Simple Storage Service (S3) or Database.

Querying Historic Interactions

Filters can be run against historic social media interactions over configurable periods.

This is not available to trial accounts.

Writing Advanced Filters

Use CSDL to create powerful filters to fine tune which interactions are returned, and enrich your selected data with tags and scores.

Note: For more information on CSDL, view the CSDL documentation.