This quick start guide walks you through creating filters using Query Builder to find relevant interactions and conversations that are taking place right now.
1. Enabling Sources
DataSift is a single platform with access to a large number of data sources. You need to enable one or more data sources before writing a filter. In this example, you will enable Tumblr™.
All available data sources are displayed. Find the Tumblr source and click the Activate button. Sources with Enquire buttons are restricted to accounts on premium subscriptions.
Add your signature in the license screen.
Select the check box agreeing to the License Agreement and click Agree.
2. Creating a Stream
A stream is all the social media interactions and extra data added by DataSift as a result of your filter.
To create a filter click the Streams tab and click the Create Stream button.
Create a filter. Type in a name and description for your stream. Select the Query Builder editor and click the Start Editing button.
Query Builder is a browser-based graphical tool that allows users to create and edit filters without having to learn the DataSift Curated Stream Definition Language (CSDL).
Click the Create New Filter button.
From the row of sources, click the Tumblr logo. This opens the next level of options. From the many types of Tumblr information, select BODY.
BODY looks for content in the Tumblr post.
Type Starbucks and leave Contains words selected. This simple filter will match interactions from every Tumblr post that contains the word Starbucks. Click Save and Preview.
Note: The "Contains words" operator allows a comma-separated list of words.
The Starbucks stream now contains one filter which matches Starbucks in the Tumblr post. Click Save and Close.
3. Refining Filters
A filter which displays every Tumblr post about your brand or your competitor's brand may produce too much data for you to analyze. The filter can be improved to retrieve interactions which match your business needs.
In this example, we will look for positive comments about Starbucks. The comments are measured by an Augmentation which looks for positive or negative sentiment in a post.
To update a filter to include augmentations, check the Sentiment Augmentation is enabled under Data Sources.
From the Streams tab, select your Starbucks stream and click Edit Stream.
Click the Create a New Filter button.
To filter for positive comments, add a sentiment value.
a. From the AUGMENTATIONS source,
b. Select SALIENCE
c. Select CONTENT
d. Select SENTIMENT
e. Select the greater than operator (>)
f. On the scale from the most negative sentiment and the most positive sentiment, select one.
Save and Preview your filter.
Two filters now define your stream.
To select the logic to be used between each filter, use the ALL of the following, ANY of the following and ADVANCED buttons. For this example, ensure ALL of the following is highlighted. Click Save and Close.
Use preview to verify the correct interactions are being provided in your stream.
4. Previewing Streams
A summary of the configured stream is shown along with the cost (in Data Processing Units) and options to run the stream or edit it again.
To use Live Preview, click Live Preview.
In the summary of sources, check that Tumblr is listed.
Click the Play button at the bottom of the screen to start the live preview.
Wait for a few interactions to appear…
Then click the pause button.
5. Analyzing Interactions
Each interaction is displayed along with Augmentations. DataSift enhances the information in the interaction with meta-data in the form of augmentations.
In this example, the icons under the text show the content contains positive sentiment and the text is English.
To display an interaction in more detail, move the mouse pointer over an interaction and a debug symbol is displayed. Click to reveal more information about the interaction.
Use the debug window to view the Tumblr data and all the extra augmented data provided by the DataSift platform.
Well done! You have successfully created a filter to provide you with a real-time stream of interactions using a Tumblr data source and Augmentations. To maximize the benefit of DataSift functionality, consider the following steps.
So far, you have previewed the stream in your web browser. To consume the stream over an extended period of time or to process the data programmatically, you need to consider Streaming or Destinations.
Data Consumption Method
The stream of interactions is sent to a pre-configured destination such as an FTP server, Amazon Simple Storage Service (S3) or Database.
Querying Historic Interactions
Filters can be run against historic social media interactions over configurable periods.
This is not available to trial accounts.
Writing Advanced Filters
The DataSift Curated Stream Definition Language (CSDL) is a programming language for defining filters. The previous Starbucks example created in Query Builder has an equivalent in CSDL.
To view the CSDL, click Share CSDL from the stream summary.
The language is displayed in a CSDL editor.