Using Managed Sources

Ed Stenson | 9th September 2013

I've noticed some questions from clients who are using Managed Sources for the first time. In this blog I'm going to go through the steps to run a DataSift filter on a Managed Source:

  1. Create a token
  2. Create a Managed Source
  3. Create a CSDL filter for that Managed Source
  4. Start recording the output of the filter
  5. Start the Managed Source

I'll use Facebook in my examples, but the process is similar for all the Managed Sources the platform offers.

Suppose that you have hundreds of Facebook pages about your brands, plus a body of content created by users or customers. DataSift can aggregate it all: your brand pages, campaign pages, competitor's pages, and pages from industry influencers.

In this blog I'm going to focus on our UI but you can set up and manage everything via API calls instead and, for production use, that's the way to go. To learn more about that process, read our step-by-step guide.

Just to set the scene, DataSift offers two types of data source:

  • Public
  • Managed

A public source (Youtube, for example) is one that anyone can access. A Managed Source is one that requires you to supply valid authentication credentials before you can use it.

Create a token

The first task is to create an OAuth token that DataSift will use for authentication. The good news is that you don't even need to know what an OAuth token is, because it's generated automatically:

  1. Log in and go to Data Sources -> Managed Sources.

  2. Click on the Facebook tile.

  3. Click Add Token.

    A popup box appears, inviting your to sign into your Facebook account. If you look at the URL in the popup's address bar, you'll see that it's served by Facebook, not by us. That means you're giving your Facebook credentials to Facebook privately, just as you do any other time you sign in. You are not giving them to us and we cannot see them.

  4. Log in to Facebook in the popup box.

    The popup closes and you will now see that you have a token.

    From now on, any time you run a filter in DataSift against this Managed Source, DataSift will use the token to gain access. It's secure; if you want to stop using the token, you can delete it from DataSift by clicking the red X. Or, in your Apps settings in Facebook, you can revoke it. If you do that, the token becomes useless.

Create a Managed Source

Now you can specify what you want to filter for.

  1. In the Name field, specify a name for your Managed Source. Here, I've called it "Example".

  2. Type a search term in the Search box and click Search. Here I'm going to monitor Ferrari cars and merchandise.

    DataSift lists all the accounts that match your search term. Select which ones you want to include in your filtering. In this example, I've chosen the candidate with the greatest number of likes.

  3. Click Save

Create a CSDL filter for that Managed Source

  1. Click the My Managed Sources tab. You will see the source you just defined. Notice that the Start button is orange whereas the other two sources, which I defined before I took this screenshot, have a Stop button. It's important that you don't click Start yet. The first time you click it, DataSift delivers a backlog of posts from the past seven days. You need to create a stream and start a recording to capture those posts otherwise they'll be lost. The next few steps explain how to do that.

  2. Click on your Managed Source, "Example" in this case. DataSift displays the definition page for the source.

  3. Click How to Use. Now you can grab the CSDL code for this Managed Source. It's a simple one-line filter that uses the target and the unique id for the source you just defined.

  4. Copy the CSDL code to the clipboard: == "c07504cc3a324848ba1fb5905287799b"

  5. Create a filter with that CSDL. You're probably very familiar with this step already. Just click the Create Stream button, paste the CSDL code in from my clipboard, and save it.

Start recording the output of the filter

Now you need to start recording the output of that filter. Recordings are under the Tasks tab in DataSift.

  1. Click Start a Recording.

  2. Choose the filter that you created in Step 13.

  3. Click Start Now and choose and end time for your recording. For this first test, I'd recommend that you don't choose a long duration.

  4. Click Continue and then click Start Task.

Start the Managed Source

Now go back to My Managed Sources and click Start.

Your filter will start to collect data from the source and DataSift will record it automatically.


That's all you need to know to use Managed Sources from the UI. Notice that you didn't even need to write a filter to get started; the platform provided the code for you. And by starting the recording before you ran the filter, you made sure that no data was lost.

For production use, there's a powerful Managed Sources API, plus that step-by-step guide that I mentioned at the beginning of this blog.

Previous post: Pulling Data with the Pull (Push) Connector

Next post: New delivered_at meta field for Push