API FAQ

Basics

What's an API?

The acronym "API" stands for "Application Programming Interface". An API is a defined way for a program to accomplish a task, usually by retrieving or modifying data. In DataSift's case, we provide API methods for all the core functionality. Programmers use the DataSift API to build applications that work with our platform. Programs talk to the API using HTTP or websockets.

How do I find my API key?

NOTE: your API key and DataSift username are both case sensitive.

To use the DataSift API, the first thing you need is your API key.

  1. Log in to DataSift.

  2. Go to the Dashboard or to the Settings page.

  3. Click the Copy to Clipboard icon under "Developer API Key".

Note that DataSift does not display the API key until you have purchased credits. If you have no credits yet, you cannot make an API call.

How do I find the hash for my stream?

  1. Log in to DataSift.

  2. Click on the Streams tab.

  3. Select a stream.

  4. Click the green "Use stream" button.

What's an interaction?

In DataSift, an interaction is one object from a source website. For example, an interaction from Twitter is a Tweet, including all the meta information such as the name of the author, the number of followers the author has, the number of people the author follows, and so on. It can also include augmentation information such as the language in which the content is written and the sentiment, positive or negative, conveyed in the message. An interaction is typically delivered to your client application as a JSON object.

What's a filter?

Filters sit at the very heart of DataSift's engine. You write them in our CSDL programming language. You can think of a filter as the logic that decides which input objects DataSift will deliver to you and which ones it will discard.

What's a stream?

A stream is the output of a filter. The terms are quite close. In fact, you might hear some developers refer to filters as streams.

What's a target?

A target is an individual field of information supplied by one of our social media partners sich as Twitter, by a third-party augmentation such as Klout, or by additional processing performed by DataSift itself, such as language analysis or gender detection. For example:

  • twitter.text contains the 140 characters of a Tweet
  • klout.score contains an author's overall score on Klout
  • language.tag contains the 2-character language code that identifies the language in which the post of writter

What's a post?

A post is a generic DataSift term for a message from one of our social media partners. Posts can be Tweets, blog entires, blog comments, Myspace content, and so on.

What's CSDL?

CSDL is our programming language, the Curated Stream Definition Language. Every DataSift developer learns CSDL because it is the language you use to write filters. It's a very simple, compact compiled language that runs exceptionally quickly and is easy to learn.

How many streams can I create?

There is currently no limit to the number of streams you can create through our Streaming or REST APIs. You can create a maximum of 1,000 streams through our GUI. For full details on our API usage policy, please see API Rate Limiting.

What's JSON?

JSON (JavaScript Object Notation) is the default format for the data that our APIs return. Wikipedia introduces JSON as "a lightweight text-based open standard designed for human-readable data interchange."

How do I switch from JSON to JSONP?

To use JSONP you need to understand how to:

  • Write a JavaScript callback function
  • Include that function in your API call
  • Make sure you specify JSONP in your API call

You define your JavaScript callback function on your web page and name it as your src parameter when you call a DataSift API. The callback function needs to be able to process standard DataSift objects but also be able to handle the other types of message that it might receive, such as ticks or error messages. A tick simply indicates that a connection is open but receiving no data. Ticks look like this:

    {"tick":1336057708,"status":"initialised","message":"Waiting for data"}

Your API call needs to include the name of the callback function. For example, suppose that you're using this API call with JSON and, to keep things simple, suppose that your username is "me" and your api_key is 888:

    http://stream.datasift.com/usage?username=me&api_key=888

To move from JSON to JSONP, with a callback function called xyz, you would change this API call to this:

    http://stream.datasift.com/usage.jsonp?username=me&api_key=888&callback=xyz

Notice that we changed the endpoint from usage to usage.jsonp to instruct DataSift to use JSONP.

I need something!

Where are the API Client Libraries?

They're held on GitHub, the code-sharing site. Here's what they say about their site: "GitHub has grown into an application used by over a million people to store over two million code repositories, making GitHub the largest code host in the world."

Our Client Libraries page has the up-to-date list of the libraries we offer.

How do I keep up with changes to the API?

We make all our development announcements on Twitter. Just follow @DataSiftDev. We'd love you to follow @DataSift, too.

Something isn't working!

Is your service down?

We are constantly striving to build the best platform to unlock the power and data of social media. From time to time it may be necessary to disrupt our services to perform maintenance and upgrades. You can check on the status of DataSift, and check any scheduled work on the DataSift Status Dashboard. Any changes to the platform status are also announced on @DataSiftAPI

What do I do when I hit the rate limit?

The rate limit is designed to ensure that everyone plays fair. Essentially, you can use the Streaming API and the /stream endpoint of the REST API without hitting limits but for activities such as compiling or validating CSDL code through the REST API, DataSift applies limits. If you find yourself hitting the limits, you might have to wait for up to one hour. Each REST API endpoint has its own rate limit cost.

Why did the hash for my stream change?

If you're working in the GUI, it's because you edited the stream. If you're compiling using the /compile endpoint of the REST API, it's because you sent new CSDL. The bottom line here is that if you change the CSDL, the hash will change.