Historics Access

Feel free to jump right in with the Historics API: Step-by-Step guide.

note icon

NoteThe Historics archive is not currently available to trial customers nor to Pay As You Go customers. If you would like to use Historics, the folks on our Sales Team are happy to help you.

The DataSift Historics archive is a large body of content gathered from a variety of social media sites. Historics is useful when you want to turn the clock back and filter against data from the past.

It uses the same CSDL language that we use for live streaming but it works much faster than live streaming; it offers 100 percent coverage but can be run on a sample of 10 percent.

Historics jobs run as 'batch' processes in our cluster. You specify the time range that you want to look at and submit 'jobs' to gather the data.

When you query the Historics archive, we give you clear guidance of data availability. You'll see on screen that we have coverage for the days you've selected. Here's a snapshot of the archive in our staging environment (not production) where data for July, 17 has not been loaded.

Note that your Historics queries are run in the timezone you set on your profile in DataSift. You can change it at any time.

Note also that the end time for any Historics query must be at least one hour in the past.

You can stop a Historics halfway through. You're billed for the work done so far, and for any data received.

The delivery mechanism for Historics is called Push. To learn how to configure and use Historics, take a look at our step-by-step guide to Historics.

What data is available in the archive?

At the time of writing, we have data from these sources and augmentations in the archive.

  • Bitly
  • Disqus
  • Tumblr
  • Wordpress
  • Gender
  • Interaction
  • Language
  • Links
  • Salience Sentiment
  • Salience Entities
  • Salience Topics

For the latest information, and to find out how far back the archive data goes for each source, consult our Historics Archive Schema pages.

How does billing work with Historics?

For Historics, you pay the standard data cost for each interaction that you receive, plus a DPU cost. Call historics/prepare to determine the DPU cost.

Push costs nothing. You pay exactly the same whether you use Push or not.

To learn more, please take a look at our Billing page.

Further Reading

Don't miss A Journey into Optimizing Hadoop Jobs by Lorenzo Alberton, DataSift Chief Technology Officer.