Push API: Step-by-Step
The Push endpoints in DataSift's REST API allow you to use our Push delivery system instead of live streaming. If you're using Push with live streaming, a call to the /push/create endpoint is enough to set everything going.
If you're using Historics, you'll need to make these API calls in the correct sequence:
Here's a summary of the API calls that you need to make. If you're using one of our client libraries, they'll make the appropriate calls for you.
How do I use the Push API?
Decide which data destination you want to use.
If you choose a destination that requires us to access your servers (such as FTP, SFTP, or HTTP), take a look at the IP addresses we use.
Create a filter in CSDL and call the /compile endpoint which returns a JSON object containing a unique hash to indentify your filter. You can run the stream live or you can run it as a Historics query against our archived data.
If you're using Historics, hit the /historics/prepare endpoint in the Historics API. Make a note of the Historics id that it returns. Take a look at our Historics documentation for more information.
Hit the /push/create endpoint in the Push API.
This API call generates a Subscription and returns a Subscription id which serves as a unique identifier for that Subscription. Make sure you keep a note of the Subscription id.
One of the parameters is called output_params. Here you can set the maximum amount of data that you want to receive in each HTTP request and the minimum time between HTTP requests. Think about these settings; if you choose a small value for the data size and a long time interval, and then you run a high-volume stream, you could lose data. We recommend that you test your streams in DataSift's UI first to get an idea of the data volume you can expect.
If you're using Historics, hit the /historics/start endpoint to set your Historics query running. DataSift starts to send data to the buffer in Push.
If you need to pause data delivery, hit the /push/pause endpoint. DataSift continues to buffer your data for up to an hour. In other words, data flows into Push but we do not deliver it to you. It's important that you don't pause delivery for more than an hour because you will lose data. Hit /push/resume to make the Subscription active again.
If you want to change the name or parameters you gave to the Subscription (when you originally called push/create), call /push/update.
At any time, you can request simple statistics on any Subscription together with a status report. Just call the /push/log endpoint.
Hit /push/stop to stop a Subscription that is running. DataSift sets the status flag to finishing, attempts to deliver all the data in the buffer to you using HTTP requests, and then sets the status flag to finished.
- At any time, you can delete a Subscription, even if it is running, by calling /push/delete.