Important! This Connector has been deprecated.
Please use Splunk Storm REST instead.
Splunk Storm is a cloud-based data analysis and visualization service. DataSift can push interactions directly into your Splunk Storm account.
Configuring Splunk Storm for Push delivery
To use Splunk Storm with Push delivery, follow the instructions below, skipping the steps you have already completed. It does not matter which operating systems you use as long as you can connect to the internet:
Create a new Splunk Storm project.
You need a Splunk Storm account.
Set data delivery format to JSON (pre-defined timestamps).
Set the Input for your new project to Network Data.
Authorize DataSift's IP address for data delivery.
The easiest way is to use the automatic setup wizard.
- You are now ready to set up the Splunk Storm connector.
Configuring Push for delivery to Splunk Storm
To enable delivery, you will need to define a stream or a Historics query. Both return important details required for a Push subscription. A succesful stream definition returns a hash, a Historics query returns an id. You will need either (but not both) to set the value of the hash or historic_id parameters in a call to /push/create. You need to make a call to /push/get or /historics/get to obtain that information or you can use the DataSift dashboard.
Once you have the stream hash or the Historics id, you can give that information to /push/create. In the example below we are making that call using curl, but you are free to use any programming language or tool.
curl -X POST 'https://api.datasift.com/v1.4/push/create' \ -d 'name=connectorsplunkstorm' \ -d 'hash=42d388f8b1db997faaf7dab487f11290' \ -d 'output_type=splunkstorm' \ -d 'output_params.host=splunkstorm.example.com' \ -d 'output_params.port=20036' \ -H 'Authorization: datasift-user:your-datasift-api-key'
For more information, read the step-by-step guide to the API to learn how to use Push with DataSift's APIs.
When a call to /push/create is successful, you will receive a response that contains a Push subscription id. You will need that information to make successful calls to all other Push API endpoints (/push/delete, /push/stop, and others). You can retrieve the list of your subscription ids with a call to /push/get.
You should now check that the data is being delivered to your Splunk Storm project's input. Log in to your Splunk Storm account and search for data using the * wildcard character. It will match any document it can find. When the results are empty, you may have to wait for a while to let DataSift populate your project's database.
If there are no results, the most likely reason is that your DataSift stream has not yet produced any data, and you simply need to wait for a few seconds. When you are filtering for content that appears regularly, your stream will produce a high volume of data which will probably reach Splunk with minimal delay. However, a filter for content that appears rarely will result in a low-volume stream and might take several minutes or hours to find just one match. Therefore, it's a good idea to test a stream in the DataSift UI to get an idea of the throughput you should expect.
If there is a longer delay, this might be due to the fact that the stream has no data in it or there is a problem with your server's configuration. In the first case, preview your stream using the DataSift web console and in the second case, make a call to /push/log to find out if there are any clues in there.
Please make sure that you watch your usage and add funds to your account when it is running low. Also, stop any subscriptions that are no longer needed otherwise you will be charged for their usage. There is no need to delete them. You can can have as many stopped subscriptions as you like without paying for them. Remember that any subscriptions that were paused automatically due to insufficient funds, will resume when you add funds to your account.
- Familiarize yourself with the output parameters (for example, the bucket name) you'll need to know when you send data to a Splunk Storm project.
Twitter sends delete messages which identify Tweets that have been deleted. Under your licensing terms, you must process these delete messages and delete the corresponding Tweets from your storage.
default = json_new_line_timestamp_meta
| The output format for your data:
If you omit this parameter or set it to json_new_line_timestamp_meta, your output consists of JSON metadata followed by a JSON array of interactions (wrapped in square brackets and separated by commas).
If you select json_new_line_timestamp, DataSift omits the metadata and sends just the array of interactions.
Take a look at our Sample Output for File-Based Connectors page.
|The name of the Splunk Storm host that DataSift will connect to.|
|The port that you want DataSift to use on the target Splunk Storm host.|
Data format delivered:
JSON format. Each interaction is stored as one document.
One interaction per document.
DataSift cannot currently send more than 5MB of data every 10 seconds to Splunk Storm. This is due of the limitations of the Splunk Storm infrastructure and it may change in the future.
Please refer to the Splunk Storm support page.