Splunk Enterprise

Updated on Thursday, 29 May, 2014 - 10:27

This connector is available only to customers on the Enterprise Edition.

 

Splunk Enterprise is a self-hosted data analysis and visualization service from Splunk. DataSift can push interactions directly into your Splunk Enterprise service as long as we can reach it over the public internet.
 

Configuring Splunk Enterprise for Push delivery

To use Splunk Enterprise with Push delivery, follow the instructions below, skipping the steps you have already completed. It does not matter which operating systems you use as long as you can connect to the internet:
 

  1. Log into your Splunk account.

    You may need to create a new Splunk account if you haven't got one already.
     
  2. Download a Splunk Enterprise package for your operating system.

    You need to be logged into your Splunk account to be able to download Splunk Enerprise. In the examples on this page we assume that you are using a 64-bit version of Ubuntu.
     
  3. Unpack Splunk Enterprise to your local directory:

    $ tar zxvf splunk-5.0.2-149561-Linux-x86_64.tgz
    $ cd splunk
    $ cd bin

    Before you put Splunk into production use, please unpack it to a local directory on a machine that's easy to reach via the public internet. An Amazon AWS EC2 micro instance will do just fine.
     
  4. Install Splunk Enterprise:

    $ ./splunk start

    Follow the instructions given by the installation script. Most of the time you will not have to modify anything, but if port 8000 is used by another service, the installation script will ask you to choose another port for Splunk Enterprise web interface access.
     
  5. Log into Splunk Enterprise Administration interface.

    You need to open the Splunk Enterprise Administration interface's URL in a web browser, for example http://splunkenterprise.example.com:8000/

    If you put your Splunk Enterprise server behind a firewall, set your firewall to open and forward Administration port (8000 or the port of your choice) and to connections from the IP address of your local machine or the range of IP addresses managed by your organization.
     
  6. Make a note of the Management port.

    Go to Manager -> System settings -> General settings page.

    If that port is used by another service, simply change it to something like 8090, or another unused port 1024 or above. If you put your Splunk Enterprise server behind a firewall, set it to open and forward Management port (8089 or the port of your chioce) to connections from the DataSift Push IP addresses.
     
  7. Save any changes you have made.
     
  8. You are now ready to set up the Splunk Enterprise connector.
     

Configuring Push for delivery to Splunk Enterprise

  1. To enable delivery, you will need to define a stream or a Historics query. Both return important details required for a Push subscription. A succesful stream definition returns a hash, a Historics query returns an id. You will need either (but not both) to set the value of the hash or historic_id parameters in a call to /push/create. You need to make a call to /push/get or /historics/get to obtain that information or you can use the DataSift dashboard.
     
  2. Once you have the stream hash or the Historics id, you can give that information to /push/create. In the example below we are making that call using curl, but you are free to use any programming language or tool.
     
  3. For more information, read the step-by-step guide to the API to learn how to use Push with DataSift's APIs.
     
  4. When a call to /push/create is successful, you will receive a response that contains a Push subscription id. You will need that information to make successful calls to all other Push API endpoints (/push/delete/push/stop, and others). You can retrieve the list of your subscription ids with a call to /push/get.
     
  5. You should now check that the data is being delivered to your Splunk Enterprise installation. Log in to your Splunk Enterprise web interface and click on Launch search app. Then, search for data using the * wildcard character. It will match any document it can find. When the results are empty, you may have to wait for a while to let DataSift populate your database.

    If there is a longer delay, this might be due to the fact that the stream has no data in it or there is a problem with your server's configuration. In the first case, preview your stream using the DataSift web console and in the second case, make a call to /push/log to find out if there are any clues in there.

    Please make sure that you watch your usage and add funds to your account when it is running low. Also, stop any subscriptions that are no longer needed otherwise you will be charged for their usage. There is no need to delete them. You can can have as many stopped subscriptions as you like without paying for them. Remember that any subscriptions that were paused automatically due to insufficient funds, will resume when you add funds to your account.
     
  6. To stop delivery, call /push/stop. To remove your subscription completely, call /push/delete.
     
  7. Familiarize yourself with the output parameters (for example, the bucket name) you'll need to know when you send data to a Splunk Enterprise server.

Notes

Twitter sends delete messages which identify Tweets that have been deleted. Under your licensing terms, you must process these delete messages and delete the corresponding Tweets from your storage.

 

Output parameters

Parameter: Description:
output_params.host
required
The name or the IP address of the Splunk Enterprise host that DataSift will connect to.
output_params.port
required
The port that you want DataSift to use on the target Splunk Enterprise host.
output_params.auth.username
required
The name of the Splunk Enterprise admin user.
output_params.auth.password
required
The password of the Splunk Enterprise admin user.
output_params.format
optional
default = json_new_line_timestamp_meta
The output format for your data:
  • json_new_line_timestamp_meta - Each interaction is sent separately except it is framed with metadata and an extra timestamp field.
  • json_new_line_timestamp - Each interaction is sent separately and has an extra timestamp property and no meta data.

Take a look at our Sample Output for File-Based Connectors page.

Example values: json_new_line_timestamp_meta

 

Data format delivered: 

JSON format. Each interaction is stored as one document.

Storage type: 

One interaction per document.