Configuring Delivery Size & Frequency

For file-based Push connectors, DataSift allows you to choose how frequently you want to receive data and how much data you are prepared to accept in each delivery. The parameters are:

  • output_params.delivery_frequency
  • output_params.max_size

Database-style connectors deliver in a continuous manner but file-based connectors offer you this additional flexibility.

Data volumes

For low-volume filters, you might find that a setting of 1 minute or 5 minutes for output_params.delivery_frequency suits you best.

In cases where you expect to receive high volumes of data, or if you don't yet know what kind of volume to expect, or if you are running a Historics query, we recommend that you set the output_params.delivery_frequency for your Push connector to continuous delivery or the shortest interval it supports. This will ensure you receive your data as quickly as possible.

Backward compatibility

The maximum value that you can use for output_params.delivery_frequency is:

  • 5 minutes for live streaming
  • 30 seconds for a Historics query

In the past, DataSift allowed higher values. We don't want to break anyone's existing code so we still accept higher values but we adjust them automatically so that they do not exceed these limits.

Be careful to avoid data loss

Take care when you are selecting these parameters. For example, if you set output_params.max_size to 100K and output_params.delivery_frequency to 1 minute, Push will deliver a maximum of 100K per minute. If your stream produces more data than this, some will be lost. If you believe you have experienced data loss, check these two parameters.