Switching streams while they're live

You can change the CSDL of a stream while it is running; that is, while you are consuming it via the API. However, it's important to use the correct procedure to ensure that you do not lose any data:

  1. Keep the original stream running - don't stop it yet
  2. Make the changes to your CSDL
  3. Start consuming the new stream
  4. Stop consuming the original stream
  5. Use the interaction IDs for deduplication
flow

Let's look at each of these steps.

Keep the original stream running

Keep it running to ensure that there is no interruption to the data. You can change your CSDL source code and recompile even when a stream is running.

Make the changes to your CSDL

There are two ways to change the CSDL code. If you created it in the UI you can edit it there. If you created it using an API call to the /compile endpoint, send your CSDL source code to that endpoint again. In either case, DataSift will give you a new hash for the stream because it has changed.

Start consuming the new stream

Subscribe to the new stream using the new hash.

Disconnect from the original stream

Disconnect from the original stream as soon as the new connection is active, because you might receive duplicate data while both are running.

Use the interaction IDs for deduplication

Once you have received the data, you can write a simple script that uses the interaction ID embedded in each object in the data for deduplication.

With multistreaming

Everything discussed above refers to the situation where you are running a stream in the normal way, via the stream.datasift.com endpoint in the Streaming API, for example.

However, DataSift also offers multistreaming and if you are using this technique, the process is slightly different. With multistreaming, you open a single connection to DataSift and then you can subscribe to and unsubscribe from streams individually.

The steps to open a connection and change the code of a running stream are:

  1. Open a connection
  2. Subscribe to a stream
  3. Keep the original stream running - don't stop it yet
  4. Make the changes to your CSDL
  5. Subscribe to the new stream
  6. Unsubscribe from the original stream
  7. Use the interaction IDs for deduplication