You can add comments to your CSDL code using a style similar to the syntax used in C.

Comment out entire lines like this:

// Filter for comments about places to find a sandwich, coffee, or snack
// Some are national chains, some are regional

twitter.text contains_any "McDonalds, Burger King, Wendy's, 
Five Guys, Whataburger, KFC, Taco Bell, Qdoba, Panera Bread, 
Chick-Fil-A, Dunkin' Donuts, Dairy Queen, Starbucks, Carvel"

Comment out part of a line like this:

/* We could have used just one    */ twitter.text contains "Whataburger"
/* contains_any statement instead */ or twitter.text contains "Five Guys"

Or add a mid-line comment like this:

twitter.text /* maybe try interaction.content */ contains "Whataburger"

Note: in most cases a stream's hash changes whenever you edit the CSDL for that stream. However, adding comments to CSDL does not cause the hash to change because comments are normalized out when the stream is compiled.

CSDL accepts comments anywhere it accepts whitespace. Don't insert comments within quoted strings as the CSDL parser disables white space skipping here

Whitespace in the CSDL compiler

Our definition of whitespace is derived from the isspace() function in C. We use:

  • space
  • tab
  • newline
  • feed
  • carriage return

Word Matching using Contains/Any

DataSift tokenizes each interaction target. It ignores white space but treats every punctuation symbol (as defined in ispunct()) as a separate word.

For example, if the input is:

This is, a test

The output is:

<This> <is> <,> <a> <test>

Using this technique we can match words without allowing punctuation to affect the boundry of what is considered a word, yet still allows you to include punctuation in filter when you want to, as the punctuation is not stripped from the text.