CSDL operators use certain characters to mark the beginning or the end of an argument, or to separate their arguments. This creates an interesting problem when you want to create filters that use those reserved characters. The solution to this problem is a special notation known as "escaping". An escaped character is preceeded by a backslash (\) that "turns off" its special meaning. This means that the backslash character also has a special meaning and it too needs to be escaped with a backslash when it is not escaping another special character.
Please note that while each operator may use different special characters, they all use (\) as the escape character.
Things become even more complex when regular expressions are used as arguments, because they too use the backslash as the escape character. In such case, you need to write your regular expression first, escape special characters used in that expression following regular expression conventions if necessary, and then escape backslashes (\) and double quotes (").
The following guide examples illustrate how escaping works in CSDL.
Escaping Special Characters In CSDL
CSDL reserves some characters because they have a special meaning. When you are not sure which characters are special, always check which operator you are trying to use, because each operator has a different set of special characters associated with it.
For example, the comma that you use in the
contains_any statement separates items in a list; it is not literally a comma. If you take a look at the CSDL code below you will see that it filters for any of these three characters, a, b, or c:
twitter.text contains_any "a, b, c"
If you wanted to filter for the actual string "a, b, c" you would need to escape the commas like this:
twitter.text contains_any "a\, b\, c"
And if you wanted to filter for either the actual string "a, b, c" or the string "hippopotamus" you would write your CSDL in the following way:
twitter.text contains_any "a\, b\, c, hippopotamus"
Similarly, if you want to filter for a double quote ("), you need to escape it like this:
twitter.text contains_any "\""
And, if for some reason you would like to filter for a string made of two double quotes and a comma (",") your CSDL would look like this:
twitter.text contains_any "\"\,\""
Since backslash is a special character, if you want to filter for a string that contanis a backslash such as "a backslash looks like this \") you will need to escape the backslash with another backslash:
twitter.text contains_any "backslash, \\"
The escaping rules shown in the examples above are for the
contains_anyoperator. The table below shows what needs to be escaped when you are using other operators.
Summary of CSDL Escaping
|contains||\\||no escape||no escape||\"|
|substr||\\||no escape||no escape||\"|
|\\||no escape||no escape||\"|
Escaping Regular Expressions In CSDL (regex_exact, regex_partial)
When you want to write filters using regular expressions in CSDL you will use either regex_exact or regex_partial operators. They allow you to use expressions compatible with the Google re2 regular expression engine. Full discussion of regular expressions is beyond the scope of this document, but we do have examples that should help you get started. What you need to keep in mind is the fact that regular expressions are treated as strings in CSDL and you need to surround them with double quotes and escape any double quotes and backslashes they may contain. But before you do that, make sure that you wrote a regular expression compatible with the re2 syntax. Once you have it, replace any occurence of a double quote (") with \" and any occurence of a backslash (\) with \\.