Ed Stenson's picture
Purpose: 

To filter for an exact match with a normalized URL.

Type: 

The url_in operator takes a string argument.

Case Sensitivity: 
Not case sensitive.
Syntax: 

url_in string

Examples: 

1.  Filter for Tweets that include links to apple.com, google.com, or finance.yahoo.com:

Notes: 

We normalize the argument in your CSDL code when you use the url_in operator. DataSift automatically:

  • strips off the protocol
  • removes www if it is present, so http://www.datasift.com becomes datasift.com
  • removes UTM codes
  • removes any anchors
  • converts any of the following strings to a forward slash:
    • /index.php
    • /index.html
    • /index.htm
    • /index.aspx
    • /index.asp
    • /default.html
    • /default.htm
    • /default.aspx
    • /default.asp

Then the url_in operator looks for an exact match between any of the normalized URLs in your argument and a normalized URL in an interaction. Since the operator requires an exact match, datasift.com/blog in the CSDL argument will match datasift.com/blog in an interaction, but it will not match datasift.com.

If you have this argument in your CSDL: And a target contains this URL: The URL_IN operator will find a match:
datasift.com/blog datasift.com/blog Yes
datasift.com datasift.com Yes
datasift.com/blog datasift.com No
datasift.com datasift.com/blog No