The normalized version of the original URL. The normalizer performs the following actions on every link that it finds in every incoming interaction:

  • Removes "www."
  • Converts the URL to lower case
  • Removes any of the following:
    • /default.html
    • /default.htm
    • /default.aspx
    • /default.asp
    • /index.php
    • /index.html
    • /index.htm
    • /index.aspx
    • /index.asp
  • Removes trailing slash from the end of the URL
  • Removes any trailing anchor hash tags
  • Removes Urchin Tracking Module tags

Understand that normalization is performed on interactions before they go into DataSift's filtering engine.

For example, if the original URL was this:

its normalized version would be:

Write your CSDL so that it does not filter for elements that are removed by the normalization process. For example, if you filter in the links.normalized_url target for this:

you will receive no data at all, because "www." is removed from every link in every interaction prior to filtering.


  1. Filter for posts that contain links that point to a specified page:

links.normalized_url == ""


Remember that some URLs may be subject to redirect services such as Captcha. In such a situation, we recommend that you filter against the links.hops target as well as links.normalized_url. If there is a match, this target contains a record of the specified and redirected normalized URL.

links.normalized_url == ""
or links.hops url_in ""

Resource information

Target service: Augmentation Target: Links

Target object: Links: General

Type: array(string)

Array: Yes

Always exists: No