The normalized version of an original URL. The normalizer performs the following actions on every link that it finds in every incoming interaction:

  • Removes "www."
  • Converts the URL to lower case
  • Removes any of the following:
    • /default.html
    • /default.htm
    • /default.aspx
    • /default.asp
    • /index.php
    • /index.html
    • /index.htm
    • /index.aspx
    • /index.asp
  • Removes trailing slash from the end of the URL
  • Removes any trailing anchor hash tags
  • Removes Urchin Tracking Module tags

Understand that normalization is performed on interactions before they go into DataSift's filtering engine.

For example, if the original URL was this:

its normalized version would be:

Write your CSDL so that it does not filter for elements that are removed by the normalization process. For example, if you filter in the links.normalized_url target for this:

you will receive no data at all, because "www." is removed from every link in every interaction prior to filtering.


  1. Filter for posts that contain links that point to a specified page:

    links.normalized_url == ""


Also see the Filtering by Shared Links example.

The links.* targets contain any link in an interaction. They are more frequently populated than the and targets.

Resource information

Target service: PYLON for Facebook Topic Data

Target object: Links

Type: array(string)

Array: Yes

Tokenized for query filters: Yes

Interaction filter: Yes

Analysis target: Yes

Query filter: Yes

Child analysis target: No