The normalized version of an original URL. The normalizer performs the following actions on every link that it finds in every incoming interaction:
- Removes "www."
- Converts the URL to lower case
- Removes any of the following:
- Removes trailing slash from the end of the URL
- Removes any trailing anchor hash tags
- Removes Urchin Tracking Module tags
Understand that normalization is performed on interactions before they go into DataSift's filtering engine.
For example, if the original URL was this:
its normalized version would be:
Write your CSDL so that it does not filter for elements that are removed by the normalization process. For example, if you filter in the links.normalized_url target for this:
you will receive no data at all, because "www." is removed from every link in every interaction prior to filtering.
Filter for posts that contain links that point to a specified page:
links.normalized_url == "http://nytimes.com/2013/05/01/dining/making-lunch-with-michael-pollan-and-michael-moss.html"
Also see the Filtering by Shared Links example.
Target service: PYLON for Facebook Topic Data
Target object: Links
Tokenized for query filters: Yes
Interaction filter: Yes
Analysis target: Yes
Query filter: Yes
Child analysis target: No