The normalized version of the original URL. The normalizer performs the following actions on every link that it finds in every incoming interaction:
- Removes "www."
- Converts the URL to lower case
- Removes any of the following:
- Removes trailing slash from the end of the URL
- Removes any trailing anchor hash tags
- Removes Urchin Tracking Module tags
Understand that normalization is performed on interactions before they go into DataSift's filtering engine.
For example, if the original URL was this:
its normalized version would be:
Write your CSDL so that it does not filter for elements that are removed by the normalization process. For example, if you filter in the links.normalized_url target for this:
you will receive no data at all, because "www." is removed from every link in every interaction prior to filtering.
- Filter for posts that contain links that point to a specified page:
links.normalized_url == "http://nytimes.com/2013/05/01/dining/making-lunch-with-michael-pollan-and-michael-moss.html"
Remember that some URLs may be subject to redirect services such as Captcha. In such a situation, we recommend that you filter against the links.hops target as well as links.normalized_url. If there is a match, this target contains a record of the specified and redirected normalized URL.
links.normalized_url == "http://example.com/mypage?xyz=42" or links.hops url_in "http://example.com/mypage?xyz=42"
Target service: Augmentation Target: Links
Target object: Links: General
Always exists: No