Browse the Links augmentation targets.
The Links augmentation looks at any links within the content of a message and resolves them to their final endpoint. At the same time it also aggregates the content of the link so that filtering can be performed against the content of the page that the link was pointing at.
DataSift follows all types of shortened links (for example, bit.ly and Twitter's own t.co shortener) and follows each redirect until the final web page is found. The final resolved link is also visible (as links.url) to be filtered against
The Link augmentation works in near real time; only links which have not previous been discovered are taken out of the real-time flow and are re-inserted (normally in under two seconds) back into the flow of data.
How it Works
Here are the key points you need to know first:
- We resolve all links even if they are shortened.
- We follow all redirects through to the final URL.
- We do this in real time so any new links are instantly resolved.
- We fetch the content (currently just the title) from the page that a link points to.
You can filter against the title of a linked page:
links.title contains "something"
You can filter against specific domains. We use the
inoperator here rather than
containsbecause this target is an array of strings:
links.domain in "yahoo.com, nytimes.com"
You can filter against the retweet count:
links.retweet_count > 1000
Note that this example has no meaning if there is more than one link in the object because each link has a unique retweet count.
Multiple Links in One Input Object
An input object might contain more than one link so the Links augmentation is designed to handle multiple links. The targets for the Links augmentation are arrays of strings or arrays of integers. There is one array element for each link. For example, for a Tweet that contains three links, there will be three array elements.
DataSift keeps the array elements in step automatically. For example, if links.title contains:
You perform operations on these arrays as if they were simple strings or integers. For example, the following filter succeeds if it finds a match on at least one row in the array.
links.title contains "Cincinnati Bengals"