Managed Sources 2016+ Architecture: What's Changed

Here we explain the differences between the existing, and 2016+ Managed Sources architectures in terms of what has been added, changed, and taken away.

What's new?

Aggregated "Comment Count" and "Like Count" interactions

As a replacement for Individual "Like" interactions we have introduced aggregated "Comment Count" and "Like Count" interactions to regularly return a count of the number of Likes or Comments on a specific interaction at that point in time. These interaction types can guarantee that you receive accurate point-in-time engagement counts for the posts collected from the resources you are monitoring.

These interaction types provide an aggregated count only, and do not expose details of the individuals performing the Likes or Comments. The value of the "Like Count" interaction type for Facebook Pages is a sum of all Likes and Reactions; Reactions are not treated separately at present.

Aggregated "Share Count" interactions for Facebook Pages

These aggregated interactions have the same format as Like/Comment Count interactions, showing us how many times a specific Facebook post has been shared.

This interaction type provides an aggregated count only, and does not expose details of the individuals performing the Share.

"Page Mention" interactions for Facebook

We now support Facebook "Page Mentions"; this allows you to collect not only interactions posted from, or to the Pages you are monitoring, but also all public posts in which the Page has been tagged.

Additional resource data available in each interaction

source.resource_id and source.identity_id fields have been added to each Managed Source interaction along with the .source.id in order to let you know which Managed Source resource (Page/Tag/user/etc) and access token was responsible for returning each specific interaction:

{
  ...
  "source": {
    "identity_id": "a7c92ff18bf46cbe18c78479722b867c",
    "resource_id": "26549c827cc4e2fae35b3ba6b15a5a61",
    "id": "6be34bab475640d597b922312d75c2e1"
  }
}

New source.resource_id CSDL target

As well as introducing the source.resource_id field in your output data, we are making this value filterable to allow you to filter specifically for interactions returned by a given Managed Source resource. See the source.resource_id target documentation for full details.

What's changed?

Smarter back-off for Facebook Pages Managed Sources

Facebook's API returns notifications when API requests made by access tokens generated by a specific app are approaching their request rate limit. We now acknowledge these notifications, and begin to back-off requests being made with these access tokens until these notifications go away.

This behaviour has allowed us to reduce cases where all requests from that app have been rate-limited by Facebook. See Facebook's Rate Limiting documentation for full details.

More helpful log messages across all Managed Sources

We have reviewed and improved error and warning messages returned in your Managed Source logs.

Interactions are no longer incorrectly deduplicated

DataSift performs interaction deduplication to prevent you from receiving interactions more than once. However, there is an edge case in the older Managed Sources architecture which could lead to interactions being deduplicated incorrectly.

As an example, we'll work through a case where we are filtering for two Instagram tags to monitor a surfing competition; #RipcurlPro and #BellsBeach. There will be some posts which contain both of these tags. One post for example, may be from a fan and read "Enjoying the 2017 #RipcurlPro at #BellsBeach".

  • The older Managed Sources architecture will return this interaction the first time, and when it collects the interaction again by searching for the second tag, that interaction will be filtered out, and will not be delivered.
  • The newer Managed Sources architecture will deliver that interaction a second time, but will label it with the resource_id of the resource (Tag) responsible for returning that interaction (see the Additional resource data available in each interaction section for more information).

API changes

The API itself has not changed; you will still use the /source/* API endpoints to manage your Managed Sources. However, with the changes to the interaction subtypes you can request, we now provide a number of new parameters which can be requested for each source. The parameters objects used when creating Instagram and Facebook sources will now look like the following:

Facebook parameter options

{
  ...
  "parameters": {
    "posts_by_others": true,
    "comments": true,
    "comment_counts": true,
    "like_counts": true,
    "share_counts": true,
    "page_likes": true,
    "tagged": true
  }
}

Instagram parameter options

{
  ...
  "parameters": {
    "comments": true,
    "like_count": true,
    "comment_count": true
  }
}

What's been removed?

Individual "Like" interactions

For the purposes of performance enhancements, we have removed individual Like interactions. In some cases, we saw that up to 90% of our platform resources were working to fetch and process individual Like interactions.

Individual Like interactions were also unfortunately misleading in some cases; neither Instagram nor Facebook provide a timestamp on an individual Like interaction, so the only timestamp provided with these interactions was the time at which DataSift pulled the interaction. Individual Like interactions have been replaced with aggregate Like Count interactions, which help provide far more accurate tracking of Like counts over time for each post.