When should I use interaction.author.id and twitter.user.id?
Let's say you want to track a list of Twitter users by their Twitter user ID. There are a number of different ways this can be accomplished in DataSift, and some of these methods will give you different results.
Let's start with perhaps the most obvious method:
twitter.user.id will filter on any Tweets originally sent by your list of users. This target will NOT return any retweets. For an explanation on why this is the case, take a look at some of our example Twitter data. You can see that in a regular Tweet, the twitter.user.id field exists, however in retweets it does not. Instead you will find twitter.retweet.user.id and twitter.retweeted.user.id in its place.
So, if you are only interested in original Tweets sent by the users you are tracking (and not retweets), the CSDL in the example above should do the trick. If you are also interested in any retweets sent by the users you are tracking, you might consider using the following CSDL:
Although this would work as expected, you will find you are repeating each of your user IDs, which will essentially double the cost of your stream. (See this post on CSDL Optimization Techniques for more details.) This is where interaction.author.id comes in.
interaction.author.id is the user ID of the user who sent the interaction, whether it be a Tweet, retweet, Facebook post or Wikipedia edit. This user ID will map to their user ID on the service from which they sent the interaction.
This final example is the most cost efficient way to ensure you receive both Tweets and retweets from your list of tracked users. When using interaction.* targets, it is usually worth adding the interaction.type target to ensure the content you receive comes only from the data sources you are interested in.