The language a Retweet was written in, as identified by Twitter's machine language detection algorithms. The values are valid BCP 47 language identifiers, and may represent any of the languages listed on Twitter's advanced search page, or "und" if no language could be detected.
DataSift already has a language detection mechanism, of course, offered by our Language augmentation.
Remember that there is a third way to find out which language a user prefers. That's by examining the language the author of a Tweet selected in their Settings page on Twitter. You can filter against this in twitter.user.lang, twitter.retweet.user.lang, or twitter.retweeted.user.lang. Take care, though. Users usually choose a language from a drop-down list, and it's usually their main language. There's no guarantee that this is actually their real one or, most importantly, that this is the language of the current Tweet: many users in fact write messages in different languages, so there might be a discrepancy between the language of the tweet and the main language of the user as specified in their profile.
- Filter for Retweets in English, French, German, Spanish, or Italian:
twitter.retweet.lang in "en, fr, de, es, it"
Here are some sample values for filtering:
Target service: Twitter
Target object: Twitter: Retweet
Always exists: No