Language detection can be a difficult task, especially with microblogs where abbreviation and creative use of language are essential talents to employ. In extreme cases, it can be very difficult to determine the language that an author is being using.

We h8 to b wrong.

The language.confidence target contains an estimate of the accuracy of our analysis. In other words, the value we place on the information in language.tag for this interaction.


  1. Filter for content from any data source where we are at least 75% confident that the language is Arabic.

    language.confidence >= 75 and
    language.tag == "ar"

  2. Filter for Tweets in any language where we are at least 90% confident that we've identified the language correctly.

    interaction.type == "Twitter" and
    language.confidence >= 90

Resource information

Target service: Augmentation Target: Language

Type: int

Array: No