Analyzing Scored Data

A key feature of PYLON is the ability to classify data with custom rules and then use this to greatly increase you analysis options.

Scoring is the mechanism that allows you apply Machine Learning to data. You can use scores you've added to data in both your analysis query filters and as targets to be analyzed. Talk to your account manager to learn more.

Analysis Query Filters

By adding scores to data you give yourself many more ways to subset your data for more analysis that specifically is tailored to your use case.

When you submit an analysis query you can provide an optional filter using the filter parameter. This parameter accepts CSDL allowing you to subset the data in your index before the analysis is run.

For example you could add the following scoring rules to your interaction filter for your recording to identify marketing SPAM (for the full example see the library):

tag.individual_selling 0.336973434597112 {fb.content contains_any "sale" or fb.parent.content contains_any "sale"} 
tag.individual_selling 0.694614877883376 {fb.content contains_any "trade" or fb.parent.content contains_any "trade"} 
tag.individual_selling -0.016319634970781 {fb.content contains_any "buy" or fb.parent.content contains_any "buy"} 
tag.other -0.383526852468688 {fb.content contains_any "sale" or fb.parent.content contains_any "sale"} 
tag.other -1.169380923788089 {fb.content contains_any "trade" or fb.parent.content contains_any "trade"} 
tag.other -0.125093031952464 {fb.content contains_any "buy" or fb.parent.content contains_any "buy"} 
tag.marketing -0.406040918914267 {fb.content contains_any "deposit" or fb.parent.content contains_any "deposit"} 
tag.marketing -0.016257468216822 {fb.content contains_any "sell" or fb.parent.content contains_any "sell"} 
tag.marketing 0.045705906076719 {fb.content contains_any "chance" or fb.parent.content contains_any "chance"}

With these rules in place you can now subset data by the classes detected:

  • marketing: Automated or company generated marketing
  • individual_selling: Individuals selling items
  • other: Content which is not marketing or selling

note icon


When scoring rules are run in an interaction filter the class with the highest total score for the interaction is considered the 'winning' class. The interaction.ml.categories target exposes the winning class for each interaction so that the class can be used in query filters and analysis queries.

For example you could use the following filter to analyze just interactions that aren't considered SPAM:

interaction.ml.categories == "other"

Or, you could filter to multiple classes:

interaction.ml.categories IN "other, individual_selling"

If you're looking to analyze for example the top links shared by an audience, you now have the abillity to analyze this excluding SPAM:

{
    "analysis_type": "freqDist",
    "filter": "interaction.ml.categories == \"other\"",
    "parameters": {
        "target": "links.url",
        "threshold": 5
    }
}

Analysis Targets

By adding scores to data you can perform frequency distribution analysis on the scored data.

When you submit an analysis query you can specify interaction.ml.categories as your analysis target, this will analyze the 'winning' classes assigned to each interaction.

For example:

{
    "analysis_type": "freqDist",
    "parameters": {
        "target": "interaction.ml.categories",
        "threshold": 3
    }
}