Tagging Topics - Automotive Features

One of the quickest ways to add value to data in PYLON is classification using tags.

In this simple example we'll take a look at building a classifier which picks out automotive features from conversations relating to cars.

tip icon

Tagging allows you to add labels to interactions as they are recorded which can be used later in your analysis.

Our developer guide gives you an introduction to the concepts of classification.

Identifying Classes

To build a classifier the first step is to spend time identifying the classes you want to identify.

It can take a number of attempts to arrive at a final set and it will no doubt change once you start seeing results, but it's always worth giving the set of classes a good amount of thought so they make sense for your use case, are expressable from the data you are classifying and are distinct as far as possible.

For this classifier we'll use this set of classes, based upon analyzing car conversations in general:

  • Style - Design features and general look
  • Practicality - Practical features such as number of doors
  • Purchase Price - Price of purchasing the car, including deals and quotes
  • Environment - Environment factors such as fuel type and emissions
  • Reliability - Faults, breakdowns and recalls
  • Safety - Safety features such as airbags, ABS
  • Convenience Features - Non essential features such as cruise control and audio systems
  • Performance - General performance such as speed and handling
  • Running Cost - Cost of running the vehicle such as MPG and tax
  • Mechanical Specification - Specific features such as engine size

This list would be significantly if for example you wanted to analyze conversation around bikes or trucks, or if you wanted to analyze second-hand car purchases.

Building the Classifier

This simple classifier will make use of keywords to identify classes. It will tag both stories and engagements on those stories.

Each of the tags will follow this syntax:

tag.automotive.feature "class" { fb.content contains_any "keywords" OR fb.parent.content contains_any "keywords" }

Looking at the syntax in detail:

  • tag - declares this this a tag rule
  • .automotive.feature - states this tag is in the automotive > feature namespace
  • "class" - is the label for the class
  • fb.content contains_any - looks for keywords in Facebook stories
  • fb.parent.content contains_any - looks for keywords in stories that are engaged with

Looking at the 'style' class as an example, the following keywords & phrases pick out conversations:

paint,matte,white,black,red,blue,silver,bumpers,trim,bumper,door mirror,door mirrors,alloys,alloy wheels,alloy wheel,cool,sexy,beautiful,stunning

So we can create a 'style' tag as follows:

tag.automotive.feature "Style" { 
    fb.content contains_any "paint,matte,white,black,red,blue,silver,bumpers,trim,bumper,door mirror,door mirrors,alloys,alloy wheels,alloy wheel,cool,sexy,beautiful,stunning" 
    OR fb.parent.content contains_any "paint,matte,white,black,red,blue,silver,bumpers,trim,bumper,door mirror,door mirrors,alloys,alloy wheels,alloy wheel,cool,sexy,beautiful,stunning" 
}

Repeating the process for each class gives a full classifier:

tag.automotive.feature "Style" { fb.content any "paint,matte,white,black,red,blue,silver,bumpers,trim,bumper,door mirror,door mirrors,alloys,alloy wheels,alloy wheel,cool,sexy,beautiful,stunning" OR fb.parent.content any "paint,matte,white,black,red,blue,silver,bumpers,trim,bumper,door mirror,door mirrors,alloys,alloy wheels,alloy wheel,cool,sexy,beautiful,stunning"} 
tag.automotive.feature "Practicality" { fb.content any "door,doors,seat,seats,boot,capacity,towing" OR fb.parent.content any "door,doors,seat,seats,boot,capacity,towing"} 
tag.automotive.feature "Purchase Price" { fb.content any "cost,price,payments,on the road,quote,deal" OR fb.parent.content any "cost,price,payments,on the road,quote,deal"} 
tag.automotive.feature "Environment" { fb.content any "co2,emission,emissions,gasses,hydrogen,gas,fuel cell,hybrid,battery,green,greenest,g/km" OR fb.parent.content any "co2,emission,emissions,gasses,hydrogen,gas,fuel cell,hybrid,battery,green,greenest,g/km"} 
tag.automotive.feature "Reliability" { fb.content any "recall,fault,broke down,broken" OR fb.parent.content any "recall,fault,broke down,broken"} 
tag.automotive.feature "Safety" { fb.content any "safety,brake light,brakes,abs,fog lights,day running lights,day running light,airbag,seat belt,ebd,epas,child seat,child seats,tyre,tyres,tire,tires" OR fb.parent.content any "safety,brake light,brakes,abs,fog lights,day running lights,day running light,airbag,seat belt,ebd,epas,child seat,child seats,tyre,tyres,tire,tires"} 
tag.automotive.feature "Convenience Features" { fb.content any "cruise control,autocruise,auto cruise,trip computer,display,bluetooth,radio,cd,dvd,mp3,speaker,speakers,adjustable,adjust,cushion,remote control,remote controls,air conditioning,keyless,power socket,auto trans,auto transmission" OR fb.parent.content any "cruise control,autocruise,auto cruise,trip computer,display,bluetooth,radio,cd,dvd,mp3,speaker,speakers,adjustable,adjust,cushion,remote control,remote controls,air conditioning,keyless,power socket,auto trans,auto transmission"} 
tag.automotive.feature "Performance" { fb.content any "handles,handling,fast,fastest,quick,quickest,wheel drive,engine size,cc,engine power,bhp,top speed,mph,acceleration,performance,turning radius" OR fb.parent.content any "handles,handling,fast,fastest,quick,quickest,wheel drive,engine size,cc,engine power,bhp,top speed,mph,acceleration,performance,turning radius"} 
tag.automotive.feature "Running Cost" { fb.content any "costs,mpg,tax,service,serviced,servicing,insurance,mileage,fuel economy,economical,petrol,diesel" OR fb.parent.content any "costs,mpg,tax,service,serviced,servicing,insurance,mileage,fuel economy,economical,petrol,diesel"} 
tag.automotive.feature "Mechanical Specification" { fb.content any "turbo,cylinder,engine,4wd,awd,rear wheel drive,torque,exhaust,muffler,cylinder" OR fb.parent.content any "turbo,cylinder,engine,4wd,awd,rear wheel drive,torque,exhaust,muffler,cylinder"}

You can see the final classifier in the library.

Applying the Classifier to a Recording

It's easy to apply a set of tags to an interaction filter and create a recording.

For this example let's say we have an interaction filter already defined with the following CSDL:

( fb.parent.content contains_any "ford, BMW, Honda" OR fb.content contains_any "ford, BMW, Honda" ) 
AND fb.topics.category == "Cars" OR fb.parent.topics.category == "Cars"

Firstly you need to ensure your filter conditions are encapsulated in a return statement. A return statement is mandatory when using tags in a filter.

Then you can include your tags, either by adding them before the return statement:

tag.automotive.feature "Style" { fb.content any "paint,matte,white,black,red,blue,silver,bumpers,trim,bumper,door mirror,door mirrors,alloys,alloy wheels,alloy wheel,cool,sexy,beautiful,stunning" OR fb.parent.content any "paint,matte,white,black,red,blue,silver,bumpers,trim,bumper,door mirror,door mirrors,alloys,alloy wheels,alloy wheel,cool,sexy,beautiful,stunning"} 
tag.automotive.feature "Practicality" { fb.content any "door,doors,seat,seats,boot,capacity,towing" OR fb.parent.content any "door,doors,seat,seats,boot,capacity,towing"} 
tag.automotive.feature "Purchase Price" { fb.content any "cost,price,payments,on the road,quote,deal" OR fb.parent.content any "cost,price,payments,on the road,quote,deal"} 
tag.automotive.feature "Environment" { fb.content any "co2,emission,emissions,gasses,hydrogen,gas,fuel cell,hybrid,battery,green,greenest,g/km" OR fb.parent.content any "co2,emission,emissions,gasses,hydrogen,gas,fuel cell,hybrid,battery,green,greenest,g/km"} 
tag.automotive.feature "Reliability" { fb.content any "recall,fault,broke down,broken" OR fb.parent.content any "recall,fault,broke down,broken"} 
tag.automotive.feature "Safety" { fb.content any "safety,brake light,brakes,abs,fog lights,day running lights,day running light,airbag,seat belt,ebd,epas,child seat,child seats,tyre,tyres,tire,tires" OR fb.parent.content any "safety,brake light,brakes,abs,fog lights,day running lights,day running light,airbag,seat belt,ebd,epas,child seat,child seats,tyre,tyres,tire,tires"} 
tag.automotive.feature "Convenience Features" { fb.content any "cruise control,autocruise,auto cruise,trip computer,display,bluetooth,radio,cd,dvd,mp3,speaker,speakers,adjustable,adjust,cushion,remote control,remote controls,air conditioning,keyless,power socket,auto trans,auto transmission" OR fb.parent.content any "cruise control,autocruise,auto cruise,trip computer,display,bluetooth,radio,cd,dvd,mp3,speaker,speakers,adjustable,adjust,cushion,remote control,remote controls,air conditioning,keyless,power socket,auto trans,auto transmission"} 
tag.automotive.feature "Performance" { fb.content any "handles,handling,fast,fastest,quick,quickest,wheel drive,engine size,cc,engine power,bhp,top speed,mph,acceleration,performance,turning radius" OR fb.parent.content any "handles,handling,fast,fastest,quick,quickest,wheel drive,engine size,cc,engine power,bhp,top speed,mph,acceleration,performance,turning radius"} 
tag.automotive.feature "Running Cost" { fb.content any "costs,mpg,tax,service,serviced,servicing,insurance,mileage,fuel economy,economical,petrol,diesel" OR fb.parent.content any "costs,mpg,tax,service,serviced,servicing,insurance,mileage,fuel economy,economical,petrol,diesel"} 
tag.automotive.feature "Mechanical Specification" { fb.content any "turbo,cylinder,engine,4wd,awd,rear wheel drive,torque,exhaust,muffler,cylinder" OR fb.parent.content any "turbo,cylinder,engine,4wd,awd,rear wheel drive,torque,exhaust,muffler,cylinder"}

return { 
    ( fb.parent.content contains_any "ford, BMW, Honda" OR fb.content contains_any "ford, BMW, Honda" ) 
            AND fb.topics.category == "Cars" or fb.parent.topics.category == "Cars"
}

Or to make your code more maintainable, we recommend saving your tags definition as a filter, then including these in your filters using the tags keyword.

Analyzing Classified Data

Once you've recorded data based upon a classifier you can make use of the tags in your analysis queries.

For instance you can ask for a frequency distribution of the classes across your index using the following analysis query:

{
    "analysis_type": "freqDist",
    "parameters": {
        "target": "interaction.tag_tree.automotive.features",
        "threshold": 5
    }
}

Or you could filter to just conversations relating to style by specifying the following as your filter parameter:

interaction.tag_tree.automotive.features == "Style"

Of course you can then add further filter conditions and dig down by demographic or other tags you choose to add to the data.