Using Facebook Topic Data to Refine an Advertising Campaign

Richard Caudle | 13th January 2016

You might have seen our recent blog post where we discussed how an agency used Facebook topic data to carry out audience research and refine an upcoming ad campaign. In this post we'll take a look at how the research was carried using our platform.

Using Facebook topic data to understand an audience

It naturally follows that the better you can understand your audience, the better you can plan your ad campaign, and therefore the more impact your campaign will have.

In this instance an agency was looking to get a better understanding of their audience before asking a drink brand to commit to a large ad campaign. Their hypothesis was that millennial women (aged between 18 and 34) would enjoy the combination of the brand and coffee. The agency used Facebook topic data to investigate.

To test their hypothesis the agency decided to:

  • see how the brand was perceived by the target group in relation to its competitors.
  • see how this group were engaging with types of hot drink in the different countries the campaign was to be run in.

Additionally they wanted to look at which media outlets and celebrities were influencing the target group so that they could make decisions on which channels to advertise in and which famous faces to approach to appear in the ad campaign.

From their investigations the agency established:

  • Only 31% of engagements with the brand came from females under 35, whereas competing brands were getting 47-58% of their engagement from millennials.
  • The client’s brand was actually being engaged with much more by women over 55.
  • Women in the USA were engaging most with stories about lattes, whereas in Germany the drink of choice was cappuccino and in the UK it was hot chocolate.
  • The most popular online magazines for each demographic group.
  • Jennifer Lawrence and Sharon Stone were the influencers most engaged with as a whole in the top women’s magazines during the period of analysis, women between 18 and 35 showed more interest in Taylor Swift and Kylie Jenner.

Let's take a look at how the research was carried out.

Working with PYLON

Before we look at the detailed steps, here's a quick reminder of how PYLON works in practice.

pylon-platform-superpublic-800

You work with PYLON by:

  • Filtering the stream of data from Facebook to stories and engagements (such as likes and comments) you'd like to analyze. Filtered data is recorded into an index.
  • Classifying the data using your own custom rules to add extra metadata for your use case.
  • Analyzing the data you have recorded to the index.

You can learn more about the platform in our What is PYLON? guide. Now look at these steps in the context of this specific use case.

Filtering stories and engagements

The first step of working with Facebook topic data is recording data from a target audience for your analysis.

Using the DataSift platform you can capture stories and engagements on stories by creating a filter in CSDL. The filter specifies what data you'd like to be recorded from the Facebook data source to your index for analysis. The rules in your filter operate against the values of targets (data fields) of the stories and engagements.

For this study the agency recorded two sets of data, creating two indexes. The first to investigate audiences discussing coffee, the second to investigate engagement with online magazines.

To record coffee-related engagement an example filter would be:

fb.all.content contains_any "cappuccino, flat white, macchiato, americano, caffè irlandese, hot chocolate, chocolat chaud, latte" 
OR fb.parent.topics.name in "cappuccino, flat white, macchiato, americano, hot chocolate" 
OR fb.topics.name in "cappuccino, flat white, macchiato, americano, hot chocolate"

Notice here we've used keywords and phrases in multiple languages and topics which are inferred from the content of posts. You could of course extend this filter to include many other drinks and add individual brands if you choose. In this instance the agency added brands so they could perform share of voice analysis.

Based on this filter if someone posted the following:

Just popped into starbs for my morning macchiato!

This story and any likes, comments or reshares on the story will be recorded.

To record engagement with magazines an example filter would be:

links.domain contains_any "cosmopolitan.com, cosmopolitan.fr, marieclaire.com, marieclaire.fr, elle.com, elle.fr" 
OR fb.parent.topics.name in "Cosmopolitan, Elle, Marie Claire" 
OR fb.topics.name in "Cosmopolitan, Elle, Marie Claire"

Here we're using topics and domains of links that people share. Links are a great source for this kind of study as the majority of engagement with magazines is through sharing links.

Based on this filter, any posts that mention one of the magazine brands or shares a link to one of the websites will be recorded.

The agency started recordings using filters similar to the two examples to record the two datasets.

Adding value through classification

Facebook topic data is already a rich data set but you can add additional value using classification rules. By adding classification rules to a filter the platform will record additional meta-data for each story and engagement. You can use this additional metadata in your analysis.

In this case the agency was interested in analyzing the types of coffee drinks being discussed and the magazines being engaged with.

So for example to identify coffee drinks we could add the following tags to our first example filter above:

tag.drink "latte" { fb.all.content contains_any "latte, latté, caffè latte, melange, lattee, lattes" } 
tag.drink "cappuccino" { fb.all.content contains_any "cappuccino, melange, cappuccinos" } 
tag.drink "hot chocolate" { fb.all.content contains_any "hot chocolate, chocolat chaud, heiße schokolade" }

Notice here that the tags help us to group posts by drink type and normalize the data. For example we've grouped posts mentioning latté using different phrases and languages using one tag. This makes the data much easier to analyze later.

When a story or engagement matches the filter conditions the classification rules are applied before the data is recorded to your index. So in this case if the content of a post reads:

Ich liebe heiße schokolade!

The story will be tagged with "hot chocolate" when it is stored to the index.

The agency added tags for each type of drink and for each competitor brand to their filter definition.

The agency also took a similar approach when classifying magazines, using tags similar to the following to normalize links by domain:

tag.magazine "Cosmopolitan" { links.domain any "cosmopolitan.com, cosmopolitan.fr" } 
tag.magazine "Elle" { links.domain any "elle.com, elle.fr" } 
tag.magazine "Marie Claire" { links.domain any "marieclaire.com, marieclaire.fr" }

These are simple examples but you can see how you can use tags to normalize your recorded data and add extra metadata for use in your analysis.

Finding audience insights

Once you've recorded data to your index you can immediately perform initial analysis using analysis queries.

You can perform a time series analysis to see how an audience engaged over time. You can perform a frequency distribution analysis to quantify the engagement by segments of your audience. A more advanced form of analysis is nested queries where you can segment and quantify your audience by multiple dimensions.

You also have the option of using query filters to filter to a portion of your recorded data before performing analysis. So for example you could use the example tags above and filter to only stories and engagements relating to friends before performing a time series analysis.

With the classified data recorded to the two indexes the agency submitted analysis queries to test their hypothesis.

Brand share of voice

Firstly the agency looked at how the share of voice for drink brands varied across demographic groups. To do so they analyzed the first index where they had recorded coffee discussions and had added tags for each brand.

To perform this analysis the agency used a nested query analyzing the brand tags and then the age groups within.

{
    'analysis_type': 'freqDist',
    'parameters':
    {
        'threshold': 5,
        'target': 'interaction.tag_tree.brand'
    },
    'child':
    {
        'analysis_type': 'freqDist',
        'parameters':
        {
            'threshold': 5,
            'target': 'fb.author.age'
        }
    }
}

datasift.pylon.analyze('coffee index id', analyze_parameters, filter='fb.author.gender == "female"')

Note the filter argument for the query filters the dataset to just posts and engagements from females prior to the analysis being performed.

Plotting the results as pie charts revealed the insights:

brand-pie-charts

The client brand is show on the left. This showed the agency that 31% of engagement with the client brand came from females under 35, whereas competing brands were getting 47-58% of their engagement from millennials. Also, the client’s brand was actually being engaged with much more by women over 55.

Type of coffee drink

Next the agency looked at which types of hot drink females engaged with most in each market. To do so they analyzed the first index where they had recorded coffee discussions and had added tags for each coffee drink.

Again a nested query could be used analyzing the countries and then within each country the type of drink.

{
    'analysis_type': 'freqDist',
    'parameters':
    {
        'threshold': 5,
        'target': 'fb.author.country'
    },
    'child':
    {
        'analysis_type': 'freqDist',
        'parameters':
        {
            'threshold': 5,
            'target': 'interaction.tag_tree.drink'
        }
    }
}

datasift.pylon.analyze('coffee index id', analyze_parameters, filter='fb.author.gender == "female"')

Plotting the results revealed the insights:

coffee-by-country

You can clearly see that women in the USA were engaging most with lattes, whereas in Germany the drink of choice was cappuccino and in the UK it was hot chocolate.

Influential magazines and celebrities

Finally the agency looked at which magazines and celebrities were most influential for target demographic groups. To perform this analysis the agency used the second index where they had recorded engagement around popular magazines.

The agency performed an age-gender analysis for each magazine:

{
    'analysis_type': 'freqDist',
    'parameters':
    {
        'threshold': 2,
        'target': 'fb.author.gender'
    },
    'child':
    {
        'analysis_type': 'freqDist',
        'parameters':
        {
            'threshold': 5,
            'target': 'fb.author.age'
        }
    }
}

// Repeat for each magazine
datasift.pylon.analyze('magazine index id', analyze_parameters, filter='interaction.tag.magazine == "Cosmopolitan"')

datasift.pylon.analyze('magazine index id', analyze_parameters, filter='interaction.tag.magazine == "Elle"')

Note that here the magazine tag is used as the filter for each analysis request.

Plotting the results for each magazine revealed the insights:

magazine-age-gender

Here you can clearly see which magazines are engaged with by each demographic group.

To investigate celebrities the agency took advantage of topics that are provided with posts in Facebook topic data. You can analyze the frequently appearing topics that appear in posts being engaged with. For example here we perform a frequency distribution request, first filtering to only engagements on stories which mention directors and actors, analyzing the top 20 topics that appear:

{
    'analysis_type': 'freqDist',
    'parameters':
    {
        'threshold': 20,
        'target': 'fb.parent.topics.name'
    }
}

datasift.pylon.analyze('magazine index id', analyze_parameters, filter='fb.parent.topics.category == "Actor/Director"')

Displaying the results of this query gave the following:

top-actors-directors

This chart shows that Jennifer Lawrence and Sharon Stone are two celebrities that have seen engagement. During the period of analysis Jennifer Lawrence starred in an ad paying homage to Sharon Stone, which has skewed our results in this case, but you can see that we are identifying celebrities that are proving influential with the audience.

Note that in the results we see topics that are not actors or directors. Remember we filtered to posts that mentioned actors and directors. These are topics that were mentioned alongside.

Repeating the query but further restricting the analysis to the millennial age group gave the following result:

top-actors-directors-2

Revealing females between 18 and 35 showed more interest in Taylor Swift and Kylie Jenner.

Learn more…

PYLON for Facebook Topic Data gives analysts access to a vast new audience to test their assumptions and to inform better decisions.

To learn more about the platform take a look at our What is PYLON? guide.

Also, keep an eye on this blog for more Facebook topic data use cases which we'll be posting soon.


Previous post: Nested Analysis Queries in PYLON

Next post: Announcing PYLON 1.7 - Introducing Interaction Filter Swapping