deeDDeep Neural Networks for Multi-Touch AttributionDeep Neural Networks for Multi-Touch AttributionDeep Neural Networks for Multi-Touch AttributionDee

e-book by: Express Analytics

A Deep Learning Approach To Multi-Touch Attribution

Attribution using LSTM-attention

Introduction

Online retail has become an important part of our society. Along with online shopping, advertisements, too, are now an integral part of a marketing and sales strategy to attract new customers. For a retailer, it is now crucial to know which touchpoints and advertisements are effective. Millions of customers are exposed to advertisements over a wide range of digital channels. Organizations spend big bucks on advertising and marketing like email advertising, and on Google Adwords, Instagram, Facebook and Twitter. They also bid on results of search engines (e.g. Google and Bing) for certain high-ranking keywords.

The decision of allocating budget across channels is commensurate with their respective impact in your market. Marketers track all customer encounters with these channels. With deep learning, however, there’s been a paradigm shift in data science. Millions of customers are exposed to advertisements over a wide range of digital channels. These interactions leave behind patterns buried deep inside, waiting for marketers to uncover them. Deep learning has proved to be a reliable technique to extract these patterns and predict outcomes from large datasets. Also, for a targeted campaign and for the allocation of the right budget, a marketing team must be able to confirm the influence of different marketing channels on audiences. Given the success of deep learning models on large datasets, implementing artificial neural networks seems to be the most promising approach.

In this e-book, we talk about a novel attribution algorithm based on deep learning that Express Analytics implemented in case of a client in order to assess the impact of each advertising and marketing channel.


Disclaimer

We have been working with this client and have implemented this and other models using their customer data. For reasons of privacy and confidentiality, we are unable to disclose the details such as the name of the client, the names of its customers, or other such sensitive data. For the reader to get the picture, we’ve replaced real instances with mock data. However, the performance (AUC metric) quoted corresponds to the actual data.

There is a gap between the data any marketer has, and the information needed for making a sound business decision. A wide variety of parochial rule-based models and data-driven models (e.g. Markov model) have been widely adopted but do not address all of channel interaction, time dependency, and user characteristics.

A deep neural network single-handedly closes these gaps. The only assumption it makes is that only the data speaks for itself. In fact, the model welcomes user context information - browser used by the prospective customer, device type, device name; any channel or device from which we can see a pattern emerging. By looking for patterns in the data, it gives a fair way to assign credit for sales to channels in conversion paths.

Why Neural Networks At All?

The illustration below shows how the data is “pre-processed” before being fed to the neural network.

Modeling activity of unknown users

It is important that we mention here that the data deals with not only the users who already have an account with an online store but also the customers who anonymously visit its website. Since our neural network forms its opinion about channels using both converted and unconverted sequences, adding activity of unknown users prevents the model from having biases due to misrepresentation.

Data Preview

Foot traffic attribution

Foot Traffic Attribution connects ad exposure to real world behavior to quantify the impact of digital and out‐of‐home advertising on in‐store visits.

It is possible for several users to have bought products from an offline store, or perhaps because of their online activity. The data collected also includes most purchases at the store. Thus, we can be confident that the attribution values extracted are sourced from all the data our client can possibly find.

The premise is performance

A deep neural network, in a supervised learning fashion, learns to predict if a series of encounters with touchpoints leads to conversion. By doing this, the model inevitably ends up having a deep understanding the effects of dynamic interaction between channels.

How Does It Work?

How do we know whether the network has captured such patterns in the dataset?

For this case of “imbalanced data”, we chose the AUC metric.

What Does The Network Look Like?

The above is an illustration of the neural network for an example sequence of 4 touch-points that a user encountered.

LSTM layer (blue layer)

LSTMs are well known and widely used for processing sequential data. The embeddings of channels are fed here. The output would be a numerical representation of the effect of the impressions the user has come across. For the above sequence consisting of 4 encounters of a user with our channels, we get 4 vectors as output.

Attention layer (yellow layer)

In a sequence of observations of touchpoints, the same touchpoint may be differentially important at different time locations and at different frequencies of occurrence. This attention mechanism that lets the model pay appropriate emphasis on individual touchpoints when constructing the representation of the customer path.

It is important that the attention scores (amount of emphasis) are calculated during the forward pass in the network. The model has the capacity to give channel ‘A’ a score of 50% in the sequence [A,B,C] whereas 20% in the sequence [C, A, D]

What Does The Network Look Like?

User context information (maroon layer)

The neural network welcomes information specific to a user such as device type, device name, operating system. This information opens up possibilities for finding patterns in the data, enabling a much more informed prediction about the sequence conversion. This information is numerically captured in the dense layer.

Output layer (green layer)

The probability of conversion is predicted using the output from the attention layer and the dense layer.

Attention layer with time decay

This can be used instead of the default attention layer. This uses time relative to the last channel in the sequence to penalize attention/attribution.

The decay parameter lambda is learnt from the training data. It is clear that lambda must be positive. In the case when everything else in the sequence is quite the same, the network must provide more attribution to the last few channels in the sequence because it converted the customer.

Results

Area under ROC

This section tabulates the performance on and the attribution values extracted from mock data. This is useful only for analysis of models.

Model

AUC

Attention Neural Network

0.89

Attention Neural Network with time decay

0.96

Results

This plot shows the distribution of probabilities predicted by the neural network. For unconverted sequences, we can say most of the predictions were less than 0.2 whereas those for the converted sequences were greater than 0.8. This illustrates the power of the network in separating converted and unconverted sequences.

This shows the ROC of the Time Decay Attention Neural Network.

Disclaimer: The dollar attribution values herein are merely for the purpose of the presentation and do not necessarily reflect the views of Express Analytics on your browser data.

Results

Dollar attribution

Results

Channel

Attention NN

Attention NN with Time decay

Facebook

44,260

43,220

Instagram

20,734

16,673

Online Display

9,430

6,479

Online Video

20,987

26,309

Paid Search

14,818

17,547


References

Hope You Have Enjoyed This e-Book?

About Express Analytics

Express Analytics offers a slew of services such as retail analytics, business intelligence analytics solutions and customer analytics. Our customer data platform  "Oyster" is a world-class data unifying platform for B2C and B2B businesses. To know more: 

Get In Touch

(c) 2020