In an ever increasing data conscious world, when signing up for an experimentation platform like Optimizely Web, your data team might have concerns about GDPR violations, or, transgressions with their customers personal data.
The good news is that Optimizely Web can be configure to work with whatever internal data policies you may have. Within this article, I will explain how Optimizely Web can be configured to only track anonymous data 📈📈📈
GDPR
A common concern some companies might have with running an experimentation platform within their technology stack is violating GDPR somehow. This concern is only to be expected, however, experimentation (and the data it needs to track) is a different beast compared to other types of software like a CDP, an EMP, or even an analytics platform.
An experimentation platform is less bothered about identifying individual people and more interested in targeting groups of people who have similar trends. In terms of bucketing someone into an experiment, the decision if a user should be added into an experiment, or, even which variation that user should be shown, will be based on either randomisation, or, a universal criteria.
The data required to track these interactions does not require personal identifiable information to be collected. An experimentation tool only cares that user x, clicked on button x. The experimentation tool does not care in the slightest that Jon Jones clicked on button x.
These two scenarios result in the same events being generated, so they might sound similar, however, in terms of GDPR and data protection they are very different.
To be clear, the GDPR policy is aimed at protecting data stored about individuals online. GDPR itself specifically refers to tracking data about individuals, not companies, or generic groups of users.
Unique Ids
Taking Optimizely out the equation for a minute, if you WANTED to track personal data, the key to making that process work would be to create some form of unique identifier that would map a user to an account. Optimizely does need a way to track users, however, this ID can NOT be PII.
By default, Optimizely Web will use a randomly generated ID as the basis for bucketing and event mapping. This ID can also be changed to use an ID of a clients choosing.
In theory, it is possible for a client to change the Optimizely ID to use something like a customers email address. If a client configured Optimizely like this, that would mean Optimizely would start collecting PPII data.
The key point here is that it is against Optimizely's DPA (Section 2) and Privacy Webpage (https://www.optimizely.com/trust-center/privacy/) to use identifiable customer data as the value for the ID. Meaning, if you wanted to use an email for whatever reason as your Optimizely ID, you would be in violation of Optimizely's contract.
If you are reading this and you have this specific use case, Optimizely's implementation advice would be that you hash the email address first before sending it to Optimizely. This way Optimizely can uniquely identify people and has no way of mapping its data back to an individual. This prevents potential GDPR violations!
We have now covered how the Optimizely ID does not violate GDPR when collecting visitor data, however, what other data does Optimizely collect and should you be worried about any of them?
- Visitor ID:
- IP Address
- Timestamp
- Session ID
- Experiment ID
- Variation ID
More information on this collection process can be found here. In relation to GDPR, asides from IP address, none of these fields should trigger any red flags. The only value that might is IP address, which is why the next section is solei focused on IP and its relation with GDPR!
IP Address
When collecting data for experiments, Optimizely also collects the requesting device's IP address. Optimizely stores the IP address, not for tracking purposes, but, for bot migration purposes.
Regardless of the purpose, IP address is a kind of grey area in terms of whether it gets classed as GDPR or not.
The law around IP and GDPR is very specific. Article 4 defines the definition of personal data. Recital 30 defines that IP addresses count as online identifiers. There is no question IP address can definitely be counted as PII, the grey area comes in around the context of how it is used.
For example, a web hosting company tracks IP addresses, status codes and URLs within its server logs. In this context, IP address is not considered personal information. Taken from the other side, if that same company also mapped that individual's name, email address, or address within the same log entry, the IP now definitely violates GDPR as it can be used to track PII.
In terms of Optimizely, your first consideration is the point made above. Optimizely does not want to track personal data. Optimizely only maps an IP address to a random ID. As long as you successfully anonymize data, then that data is not counted as personally identifiable data and GDPR does not apply.
Some companies understandably won't want to take the slightest risk around potential GDPR violations, grey area or not. In these instances, Optimizely can then be further configured to anonymise the IP address that it stores. This can be done in two ways:
Anonymize Visitor IP Address: Optimizely can be configured to anonymize your visitors' IP addresses before they are stored in the Optimizely event logs.
When IP anonymization mode is enabled, Optimizely will not store the full visitor IP address. Instead, for IPv4 addresses, the last octet of the IP address will be zeroed out and for IPv6 addresses, the low 64 bits are zeroed out. More information about this process can be found here'.
Proxy option: The second option is to implement a proxy between Optimizely and the client. Optimizely support can change the URL destination for the Optimizely Web snippet on request from any client.
When this happens, instead of requests being sent directly to Optimizely, they will be routed via a proxy. This could be either a proxy server or a serverless function
When this setting is enabled, Optimizely will store the proxy IP address within the database rather than a client IP. This means Optimizely will not hold any data that pertains to PII or violates GDPR
If you wanted to go this route, you will need to make sure that your intermediary service can be scaled. If any proxy or function hits a throttling or usage cap, event data could be lost. To make the requests more robust you could also implement the service behind an API gateway. This will allow requests to be versioned or rerouted if needed at a later date without impacting the service
The aim of this article was to hopefully spread some knowledge about what data Optimizely stores and the options that you will have available to you to help you mitigate any risks you may have concerning data protection and GDPR. As you will hopefully see, Optimizely can be configured so that is impossible to break any data protection laws.
Happy coding 🤘