Just Google or chat (it’s 2023) with ChatGPT for the definition of Data Clean Rooms (DCR) and you will end up with several different answers sprinkled with terms like secure, controlled, anonymized, privacy, aggregated, etc., but no clear definition or understanding of what a Data Clean Room is.
It is not just another ETL (Extract, Transform, and Load) machine. And it needs to have specific capabilities that qualify a product or service as a Data Clean Room. The phrase stems from the industrial manufacturing facility concept, where a clean room was a controlled area to minimize contamination and maintain the integrity of the product.
In that sense, a Data Clean Room can be defined as an environment that minimizes exposure of personal data and maintains the integrity of the individual’s privacy. But that’s easier said than ‘defined or understood’.
Understanding Data Clean Rooms
Over the last two-three years, there has been an increasing interest in the industry to deploy first-party data sets for advertising purposes for targeting audience segments. Data Clean Rooms have emerged as one of the solutions to enable first-party data for several marketing and advertising use cases.
Many providers offer Data Clean Room services ranging from independent startups like Infosum, Habu, Decentriq to established players like Amazon Web Services and Snowflake and even large publishers and walled gardens offer clean room products on their own platforms.
Earlier this month, the IAB Tech Lab released its Data Clean Room Guidance and Recommended Practices, a definitive guide to understanding Data Clean Rooms. The guide highlights:
- What they are, and what are the common capabilities expected of a Data Clean Room product or service
- What are the potential applications for addressability and activation, insights and enrichment, and measurement and attribution
- How do they work outlining roles and different operations performed in a Data Clean Room
- What are the data privacy, security, and governance controls you should expect from a Data Clean Room
- What constraints and limitations should one expect when engaging in a Data Clean Room?
- How can you select the one that is right for your needs?
Utilizing Data Clean Rooms
While it is exciting to see the industry adopt a new privacy technology, working with multiple providers is challenging due to the increased cost and friction of preparing, managing, and extracting data and outputs from differently set-up Data Clean Rooms. And that is not all; using the outputs with different business partners also poses similar challenges, e.g., targeting an audience segment through different ad serving systems, e.g., SSPs and DSPs.
In conjunction with the Data Clean Room Guidance & Recommendation Guide, IAB Tech Lab launched the Open Private Join and Activation (OPJA) specification, the first in a series of (upcoming) Data Clean Room Interoperability standards.
OPJA is an operation designed for finding overlapping audiences between buyer and seller data sets so that the buyer can target those audiences at the seller’s digital properties via programmatic supply and demand side ad serving systems.
OPJA deploys security and privacy technologies to accomplish three key design goals:
- Security of personal information
- Privacy of individual identity
- Privacy of audience membership
To achieve these goals, OPJA describes two components:
- Matching system defines the input and output structure and formats and matching techniques using well-established methods that leverage privacy technologies — Private Set Intersection or Trusted Execution Environment
- Activation Protocol defines the encryption labeling techniques and encryption protocols and how the publisher and advertiser should use the resulting outputs of the matching system
The security and privacy requirements do not end with the data clean room components, and the design goals must be maintained while using the outputs in the activation systems. OPJA lists potential privacy and security threat scenarios and design requirements for activation systems to preserve the three design goals.
Looking Into The Future
This is just the beginning of IAB Tech Lab’s work on Data Clean Rooms. We will develop more interoperability specifications for other use cases, e.g., measurement and attribution.
IAB Tech Lab will also develop extensions to other standards for deploying Data Clean Room outputs to preserve privacy and security design goals while using the outputs.
As the use of Data Clean Rooms matures, the industry needs to come to a consensus about how Data Clean Rooms operate and develop a collection of canonical use cases and standards for interoperability among the participants engaging in a Data Clean Room.
To learn more about the Data Clean Room Guidance & Recommended Practices and the Open Private Join & Activation (OPJA) Specification, please click here . Both releases are available for public comment until April 17, 2023.