Secure Data Collaboration with Databricks Clean Rooms: Unlocking Insights Safely

Databricks

Date : 04/22/2025

Databricks

Date : 04/22/2025

Secure Data Collaboration with Databricks Clean Rooms: Unlocking Insights Safely

Explore how Databricks Clean Rooms enable privacy-first data sharing across industries. Learn about FY26 updates, use cases, and the future of secure collaboration.

Sourav Roy

AUTHOR - FOLLOW
Sourav Roy
Senior Manager, Data Engineering

Secure Data Collaboration with Databricks Clean Rooms: Unlocking Insights Safely
Like the blog

Table of contents

Secure Data Collaboration with Databricks Clean Rooms: Unlocking Insights Safely

  • Here’s why secure data collaboration is essential in today’s landscape:
  • The Clean Room Solution
  • The Journey of Databricks Clean Rooms to FY26: A Privacy-First Evolution
  • Clean Room — Use Cases:
  • Clean Room — Technical Deep Dive
  • Clean room Unity Catalog Privileges
  •  Clean Room — Cost Structure:
  • Upcoming and New releases in Clean Room:
  • The Takeaway
  • The Future of Data Collaboration: Privacy-First, Insight-Driven

Table of contents

Secure Data Collaboration with Databricks Clean Rooms: Unlocking Insights Safely

  • Here’s why secure data collaboration is essential in today’s landscape:
  • The Clean Room Solution
  • The Journey of Databricks Clean Rooms to FY26: A Privacy-First Evolution
  • Clean Room — Use Cases:
  • Clean Room — Technical Deep Dive
  • Clean room Unity Catalog Privileges
  •  Clean Room — Cost Structure:
  • Upcoming and New releases in Clean Room:
  • The Takeaway
  • The Future of Data Collaboration: Privacy-First, Insight-Driven
Secure Data Collaboration with Databricks Clean Rooms: Unlocking Insights Safely

In an era where data drives every decision, every strategy, and every innovation, businesses face a critical challenge — how to collaborate on data without compromising privacy, security, or compliance. Traditional data-sharing methods often require duplicating data, exposing sensitive information, and increasing security risks. But with growing regulatory scrutiny and escalating cyber threats, organizations need a better way to share insights without sharing raw data.

This is where secure data collaboration comes in. Industries such as finance, healthcare, and retail are already embracing technologies like Databricks Clean Rooms to securely analyze shared datasets without ever moving or exposing underlying data. Be it banks assessing credit risk, advertisers campaigns, clinical research in pharmaceutical companies, the ability to collaborate on data while maintaining strict privacy controls is becoming a business necessity rather than a luxury.

With the rise of AI, multi-cloud strategies, and data-driven innovation, secure data collaboration is not just about protection — it’s about unlocking new opportunities while staying compliant and secure. In this blog, we’ll explore the latest FY26 enhancements to Databricks Clean Rooms, how businesses are leveraging these advancements, and why privacy-first data sharing is shaping the future.

Here’s why secure data collaboration is essential in today’s landscape:

1. Rising Data Privacy Regulations

With regulations like GDPR, CCPA, and HIPAA, companies face strict requirements on how data is handled. Secure data collaboration ensures compliance by allowing businesses to share insights without exposing raw data.

2. The Need for Cross-Company Insights Without Risk

Enterprises often need to collaborate with partners, vendors, and research institutions to gain deeper insights.

3. Growing Cybersecurity Threats

Data breaches are costly, with the average cost of a breach reaching $4.45 million in 2023 (IBM report). Secure collaboration methods minimize data movement, reducing the attack surface and keeping sensitive data within controlled environments.

4. AI and Data-Driven Innovation

AI-driven insights are revolutionizing industries, but models need diverse datasets. Secure data collaboration allows multiple organizations to train AI models on combined datasets without direct data exposure, ensuring innovation while maintaining security.

5. Competitive Advantage Through Data Sharing

Companies that harness external data for better decision-making gain a competitive edge. Example: Banks and fintech firms use clean rooms to assess customer creditworthiness without violating privacy laws.

The Clean Room Solution

Databricks Clean Rooms eliminate the risks of traditional data-sharing by enabling secure, compliant, and efficient collaboration without moving or exposing raw data and therefore solve these problems by enabling privacy-preserving collaboration where multiple parties can:

  1. Run queries on shared datasets without ever seeing or accessing raw data i.e. securely collaborate with no direct access to underlying data
  2. Maintain full control over permissions to restrict what partners can compute i.e. control access to data with pre-approved code and use cases for analysis.
  3. Ensure regulatory compliance by keeping data within secure environments to enhance data security with isolated compute infrastructure.

The Journey of Databricks Clean Rooms to FY26: A Privacy-First Evolution

Databricks had introduced Clean Rooms as a native, scalable, and fully managed solution built on the Unity Catalog — allowing multiple parties to collaborate on shared datasets securely, without moving or exposing raw data.

FY24–FY25: Laying the Foundation for Secure Collaboration

  • Initial Launch (AWS & Azure) — Clean Rooms became available on AWS and Azure, allowing businesses to securely analyze shared datasets.
  • Query-Based Collaboration — Instead of exchanging raw data, partners could now run SQL queries while data owners retained full control.
  • Fine-Grained Access Controls — Organizations could set permissions at the column level, ensuring partners only accessed approved datasets.
  • Industry Adoption — Early adopters included financial institutions, healthcare providers, and retail brands leveraging Clean Rooms for fraud detection, clinical trials, and marketing attribution.

FY26: Major Enhancements for Scalable, Enterprise-Grade Adoption

With increasing adoption across industries, Databricks Clean Rooms in FY26 introduced new features to improve security, compliance, and ease of use:

  • General Availability on Azure & AWS — Ensuring robust, enterprise-wide scalability.
  • HIPAA Compliance: Expanding use cases in healthcare and life sciencesmaking them an ideal solution for processing sensitive patient data.
  • Federated Querying (In Private Preview) — Use lakehouse federation to seamlessly collaborate with partners across clouds and data platform without needing to replicate or move all the data. This means data from BigQuery, Snowflake, Redshift, etc. can be connected from Clean Room and using delta sharing data is shared to centralized clean room.
  • Output Tables — Enabling temporary, read-only tables that partners can analyze while preserving data privacy. Generate and share approved analytical outputs directly in Unity Catalog across AWS and Azure which is further used for additional analysis. This notebook run outputs are delta shared back to the individual workspace.
  • Self-Collaboration — Letting organizations create clean rooms within their own metastore for internal teams to collaborate securely or even with yourself. This is useful for performing internal clean room use cases, POCs.
  • Automated APIs — Allowing enterprises to programmatically manage and scale their clean room implementations like to automate clean room setup, monitoring, and orchestration (like adding data assets, adding notebooks).
  • Cross Clouds & Regions, Python & SQL notebooks, structured/unstructured data, private libraries are also in GA for Clean Room.

Clean Room — Use Cases:

Databricks clean room is being used for cross-joins to building AI models on sensitive data. One of the early adoption was segmenting audiences based on multiple source of data to decide the campaign. Clean room is also used to calculate the performance of the campaign by measurement and attribution.

Mastercard, a financial services customer, is using clean rooms to provide access to some of the data assets to their partners.

Similarly customers from other domains like healthcare, manufacturing are making use of Databricks clean room clinical trials, predictive maintenance, etc.

Clean Room — Technical Deep Dive

  1. Databricks Clean Room is built upon the foundation of Unity Catalog and Delta Sharing.

  1. One of the collaborator creates a clean room and invites the other parties to collaborate. In the background, a central clean room environment gets created in Databricks which is isolated from both the parties. Therefore they use Delta sharing to share the assets collaborators are having in their own unity catalog to the central clean room environment.
  2. Once the data is made available, analysis can be run via notebooks on the clean room sensitive data assets. The notebook code is mutually approved by both the parties beforehand.

Clean room Unity Catalog Privileges

CREATE_CLEAN_ROOM can be done by metastore admin or user with this specific permission granted by metastore admin

BROWSE access enables ordering the clean rooms

MODIFY_CLEAN_ROOM enables modification permission on clean room data assets

EXECUTE_CLEAN_ROOM_TASK grants the permission to run notebooks in clean room

Notebook Approvals in Clean room:

  1. Clean room currently supports two party clean room collaboration. (Multi-party collaboration is on its way)
  2. Currently this is happening via Implicit Approval Model

What is Implicit Approval Model?

Collaborator A is uploading a notebook in the clean room which essentially grants Collaborator B access, review and run the notebook. In this case Collaborator B implicitly approves the notebook run uploaded by Collaborator B. You can not run your own notebook.

What is Egress Control?

  1. It gives the ability to switch the internet on or off (restricted mode in clean rooms)
  2. Delta shared assets are implicitly allow listed
  3. It gives us the ability to provide custom internet destinations in restricted mode, once configured, there no going back to edit as this is immutable.

 Output tables deep dive:

  1. Customers often use output of the notebooks in subsequent workflows. This data is essentially stored in Output table temporarily for 30 days with no storage limit and delta shared back to the runner. Post 30 days the data is deleted from central clean room environment. In case of retention needed, a local copy needs to be created.
  2. Output table provides seamless workflow support to enable clean room tasks to serve as intermediate steps as output tables consumed by subsequent steps.
  3. Output table is GA in AWS, Azure and in the time of writing this blog, GCP is in private preview today.
  4. Output table creation code is specified in the notebook itself. The necessary parameters are specified as catalog and schema in the three part namespace. Every time the notebook is running, will provide a new schema and delta shared back to the notebook runner’s workspace.

 Clean Room — Cost Structure:

  1. For two party collaboration, the charge is daily $50 for each collaborator is setup. In case of multi party collaboration, the charge will essentially be n X $50 where n is the number of collaborators setup.
  2. The number of users of the clean room does not affect the clean room cost. Users can be as many as possible to make data assets, running jobs, etc. with no additional pay needed for users to get added to the clean rooms.
  3. Databricks also charge based on clean room usage on serverless compute, storage and transfer services.

Upcoming and New releases in Clean Room:

We learnt about implicit approval model in clean rooms currently in GA. Let’s talk about the below models differing the former (new and upcoming)

  1. Explicit Approval Model: In this model, a collaborator can upload a notebook and seek approvals from other collaborators to get permission to run its own notebook. The collaborator can specify who will be notebook runner including themselves. The notebook status can be Pending Approval, Approved, Rejected, etc.
  2. Auto Approval Rules: There can be some rules already setup in case of trusted partners i.e. whenever a collaborator uploads a notebook, a set of rules can be set as pre-approved to grant the notebook runner permission to run the notebook including themselves. This way approval is streamlined. This eliminates human effort and bypasses any challenges if faced for granting permissions.
  3. Multi-party collaboration: Databricks is also launching multi party collaborations which will allow 4 collaborators in a clean room which will be fruitful for advanced use cases where we need multiple partners to come into a single clean room.
Source: Databricks TKO, FY 26

The Takeaway

From a secure SQL-based solution to a fully managed, compliance-ready platform, Databricks Clean Rooms have transformed the way organizations collaborate on data. The FY26 enhancements are a major leap forward, empowering businesses to maximize data-driven insights while ensuring privacy and security.

The Future of Data Collaboration: Privacy-First, Insight-Driven

As businesses navigate an increasingly data-driven world, the balance between collaboration and privacy has never been more critical. Traditional data-sharing models are no longer sustainable — they expose organizations to security risks, compliance challenges, and inefficiencies.

Databricks Clean Rooms redefine what it means to share data securely, empowering organizations to unlock insights without exposure. With the FY26 enhancements, we’re seeing a shift toward federated, AI-driven, and multi-cloud collaboration, where businesses can extract value from data without ever transferring ownership.

But this is just the beginning. The future of Clean Rooms will likely extend beyond structured datasets into real-time analytics, AI model training, and automated compliance enforcement. As regulatory landscapes evolve, technologies like these will become the gold standard for privacy-first data collaboration.

The question is no longer whether businesses should embrace Clean Rooms, but how quickly they can integrate them to stay ahead. Those who do will not only safeguard their data but also unlock unprecedented opportunities for innovation, partnerships, and smarter decision-making.

The future belongs to organizations that collaborate securely — without compromise. Are you ready?

Sourav Roy

AUTHOR - FOLLOW
Sourav Roy
Senior Manager, Data Engineering


Next Topic

Databricks: The Most Flexible Compute Selection Platform



Next Topic

Databricks: The Most Flexible Compute Selection Platform


Ready to talk?

Join forces with our data science and AI leaders to navigate your toughest challenges.

×
Thank you for a like!

Stay informed and up-to-date with the most recent trends in data science and AI.

Share this article
×

Ready to talk?

Join forces with our data science and AI leaders to navigate your toughest challenges.