As a fast-scaling retail brand grew its D2C and marketplace presence across owned sites and major marketplaces, increasing customer interactions introduced a familiar yet complex challenge: fragmented customer data.
The marketing and analytics teams lacked a unified view of customers, making it difficult to deliver personalized experiences, measure campaign effectiveness, or even recognize the same customer across platforms.
The leadership wanted to modernize its data stack with a Composable Customer Data Platform (CDP) one that leveraged existing cloud infrastructure, supported advanced analytics, and provided flexibility to evolve with new tools.
Before implementation, the company struggled with:
Customer transactions, website sessions, and marketing interactions were siloed across e-commerce platforms, CRMs, and analytics tools.
The same customer appeared as multiple identities across systems due to different identifiers (emails, device IDs, marketplace usernames).
Campaigns were channel-specific, unable to leverage holistic behavioral data.
Creating audience segments or running retargeting campaigns required extensive coordination between data and marketing teams.
These issues led to inefficient marketing spend, slow decision-making, and a lack of personalization at scale.
The goal was to build a Composable CDP on Databricks with in-built identity resolution and activation capabilities, connecting all customer data sources into one cohesive, actionable system.
We designed and deployed a Composable Customer Data Platform (CDP) built entirely inhouse, integrating ingestion, identity resolution, segmentation, and activation all modular and fully controlled by the client's data team.
We built scalable data pipelines to ingest and harmonize data from key sources:
Transactional and order-level data
Customer behavior, traffic source, conversion metrics
Campaign data, audience lists, UTM parameters
All data flowed into a unified Delta Lake architecture with metadata consistency and schema enforcement. This enabled high-performance querying and downstream ML applications.
At the heart of the CDP, we implemented a robust identity resolution system combining deterministic and probabilistic techniques.
| Approach | Description | Examples |
|---|---|---|
| Deterministic Matching | Direct one-to-one mapping based on exact identifiers | Email, Phone number |
| Probabilistic Matching | Pattern-based similarity matching using fuzzy logic and behavioral overlaps | Address, First name, Last name |
This hybrid approach stitched together fragmented customer records across e-commerce, analytics, and marketing data sources, producing a "Golden Customer Record" a single unified profile per user.
Identity resolution results were stored in a graph-based structure within Databricks, allowing flexible linkages and lineage tracking for compliance and audits.
The unified customer profiles powered an activation engine for downstream use cases:
Real-time segmentation based on attributes (RFM, engagement, purchase recency).
Feeding into recommendation systems and retention models.
Seamless sync with Meta and Google Ads via activation pipelines.
Empowering non-technical teams to self-serve customer lists and campaigns.
This composable setup gave the client the flexibility of a CDP without vendor lock-in every component could be modified or replaced as business needs evolved.
The CDP now serves as the data backbone for marketing, analytics, and growth teams.
| Use Case | Description |
|---|---|
| 360° Customer View | Unified profiles across web, marketplace, and CRM data |
| Personalized Campaigns | Targeted offers based on complete behavior history |
| Marketing Automation | Faster audience building and campaign activation |
| Churn Prediction | Identification of at-risk segments using unified event data |
| Ad Platform Integration | Export of high-value audiences to Meta and Google Ads for lookalike modeling |
The Composable CDP enabled the retail brand to turn its data into action with measurable business outcomes.
| Impact Area | Before | After |
|---|---|---|
| Data Access Time | Manual pulls, high dependency on engineering | Automated and self-service segmentation |
| Campaign ROAS | Channel-level targeting only | Significant increase in ROAS through unified audiences |
| Marketing Data Ops Time | Days to build segments | Minutes with automated pipelines |
| Personalization Quality | Limited by data silos | Enhanced recommendations and reactivation rates |
| Scalability | Dependent on vendor tools | Fully composable, owned, and extensible architecture |
Marketing and analytics teams gained trust in data quality and consistency
Personalized campaigns now run across all channels with unified customer context
Reduced operational dependency on third-party CDP tools, saving long-term costs
"With the composable CDP, our marketing team no longer waits for engineering to build lists. We can see, segment, and activate our customers across every channel in one place."
By implementing a Composable CDP with identity resolution, the client achieved what traditional packaged CDPs often fail to deliver control, flexibility, and scalability.
The solution unified fragmented data, established a single source of truth for customer intelligence, and powered high-impact marketing automation.
Today, the client operates with real-time customer insight, reduced manual effort, and a personalization engine that drives measurable revenue growth.