Loading tutorials…
Loading tutorials…
Customer.io Data Pipelines is the CDP layer that lets you instrument once and fan out to many tools. For SaaS teams paying Segment $1K-10K/mo, this typically consolidates the spend while keeping the same instrumentation pattern.
Who this is forSaaS teams on Customer.io Premium+ who currently use Segment, Rudderstack, or another CDP, or who're instrumenting from scratch and want the cleanest path. Engineering-led teams who value consolidating tooling and reducing per-event costs.
What you'll need
Step 1
Before touching anything, document the existing data flow. Sources, Destinations, transformations, identify/track call patterns.
Open a doc: list every Source currently feeding your CDP (web app, mobile app, backend, server-side webhooks, third-party tools via Segment).
List every Destination receiving data: Customer.io, Mixpanel, Amplitude, Snowflake/BigQuery, marketing tools (Facebook Conversions API, Google Ads), webhooks.
For each Destination, note: which events get sent, which user properties, any transformations applied.
List your current monthly CDP spend, event volume (MTUs or events/mo), and pain points (high cost, missing destinations, schema enforcement gaps).
Target architecture: Customer.io as Source layer (your app sends to one endpoint), fan out to Destinations. Decide upfront which Destinations stay (Mixpanel, Amplitude, warehouse) vs which get deprecated.
Step 2
Data Pipelines → Sources → + Add Source. Pick Source type (HTTP API, JavaScript, mobile SDK, server-side library).
Customer.io → Data Pipelines → Sources → + Add Source.
Pick Source type: HTTP API for backend, JavaScript for web, iOS/Android SDK for mobile, Segment-compatible API if migrating from Segment.
Name the Source clearly: 'Production Web App,' 'Staging Backend,' 'Mobile iOS.' One Source per logical instrumentation surface.
Customer.io generates a Write Key per Source. Store this in your env vars — same hygiene as any API key.
For Segment migration: Customer.io provides a Segment-compatible Source endpoint. Same method names (identify, track, page, screen, group), same payload shape. Point your existing Segment client at the Customer.io endpoint and existing code works.
Verify: send a test event from your app → Customer.io → Data Pipelines → Sources → click your Source → Live tail. Event should appear within seconds.
Step 3
Connect your Source to the Customer.io workspace as a Destination. This is the baseline — every other Destination is additive.
Data Pipelines → Destinations → + Add Destination → search "Customer.io" → connect.
Pick the workspace to receive events (your Production workspace).
Configure event filtering: which events from this Source should land in this Destination. Default = all. Most teams stay at all for the Customer.io destination.
Activate Destination.
Verify: send an identify call from your app → check Customer.io → People → confirm profile created/updated. Check track event → confirm event in Activity timeline.
This is the baseline. From here, every additional Destination (Mixpanel, Amplitude, warehouse) is "send a copy to also."
Step 4
For SaaS teams running Mixpanel or Amplitude for product analytics, connect them as Destinations. Same instrumentation feeds both.
Data Pipelines → Destinations → + Add Destination → search "Mixpanel" → connect.
Paste Mixpanel Project Token (from Mixpanel → Project Settings → Project Token).
Configure event mapping: by default, all track events flow to Mixpanel. You can filter by event name if some events are Customer.io-only.
For identify calls, Customer.io maps to Mixpanel's `people.set` automatically. User properties on Customer.io profile → Mixpanel user properties.
Repeat for Amplitude: Data Pipelines → Destinations → + Amplitude → paste Amplitude API key. Same event flow.
Verify: send a test event from your app. Confirm it appears in Customer.io People timeline AND Mixpanel Live View AND Amplitude User Look-Up. Triple-check.
Step 5
For SaaS teams with a warehouse, sync raw event data for analysis. This replaces Segment Warehouses or Rudderstack Warehouse Actions.
Data Pipelines → Destinations → + Add Destination → search "Snowflake" (or BigQuery, Redshift, Postgres).
Provide warehouse credentials: hostname, database, schema, username, password (use a dedicated service account with INSERT-only on the events schema).
Pick sync frequency: usually hourly is fine. Real-time sync is available on Enterprise but rarely needed for analytics use cases.
Customer.io creates tables per event type. E.g., `subscription_started` table with columns for user_id, timestamp, plan_name, etc.
Common downstream use: build dbt models on top of Customer.io event tables to produce a unified events table for analytics.
Verify after first sync: query the warehouse — `SELECT COUNT(*) FROM customer_io.account_created WHERE timestamp > NOW() - INTERVAL '1 day'`. Compare to Customer.io UI event count.
Step 6
Some Destinations need event filtering or property renaming. Customer.io Data Pipelines supports per-Destination transformations.
Open a Destination → Transformations tab.
Add a filter: e.g., "only send `Subscription Started` and `Subscription Cancelled` events to Facebook Conversions API." Keeps marketing-tool event counts manageable.
Add property mapping: e.g., your event property is `plan_name` but Facebook expects `predicted_ltv` — map and transform.
Add user-property forwarding: Facebook needs hashed email + phone. Configure Customer.io to hash + send these as match keys.
For complex transformations beyond filtering/mapping (e.g., enriching events with computed fields), use Webhook Destination → write a transformation in your backend that consumes the webhook → re-emits to the final Destination. Heavier but more flexible.
Step 7
For 2-4 weeks, send events to BOTH Segment AND Customer.io. Compare counts daily. Cut Segment off only when parity is verified.
In your instrumentation code, send each event to BOTH the existing Segment endpoint AND the new Customer.io Source. Yes, double cost during migration — non-negotiable for safety.
Set up a daily comparison: total events in Segment vs Customer.io for each event type. Acceptable variance: ±2%. Anything more is a bug.
Compare key destinations: Mixpanel event counts via Segment vs via Customer.io. Amplitude same. Warehouse same.
After 14 days of clean parity, pick a single Destination to cut from Segment to Customer.io (start with the lowest-risk one — usually warehouse, since downstream reports are cached).
Each week, cut one more Destination. By week 6-8, all Destinations are on Customer.io, Segment can be paused.
Keep Segment account dormant (not cancelled) for 30 days after final cutover — in case you need to roll back. Then cancel.
Common mistakes
Cutover without parallel run
What goes wrong: Migration goes live, an event mapping breaks silently for one Destination. Mixpanel funnels look wrong for 2 weeks before anyone notices. Product decisions made on bad data during that window. Recovery: 3-5 days of investigation + reconciliation.
How to avoid: Dual-write for 2-4 weeks. Daily count comparison. Cut over one Destination at a time. Slow is fast.
Migrating event names during the migration (rename + move at same time)
What goes wrong: Mixpanel reports break because event names changed. Funnels that referenced 'subscription_started' (snake_case) now need 'Subscription Started' (Title Case). Every saved query, every dashboard breaks.
How to avoid: Migrate first with EXACT same event names + property shapes. After migration is complete and stable, do a separate renaming pass if needed (with team coordination on all dependent queries).
No backfill plan for warehouse Destination
What goes wrong: Warehouse only gets events from the day Customer.io Destination was activated. Historical Segment data still in old warehouse tables. Cross-period analysis breaks. dbt models error.
How to avoid: Either (a) bulk-export Segment historical data to the Customer.io warehouse Destination via REST API on Day 1 of cutover, or (b) keep both old + new warehouse tables and union them in dbt for transition period.
Hardcoded Source write key in code (leaked to client-side)
What goes wrong: Write key exposed in client JS. Bad actor uses it to fire fake events into your analytics. Mixpanel/Amplitude reports contaminated. Customer.io profiles created for fake users.
How to avoid: Server-side instrumentation uses backend Source write key (server-only env var). Client-side instrumentation uses a separate Source with client-safe write key (limited scopes). Same hygiene as Segment.
No event schema enforcement, drift between Sources
What goes wrong: Web app fires `Subscription Started` with `plan` property. Mobile app fires same event with `plan_name`. Reports break. Workflows trigger inconsistently. Took Segment Protocols years to mature — don't skip schema enforcement.
How to avoid: Use Customer.io Data Pipelines schema enforcement (or external schema-as-code via TypeScript event types). Reject events that don't match schema. Catches drift before it pollutes Destinations.
Forgetting to disable Segment after parallel run
What goes wrong: Pay double CDP cost for months. At $5K/mo Segment + $3K/mo Customer.io Premium, that's $30K/year of wasted spend.
How to avoid: Calendar reminder for 30 days after final Destination cutover: cancel Segment subscription. Verify no events are still being sent to the old endpoint via Segment dashboard first.
Recap
Done — what's next
How to instrument Customer.io event tracking for a SaaS product
Read the next tutorial
Hand it off
Data Pipelines migration is the kind of project where engineering specialists pay for themselves on Day 1. A specialist who's done 10+ Segment-to-Customer.io migrations has the runbook, the dual-write pattern, the rollback plan. Typical 4-week engagement is $4K-8K at $14-16/hr — usually pays back in 2-4 months of saved CDP spend, before counting risk-reduction.
See specialist rates
For 80-90% of SaaS use cases, yes. Gaps: Segment Protocols (advanced schema enforcement) is more mature in Segment. Some niche Destinations (specific MMPs, certain BI tools) aren't yet in Customer.io's Destination library. For most SaaS lifecycle + product analytics use cases, Customer.io Data Pipelines covers Sources, Destinations, transformations, and warehouses.
Highly variable. Segment Business plan is typically $1-10K/mo for SaaS teams. Customer.io Premium with Data Pipelines is $300-1,500/mo for the same volume. Most teams save 50-70% on CDP cost. The savings widen at higher MTU/event volume.
Rudderstack is the open-source/self-hostable alternative to Segment. Customer.io Data Pipelines is fully managed and tightly integrated with Customer.io messaging. If you need self-hosted/data-residency, Rudderstack is the right tool. If you're already on Customer.io for messaging, Data Pipelines consolidates the stack.
Yes — that's the standard setup. Your app sends events to a Customer.io Source. Customer.io Source fans out to Customer.io workspace (for messaging) + Mixpanel + Amplitude + warehouse. One instrumentation, many outputs.
Configure per-Destination filters to strip sensitive properties. E.g., email + phone go to Customer.io for messaging, but get stripped before forwarding to Mixpanel (which doesn't need PII for analytics). Reduces compliance surface area. Customer.io Data Pipelines supports this in Transformations.
Two options: (1) Backfill — export Segment events via Segment's Replay feature (Business plan) and replay into Customer.io. Expensive but complete. (2) Forward-only — start fresh from migration date, keep Segment archive for historical lookups. Most teams pick (2) unless historical data is critical for retention/cohort analysis.
Customer.io
Customer.io is only as smart as the events you send it. Sloppy instrumentation — events fired client-side, inconsistent naming, missing properties — silently sabotages every workflow you build on top. This is the schema that scales.
Customer.io
Segments are how Customer.io decides who. Workflows are how it decides when and what. Most SaaS teams get one or the other right but rarely both — and the gap shows up as activation campaigns missing 40% of eligible users.
Mixpanel
Mixpanel doesn't fail because events break — it fails because event names drift. Three engineers, three opinions, three versions of 'signup' over a year. Here's how to ship instrumentation that holds up.
Amplitude
Bad event tracking is the most common reason Amplitude projects fail. Here is the naming convention, the SDK code, and the Data Guard rules that keep your taxonomy clean for years — not weeks.
Customer.io
When a Customer.io workflow isn't firing, 80% of the time it's an event-delivery problem upstream, not the workflow itself. This is the diagnostic path that finds the root cause in minutes instead of days.
Customer.io
DIY Customer.io is the right call — until it isn't. In healthy SaaS, lifecycle email + in-app should drive 20-35% of activation and 10-20% of retention. If yours is at 5-10%, the gap is the program isn't being worked. Here's the honest framework for when to hire.