Transport · Montréal
Canadian operator: 12 data silos → lakehouse, real-time KPIs.
Inconsistent KPIs, dashboards 48h late. Databricks lakehouse, medallion, dbt, self-service BI.
KPI
60%
analystes autonomes
Duration
9 mois
Team
6 engineers
Hub(s)
Montréal
When every department gives you a different number for the same KPI, you're not making decisions: you're arbitrating between opinions.
The context
Canadian transport operator, 11,000 employees, Montreal hub. 12 historical data silos (operations, HR, finance, ticketing, maintenance, etc.), aging Oracle data warehouse, Excel dashboards sent by email.
The problem
- 12 data silos with no shared governance
- Inconsistent KPIs across departments (up to 18% gap on the same indicator)
- 48h late dashboards, manual updates
- No data catalog, duplicates and ambiguous definitions
- Analysts stuck on SQL extracts, low business autonomy
The approach
Databricks data lakehouse with medallion architecture (bronze/silver/gold), Unity Catalog governance, versioned dbt transformations, self-service Tableau BI with semantic layer.
The pillars
- Real-time ingestion via Auto Loader (Kafka + files)
- dbt dimensional modeling, mandatory quality tests
- Unity Catalog for governance, lineage, RBAC
- Semantic layer exposed to Tableau (centralized business definitions)
- Enablement program: 60% of analysts trained in 4 months
The stack
- Databricks Lakehouse Platform on Azure
- dbt Cloud for versioned transformations
- Unity Catalog for governance and lineage
- Tableau Cloud with semantic layer
- Apache Airflow for ingestion orchestration
The results
- Single source of truth: 100% of operational KPIs reconciled
- Dashboard latency: 48h to real-time (sub-minute on 80% of KPIs)
- Autonomous analysts: 60% in 4 months (vs 12 targeted)
- Data costs: -22% despite 3x on processed volumes
- Data quality issues: -76% in 9 months
« For the first time in 15 years, my operations and finance teams argue about action levers, no longer about numbers. That's the ROI of a data platform. »
What we learned
Unity Catalog is Databricks' real differentiator, not the Spark engine. dbt scales very well up to 800 models, beyond that you need to invest in modularization. Mistake: we shipped the gold layer before consolidating silver, expensive rollbacks. To redo: never open analyst access before 90% of dbt tests are green. Otherwise, you lose trust and don't get it back.
// Read next
Mobilité urbaine · Paris + Montréal
Mobility scale-up: −30% cloud bill, same SLOs.
AWS bill doubled in 18 months without matching traffic growth. GreenOps audit, refactor, Karpenter, ARM64. Measured outcome.
−30%
facture cloud
E-commerce sport · Paris
Sports leader: PWA, +18% mobile conversion, Lighthouse 92.
Mobile Lighthouse at 38, conversion falling. Next.js App Router, edge, images, splitting. Delivered in 6 months.
+18%
conversion mobile
Robotique industrielle · Tokyo
Japanese industrial: 80 AGVs, ROS 2, +40% warehouse throughput.
Slow fleet, collisions, downtime. Nav2 refactor, perception fusion, multi-agent planning. Zero collisions in 6 months.
+40%
throughput entrepôt
