Mobilité urbaine · Paris + Montréal
Mobility scale-up: −30% cloud bill, same SLOs.
AWS bill doubled in 18 months without matching traffic growth. GreenOps audit, refactor, Karpenter, ARM64. Measured outcome.
KPI
−30%
facture cloud
Duration
9 mois
Team
4 engineers
Hub(s)
Paris + Montréal
An AWS bill that doubles in 18 months without traffic following: that's rarely growth, it's architecture debt being paid in cash.
The context
Urban mobility scale-up, 180 people, Paris and Montreal hubs, 4 million active users. Microservices platform in production for four years, platform team of 12 engineers. The CFO slammed the table: annual cloud bill exceeded 2.1M USD with 22% traffic growth over the same period.
The problem
- AWS spend x2 in 18 months for +22% traffic
- No team budget, no FinOps in place
- Systematic over-provisioning of EKS nodes (14% average CPU)
- MTTR at 47 minutes, noisy alerting, no distributed tracing
- Engineers unable to attribute a cost to a service
The approach
Six weeks of GreenOps audit, then seven months of incremental remediation. No big bang, no re-platforming. We started by measuring, then cut obvious fat, then rethought what needed rethinking.
The four workstreams
- Full observability: Prometheus, Grafana, Tempo, cost attribution per namespace via Kubecost
- Cluster Autoscaler migration to Karpenter: tight packing, spot first, aggressive consolidation
- ARM64 Graviton on 60% of stateless workloads after benchmarks
- Smart scheduling: night batches on interruptible spot, taints/tolerations reset
The stack
- Go 1.22, Kubernetes 1.29 on EKS
- Karpenter 0.34, Graviton2/3 (c7g, m7g)
- Prometheus, Grafana, Tempo, OpenTelemetry SDK
- Kubecost for attribution, Terraform for IaC
The results
- Cloud bill: -30% at iso-SLO over 9 months (-630k USD/year)
- MTTR: 47 min to 11 min (divided by 4)
- Average cluster CPU: 14% to 51%
- Traffic absorbed: +35% with no added capacity
- Estimated carbon footprint: -38% (client Scope 3 report)
« Abbeal taught us to look at our bill as an engineering signal, not as fate. We recovered budget to reinvest in the product. »
What we learned
Karpenter is a game changer but demands rigor on pod disruption budgets. ARM64 works on 60% of workloads, not 100%: some third-party C++ binaries resisted us for two months. The real sustainable lever is FinOps embedded in the team: we trained two internal relays so the drop holds after we leave.
// Read next
E-commerce sport · Paris
Sports leader: PWA, +18% mobile conversion, Lighthouse 92.
Mobile Lighthouse at 38, conversion falling. Next.js App Router, edge, images, splitting. Delivered in 6 months.
+18%
conversion mobile
Robotique industrielle · Tokyo
Japanese industrial: 80 AGVs, ROS 2, +40% warehouse throughput.
Slow fleet, collisions, downtime. Nav2 refactor, perception fusion, multi-agent planning. Zero collisions in 6 months.
+40%
throughput entrepôt
FinTech SaaS · Tri-geo
FinTech SaaS: ISO 27001 in 9 months, zero velocity regression.
Roadmap frozen by cert. DevSecOps, IaC policies, Vault, incident runbook. DORA stays elite.
9 mois
ISO 27001 (vs 18 estimé)
