What is the difference between edge computing and cloud computing?

The cloud centralizes compute in remote datacenters. Edge computing runs processing as close as possible to the data source, on the machine, gateway or sensor. You choose the edge when latency, bandwidth, sovereignty or offline availability become constraining. The two are complementary: the edge filters and decides locally, the cloud aggregates, trains and supervises.

Edge AI means running a model's inference directly on the device, without a round-trip to the cloud. You deploy compact models (INT8 quantization, ONNX or TensorRT formats) on embedded accelerators such as Jetson or Edge TPU. This delivers decisions in a few milliseconds and sharply reduces bandwidth consumption.

When is the edge NOT justified?

When latency is not critical, connectivity is reliable and volumes stay reasonable. If you can tolerate a 200 ms round-trip and your sites are well connected, the cloud stays simpler to operate, update and secure. The edge adds real operational complexity: only pay for it when a measurable gain justifies it.

What is the main trap of an edge project?

Distributed operations. Updating, monitoring and securing a fleet of hundreds or thousands of heterogeneous devices is an engineering problem in its own right. Without a robust OTA deployment pipeline, per-device observability and a rollback strategy, the edge turns into technical debt spread across the entire field.

← Insights

Infrastructure

Edge computing: when offloading computation to the edge becomes profitable

Latency, bandwidth, offline: edge computing promises a lot. But it also adds real operational complexity. Where do you draw the line?

Published 16 June 2026Updated 22 June 20269 min

Edge computing has been coming up in every infra discussion for two years now. The promise: bring computation closer to the data to reduce latency, cut bandwidth, work offline. Reality on the ground: it's true, but it comes with an operational cost that many underestimate. Deploying code to a fleet of dispersed devices, monitoring them, patching them, managing network failures… it's a job in itself.

We've supported several edge projects over the past three years—from retail with real-time video analysis to isolated industrial sites. The pattern repeats: the technical architecture holds up quickly, but it's the ops part that hurts. This article lays out the real decision criteria and shares the pitfalls we've seen in prod.

No marketing bullshit: if your use case works with 200 ms latency and a stable connection, stay in the cloud. But if you're in one of the cases below, edge becomes a real option.

The four triggers that make edge profitable

Edge computing isn't justified by hype. We consider it when one of these four constraints becomes blocking in production.

1. Incompressible latency

When every millisecond counts. Typically: video surveillance with real-time intrusion detection, quality control on production lines, driving assistance. If your SLA requires a decision under 50 ms, a cloud round-trip takes you out of budget. Local inference becomes mandatory.

Concrete example: a retail client analyzes in-store flows to detect suspicious behavior. The model runs on Jetson Nano (€99/device), inference at 30 FPS, decision in 15-20 ms. Impossible with a round-trip to AWS, even in a nearby region.

2. Prohibitive bandwidth

Sending 4K video streams, LiDAR point clouds, or high-frequency IoT flows to the cloud is expensive—in euros and network latency. Edge allows you to filter, aggregate, and only push back anomalies or metadata.

We saw an industrial project go from 12 TB/day pushed to the cloud to 400 GB by deploying edge processing that only pushes qualified events. Direct savings in AWS egress costs (~$1,000/month saved) and network load.

3. Mandatory offline availability

Isolated sites, oil platforms, autonomous vehicles, warehouses with spotty network coverage. If your service must keep running when the WAN link drops, you need local autonomy. Edge becomes your fallback by design.

4. Data sovereignty

GDPR, sector regulations, contractual clauses: some data simply cannot leave the site. Video analysis in hospitals, HR data, confidential industrial processes. Edge computing allows processing on-site and only pushing back anonymized or aggregated data.

Edge vs cloud: complementary, not exclusive

Edge doesn't replace the cloud. Both work together in a well-designed hybrid architecture. The cloud centralizes model training, metric aggregation, long-term storage, and global supervision. Edge executes inference, filters data, makes critical decisions locally.

Classic pattern we deploy for our clients:

Edge: real-time inference (TensorRT, ONNX Runtime), local decisions, critical data cache, keeps running offline.
Cloud: model training and fine-tuning, cross-site aggregation, global dashboards, S3 storage, OTA update distribution.
Bidirectional sync: edge pushes anomalies and metrics, cloud pushes new models and configs. MQTT, gRPC, or HTTP/2 depending on network constraints.

This decoupling maintains local resilience while keeping the cloud's power for everything that isn't time-critical.

Edge AI: inference closest to the sensor

Deploying ML models directly on the device is edge AI. Concretely: you train in the cloud (A100 GPUs, large datasets), then export an optimized model for embedded (INT8 quantization, pruning, distillation) and deploy it on a local accelerator.

Typical stack we use in prod:

Hardware: NVIDIA Jetson (Nano, Xavier, Orin depending on budget), Google Coral Edge TPU, Intel NCS2, or even Raspberry Pi 4 for light workloads.
Formats: ONNX (maximum interop), TensorRT (max perf on Jetson), TFLite (mobile/edge).
Optimization: INT8 post-training quantization (4x faster, 4x less memory), pruning if the model tolerates it.

Example with numbers: a classic ResNet-50 does ~25 FPS on Jetson Nano. Quantized INT8 + TensorRT, we get to 90 FPS. That changes everything for multi-camera video analysis.

The trap: not all models quantize well. Test post-quantization accuracy on your real data. We've seen cases where accuracy dropped from 92% to 78% after INT8, making the model unusable. In that case, either keep FP16 or fall back to the cloud.

The real challenge: distributed operations

Getting code to prod on 10 devices is doable with SSH and a bash script. Managing 500 heterogeneous devices spread across 50 sites is an engineering problem in itself. It's the friction point we systematically see underestimated in the design phase.

OTA (Over-The-Air) deployment

You must be able to deploy a new version of the model or runtime across the entire fleet without physical intervention. That requires:

A centralized registry (Harbor, ECR, Artifactory) for artifacts (containers, models, configs).
A local agent on each device that polls or receives updates (balenaOS, AWS IoT Greengrass, Azure IoT Edge, or custom with MQTT).
A progressive rollout strategy: canary on 5% of the fleet, validate metrics, then full deployment. If it breaks, automatic rollback.
A robust rollback mechanism. Devices must be able to revert to version N-1 without manual intervention.

We saw a project stall for 3 months because a defective OTA update bricked 40% of the fleet, and technicians had to be sent on-site to manually reflash. Cost: several tens of k€ in intervention + service loss.

Per-device observability

You need to know in real-time the state of each device: deployed version, CPU/RAM/disk, application errors, network connectivity. Without that, you're debugging blind.

Typical stack:

System metrics: Telegraf or Prometheus node_exporter, pushed to Victoria Metrics or central Prometheus.
Application logs: Fluent Bit locally, aggregation to Loki or CloudWatch. With intelligent sampling to avoid saturating the WAN link.
Alerting: based on critical metrics (device offline > 5 min, inference rate < threshold, disk > 90%).

The classic mistake: monitoring only the application and forgetting the system layer. Result: you discover the device is constantly swapping or storage is full only when everything crashes.

Distributed security

Each edge device is an attack surface. Especially if they're physically accessible (stores, warehouses, industrial sites). Minimum checklist:

Encryption at rest: full disk encryption (LUKS, dm-crypt).
Mutual authentication: TLS certificates for device ↔ cloud communication.
Automatic patching: OS vulnerabilities come out regularly, you must be able to patch without intervention.
Principle of least privilege: devices should only have access to strictly necessary cloud resources (IAM policies or equivalent).
Signed artifacts: verify signatures of images/models before deployment to avoid malicious code injection.

We audited a project where devices all used the same SSH key hardcoded in the image. One compromised device = entire fleet compromised. Don't do that.

When NOT to do edge

Because it's not always the right answer. A few signals that should keep you in the cloud:

Tolerable latency: if 200-500 ms round-trip is fine, the cloud is simpler.
Reliable connectivity: if your sites have fiber or stable 4G/5G, the offline argument falls.
Low volumes: a few devices (< 20) don't justify setting up a complete OTA infrastructure.
Limited ops team: managing distributed edge requires DevOps/SRE skills. If you don't have the bandwidth, the cloud stays safer.
Constrained budget: initial cost (hardware + OTA infra dev) is non-negligible. If ROI isn't clear, validate a cloud PoC first.

Edge adds complexity. Only pay for it if a measurable gain—in latency, network costs, availability—justifies it. Otherwise, you're accumulating debt for nothing.

Checklist before launching

If after all this you're still convinced edge is the right option, here are the milestones to set before scaling:

PoC on 3-5 devices: validate the technical stack (hardware, runtime, model) and measure real performance.
Automate OTA deployment: don't scale without it. Test rollout + rollback on the PoC.
Set up observability: metrics, logs, alerting. Validate that you see what's happening in prod.
Simulate a network outage: cut the cloud link and verify the device keeps working, then resyncs correctly.
Test security: basic pen test, verify encryption, certificates, IAM policies.
Document runbooks: what do we do if a device is offline? If a version causes problems? If storage is full?

Once these six points are validated on the PoC, you can scale progressively (10, 50, 200 devices). But not before.

What we take away from the field

Edge computing is a real technical answer when latency, bandwidth, or offline availability become constraining. But it's also a real operational challenge that shouldn't be underestimated. Projects that succeed are those that invest from the start in OTA tooling, observability, and security—not those that tell themselves they'll optimize "later".

If you're evaluating an edge project and want to challenge the architecture or ops roadmap with people who've done it in prod, [we're here for that](https://abbeal.com/contact). We prefer asking the right questions upfront rather than debugging a bricked fleet six months after go-live.

// Read next

Working on something similar?

Talk to an architect