A Quick Intro to Distributed Systems + CAP/ACID/BASE: First Steps Toward “Exactly-Once”

What happens when a single machine hits its limits? Why isn’t the network “perfect”? In a partition, do you pick C or A? A short, punchy primer.

Reading time: ~7–8 min

What Are Distributed Systems and Why Use Them?

A distributed system is made of components running on different servers/devices that coordinate by exchanging messages. Instead of one big box, many machines work together, which gives you:

Horizontal scale: add nodes to increase capacity.
Fault tolerance: if one node fails, others keep serving.

This lets you handle workloads beyond a single machine and reduce single points of failure. The price: networks, disks, software, and timing can (and will) fail. Design with failure as the default (timeouts, retries, jitter, backpressure, circuit breakers, observability, etc.).

Core Challenges

Network and hardware failures are normal: servers crash, disks die, links drop, latency spikes. The famous fallacies of distributed computing (e.g., “the network is reliable,” “latency is zero,” “bandwidth is infinite”) are traps. These uncertainties cause partial failure—some components fail while others keep running. Developers must plan timeouts, retries, backpressure, and compensation from the start.

CAP Theorem: In a Partition, C or A?

CAP (Brewer’s) Theorem says that under a network partition, you cannot simultaneously guarantee both Consistency (C) and Availability (A); Partition tolerance (P) is a given in real systems. During a partition you must choose:

Preserve C → reject/block some requests, sacrificing A.
Preserve A → keep responding, accepting brief inconsistency.

Note: Without a partition, you can often enjoy both C and A just fine. CAP mainly clarifies what you do when the link breaks.

Consistency Models: ACID vs BASE

ACID (Atomicity, Consistency, Isolation, Durability): strong consistency; may introduce blocking under partitions (depends on isolation).
BASE (“Basically Available, Soft state, Eventually consistent”): replicas converge over time; favors availability/scale, but needs conflict resolution (e.g., vector clocks, last-writer-wins).

How to choose?

By domain: Finance leans ACID; massive social feeds lean BASE.

Pick ACID when errors are expensive (money movement, strict inventory, double-spend risk).
Pick BASE when you need global reach, extreme read throughput, and brief staleness is acceptable.

Mini Scenario: EV Charging Network with Grid-Aware Sessions

Context: Nationwide EV chargers. When the grid is constrained, the operator pushes dynamic prices and power throttling.

User flow: reserve → authorize → start charging → interim meter reports → stop → billing.

A) Discovery & Offers (AP + BASE)

Station availability (free/busy, wait time) and dynamic price signals must be highly available; a few seconds of staleness are acceptable.

Choice: AP-leaning + BASE (caches/replicas with TTL; tolerate small drift).

B) Session Lifecycle (CP + ACID + SAGA)

kWh accounting, payments, reservation locks must be correct—no wrong totals.

Choice: CP-leaning + ACID; on failures use SAGA compensations. Orchestrators like Temporal or AWS Step Functions add durable retries and rollbacks.

C) Telemetry and the “Exactly-Once Effect”

Use at-least-once delivery + idempotent consumers: don’t lose meter data; if duplicated, apply it once.

Transactional Outbox + CDC (Debezium): producer writes data + outbox atomically; CDC publishes to the broker reliably.

Product Support (2025)

Kafka: Idempotent producers + transactions enable exactly-once processing semantics (EOS) (especially across stream pipelines).
Apache Pulsar: Transactions unify consume+produce in a single atomic context.
Google Cloud Pub/Sub: Exactly-once delivery in certain subscription modes (mind the constraints).

Closing

Sound distributed design requires a clear CAP stance for partitions and per-flow ACID/BASE choices. In EV charging, keep reads on AP/BASE for great UX, and enforce CP/ACID for critical accounting and payments. The practical path toward “exactly-once” is paved with idempotency and patterns like outbox/inbox + CDC.

Sources

Apache Kafka: Exactly-once semantics / transactions — Apache Kafka
Apache Pulsar: Transactions & end-to-end exactly-once goals — Apache Pulsar
Google Cloud Pub/Sub: Exactly-once delivery — Google Cloud Documentation
Debezium: Outbox Event Router / CDC — Debezium
SAGA Orchestration: Temporal docs; AWS Step Functions guides — Temporal
DLQ/Replay: Azure Service Bus DLQ — Microsoft Learn
CAP Theorem — Wikipedia
Fallacies of Distributed Computing — Wikipedia

🎬 Watch the Video

A Quick Intro to Distributed Systems + CAP/ACID/BASE: First Steps Toward “Exactly-Once”

What Are Distributed Systems and Why Use Them?

Core Challenges

CAP Theorem: In a Partition, C or A?

Consistency Models: ACID vs BASE

Mini Scenario: EV Charging Network with Grid-Aware Sessions

A) Discovery & Offers (AP + BASE)

B) Session Lifecycle (CP + ACID + SAGA)

C) Telemetry and the “Exactly-Once Effect”

Product Support (2025)

Closing

Sources

How to onboard engineers 10x faster

Quantum-Resistant Federated Learning with Lattice-Based Homomorphic Encryption for Medical Imaging

Red flag or red herring? Here’s how AI’s power, water and carbon footprints stack up on a global scale

Apple MacBooks running Nvidia RTX GPUs are not a fantasy anymore – TinyCorp unlocks a whole new world of possibilities in a surprisingly low-tech way

I tested the JBL Grip and JBL Clip 5 small Bluetooth speakers – here’s which one I’d recommend for you

🚫 Divs Are Not Buttons , Here’s Why (and How to Fix It)

What Are Distributed Systems and Why Use Them?

Core Challenges

CAP Theorem: In a Partition, C or A?

Consistency Models: ACID vs BASE

Mini Scenario: EV Charging Network with Grid-Aware Sessions

A) Discovery & Offers (AP + BASE)

B) Session Lifecycle (CP + ACID + SAGA)

C) Telemetry and the “Exactly-Once Effect”

Product Support (2025)

Closing

Sources

Similar Posts