Conflict-free Replicated Data Types (CRDTs)

Background

Modern distributed systems frequently replicate data across multiple machines, regions, or user devices. Replication is a fundamental design choice that improves system behavior and user experience.

Why replication matters:

  • High availability – the system continues working even if some nodes fail
  • Low latency – users interact with nearby replicas
  • Offline support – devices can operate while disconnected
  • Fault tolerance – redundancy prevents data loss

The Fundamental Challenge

Replication introduces a critical question:

What happens when multiple replicas modify the same data concurrently?

In distributed environments, concurrent updates are not an edge case — they are the norm.

The Core Problem in Distributed Systems

Distributed systems inherently operate under imperfect conditions:

  • Nodes maintain independent copies of data
  • Network partitions and disconnections occur
  • Updates may happen at the same time
  • Messages can be delayed or reordered

Without careful design, these realities can cause:

  • Conflicts between updates
  • Lost updates
  • Diverging replicas
  • Inconsistent system state

Traditional Approaches: Coordination

Classic distributed system designs rely on coordination mechanisms to preserve correctness:

  • Locks
  • Leader-based systems
  • Consensus protocols (e.g., Paxos, Raft)

While effective, coordination introduces trade‑offs:

  • Increased latency
  • Reduced availability during failures
  • Higher system complexity

Correctness is preserved, but performance and resilience may suffer.

A Different Perspective: CRDTs

Conflict-free Replicated Data Types (CRDTs) take a fundamentally different approach.

Instead of preventing conflicts through coordination, CRDTs are designed so that:

  • Concurrent updates are expected
  • Conflicts are mathematically impossible or automatically resolved
  • Replicas always converge to the same state

This enables systems that remain:

  • Highly available
  • Low latency
  • Partition tolerant

CRDTs shift the burden from runtime coordination to data structure design.

What is a CRDT?

A Conflict-free Replicated Data Type (CRDT) is a data structure specifically designed for distributed systems where multiple replicas may update data independently.

A CRDT ensures that:

  • Replicas can update data independently
  • Replicas can merge safely without coordination
  • Conflicts do not occur (by design)
  • All replicas eventually converge to the same state
  • No central coordinator or locking mechanism is required

CRDTs provide strong eventual consistency through deterministic merge rules.

Why CRDTs Work

CRDTs rely on mathematically defined merge operations with three critical properties:

1. Commutative

The order of merging does not matter.

merge(A, B) = merge(B, A)

2. Associative

The grouping of merges does not matter.

merge(A, merge(B, C)) = merge(merge(A, B), C)

3. Idempotent

Repeating merges is safe and produces no side effects.

merge(A, A) = A

Because CRDT merge operations satisfy these properties, replicas always converge, regardless of:

  • Message delays
  • Network partitions
  • Duplicate updates
  • Out-of-order delivery

Two Main Types of CRDTs

State-Based CRDTs (Convergent Replicated Data Types)

Replicas exchange their entire state during synchronization.

How they work:

  1. Each replica updates its local state independently
  2. Replicas periodically share their full state
  3. A deterministic merge function combines states

Key characteristics:

  • Simple to reason about
  • Naturally resilient to message duplication
  • Robust under unreliable networks
  • Larger messages due to full-state transfer

Operation-Based CRDTs (Commutative Replicated Data Types)

Replicas exchange operations instead of full state.

How they work:

  1. Replicas generate operations (add, remove, insert, etc.)
  2. Operations are broadcast to other replicas
  3. Operations are designed to commute safely

Key characteristics:

  • More bandwidth-efficient
  • Lower message size
  • Requires reliable delivery assumptions
  • More complex to design correctly

Example 1 — Distributed Counter

Assume two replicas start with the same value:

Value = 0

Both replicas go offline and update independently:

  • Replica A increments → +1
  • Replica B increments → +1

After synchronization, the correct final value should be:

Value = 2

How a CRDT Counter Solves This

Instead of storing a single integer, each replica maintains per-replica state.

Replica A → { A: 1, B: 0 }
Replica B → { A: 0, B: 1 }

Merge rule:

Take the maximum value for each replica slot

Merged result:

{ A: 1, B: 1 } → Value = 2

No updates are lost, even without coordination.

PN-Counter (Supports Decrements)

A PN-Counter extends the basic counter to support decrements.

It internally maintains two counters:

  • One for increments (P = Positive)
  • One for decrements (N = Negative)

Final value calculation:

value = increments − decrements

This preserves convergence while allowing both operations.

Example 2 — Concurrent Text Editing

Initial text:

Hello World

Two users edit concurrently at the same logical position:

  • User A inserts “vikas” after “Hello “
  • User B inserts “nannu” at the same place

Why Traditional Systems Struggle

If edits rely purely on numeric indexes:

  • Both target index 6
  • Order of arrival affects result
  • One update may overwrite the other
  • Replicas may diverge

How CRDTs Fix This

CRDT-based editors avoid fragile positional indexes.

Instead:

  • Every character is assigned a unique identifier
  • Insertions occur relative to identifiers, not indexes
  • Concurrent inserts are preserved by design

Possible merged results:

Hello vikasnannu World

or

Hello nannuvikas World

The exact order depends on deterministic rules, but all replicas agree on the same result.

CRDT Data Structure Categories

CRDTs are not limited to a single data model. They exist for many common data structures, enabling safe replication across a wide range of application needs.

Registers

Registers store a single value.

Example: Last-Write-Wins (LWW) Register
Merge rule: choose the value with the latest timestamp.

Use cases:

  • Configuration values
  • User profile fields
  • Simple shared state

Counters

Counters track numeric updates under concurrency.

Examples:

  • G-Counter (Grow-only) – supports increments only
  • PN-Counter (Positive-Negative) – supports increments and decrements

Use cases:

  • Likes / views / reactions
  • Distributed metrics
  • Rate tracking

Sets

Sets maintain collections of elements with safe concurrent modifications.

Examples:

  • G-Set (Grow-only Set) – elements can only be added
  • OR-Set (Observed-Remove Set) – supports add and remove safely

Use cases:

  • Tags / labels
  • Membership tracking
  • Feature flags

Maps / JSON Structures

Complex objects can be built by composing smaller CRDTs.

Idea: Each field is itself a CRDT.

Use cases:

  • Shared documents
  • Application state
  • Nested data models

Sequences

Sequences maintain ordered collections, essential for collaborative editing.

Use cases:

  • Text editors
  • Real-time collaboration tools
  • Ordered shared logs

Handling Deletions

Deletion is fundamentally harder than insertion in distributed systems.

A common CRDT technique is the use of tombstones:

  • Elements are marked as deleted instead of removed
  • Metadata is preserved for correct merging

Trade-off:

  • Increased storage / metadata overhead
  • Guaranteed convergence and correctness

What CRDTs Guarantee

CRDT-based systems provide strong distributed safety properties:

  • No lost updates
  • No manual conflict resolution
  • Eventual convergence across replicas
  • High availability under failures
  • Partition tolerance by design
  • No locks, leaders, or coordination required

Advantages of CRDTs

CRDTs are powerful because they naturally align with distributed environments:

  • Allow independent replica updates
  • Operate correctly under offline conditions
  • Eliminate complex conflict resolution logic
  • Scale efficiently across regions
  • Reduce coordination overhead

Limitations of CRDTs

CRDTs are not universally applicable. Practical challenges include:

  • Metadata growth over time
  • Memory and storage overhead
  • Non-intuitive ordering behavior
  • Difficulty enforcing strict invariants

Poor fit for systems requiring:

  • Strong consistency guarantees
  • Global ordering constraints
  • Complex transactional invariants

Examples:

  • Banking systems
  • Financial ledgers
  • Strictly serialized workflows

CRDTs vs Strong Consistency Systems

Two contrasting design philosophies exist in distributed systems.

Strong Consistency Systems:

  • Use consensus protocols
  • Enforce global ordering
  • Provide immediate consistency
  • Typically incur higher latency

CRDT-Based Systems:

  • Avoid coordination
  • Accept eventual consistency
  • Prioritize availability and latency

The correct choice depends entirely on application requirements.

Ideal Use Cases for CRDTs

CRDTs work best in environments where:

  • Concurrent updates are common
  • Offline operation is expected
  • Low latency is critical
  • Eventual consistency is acceptable

Examples:

  • Collaborative editors
  • Offline-first applications
  • Distributed counters
  • Edge / multi-device systems
  • Shared state applications

Final Thoughts

CRDTs do not resolve conflicts after they occur. They prevent conflicts by design. Every update is structured so merging is always deterministic and safe.

A helpful way to reason about CRDTs:

Replicas never fight over updates.
They record changes independently and merge deterministically.

CRDTs represent an elegant shift in distributed system design:

Instead of coordinating every update, replicas evolve independently while still guaranteeing convergence.

They are especially valuable in modern systems where:

  • Offline usage is normal
  • Latency directly impacts user experience
  • Global coordination is expensive

Used appropriately, CRDTs dramatically simplify distributed data management while improving system resilience.

Leave a Reply