Beyond Kafka and Redis: A Practical Guide to NATS as Your Unified Cloud-Native Backbone

Most platform engineers know this architecture all too well.

You’re running a Kafka cluster with Zookeeper (or the newer KRaft mode that still requires careful tuning). Redis Sentinel is handling your sessions and rate limiting. S3 or MinIO stores your model weights and configuration blobs. Each system has its own operational playbook, monitoring stack, and failure modes you’ve learned the hard way.

When the 3 AM pages come, you’re not just debugging your application—you’re debugging the infrastructure that supports it.

What if I told you there’s a single binary that can replace all of that?

NATS isn’t new. It’s been quietly powering some of the world’s largest distributed systems for over a decade. But with JetStream, Key-Value Store, and Object Store now mature and production-ready, NATS has evolved from a lightweight pub/sub system into a genuine unified messaging platform.

Let me show you what that actually looks like in practice.

What We’re Replacing (And Why)

Before we dive in, let’s be clear about what NATS brings to the table:

Feature Traditional Stack NATS Equivalent
Pub/Sub messaging Redis Pub/Sub, Kafka topics Core NATS subjects
Persistent streaming Kafka, Pulsar, AWS Kinesis JetStream streams
Key-Value storage Redis, etcd, Consul KV NATS KV Store
Object/blob storage S3, MinIO, Redis Streams NATS Object Store
Request-Reply Custom HTTP, gRPC Built-in NATS pattern
Service discovery Consul, Eureka, K8s DNS NATS Services framework

The operational difference is stark:

Kafka deployment:

  • Zookeeper cluster (3+ nodes) OR KRaft coordination
  • Broker nodes with careful partition planning
  • Schema Registry for Avro/Protobuf
  • Kafka Connect for data pipelines
  • Separate monitoring (JMX, Prometheus exporters)

NATS deployment:

nats-server -js

That’s it. One binary. No external dependencies. JetStream enabled.

Section 1: Foundation — Core NATS

Core NATS is beautifully simple. Messages flow through subjects using publish-subscribe and request-reply patterns. There’s no persistence at this layer—it’s pure, fire-and-forget messaging with sub-millisecond latency.

Starting the Server

# Download and run (that's really it)
nats-server

# Or with Docker
docker run -p 4222:4222 nats:latest

# Enable JetStream for persistence features
nats-server -js

Publish-Subscribe

import { connect } from "@nats-io/transport-node";

const nc = await connect({ servers: "localhost:4222" });

// Publisher
await nc.publish("events.user.signup", JSON.stringify({
  userId: "user-123",
  email: "dev@example.com",
  timestamp: Date.now()
}));

// Subscriber (in another process)
const sub = nc.subscribe("events.user.*");
for await (const msg of sub) {
  console.log(`Received on ${msg.subject}:`, msg.string());
}

Compare this to Kafka, where you’d need to:

  1. Create a topic with partitions and replication factor
  2. Configure producer serializers
  3. Set up consumer groups
  4. Handle partition assignment and rebalancing

With NATS, you publish. You subscribe. Done.

Request-Reply Pattern

This is where NATS shines for synchronous communication—no need for HTTP or gRPC for internal services:

// Service: respond to requests
const sub = nc.subscribe("api.users.get");
(async () => {
  for await (const msg of sub) {
    const request = JSON.parse(msg.string());
    const user = await db.findUser(request.userId);

    // Reply directly to the requester
    msg.respond(JSON.stringify(user));
  }
})();

// Client: make synchronous request
const response = await nc.request(
  "api.users.get",
  JSON.stringify({ userId: "user-123" }),
  { timeout: 5000 }
);
console.log("User:", JSON.parse(response.string()));

For AI Workloads:
Request-Reply is perfect for synchronous model inference. Your application sends a prompt to inference.gpt4.query, a model server picks it up, processes it, and returns the result—all without HTTP overhead or load balancer configuration. With NATS’s built-in load balancing across subscribers, you get automatic distribution across multiple inference workers.

Section 2: Persistent Streaming — JetStream

JetStream is NATS’s answer to Kafka. It provides durable message storage, exactly-once delivery semantics, and replay capabilities. Unlike Kafka, there’s no separate cluster to manage—JetStream runs inside the same NATS server.

Creating a Stream

# Using the NATS CLI
nats stream add ORDERS 
  --subjects "orders.*" 
  --retention limits 
  --max-msgs 1000000 
  --max-bytes 1GB 
  --max-age 7d

Or programmatically:

import { connect } from "@nats-io/transport-node";
import { jetstreamManager, AckPolicy, DeliverPolicy } from "@nats-io/jetstream";

const nc = await connect({ servers: "localhost:4222" });
const jsm = await jetstreamManager(nc);

// Create stream
await jsm.streams.add({
  name: "ORDERS",
  subjects: ["orders.*"],
  retention: "limits",
  max_msgs: 1_000_000,
  max_bytes: 1024 * 1024 * 1024, // 1GB
  max_age: 7 * 24 * 60 * 60 * 1_000_000_000, // 7 days in nanoseconds
});

Publishing to JetStream

import { jetstream } from "@nats-io/jetstream";

const js = jetstream(nc);

// Publish with acknowledgment
const ack = await js.publish("orders.created", JSON.stringify({
  orderId: "ORD-12345",
  customerId: "CUST-789",
  items: [{ sku: "WIDGET-01", quantity: 2 }],
  total: 49.99
}));

console.log(`Stored in stream: ${ack.stream}, sequence: ${ack.seq}`);

// Publish with deduplication (exactly-once semantics)
await js.publish("orders.created", JSON.stringify({ orderId: "ORD-12346" }), {
  msgID: "ORD-12346"  // Duplicate publishes with same ID are ignored
});

Consuming with Durable Consumers

// Create a durable consumer
await jsm.consumers.add("ORDERS", {
  durable_name: "order-processor",
  ack_policy: AckPolicy.Explicit,
  deliver_policy: DeliverPolicy.All,  // Start from beginning
  filter_subject: "orders.created",
  max_ack_pending: 100,
});

// Consume messages
const consumer = await js.consumers.get("ORDERS", "order-processor");
const messages = await consumer.consume();

for await (const msg of messages) {
  const order = JSON.parse(msg.string());
  console.log(`Processing order: ${order.orderId}`);

  await processOrder(order);

  // Explicit acknowledgment
  msg.ack();
}

Check stream status anytime:

$ nats stream info ORDERS
...
State:
  Messages: 15,234
  Bytes: 2.1 MB
  First Sequence: 1 @ 2024-01-15 10:30:00
  Last Sequence: 15,234 @ 2024-01-15 14:22:31

For AI Workloads:
JetStream is ideal for durable logging of agent actions and chain-of-thought traces. Every decision your AI agent makes can be published to a agent.{agent-id}.actions stream, creating an auditable, replayable history. When debugging why an agent made a particular decision, you can replay the exact sequence of events. For training pipelines, JetStream can durably queue inference requests, ensuring none are lost even during model server restarts.

Section 3: Stateful Data — Key-Value & Object Store

This is where NATS truly becomes a Redis and S3 replacement.

Key-Value Store

The KV Store is built on JetStream but provides a familiar key-value interface with some powerful additions: history, watchers, and TTL.

import { Kvm } from "@nats-io/kv";

const kvm = new Kvm(nc);

// Create a KV bucket
const kv = await kvm.create("sessions", {
  history: 5,     // Keep last 5 values per key
  ttl: 3600_000_000_000,  // 1 hour TTL in nanoseconds
});

// Basic operations
await kv.put("session:user-123", JSON.stringify({
  userId: "user-123",
  role: "admin",
  lastActivity: Date.now()
}));

const entry = await kv.get("session:user-123");
console.log(`Session data:`, entry?.string());
console.log(`Revision:`, entry?.revision);  // Every update increments revision

// Optimistic locking with revision check
try {
  await kv.put("session:user-123", newData, { 
    previousRevision: entry.revision 
  });
} catch (err) {
  console.log("Concurrent modification detected!");
}

// Watch for changes (real-time updates)
const watch = await kv.watch();
for await (const update of watch) {
  console.log(`Key ${update.key} changed: ${update.operation}`);
}

Compare to Redis:

  • Redis requires separate Sentinel/Cluster for HA
  • No built-in history per key
  • Watchers require Pub/Sub subscriptions + careful coordination
  • TTL works, but no revision tracking
# CLI operations
$ nats kv put sessions "session:user-456" '{"userId":"user-456"}'
$ nats kv get sessions "session:user-456"
$ nats kv history sessions "session:user-456"

For AI Workloads:
The KV Store is perfect for agent conversation state. Store the current context window, tool results, and memory for each conversation under keys like agent:{session-id}:state. The history feature lets you track how state evolved, and watchers enable real-time coordination between agent components. With TTL, abandoned sessions automatically clean up.

Object Store

For larger objects—model weights, configuration files, training data—the Object Store handles arbitrarily large files by chunking them across JetStream.

import { Objm, StorageType } from "@nats-io/obj";

const objm = new Objm(nc);

// Create object store bucket
const store = await objm.create("models", {
  storage: StorageType.File,  // Or Memory
  description: "ML model storage"
});

// Upload a model file
const modelData = await fs.readFile("./model-v2.3.bin");
const info = await store.putBlob({
  name: "gpt-classifier/v2.3",
  description: "GPT-based text classifier, trained 2024-01",
}, modelData);

console.log(`Stored: ${info.name}, ${info.size} bytes, ${info.chunks} chunks`);
console.log(`Digest: ${info.digest}`);  // SHA-256 for integrity verification

// Download a model
const result = await store.get("gpt-classifier/v2.3");
const data = await streamToBuffer(result.data);

// List all models
const models = await store.list();
for (const model of models) {
  console.log(`${model.name}: ${model.size} bytes`);
}
# CLI operations
$ nats object put models ./model-v2.3.bin --name "gpt-classifier/v2.3"
$ nats object get models "gpt-classifier/v2.3" --output ./downloaded-model.bin
$ nats object ls models

For AI Workloads:
Object Store handles model distribution elegantly. Store your model weights, LoRA adapters, or configuration files in NATS, and inference servers can pull the latest versions on startup or when notified via KV watchers. Combined with KV for metadata (models:gpt-classifier:current-version), you get a complete model registry without S3 or a separate artifact store.

Section 4: Security & Multi-Tenancy

NATS has a sophisticated security model built around accounts, users, and JWTs. This isn’t an afterthought—it’s designed for multi-tenant SaaS and zero-trust environments.

Basic Configuration

# nats-server.conf
authorization {
  users = [
    {
      user: "order-service"
      password: "$2a$11$..."
      permissions: {
        publish: ["orders.>"]
        subscribe: ["orders.*.status"]
      }
    },
    {
      user: "analytics-service"  
      password: "$2a$11$..."
      permissions: {
        publish: []  # Cannot publish
        subscribe: ["orders.>"]  # Read-only access to all orders
      }
    }
  ]
}

Advanced: Account-Based Multi-Tenancy

For proper isolation, NATS Accounts provide complete separation:

accounts {
  TEAM_A: {
    users: [
      { user: team_a_admin, password: "..." }
    ]
    jetstream: enabled
  }
  TEAM_B: {
    users: [
      { user: team_b_admin, password: "..." }
    ]
    jetstream: enabled
  }
}

Each account gets its own isolated namespace—Team A’s orders stream is completely separate from Team B’s, even though they use the same NATS cluster.

For production deployments, NATS supports JWT-based authentication with the nsc tool for key management, enabling decentralized, zero-trust security without a central auth database.

Section 5: Bringing It All Together for AI

Let’s paint the complete picture. You’re building an AI agent system:

┌─────────────────────────────────────────────────────────────────┐
│                         NATS Cluster                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐        │
│  │  JetStream  │    │   KV Store  │    │Object Store │        │
│  │   Streams   │    │   Buckets   │    │   Buckets   │        │
│  ├─────────────┤    ├─────────────┤    ├─────────────┤        │
│  │ agent.*.    │    │ sessions    │    │ models      │        │
│  │   actions   │    │ agent-state │    │ embeddings  │        │
│  │ inference.  │    │ config      │    │ documents   │        │
│  │   results   │    │             │    │             │        │
│  └─────────────┘    └─────────────┘    └─────────────┘        │
│                                                                 │
│  Request-Reply: inference.gpt4.query ←→ Model Servers          │
│  Pub-Sub: events.* → Multiple Subscribers                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

The agent workflow:

  1. User sends message → Published to chat.{session-id}.input

  2. Agent retrieves state → KV get agent-state:{session-id}

  3. Agent queries model → Request-Reply to inference.gpt4.query

  4. Agent logs reasoning → JetStream publish to agent.{session-id}.actions

  5. Agent updates state → KV put agent-state:{session-id} with new context

  6. Agent loads tools → Object Store get tools/web-search/config.json

  7. Response sent → Published to chat.{session-id}.output

All of this happens within a single NATS cluster. No Redis for state. No Kafka for event logs. No S3 for artifacts. One system. One operational playbook. One set of metrics.

Getting Started

  1. Install NATS Server:
# macOS
brew install nats-server

# Or download from https://nats.io/download/
  1. Install the NATS CLI:
brew install nats-io/nats-tools/nats
  1. Start with JetStream:
nats-server -js
  1. Explore:
# Check server status
nats server info

# Create your first stream
nats stream add TEST --subjects "test.*" --defaults

# Publish a message
nats pub test.hello "Hello NATS!"

# Create a KV bucket
nats kv add config
nats kv put config app.version "1.0.0"
nats kv get config app.version

The Shift

Here’s what I want you to take away:

Stop thinking of your messaging infrastructure as three separate concerns (queuing, caching, storage) that require three separate systems. NATS unifies them under a single, operationally simple platform.

When you’re designing a system and you think “I need Kafka for event streaming, Redis for caching, and S3 for blobs,” ask yourself: “Can NATS just handle all of this?”

Often the answer is yes, and your architecture becomes dramatically simpler as a result.

NATS won’t replace PostgreSQL for your transactional data. But it can replace the sprawling collection of infrastructure that sits between your services—and that’s where most operational pain lives.

Give it a shot. Start with Core NATS, add JetStream when you need persistence, and see how far you can go with a single nats-server -js before reaching for anything else.

Resources

What’s your experience with NATS? Are you considering it as a Kafka/Redis replacement? I’d love to hear what you’re building.

Leave a Reply