Most platform engineers know this architecture all too well.
You’re running a Kafka cluster with Zookeeper (or the newer KRaft mode that still requires careful tuning). Redis Sentinel is handling your sessions and rate limiting. S3 or MinIO stores your model weights and configuration blobs. Each system has its own operational playbook, monitoring stack, and failure modes you’ve learned the hard way.
When the 3 AM pages come, you’re not just debugging your application—you’re debugging the infrastructure that supports it.
What if I told you there’s a single binary that can replace all of that?
NATS isn’t new. It’s been quietly powering some of the world’s largest distributed systems for over a decade. But with JetStream, Key-Value Store, and Object Store now mature and production-ready, NATS has evolved from a lightweight pub/sub system into a genuine unified messaging platform.
Let me show you what that actually looks like in practice.
What We’re Replacing (And Why)
Before we dive in, let’s be clear about what NATS brings to the table:
| Feature | Traditional Stack | NATS Equivalent |
|---|---|---|
| Pub/Sub messaging | Redis Pub/Sub, Kafka topics | Core NATS subjects |
| Persistent streaming | Kafka, Pulsar, AWS Kinesis | JetStream streams |
| Key-Value storage | Redis, etcd, Consul KV | NATS KV Store |
| Object/blob storage | S3, MinIO, Redis Streams | NATS Object Store |
| Request-Reply | Custom HTTP, gRPC | Built-in NATS pattern |
| Service discovery | Consul, Eureka, K8s DNS | NATS Services framework |
The operational difference is stark:
Kafka deployment:
- Zookeeper cluster (3+ nodes) OR KRaft coordination
- Broker nodes with careful partition planning
- Schema Registry for Avro/Protobuf
- Kafka Connect for data pipelines
- Separate monitoring (JMX, Prometheus exporters)
NATS deployment:
nats-server -js
That’s it. One binary. No external dependencies. JetStream enabled.
Section 1: Foundation — Core NATS
Core NATS is beautifully simple. Messages flow through subjects using publish-subscribe and request-reply patterns. There’s no persistence at this layer—it’s pure, fire-and-forget messaging with sub-millisecond latency.
Starting the Server
# Download and run (that's really it)
nats-server
# Or with Docker
docker run -p 4222:4222 nats:latest
# Enable JetStream for persistence features
nats-server -js
Publish-Subscribe
import { connect } from "@nats-io/transport-node";
const nc = await connect({ servers: "localhost:4222" });
// Publisher
await nc.publish("events.user.signup", JSON.stringify({
userId: "user-123",
email: "dev@example.com",
timestamp: Date.now()
}));
// Subscriber (in another process)
const sub = nc.subscribe("events.user.*");
for await (const msg of sub) {
console.log(`Received on ${msg.subject}:`, msg.string());
}
Compare this to Kafka, where you’d need to:
- Create a topic with partitions and replication factor
- Configure producer serializers
- Set up consumer groups
- Handle partition assignment and rebalancing
With NATS, you publish. You subscribe. Done.
Request-Reply Pattern
This is where NATS shines for synchronous communication—no need for HTTP or gRPC for internal services:
// Service: respond to requests
const sub = nc.subscribe("api.users.get");
(async () => {
for await (const msg of sub) {
const request = JSON.parse(msg.string());
const user = await db.findUser(request.userId);
// Reply directly to the requester
msg.respond(JSON.stringify(user));
}
})();
// Client: make synchronous request
const response = await nc.request(
"api.users.get",
JSON.stringify({ userId: "user-123" }),
{ timeout: 5000 }
);
console.log("User:", JSON.parse(response.string()));
For AI Workloads:
Request-Reply is perfect for synchronous model inference. Your application sends a prompt toinference.gpt4.query, a model server picks it up, processes it, and returns the result—all without HTTP overhead or load balancer configuration. With NATS’s built-in load balancing across subscribers, you get automatic distribution across multiple inference workers.
Section 2: Persistent Streaming — JetStream
JetStream is NATS’s answer to Kafka. It provides durable message storage, exactly-once delivery semantics, and replay capabilities. Unlike Kafka, there’s no separate cluster to manage—JetStream runs inside the same NATS server.
Creating a Stream
# Using the NATS CLI
nats stream add ORDERS
--subjects "orders.*"
--retention limits
--max-msgs 1000000
--max-bytes 1GB
--max-age 7d
Or programmatically:
import { connect } from "@nats-io/transport-node";
import { jetstreamManager, AckPolicy, DeliverPolicy } from "@nats-io/jetstream";
const nc = await connect({ servers: "localhost:4222" });
const jsm = await jetstreamManager(nc);
// Create stream
await jsm.streams.add({
name: "ORDERS",
subjects: ["orders.*"],
retention: "limits",
max_msgs: 1_000_000,
max_bytes: 1024 * 1024 * 1024, // 1GB
max_age: 7 * 24 * 60 * 60 * 1_000_000_000, // 7 days in nanoseconds
});
Publishing to JetStream
import { jetstream } from "@nats-io/jetstream";
const js = jetstream(nc);
// Publish with acknowledgment
const ack = await js.publish("orders.created", JSON.stringify({
orderId: "ORD-12345",
customerId: "CUST-789",
items: [{ sku: "WIDGET-01", quantity: 2 }],
total: 49.99
}));
console.log(`Stored in stream: ${ack.stream}, sequence: ${ack.seq}`);
// Publish with deduplication (exactly-once semantics)
await js.publish("orders.created", JSON.stringify({ orderId: "ORD-12346" }), {
msgID: "ORD-12346" // Duplicate publishes with same ID are ignored
});
Consuming with Durable Consumers
// Create a durable consumer
await jsm.consumers.add("ORDERS", {
durable_name: "order-processor",
ack_policy: AckPolicy.Explicit,
deliver_policy: DeliverPolicy.All, // Start from beginning
filter_subject: "orders.created",
max_ack_pending: 100,
});
// Consume messages
const consumer = await js.consumers.get("ORDERS", "order-processor");
const messages = await consumer.consume();
for await (const msg of messages) {
const order = JSON.parse(msg.string());
console.log(`Processing order: ${order.orderId}`);
await processOrder(order);
// Explicit acknowledgment
msg.ack();
}
Check stream status anytime:
$ nats stream info ORDERS
...
State:
Messages: 15,234
Bytes: 2.1 MB
First Sequence: 1 @ 2024-01-15 10:30:00
Last Sequence: 15,234 @ 2024-01-15 14:22:31
For AI Workloads:
JetStream is ideal for durable logging of agent actions and chain-of-thought traces. Every decision your AI agent makes can be published to aagent.{agent-id}.actionsstream, creating an auditable, replayable history. When debugging why an agent made a particular decision, you can replay the exact sequence of events. For training pipelines, JetStream can durably queue inference requests, ensuring none are lost even during model server restarts.
Section 3: Stateful Data — Key-Value & Object Store
This is where NATS truly becomes a Redis and S3 replacement.
Key-Value Store
The KV Store is built on JetStream but provides a familiar key-value interface with some powerful additions: history, watchers, and TTL.
import { Kvm } from "@nats-io/kv";
const kvm = new Kvm(nc);
// Create a KV bucket
const kv = await kvm.create("sessions", {
history: 5, // Keep last 5 values per key
ttl: 3600_000_000_000, // 1 hour TTL in nanoseconds
});
// Basic operations
await kv.put("session:user-123", JSON.stringify({
userId: "user-123",
role: "admin",
lastActivity: Date.now()
}));
const entry = await kv.get("session:user-123");
console.log(`Session data:`, entry?.string());
console.log(`Revision:`, entry?.revision); // Every update increments revision
// Optimistic locking with revision check
try {
await kv.put("session:user-123", newData, {
previousRevision: entry.revision
});
} catch (err) {
console.log("Concurrent modification detected!");
}
// Watch for changes (real-time updates)
const watch = await kv.watch();
for await (const update of watch) {
console.log(`Key ${update.key} changed: ${update.operation}`);
}
Compare to Redis:
- Redis requires separate Sentinel/Cluster for HA
- No built-in history per key
- Watchers require Pub/Sub subscriptions + careful coordination
- TTL works, but no revision tracking
# CLI operations
$ nats kv put sessions "session:user-456" '{"userId":"user-456"}'
$ nats kv get sessions "session:user-456"
$ nats kv history sessions "session:user-456"
For AI Workloads:
The KV Store is perfect for agent conversation state. Store the current context window, tool results, and memory for each conversation under keys likeagent:{session-id}:state. The history feature lets you track how state evolved, and watchers enable real-time coordination between agent components. With TTL, abandoned sessions automatically clean up.
Object Store
For larger objects—model weights, configuration files, training data—the Object Store handles arbitrarily large files by chunking them across JetStream.
import { Objm, StorageType } from "@nats-io/obj";
const objm = new Objm(nc);
// Create object store bucket
const store = await objm.create("models", {
storage: StorageType.File, // Or Memory
description: "ML model storage"
});
// Upload a model file
const modelData = await fs.readFile("./model-v2.3.bin");
const info = await store.putBlob({
name: "gpt-classifier/v2.3",
description: "GPT-based text classifier, trained 2024-01",
}, modelData);
console.log(`Stored: ${info.name}, ${info.size} bytes, ${info.chunks} chunks`);
console.log(`Digest: ${info.digest}`); // SHA-256 for integrity verification
// Download a model
const result = await store.get("gpt-classifier/v2.3");
const data = await streamToBuffer(result.data);
// List all models
const models = await store.list();
for (const model of models) {
console.log(`${model.name}: ${model.size} bytes`);
}
# CLI operations
$ nats object put models ./model-v2.3.bin --name "gpt-classifier/v2.3"
$ nats object get models "gpt-classifier/v2.3" --output ./downloaded-model.bin
$ nats object ls models
For AI Workloads:
Object Store handles model distribution elegantly. Store your model weights, LoRA adapters, or configuration files in NATS, and inference servers can pull the latest versions on startup or when notified via KV watchers. Combined with KV for metadata (models:gpt-classifier:current-version), you get a complete model registry without S3 or a separate artifact store.
Section 4: Security & Multi-Tenancy
NATS has a sophisticated security model built around accounts, users, and JWTs. This isn’t an afterthought—it’s designed for multi-tenant SaaS and zero-trust environments.
Basic Configuration
# nats-server.conf
authorization {
users = [
{
user: "order-service"
password: "$2a$11$..."
permissions: {
publish: ["orders.>"]
subscribe: ["orders.*.status"]
}
},
{
user: "analytics-service"
password: "$2a$11$..."
permissions: {
publish: [] # Cannot publish
subscribe: ["orders.>"] # Read-only access to all orders
}
}
]
}
Advanced: Account-Based Multi-Tenancy
For proper isolation, NATS Accounts provide complete separation:
accounts {
TEAM_A: {
users: [
{ user: team_a_admin, password: "..." }
]
jetstream: enabled
}
TEAM_B: {
users: [
{ user: team_b_admin, password: "..." }
]
jetstream: enabled
}
}
Each account gets its own isolated namespace—Team A’s orders stream is completely separate from Team B’s, even though they use the same NATS cluster.
For production deployments, NATS supports JWT-based authentication with the nsc tool for key management, enabling decentralized, zero-trust security without a central auth database.
Section 5: Bringing It All Together for AI
Let’s paint the complete picture. You’re building an AI agent system:
┌─────────────────────────────────────────────────────────────────┐
│ NATS Cluster │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ JetStream │ │ KV Store │ │Object Store │ │
│ │ Streams │ │ Buckets │ │ Buckets │ │
│ ├─────────────┤ ├─────────────┤ ├─────────────┤ │
│ │ agent.*. │ │ sessions │ │ models │ │
│ │ actions │ │ agent-state │ │ embeddings │ │
│ │ inference. │ │ config │ │ documents │ │
│ │ results │ │ │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ Request-Reply: inference.gpt4.query ←→ Model Servers │
│ Pub-Sub: events.* → Multiple Subscribers │
│ │
└─────────────────────────────────────────────────────────────────┘
The agent workflow:
-
User sends message → Published to
chat.{session-id}.input -
Agent retrieves state → KV get
agent-state:{session-id} -
Agent queries model → Request-Reply to
inference.gpt4.query -
Agent logs reasoning → JetStream publish to
agent.{session-id}.actions -
Agent updates state → KV put
agent-state:{session-id}with new context -
Agent loads tools → Object Store get
tools/web-search/config.json -
Response sent → Published to
chat.{session-id}.output
All of this happens within a single NATS cluster. No Redis for state. No Kafka for event logs. No S3 for artifacts. One system. One operational playbook. One set of metrics.
Getting Started
- Install NATS Server:
# macOS
brew install nats-server
# Or download from https://nats.io/download/
- Install the NATS CLI:
brew install nats-io/nats-tools/nats
- Start with JetStream:
nats-server -js
- Explore:
# Check server status
nats server info
# Create your first stream
nats stream add TEST --subjects "test.*" --defaults
# Publish a message
nats pub test.hello "Hello NATS!"
# Create a KV bucket
nats kv add config
nats kv put config app.version "1.0.0"
nats kv get config app.version
The Shift
Here’s what I want you to take away:
Stop thinking of your messaging infrastructure as three separate concerns (queuing, caching, storage) that require three separate systems. NATS unifies them under a single, operationally simple platform.
When you’re designing a system and you think “I need Kafka for event streaming, Redis for caching, and S3 for blobs,” ask yourself: “Can NATS just handle all of this?”
Often the answer is yes, and your architecture becomes dramatically simpler as a result.
NATS won’t replace PostgreSQL for your transactional data. But it can replace the sprawling collection of infrastructure that sits between your services—and that’s where most operational pain lives.
Give it a shot. Start with Core NATS, add JetStream when you need persistence, and see how far you can go with a single nats-server -js before reaching for anything else.
Resources
- NATS Documentation
- NATS CLI Reference
- NATS.js Client
- NATS .NET Client
- JetStream Deep Dive
- NATS by Example
What’s your experience with NATS? Are you considering it as a Kafka/Redis replacement? I’d love to hear what you’re building.
