7.9 KiB
async-nats: NATS Protocol & Wire Format
Protocol Overview
NATS uses a simple, text-based protocol over TCP. Messages are terminated with \r\n. The protocol is symmetric for client and server operations.
Client → Server Operations (ClientOp)
pub(crate) enum ClientOp {
Publish { subject, payload, respond, headers },
Subscribe { sid, subject, queue_group },
Unsubscribe { sid, max },
Ping,
Pong,
Connect(ConnectInfo),
}
Server → Client Operations (ServerOp)
pub(crate) enum ServerOp {
Ok,
Info(Box<ServerInfo>),
Ping,
Pong,
Error(ServerError),
Message { sid, subject, reply, payload, headers, status, description, length },
}
Wire Format: Client Operations
CONNECT
Sent immediately after receiving the first INFO from the server:
CONNECT {"verbose":false,"pedantic":false,...}\r\n
The JSON payload is ConnectInfo serialized inline on the same line.
PUB (Publish without headers)
PUB <subject> [reply-to] <payload-size>\r\n
<payload>\r\n
Example:
PUB events.data INBOX.67 11\r\n
Hello World\r\n
HPUB (Publish with headers)
When headers are present and non-empty:
HPUB <subject> [reply-to] <header-size> <total-size>\r\n
<headers>\r\n
<payload>\r\n
The <total-size> = <header-size> + <payload-size>.
Header block format:
NATS/1.0\r\n
Header-Name: Header-Value\r\n
Another-Header: Another-Value\r\n
\r\n
The version line (NATS/1.0) may include a status code and description:
NATS/1.0 404 No Messages\r\n
\r\n
SUB (Subscribe)
SUB <subject> [queue-group] <sid>\r\n
The sid (subscription ID) is a client-assigned u64, unique per connection.
UNSUB (Unsubscribe)
UNSUB <sid> [max]\r\n
The optional max tells the server to auto-unsubscribe after max messages are delivered.
PING / PONG
PING\r\n
PONG\r\n
Client sends PING periodically (default every 60s). If 2+ pings are pending without PONG, the connection is considered dead.
Wire Format: Server Operations
INFO
First message sent by the server on connection:
INFO {"server_id":"NATSxxx","version":"2.10"...}\r\n
Also sent asynchronously when cluster topology changes.
MSG (Message without headers)
MSG <subject> <sid> [reply-to] <payload-size>\r\n
<payload>\r\n
HMSG (Message with headers)
HMSG <subject> <sid> [reply-to] <header-size> <total-size>\r\n
<headers + payload>\r\n
+OK / -ERR
+OK\r\n
-ERR <description>\r\n
Sent only when verbose=true in CONNECT. The client always sets verbose=false, so +OK is not expected.
Protocol Parser
The Connection struct handles all protocol parsing and serialization:
Read Path (try_read_op)
- Search for
\r\ninread_bufusingmemchr::memmem::find - Inspect the first bytes to determine the operation type:
+OK→ServerOp::OkPING→ServerOp::PingPONG→ServerOp::Pong-ERR→ServerOp::Error(...)(description istrim_matches('\''))INFO→ServerOp::Info(...)(serde_json deserialization)MSG→ Parse subject/sid/reply/size, then read payloadHMSG→ Parse subject/sid/reply/header_len/total_len, then read headers + payload
- For
MSG/HMSG: if the full message body hasn't been read yet, returnNone(wait for more data) - For
HMSG: parse the header block — extract version line (NATS/1.0[ <status>[ <description>]]), then key-value pairs (supports folded/multi-line header values)
Write Path (enqueue_write_op)
Writes into a buffer strategy:
- Small writes (< 4096 bytes): flattened into
flattened_writes: BytesMut - Large writes (≥ 4096 bytes): appended as separate
Byteschunks inwrite_buf: VecDeque<Bytes>
This enables efficient vectored I/O when the underlying stream supports it.
Write Flush Strategy
The should_flush() method returns:
Yes— buffers empty but haven't flushed yetMay— buffers not empty and haven't flushedNo— already flushed or nothing to flush
The ConnectionHandler calls poll_flush() after processing commands, ensuring data is actually sent to the server.
Vectored I/O
When stream.is_write_vectored() returns true, the connection uses poll_write_vectored() to write up to 64 IoSlices at once. This is significantly more efficient for bursty publish patterns.
const WRITE_VECTORED_CHUNKS: usize = 64;
WebSocket Transport
When the websockets feature is enabled, WebSocketAdapter<T> wraps tokio_websockets::WebSocketStream<T> to implement AsyncRead + AsyncWrite, making WebSocket connections transparent to the protocol layer.
#[cfg(feature = "websockets")]
pub(crate) struct WebSocketAdapter<T> {
pub(crate) inner: WebSocketStream<T>,
pub(crate) read_buf: BytesMut,
}
WebSocket connections use ws:// or wss:// scheme in the server URL. TLS for wss:// is handled by the WebSocket library's built-in TLS support.
Connection Lifecycle
Initial Connection Flow
Client Server
│ │
│──── TCP connect ────────────────────▶ │
│◀──── INFO {server_id, nonce, ...} ─── │
│──── CONNECT {auth, ...} ──────────▶ │
│──── PING ─────────────────────────▶ │
│◀──── PONG (or -ERR) ─────────────── │
│ │
│ [connected, ConnectionHandler runs] │
If tls_first is enabled, TLS is established before reading INFO:
Client Server
│ │
│──── TCP connect ────────────────────▶ │
│──── TLS handshake ─────────────────▶ │
│◀──── TLS handshake ──────────────── │
│◀──── INFO {...} ──────────────────── │
│──── CONNECT + PING ────────────────▶ │
│◀──── PONG ────────────────────────── │
Ping/Pong Keepalive
- Client sends PING every
ping_interval(default 60s) - Server responds with PONG
- If
pending_pings > MAX_PENDING_PINGS (2), connection is considered dead - Any server operation resets the ping interval timer
Reconnection Flow
On disconnect:
handle_disconnect()sendsEvent::Disconnectedand sets state toDisconnectedhandle_reconnect()callsconnector.connect()which:- Shuffles servers (unless
retain_servers_order) - Sorts by
failed_attempts(ascending) - Iterates through servers with exponential backoff delay
- On each server: DNS resolve → TCP connect → INFO → TLS (if needed) → CONNECT+PING → PONG
- Shuffles servers (unless
- On success:
- Sends
Event::Connected, sets state toConnected - Removes closed subscriptions
- Re-subscribes all active subscriptions (with adjusted
max = max - delivered) - Re-subscribes the multiplexer (if active)
- Sends
- On failure with
MaxReconnectsreached, the handler loop exits
Default Reconnect Delay
Exponential backoff capped at 4 seconds:
fn reconnect_delay_callback_default(attempts: usize) -> Duration {
if attempts <= 1 {
Duration::from_millis(0)
} else {
let exp: u32 = (attempts - 1).try_into().unwrap_or(u32::MAX);
cmp::min(Duration::from_millis(2_u64.saturating_pow(exp)), Duration::from_secs(4))
}
}
| Attempt | Delay |
|---|---|
| 1 | 0ms |
| 2 | 0ms |
| 3 | 2ms |
| 4 | 8ms |
| 5 | 32ms |
| 6 | 128ms |
| 7 | 512ms |
| 8 | 2048ms |
| 9+ | 4000ms (cap) |