# russh-sftp: Wire Protocol and Codec ## SFTP v3 Wire Format The SFTP protocol (draft-ietf-secsh-filexfer-02) transmits packets over the SSH channel as: ``` ┌────────────┬──────────┬─────────────────┐ │ length │ type │ payload │ │ (u32 BE) │ (u8) │ (variable) │ │ 4 bytes │ 1 byte │ length-1 bytes│ └────────────┴──────────┴─────────────────┘ ``` - `length` includes the type byte but not itself - All multi-byte integers are **big-endian** (network byte order) - Strings are encoded as `u32 length + UTF-8 bytes` - Byte arrays are encoded as `u32 length + raw bytes` ### Packet Type Constants Defined in `protocol/mod.rs`: | Constant | Value | Direction | Description | |----------|-------|-----------|-------------| | `SSH_FXP_INIT` | 1 | C→S | Client initialization | | `SSH_FXP_VERSION` | 2 | S→C | Server version response | | `SSH_FXP_OPEN` | 3 | C→S | Open a file | | `SSH_FXP_CLOSE` | 4 | C→S | Close a handle | | `SSH_FXP_READ` | 5 | C→S | Read from a handle | | `SSH_FXP_WRITE` | 6 | C→S | Write to a handle | | `SSH_FXP_LSTAT` | 7 | C→S | Stat a path (no follow) | | `SSH_FXP_FSTAT` | 8 | C→S | Stat an open handle | | `SSH_FXP_SETSTAT` | 9 | C→S | Set file attributes by path | | `SSH_FXP_FSETSTAT` | 10 | C→S | Set file attributes by handle | | `SSH_FXP_OPENDIR` | 11 | C→S | Open a directory | | `SSH_FXP_READDIR` | 12 | C→S | Read directory entries | | `SSH_FXP_REMOVE` | 13 | C→S | Remove a file | | `SSH_FXP_MKDIR` | 14 | C→S | Create a directory | | `SSH_FXP_RMDIR` | 15 | C→S | Remove a directory | | `SSH_FXP_REALPATH` | 16 | C→S | Canonicalize a path | | `SSH_FXP_STAT` | 17 | C→S | Stat a path (follow symlinks) | | `SSH_FXP_RENAME` | 18 | C→S | Rename a file | | `SSH_FXP_READLINK` | 19 | C→S | Read a symbolic link | | `SSH_FXP_SYMLINK` | 20 | C→S | Create a symbolic link | | `SSH_FXP_STATUS` | 101 | S→C / C→S | Status response | | `SSH_FXP_HANDLE` | 102 | S→C | Handle response | | `SSH_FXP_DATA` | 103 | S→C | Data response | | `SSH_FXP_NAME` | 104 | S→C | Name list response | | `SSH_FXP_ATTRS` | 105 | S→C | File attributes response | | `SSH_FXP_EXTENDED` | 200 | C→S | Extended request | | `SSH_FXP_EXTENDED_REPLY` | 201 | S→C | Extended reply | ## Packet Reading Wire I/O is handled by `utils::read_packet()`: ```rust pub(crate) async fn read_packet( stream: &mut S, max_length: u32, ) -> Result { let length = stream.read_u32().await?; if length > max_length { return Err(Error::BadMessage("packet length limit exceeded".to_owned())); } let mut buf = vec![0; length as usize]; stream.read_exact(&mut buf).await?; Ok(Bytes::from(buf)) } ``` The read packet buffer **includes the type byte** as the first byte, followed by the payload. This design means the caller can distinguish packet types before full deserialization. ## Packet Enum and Dispatch All packets are unified into a single `Packet` enum: ```rust pub enum Packet { Init(Init), Version(Version), Open(Open), Close(Close), Read(Read), Write(Write), Lstat(Lstat), Fstat(Fstat), SetStat(SetStat), FSetStat(FSetStat), OpenDir(OpenDir), ReadDir(ReadDir), Remove(Remove), MkDir(MkDir), RmDir(RmDir), RealPath(RealPath), Stat(Stat), Rename(Rename), ReadLink(ReadLink), Symlink(Symlink), Status(Status), Handle(Handle), Data(Data), Name(Name), Attrs(Attrs), Extended(Extended), ExtendedReply(ExtendedReply), } ``` ### Deserialization (`TryFrom<&mut Bytes> for Packet`) Reads the type byte first, then delegates to the custom serde deserializer: ```rust fn try_from(bytes: &mut Bytes) -> Result { let r#type = bytes.try_get_u8()?; match r#type { SSH_FXP_INIT => Self::Init(de::from_bytes(bytes)?), SSH_FXP_OPEN => Self::Open(de::from_bytes(bytes)?), // ... all 26 variants _ => Err(Error::BadMessage("unknown type".to_owned())), } } ``` ### Serialization (`TryFrom for Bytes`) Converts each variant to bytes via `ser::to_bytes()`, prepends type byte, and wraps with the 4-byte length: ```rust fn try_from(packet: Packet) -> Result { let (r#type, payload): (u8, Bytes) = match packet { Packet::Init(init) => (SSH_FXP_INIT, ser::to_bytes(&init)?), Packet::Open(open) => (SSH_FXP_OPEN, ser::to_bytes(&open)?), // ... all variants }; let length = payload.len() as u32 + 1; let mut bytes = BytesMut::new(); bytes.put_u32(length); bytes.put_u8(r#type); bytes.put_slice(&payload); Ok(bytes.freeze()) } ``` ## Custom Serde Wire Codec The crate implements a **custom serde `Serializer` and `Deserializer`** that directly maps Rust types to the SFTP binary format. This is NOT JSON, Bincode, or any standard serde format — it is a bespoke binary encoding matching the SFTP v3 wire specification. ### Serializer (`ser.rs`) The `Serializer` writes directly into a `BytesMut` buffer: | Rust Type | Wire Encoding | |-----------|---------------| | `u8` | 1 byte raw | | `u32` | 4 bytes big-endian | | `u64` | 8 bytes big-endian | | `str` / `String` | `u32 length` + UTF-8 bytes | | `bytes` | `u32 length` + raw bytes | | `struct` | Fields concatenated in order (no field names) | | `seq` | `u32 count` + elements | | `map` | Key-value pairs (no length prefix) | | `enum` | Variant index as `u32` + variant content | | `None` | Nothing (zero bytes) | | `Some(T)` | Serialized as `T` | | `bool`, `i8`–`i64`, `u16`, `f32`/`f64`, `char` | **Not supported** — returns `BadMessage` error | Key detail: `struct` serialization uses `serialize_struct` which delegates to `serialize_tuple` — fields are written in declaration order with **no field names or tags**. This matches SFTP's positional binary layout. The `data_serialize` helper serializes `Vec` as a raw byte sequence **without** a length prefix (used for `Extended.data` and `ExtendedReply.data`). ### Deserializer (`de.rs`) The `Deserializer` reads from a `&mut Bytes` buffer, consuming bytes as it goes: | Wire Pattern | Rust Deserialize Target | |--------------|------------------------| | 1 byte | `u8` | | 4 bytes BE | `u32` | | 8 bytes BE | `u64` | | `u32 len` + bytes | `String` / `str` | | `u32 len` + bytes | `Vec` / byte buf | | `u32 count` + elements | `Vec` / seq | | Positional fields | struct (tuple-like) | | `u32 variant` + content | enum | | Key-value pairs | `HashMap` | The `data_deserialize` helper reads all remaining bytes into a `Vec` (no length prefix) — used for `Extended.data` and `ExtendedReply.data`. ### TryBuf Helper (`buf.rs`) A small extension trait on `bytes::Buf`: ```rust pub trait TryBuf: Buf { fn try_get_bytes(&mut self) -> Result, Error>; // u32-length-prefixed fn try_get_string(&mut self) -> Result; // u32-length-prefixed UTF-8 } ``` These are used internally by the deserializer for reading SFTP's length-prefixed byte and string fields. ## FileAttributes Serialization `FileAttributes` has a custom `Serialize`/`Deserialize` implementation because the SFTP wire format uses a **flags bitmask** to indicate which optional fields are present. This is fundamentally different from serde's typical self-describing formats. ### Serialization Flow 1. Compute `FileAttr` flags bitmask based on which `Option` fields are `Some`: - `SIZE` (0x1) — `size` is present - `UIDGID` (0x2) — `uid`/`gid` are present - `PERMISSIONS` (0x4) — `permissions` is present - `ACMODTIME` (0x8) — `atime`/`mtime` are present - `EXTENDED` (0x80000000) — extended fields (not yet implemented) 2. Write flags as `u32` 3. Write fields conditionally based on flags ### Deserialization Flow 1. Read `u32` flags bitmask 2. Conditionally read fields based on which bits are set: - If `SIZE`: read `u64` for `size` - If `UIDGID`: read `u32` for `uid`, `u32` for `gid` - If `PERMISSIONS`: read `u32` for `permissions` - If `ACMODTIME`: read `u32` for `atime`, `u32` for `mtime` This ensures that fields not flagged are left as `None` in the `FileAttributes` struct. ## Request ID Tracking All request packets (except `Init`) carry a `u32 id` field used as a request identifier. The `RequestId` trait and macro provide uniform access: ```rust pub(crate) trait RequestId: Sized { fn get_request_id(&self) -> u32; } macro_rules! impl_request_id { ($packet:ty) => { impl RequestId for $packet { fn get_request_id(&self) -> u32 { self.id } } }; } ``` This is used by the server to extract the request ID for constructing status responses on error, and by the client for demultiplexing responses.