Add Phase 2 definitions, terminology disambiguation, and reference research docs

- definitions.md: formal term disambiguation for overloaded concepts
  (service, interface, token, identity, domain) with cross-domain mapping
  tables (alknet ↔ Keystone, distributed git, rustfs) and 8 open questions
- references/rustfs/: research on rustfs S3 store, Keystone/OIDC integration,
  and credential mapping to CredentialSet
- references/gitserver/: research on gitserver library architecture and
  integration paths as HTTP MessageInterface and SSH adapter
- references/openstack-keystone/: research on Keystone identity concepts
  (tokens, scoping, service catalog, RBAC, trust delegation, federation)
  and what alknet should adopt vs skip
- references/distributed-identity/: research on decentralized git, smart
  contract ACL, on-chain identity, and Radicle comparison
This commit is contained in:
2026-06-08 14:59:56 +00:00
parent a107aebeb7
commit f620a94705
5 changed files with 3731 additions and 0 deletions

View File

@@ -0,0 +1,771 @@
# Research: Distributed Identity, Smart Contract ACL, and Decentralized Git
> Status: Research Reference
> Created: 2026-06-08
> Scope: Decentralized git hosting, distributed identity, smart contract-based access control, and their relevance to alknet
## Table of Contents
1. [Executive Summary](#1-executive-summary)
2. [Source Concept: NFT-Based Decentralized Git](#2-source-concept-nft-based-decentralized-git)
3. [Existing Projects](#3-existing-projects)
4. [Identity on the Blockchain](#4-identity-on-the-blockchain)
5. [Access Control Models for Distributed Git](#5-access-control-models-for-distributed-git)
6. [Cryptographic Identity Mapping](#6-cryptographic-identity-mapping)
7. [Gossip Protocols for Repo Synchronization](#7-gossip-protocols-for-repo-synchronization)
8. [Relevance to Alknet](#8-relevance-to-alknet)
9. [References](#9-references)
---
## 1. Executive Summary
This document researches distributed identity systems, smart contract-based access control, and decentralized git platforms to inform alknet's architecture. The source concept — a decentralized, censorship-resistant git hosting platform using NFTs (ERC-721) for identity and smart contracts for ACL — directly inspired some of alknet's cryptographic identity and key derivation ideas. The research reveals several key findings:
**Key Findings:**
1. **Radicle is the most mature decentralized git system** and provides the closest production reference for alknet's architecture, particularly in Ed25519 identity, gossip-based replication, and self-certifying repositories. However, Radicle lacks the smart contract/on-chain ACL layer that the source concept envisions.
2. **Smart contract ACL is feasible but introduces latency trade-offs.** On-chain identity verification costs 0.5-5 seconds per look-up on L2s, making it unsuitable as a hot path. The correct pattern is on-chain registration + local cache, which aligns with alknet's `StorageIdentityProvider` approach.
3. **alknet's BIP39/SLIP-0010 key derivation already spans both worlds.** The `m/74'/0'/0'/0'` path for Ed25519 identity and `m/44'/60'/0'/0/0` for Ethereum signing means the same seed phrase that governs alknet authentication can also sign on-chain transactions — no separate wallet needed.
4. **The Identity + IdentityProvider model maps directly to decentralized identity.** `ConfigIdentityProvider` is the local-only mode (Radicle-like); `StorageIdentityProvider` is the cached mode (on-chain ACL mirrored to SQLite); a future `OnChainIdentityProvider` could verify against smart contracts.
5. **Domain events vs. integration events (from alknet's event sourcing research) is the correct pattern** for synchronizing on-chain state to local nodes. On-chain events are the source of truth; honker streams carry the projected local state.
---
## 2. Source Concept: NFT-Based Decentralized Git
The originating concept for this research is a decentralized, censorship-resistant git hosting platform built on the following principles:
### 2.1 Core Architecture
| Component | Mechanism | Purpose |
|-----------|----------|---------|
| **Org/User Identity** | Transferable ERC-721 tokens | Organizations and users are NFTs; ownership is on-chain and transferable |
| **Repository Identity** | ERC-721 tokens owned by org/user tokens | Repos are NFTs with a `mapping(address => Role)` ACL |
| **Replicators** | User/org nodes listing replicated repos + public endpoints | Decentralized hosting; replicators choose what to mirror |
| **Gossip Protocol** | Push/pull notifications about repo updates | Replicators learn about new commits from tracked repos |
| **Push Authorization** | Identity's on-chain ACL verified by replicator | No central authority can ban; replicators individually verify write privileges |
| **Funding Model** | After-the-fact Patreon-like contributions | Replicators receive donations; no paywall for access |
### 2.2 Key Design Properties
- **No central authority**: No single entity can ban an org, user, or repo
- **Individual replicator choice**: Each replicator independently decides what to replicate and whose pushes to accept
- **Transferable identity**: Selling the org NFT transfers all repos and access permissions
- **Self-certifying data**: Git content addresses + on-chain identity = verifiable data provenance
### 2.3 Critical Gaps in the Source Concept
| Gap | Issue | Solution Pattern |
|-----|-------|-----------------|
| **Hot path latency** | On-chain ACL look-up per push is too slow | Cache ACL locally; sync from chain events |
| **Key rotation** | If the private key controlling the NFT is lost, the identity is lost | Multi-delegate thresholds (like Radicle) + social recovery |
| **Fork/namespace collisions** | Multiple repos with same name under different orgs | Use on-chain IDs (token IDs) not human-readable names as the authoritative identifier |
| **Gas costs** | Every ACL change costs gas | Batch updates; use L2s (Base, Arbitrum); delegate to replicator-level local ACL |
| **Revocation propagation** | Revoking write access must propagate to all replicators | Event-driven: on-chain Revoked event → gossip notification → local ACL update |
---
## 3. Existing Projects
### 3.1 Radicle (radicle.xyz)
**Overview**: Radicle is an open-source, peer-to-peer code collaboration stack built on Git. It is the most mature decentralized git system currently in production (v1.x, Heartwood release).
#### Identity System
| Feature | Implementation |
|---------|---------------|
| **Node ID (NID)** | Ed25519 public key encoded as a DID (`did:key:z6Mk...`) |
| **Key format** | Ed25519 (same curve as alknet) |
| **Storage** | SSH-format key files; `MemorySigner` holds decrypted key in RAM |
| **Multi-device** | Currently one key per device (per RIP-0002); multi-device via threshold delegates is in development |
| **Identity Document** | JSON document stored in Git, listing delegates (DIDs) and a threshold for canonical updates |
**Relevance to alknet**: Radicle's NID system is architecturally very close to alknet's Ed25519-based identity. Both use:
- Ed25519 as the primary key type
- A single seed/identity as the root of trust
- DID-like identifiers for inter-node communication
- Cryptographic signatures for data verification
**Key difference**: Radicle uses pure Ed25519 keypairs directly (no hierarchical derivation), while alknet derives Ed25519 keys from a BIP39 seed phrase via SLIP-0010. This gives alknet the ability to derive multiple keys from a single root and to derive Ethereum signing keys from the same seed.
#### Gossip Protocol
Radicle uses a custom gossip protocol with three message types:
| Message Type | Purpose | Content |
|-------------|---------|---------|
| **Node Announcement** | Peer discovery | Node ID, alias, addresses, capabilities, timestamp |
| **Inventory Announcement** | Repo discovery | List of RepoIDs being seeded, timestamp |
| **Reference Announcement** | Repo update notification | RepoID + updated signed refs, timestamp |
Each announcement includes a cryptographic signature and timestamp, enabling verification before relay. Messages are dropped on re-encounter (epidemic-style deduplication). Bootstrap nodes seed peer discovery.
**Comparison with alknet's call protocol**: Radicle's gossip is metadata-only; actual data transfer uses Git protocol. alknet's approach uses a call protocol (`EventEnvelope`) for both metadata and operation invocation. The gossip pattern could be layered on top of alknet's call protocol as a subscription-based integration event mechanism.
#### Self-Certifying Repositories
Radicle repositories are **self-certifying**:
- The Repository ID (RID) is derived from the initial identity document hash
- All actions (commits, issue comments, patches) are cryptographically signed
- **Delegates** are public keys authorized to update the identity document
- A **threshold** defines how many delegates must sign for an update to be canonical
- Canonical branches are established dynamically based on signature thresholds
This eliminates the need for a central authority to determine "which version is correct."
**Relevance**: alknet's on-chain ACL concept (from the source) can use this threshold model. Instead of a single NFT owner dictating the canonical branch, a threshold of delegates can be required — this mirrors the `narrowed_scopes` / `DelegatesEdge` model in alknet's ACL graph.
#### Collaborative Objects (COBs)
COBs are Radicle's mechanism for distributed social artifacts (issues, patches, code review):
- Stored as Git objects in `refs/cobs/<type>/<object-id>` namespace
- Use CRDT DAG (Directed Acyclic Graph) for conflict-free merging
- All operations are Ed25519-signed by their author
- SQLite cache (`cobs.db`) provides indexed queries without traversing Git history
**Relevance**: COBs demonstrate that complex social data can be stored in Git with CRDT semantics. alknet's `alknet-storage` metagraph + honker streams could serve a similar role for distributed state, with the key difference being that alknet's state store is SQLite-backed rather than Git-backed, making it more efficient for real-time operations.
#### Summary Assessment
| Dimension | Radicle | alknet (proposed) |
|-----------|---------|-------------------|
| **Identity** | Ed25519 keypair (DID) | Ed25519 from SLIP-0010 + Ethereum key from same seed |
| **Naming** | No global naming; NID is identifier | On-chain NFT ID + human-readable name (via ENS or custom) |
| **Access Control** | Threshold delegates in identity doc | Smart contract ACL + local graph cache |
| **Replication** | Gossip for metadata, Git for data | Call protocol + (future) gossip subscriptions |
| **Data Storage** | Git objects + SQLite cache | SQLite (metagraph/honker) + Git-compatible |
| **Censorship Resistance** | P2P, no authority | P2P + on-chain identity (uncensorable registration) |
| **Funding Model** | Community-funded seed nodes | After-the-fact contributions (replicators) |
### 3.2 ForgeFed (Forgejo Federation)
**Overview**: ForgeFed is an ActivityPub-based federation protocol for software forges. It enables Gitea/Forgejo instances to interoperate — users on one instance can open issues and submit PRs on another without creating separate accounts.
| Feature | Details |
|---------|---------|
| **Protocol** | ActivityPub (same as Mastodon, PeerTube) |
| **Identity** | Web-based (user@example.com format, like email) |
| **ACL** | Per-instance ACL; no on-chain verification |
| **Censorship Resistance** | Limited; instances can block each other |
| **Status** | Forgejo implementing; Vervis is reference implementation |
**Relevance to alknet**: ForgeFed shows how federation works without blockchain. It uses ActivityPub for cross-instance communication, which is analogous to alknet's call protocol for cross-node communication. However, ForgeFed relies on instance-level trust (each Forgejo admin controls their instance), while alknet's concept uses on-chain identity for trust.
**Key takeaway**: ForgeFed's federation model is complementary, not competitive, with blockchain identity. An alknet node could expose a ForgeFed-compatible interface for interop with existing forges while using on-chain identity for internal trust decisions.
### 3.3 Git-Based Smart Contract Projects
| Project | Chain | Approach | Status |
|---------|-------|----------|--------|
| **GitBross** | Solana/Arbitrum + IPFS | Repos backed up to IPFS; smart contracts for metadata | Active |
| **GitLike** | Ethereum + IPFS | Browser-based decentralized VCS | Experimental |
| **Statik** | IPFS | Version control on IPFS with content-addressed storage | Experimental |
| **PineSU** | Ethereum | Git repos + blockchain for integrity/timestamping | Research paper |
**Common patterns**:
- IPFS for content-addressed storage of git objects
- Smart contracts for metadata (ownership, ACL, provenance)
- Ethereum or L2 for on-chain verification
- Git bridge tools that push to both IPFS and traditional remotes
**Key insight**: None of these projects have achieved widespread adoption. The main challenges are:
1. **Performance**: IPFS retrieval is slower than centralized git hosting
2. **UX**: Browser-based git clients lack feature parity with CLI tools
3. **Incentives**: No sustainable funding model for replicators
alknet's approach of using traditional git remotes with a smart contract ACL overlay avoids the IPFS performance trap while still providing censorship resistance.
### 3.4 NFT-Based Access Control Systems
Several projects use NFTs (ERC-721) for access gating:
| Pattern | Mechanism | Example |
|---------|-----------|---------|
| **Token-gated content** | Wallet verification proves NFT ownership before granting access | NFT-gated websites, Discord roles |
| **Role-based ACL via NFT** | NFTs represent roles; smart contract checks `balanceOf(address) > 0` | Token-gated DAOs, access-controlled channels |
| **Namespace NFTs** | Each NFT represents a namespace/org; sub-rights derive from ownership | ENS domains, NFT-based guild systems |
**Solidity Pattern for Repository ACL**:
```solidity
// Simplified example: NFT-based org/repo with on-chain ACL
contract OrgToken is ERC721 {
struct Org {
address owner;
mapping(address => Role) members; // ACL mapping
}
struct Repo {
uint256 orgTokenId; // Owning org
mapping(address => Permission) collaborators;
}
function canPush(uint256 repoId, address user) external view returns (bool) {
Repo storage repo = repos[repoId];
// Check direct permission
if (repo.collaborators[user] >= Permission.Write) return true;
// Check org membership
Org storage org = orgs[repo.orgTokenId];
if (org.members[user] >= Role.Member) return true;
return false;
}
}
```
**Performance considerations**: A `canPush()` check on L2 (Base, Arbitrum) costs ~0.001-0.01 USD and takes 0.5-2 seconds. This is acceptable for occasional operations (repo creation, ACL changes) but not for per-push verification. Caching is essential.
**Relevance to alknet**: The mapping from on-chain ACL to alknet's local ACL graph is direct:
- ERC-721 token ID → `PrincipalNode` in alknet's ACL metagraph
- `collaborators` mapping → `DelegatesEdge` with `narrowed_scopes`
- `canPush()` → alknet's `check_access()` function
---
## 4. Identity on the Blockchain
### 4.1 ERC-721 as Identity/Namespace Tokens
**How it works**: Each unique identity (org, user, namespace) is an ERC-721 NFT. The token ID is the on-chain identifier; metadata (display name, avatar, public key) is stored off-chain (IPFS or DNS).
**Advantages**:
- Inherent transferability (sell/gift an org identity)
- On-chain ownership verification
- Metadata can include cryptographic public keys for off-chain verification
- Composable with other on-chain protocols (DAO governance, treasury)
**Disadvantages**:
- Gas costs for every state change
- Key rotation requires a transaction (can't just change a local file)
- Metadata availability depends on off-chain storage
- Privacy: all ACL changes are public on-chain
**Resolution pattern**: Use on-chain registration as the root of trust, but resolve identity locally via cached data. This is exactly how DNS works — the zone file is authoritative, but resolvers cache it.
### 4.2 ENS (Ethereum Name Service) as a Naming Layer
**Overview**: ENS maps human-readable names (e.g., `alice.eth`) to machine-readable identifiers (Ethereum addresses, content hashes, text records).
| Feature | Implementation |
|---------|---------------|
| **Name resolution** | `alice.eth` → Ethereum address (NFT owner) |
| **Text records** | Store arbitrary key-value data (avatar, email, public key, SSH key) |
| **Subdomains** | `git.alice.eth` can point to a replicator endpoint |
| **Resolver** | Smart contract that returns records for a name |
| **Off-chain look-up** | CCIP-read (EIP-3668) allows resolving names via external data |
**Relevance to alknet**: ENS text records can store alknet node identifiers:
- `alk.id` text record → alknet Node ID (Ed25519 public key fingerprint)
- `alk.pubkey` text record → Ed25519 public key (for SSH authentication)
- `alk.replicator` text record → endpoint URL (for repo discovery)
This creates a human-friendly naming overlay on top of alknet's cryptographic identifiers. Combined with DNS TXT records (alknet's planned DNS naming layer), it provides multiple resolution paths.
**Limitation**: ENS resolution requires an Ethereum RPC call, which adds latency. For production use, ENS data should be cached locally and refreshed periodically, similar to DNS TTLs.
### 4.3 Smart Contracts as ACL/Naming Services
**Pattern**: A smart contract stores the ACL mapping and provides a view function for verification. This is the "source of truth" that local caches sync from.
```
On-chain ACL contract (source of truth)
│ events: RoleGranted, RoleRevoked, RepoCreated, etc.
alknet-storage (local cache)
├── ACL metagraph (PrincipalNode + DelegatesEdge)
├── Synced from on-chain events
└── Used for hot-path access checks
```
**Event-driven sync pattern** (critical for alknet):
1. Smart contract emits `RoleGranted(address, repoId, role)` event
2. alknet head node listens to these events (via Ethereum log subscription)
3. Event is projected into the ACL metagraph as a `DelegatesEdge` with `narrowed_scopes`
4. Local access checks use the metagraph (fast, SQLite)
5. Periodic consistency check ensures local cache matches on-chain state
This maps directly to alknet's **event boundary discipline**:
- On-chain events = external source of truth (like domain events from another service)
- ACL metagraph = local projection (like an integration event or read model)
- Honker stream `acl:updated` = notification that the local cache changed (integration event)
### 4.4 Decentralized Identity Standards
#### W3C DIDs (Decentralized Identifiers)
**Overview**: DIDs are a W3C standard for verifiable, self-sovereign digital identifiers. A DID is a URI that resolves to a DID Document describing how to interact with the identity holder.
| DID Method | Resolution | Key Type | Use Case |
|-----------|-----------|----------|----------|
| `did:key` | Static (no registry) | Ed25519, secp256k1, etc. | Radicle uses this; self-certifying |
| `did:ethr` | Ethereum registry | secp256k1 | Blockchain-verifiable identity |
| `did:web` | DNS/web server | Any | Traditional web PKI bridge |
| `did:ion` | Bitcoin Sidetree | secp256k1 | Microsoft's DID system |
**Relevance**: Radicle uses `did:key` with Ed25519 keys. alknet could use `did:key` for local identity (same key type!) and extend to `did:ethr` for on-chain identity, using the same seed phrase to derive both keys.
#### Verifiable Credentials (VCs)
**Overview**: VCs are tamper-evident, cryptographically secure attestations issued by a trusted authority. Think of them as digital certificates (driver's license, degree) that the holder presents to a verifier.
**Application to git access**: A VC could attest that "this Ed25519 public key has write access to repo X." The issuer is the org's NFT contract (or a delegate). VCs can be verified off-chain, reducing on-chain transaction costs.
**alknet mapping**: VCs are analogous to alknet's `Identity` struct with `scopes` and `resources`. A VC issuance maps to the creation of a `DelegatesEdge` in the ACL graph. The key difference is that VCs are bearer tokens (anyone who holds one can present it), while alknet's ACL is graph-based (the principal must be connected to the resource via edges).
---
## 5. Access Control Models for Distributed Git
### 5.1 Git's Own ACL Model
Git has limited built-in ACL. Access control is typically enforced at the transport layer:
| Mechanism | Layer | Scope |
|-----------|-------|-------|
| **`pre-receive` hook** | Server-side | Reject pushes based on branch, author, file patterns |
| **`update` hook** | Server-side | Per-ref checks (branch-level protection) |
| **`post-receive` hook** | Server-side | Post-push actions (notifications, CI triggers) |
| **SSH key mapping** | Transport | `authorized_keys` → system user → filesystem permissions |
| **HTTP basic auth** | Transport | Username/password → Git smart HTTP |
| **Gitolite** | Server-side | Config-file-based ACL mapping SSH keys to repos and permissions |
**Gitolite pattern** (most relevant for distributed git):
- `~/.ssh/authorized_keys` maps SSH keys to Gitolite users
- `~/.gitolite/conf/gitolite.conf` defines repos and permissions
- Permission levels: `R` (read), `RW` (read+write), `RW+` (read+write+force-push)
- Wildcard repos: `CREATOR/..*` — users can create repos matching patterns
**alknet mapping**: Gitolite's config file is the analog of alknet's ACL metagraph. The key difference is that Gitolite is centralized (one config file), while alknet's ACL can be distributed (synced from on-chain events).
### 5.2 Decentralized Write Permission Without Central Authority
In a truly decentralized system, no single node controls access. Several patterns exist:
#### Pattern 1: Self-Certifying Repositories (Radicle)
- The repo creator defines an identity document listing delegates
- Delegates are Ed25519 public keys with a threshold
- Only delegate signatures on refs are considered canonical
- Replicators accept any push but only replicate refs signed by sufficient delegates
**Trade-off**: Simple, no on-chain costs, but no mechanism for human-readable names or transferable ownership.
#### Pattern 2: On-Chain ACL (Source Concept)
- Smart contract stores `mapping(address => Role)` for each repo
- Replicators verify pusher's address against the contract before accepting
- Ownership is transferable (the NFT can be sold)
- Gas costs for setup and ACL changes
**Trade-off**: Transferable ownership and verifiable ACL, but requires Ethereum interaction and introduces latency.
#### Pattern 3: Hybrid — On-Chain Root + Local Cache
- On-chain contract defines who owns each org/repo NFT
- Local ACL graph caches on-chain state and adds local rules
- Hot-path checks use local cache (SQLite, fast)
- Cold-path operations (ACL changes, ownership transfers) go on-chain
- Local cache is periodically verified against on-chain state
**This is the recommended pattern for alknet.** It combines:
- On-chain censorship resistance (no single authority can revoke identity)
- Local performance (ACL checks are SQLite-fast)
- Transferable ownership (NFT can be sold/transferred on-chain)
- Graceful degradation (local ACL still works when chain is unavailable)
### 5.3 Radicle's Approach to Identity and Verification
Radicle's identity model has specific properties worth detailed comparison:
| Property | Radicle | alknet (proposed) |
|----------|---------|-------------------|
| **Identity root** | Ed25519 keypair (generated locally) | BIP39 seed phrase → SLIP-0010 derivation |
| **Identity document** | JSON in Git, signed by delegates | On-chain NFT + local ACL metagraph |
| **Delegate model** | Threshold of N public keys | Threshold of N delegates (on-chain or local) |
| **Key rotation** | Add/remove delegates via identity doc update | Transfer NFT to new address; update local keys |
| **Multi-device** | One key per device (RIP-0002) | One key per device derived from same seed (`m/74'/0'/0'/{n}'`) |
| **Namespace collision** | RID is content-hash, collision-free | NFT token ID is unique; human names via ENS |
| **Revocation** | Remove delegate from identity doc | On-chain ACL change + local cache update |
| **Verification** | Signature verification against delegate list | Signature verification + on-chain ACL check |
**alknet advantage**: Deriving multiple keys from one seed means:
- Multi-device support is built-in (derive a key per device)
- No "one key per identity" limitation
- The same seed provides identity keys, encryption keys, SSH keys, and Ethereum signing keys
- Key rotation for a single device is: derive a new key from the next index, updated locally
**alknet challenge**: If the seed phrase is lost, all derived keys are lost. Mitigation strategies:
- Social recovery (N-of-M threshold: trusted contacts hold shards)
- Hardware security module (HSM) protection for the seed
- Multi-sig on key operations (require threshold of devices to authorize)
---
## 6. Cryptographic Identity Mapping
### 6.1 Ed25519 Keys (alknet's Key Type)
alknet uses Ed25519 as the primary key type for:
- SSH authentication (fingerprint-based verification)
- Node identity (Node IDs are Ed25519 public keys)
- Channel signing (call protocol event signatures)
**Relevant properties of Ed25519**:
- 32-byte public key, 64-byte private key (or 32-byte seed + 32-byte public key)
- Deterministic signatures (same message, same key → same signature)
- Fast verification (~3x faster than secp256k1)
- Used in SSH (since OpenSSH 6.5), Tor onion services, Signal
**SLIP-0010 derivation** (what alknet uses):
- SLIP-0010 generalizes BIP-32 to non-secp256k1 curves
- Ed25519 derivation uses **hardened keys only** (cannot derive child public keys from parent public key)
- This means: the master seed must be available to derive any child key
- alknet's secret service holds the seed in RAM and derives keys on demand
### 6.2 Blockchain Private Keys vs SSH Keys
The key question for mapping blockchain identity to git access is: **how does an Ed25519 SSH key relate to a secp256k1 Ethereum key?**
| Key Type | Curve | Use Case | alknet Derivation Path |
|----------|-------|----------|----------------------|
| Identity key | Ed25519 | SSH auth, node identity | `m/74'/0'/0'/0'` |
| Device key | Ed25519 | Per-device identity | `m/74'/0'/0'/{n}'` |
| SSH host key | Ed25519 | Server identity | `m/74'/0'/1'/0'` |
| Encryption key | AES-256-GCM | External credential encryption | `m/74'/2'/0'/0'` |
| Ethereum key | secp256k1 | Smart contract signing | `m/44'/60'/0'/0/0` |
**The bridge**: Both keys derive from the **same BIP39 seed phrase**. The secret service can sign an Ethereum transaction using the secp256k1 key and also authenticate SSH using the Ed25519 key. This creates a cryptographically linked identity pair:
- On-chain identity (Ethereum address derived from `m/44'/60'/0'/0/0`)
- Off-chain identity (Ed25519 key derived from `m/74'/0'/0'/0'`)
**Binding them**: To prove that the Ed25519 key and the Ethereum key belong to the same entity:
1. Sign a message with the Ed25519 key: `"I, <Ed25519-pubkey>, attest that my on-chain identity is <Ethereum-address>"`
2. Store this attestation on-chain (in the org/user NFT metadata)
3. Anyone can verify: the on-chain address owns the NFT, and the attestation links the SSH key to that address
This is the **key binding mechanism** that connects alknet's SSH-based authentication to on-chain identity.
### 6.3 Deriving Repository Access from On-Chain Identity
The complete flow for a push operation in a decentralized git system with on-chain ACL:
```
1. Client connects to replicator via SSH
2. SSH auth succeeds (Ed25519 key verified by alknet IdentityProvider)
3. Client pushes to repo X
4. Replicator checks:
a. Local ACL metagraph: does this Ed25519 key have write access to repo X?
b. If local ACL is stale, re-verify against on-chain contract
5. If authorized: accept push, gossip update to other replicators
6. If not: reject with "access denied"
```
**Optimization**: Step 4b is rarely needed if the local ACL cache is kept fresh via event subscriptions. The on-chain contract emits events on ACL changes, and the head node's sync process projects these into the local ACL metagraph.
**alknet's existing support for this flow**:
| Component | Role |
|-----------|------|
| `IdentityProvider` trait | Resolves Ed25519 fingerprint → `Identity` with scopes/resources |
| `ConfigIdentityProvider` | Local-only: reads from `authorized_keys` config |
| `StorageIdentityProvider` | SQLite-backed: queries `peer_credentials` + ACL metagraph |
| `OnChainIdentityProvider` (future) | Verifies against on-chain ACL, falls back to local cache |
| `AuthProtocol` (irpc) | `VerifyPubkey``Identity` resolution |
| `CheckAccess` (irpc) | `Identity` + operation → access verification using ACL graph |
| `OperationSpec.access_control` | Declarative access requirements per operation |
---
## 7. Gossip Protocols for Repo Synchronization
### 7.1 Epidemic/Gossip Protocol Fundamentals
Gossip protocols are decentralized dissemination mechanisms inspired by how rumors spread in social networks. Key properties:
- **Eventual consistency**: All nodes eventually receive all updates
- **Fault tolerance**: Works even when nodes join/leave randomly
- **Scalability**: O(log N) time to reach all nodes in a network of N nodes
- **No single point of failure**: No coordinator node
### 7.2 Radicle's Gossip Protocol
Radicle uses three message types (detailed in Section 3.1):
- **Node Announcements**: Peer discovery (who's online, where to reach them)
- **Inventory Announcements**: Repo discovery (what repos each node seeds)
- **Reference Announcements**: Update notifications (new commits, new COB operations)
**Anti-entropy mechanism**: Nodes periodically exchange state summaries to ensure they haven't missed any updates. This is similar to Merkle tree-based reconciliation in distributed databases.
**Relevance to alknet**: alknet's call protocol subscription model (`call.requested` with `OperationType::Subscription`) can serve as the transport for gossip messages. The key difference is that alknet's call protocol is request-response oriented, while gossip is push-based. A gossip layer on top of the call protocol would work as follows:
```
alknet gossip layer:
1. Subscribe to `/{node}/gossip/announce` on known peers
2. Receive NodeAnnouncement, InventoryAnnouncement, RefAnnouncement events
3. Forward announcements to other connected peers (with deduplication)
4. For RefAnnouncements of tracked repos, trigger git fetch
```
### 7.3 Alternative: CRDT-Based Sync
Instead of gossip + git fetch, some systems use CRDTs for repository synchronization:
- **Advantages**: No merge conflicts, automatic convergence
- **Disadvantages**: Large metadata overhead, complex implementation, doesn't map directly to git's object model
**Recommendation for alknet**: Start with gossip + git fetch (as Radicle does) and consider CRDT-based sync for specific metadata (e.g., ACL state, org metadata) while keeping git data as-is. The ACL metagraph changes can propagate via honker streams (which are effectively a form of CRDT merge).
---
## 8. Relevance to Alknet
### 8.1 Identity + IdentityProvider Model
alknet's existing `Identity` struct and `IdentityProvider` trait are already designed for this use case:
```rust
pub struct Identity {
pub id: String, // Fingerprint or UUID
pub scopes: Vec<String>, // Permission scopes
pub resources: Option<HashMap<String, Vec<String>>>, // Resource-level access
}
```
The `id` field serves dual purpose:
- **Config-based auth**: SSH fingerprint (e.g., `SHA256:abc123...`)
- **Storage-based auth**: Account UUID (e.g., `acc_0123456789`)
**Extended for on-chain identity**, the `id` field could also be:
- **On-chain auth**: Ethereum address (e.g., `0x1234...`) or NFT token ID (e.g., `token_42`)
The `IdentityProvider` trait naturally extends:
```rust
trait IdentityProvider: Send + Sync {
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
fn resolve_from_token(&self, token: &[u8]) -> Option<Identity>;
}
// Future extension:
// OnChainIdentityProvider resolves Ethereum address + Ed25519 binding
// from on-chain ACL contract, with local metagraph cache
```
### 8.2 OperationRegistry Extension with On-Chain Verification
alknet's `OperationSpec` includes `access_control` fields:
```rust
pub struct AccessControl {
pub required_scopes: Vec<String>,
pub required_scopes_any: Option<Vec<String>>,
pub resource_type: Option<String>,
pub resource_action: Option<String>,
}
```
For on-chain verification, a new `access_control` mode could be added:
```rust
pub enum AccessControlMode {
Local, // Check against local ACL metagraph (current)
OnChain, // Verify against on-chain contract (future)
CachedOnChain, // Check local cache first, verify on-chain on miss/stale (recommended)
}
```
The `AccessControl` struct gains a `mode` field defaulting to `Local`. This is additive and doesn't change existing behavior.
### 8.3 Git Service Adapter for Decentralized Replication
alknet's application service pattern (from services.md) can accommodate a `GitService`:
```rust
#[rpc_requests(message = GitMessage)]
enum GitProtocol {
#[rpc(tx=oneshot::Sender<RepoInfo>)]
#[wrap(GetRepo)]
GetRepo { repo_id: String },
#[rpc(tx=oneshot::Sender<Vec<RepoInfo>>)]
#[wrap(ListRepos)]
ListRepos { org: Option<String> },
#[rpc(tx=oneshot::Sender<bool>)]
#[wrap(CanPush)]
CanPush { repo_id: String, identity: Identity },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(UpdateMirror)]
UpdateMirror { repo_id: String, refs: Vec<RefUpdate> },
#[rpc(tx=mpsc::Sender<RefAnnouncement>)]
#[wrap(SubscribeRefs)]
SubscribeRefs { repo_ids: Vec<String> },
}
```
This service:
- **Registers with the call protocol** as `/head/git/*`
- **Uses `StorageIdentityProvider`** for `CanPush` checks (with ACL metagraph)
- **Manages git mirrors** (git bare repos on the local filesystem)
- **Propagates updates** via `SubscribeRefs` (which maps to honker stream subscriptions → call protocol integration events)
### 8.4 CredentialProvider Role
The existing `CredentialProvider` pattern in alknet (used for outbound authentication TO external services) maps to:
| Use Case | CredentialProvider Implementation |
|----------|----------------------------------|
| Push to GitHub/GitLab | SSH key from alknet identity, or OAuth token from external source |
| Push to on-chain repo | Ed25519 key derived from seed (signs the push) + Ethereum key (signs on-chain attestation) |
| Authenticate to replicator | Ed25519 key (SSH auth via `IdentityProvider`) |
| Decrypt stored credentials | AES-256-GCM key derived from seed via `SecretProtocol` |
### 8.5 Domain Events vs. Integration Events (Distributed Git Context)
alknet's event boundary discipline (from event sourcing research and ADR-032) is critical for the distributed git scenario:
| Event Type | Source | Consumer | Boundary | Git Analog |
|-----------|--------|----------|----------|------------|
| **Domain events** (honker) | Local service | Same service | Internal | Git object creation/update in local repo |
| **Integration events** (call protocol) | Projected from domain events | Other nodes/services | Cross-node | Push notification, gossip announcement |
| **On-chain events** (smart contract) | Ethereum log | Head node sync process | External source | ACL change on blockchain |
| **Notifications** (honker) | Service | Any subscriber | Cross-service | "Repo X was updated" (thin, ID-only) |
**The flow for a decentralized git push**:
```
1. Client pushes to replicator
2. Replicator's GitService receives push
3. GitService publishes domain event: "repo:refs-updated" (honker stream)
4. Integration event projected: "call.responded" with repo update (call protocol)
5. Replicator gossips "RefAnnouncement" to tracked peers (call protocol subscription)
6. On-chain: if this push creates a new branch, optionally emit on-chain attestation
7. Peer replicators fetch updated refs (git protocol) and update their mirrors
```
**The flow for an ACL change**:
```
1. Org admin calls smart contract: grantWrite(repoId, newUserAddress)
2. Smart contract emits RoleGranted event
3. Head node's sync process detects the event (Ethereum log subscription)
4. Sync process calls StorageService: add DelegatesEdge to ACL metagraph
5. StorageService publishes domain event: "acl:updated" (honker stream)
6. Integration event projected: notify replicators of ACL change (call protocol)
7. Replicators update their local ACL cache
```
This cleanly separates:
- **On-chain events** (smart contract logs) = external source of truth
- **Local projections** (ACL metagraph) = cached view for fast access checks
- **Integration events** (call protocol) = cross-node notification mechanism
- **Domain events** (honker streams) = internal state management
### 8.6 Practical Integration Path
For alknet to support the decentralized git concept, the integration path is:
#### Phase 1: Foundation (Current Architecture)
- `IdentityProvider` trait supports multiple backends ✓
- `StorageIdentityProvider` queries `peer_credentials` + ACL graph ✓
- `SecretProtocol` derives Ed25519 and secp256k1 keys from same seed ✓
- `OperationSpec.access_control` supports scope-based checks ✓
#### Phase 2: Git Service (Additive)
- Add `GitProtocol` irpc service for repo management
- Implement `GitService` as an application service (like DockerService, NodeService)
- Map `CanPush` to ACL metagraph traversal
- Implement `pre-receive` hook that calls alknet's `CheckAccess` irpc
#### Phase 3: On-Chain ACL (Additive, Requires External Dependencies)
- Add `OnChainIdentityProvider` that:
1. Resolves Ed25519 fingerprint → Ethereum address (via attestation stored in NFT metadata)
2. Checks on-chain ACL contract for access rights
3. Caches results in local ACL metagraph
4. Subscribes to on-chain events for ACL changes
- Add `AccessControlMode::CachedOnChain` to `OperationSpec`
- Add `WalletProtocol` irpc service for signing on-chain transactions
#### Phase 4: Gossip and Replication (Additive)
- Add gossip message types to call protocol (`NodeAnnouncement`, `RepoAnnouncement`, `RefAnnouncement`)
- Implement `SubscribeRefs` streaming operation for repo update subscriptions
- Add replicator service that seeds repos and responds to gossip
Each phase is additive and doesn't require changes to earlier phases. The architecture supports this incremental extension because:
1. `IdentityProvider` is a trait — new implementations are additive
2. `OperationSpec.access_control` is a struct — new fields are additive
3. Application services register with the call protocol — new services don't change core
4. Honker streams are internal — new streams are additive
---
## 9. References
### Decentralized Git Platforms
- **Radicle Protocol Guide**: https://radicle.dev/guides/protocol — Comprehensive documentation of Radicle's identity system, gossip protocol, replication, and self-certifying repositories
- **Radicle Heartwood (source)**: https://github.com/radicle-dev/heartwood — Reference implementation in Rust
- **RIP-0002 Identity**: Radicle Improvement Proposal for identity documents and delegate thresholds
- **radicle-crypto crate**: Ed25519 key types, SSH encoding, keystore (DeepWiki analysis: https://deepwiki.com/radicle-dev/heartwood/7.1-radicle-crypto)
- **ForgeFed**: https://forgefed.org/ — ActivityPub-based federation protocol for forges (Forgejo, Gitea integration)
- **GitLike**: https://gitlike.dev/ — Browser-based decentralized VCS using IPFS and Ethereum
- **GitBross**: https://gitbross.com/ — Decentralized Git platform using Solana, Arbitrum, and IPFS
- **PineSU**: IEEE paper on Git + Ethereum integration for trusted information sharing
### Blockchain Identity and Naming
- **ERC-721 Standard**: https://ethereum.org/developers/docs/standards/tokens/erc-721 — Non-fungible token standard
- **ENS (Ethereum Name Service)**: https://docs.ens.domains/ — Decentralized naming on Ethereum
- **W3C DID Primer**: https://w3c-ccg.github.io/did-primer/ — Decentralized Identifiers overview
- **W3C Verifiable Credentials**: https://www.w3.org/TR/vc-data-model/ — VC specification
- **EIP-3668 (CCIP-Read)**: Off-chain data lookup for ENS, enabling smart contracts to verify off-chain data
### Access Control
- **Git Hooks**: https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks — Server-side hooks for git access control
- **Gitolite**: Config-file-based SSH key → repo permission mapping
- **Token-Gated Access Control**: https://chainscorelabs.com/guides/ — Patterns for ERC-721/ERC-1155 token-gated access
- **ChainGuard**: Blockchain-based authentication and access control (academic paper)
### Cryptographic Key Management
- **SLIP-0010**: https://slips.readthedocs.io/en/latest/slip-0010/ — Universal private key derivation from master private key (Ed25519, secp256k1, NIST P-256)
- **BIP-0032**: Hierarchical Deterministic Wallets
- **BIP-0039**: Mnemonic code for generating deterministic keys
- **SLIP-0044**: Registered coin types for BIP-0044 (alknet uses unallocated `74'`)
- **Ed25519**: Bernstein's Edwards-curve Digital Signature Algorithm
### Gossip Protocols
- **Gossip Protocol Fundamentals**: https://www.geeksforgeeks.org/distributed-systems/gossip-protocol-in-disrtibuted-systems/ — Epidemic-style information dissemination
- **libgossip**: C++17 implementation for decentralized node discovery and metadata propagation
- **Bitcoin Gossip**: Used in Bitcoin for transaction and block propagation
- **Secure Scuttlebutt (SSB)**: Inspiration for Radicle's gossip model
### Alknet Architecture Documents (Internal)
- **core.md**: Transport, call protocol, auth, services, DNS
- **services.md**: irpc service architecture, OperationEnv, Identity, auth/secret/config protocols
- **storage.md**: Metagraph data model, ACL as metagraph, identity tables, honker integration
- **integration-plan.md**: Phase 0-4 integration plan, ADRs 026-034
- **ADR-029**: Identity as core type (`Identity { id, scopes, resources }` + `IdentityProvider` trait)
- **ADR-032**: Event boundary discipline (domain events vs. integration events vs. service calls)
### Radicle-Specific Documentation
- **Radicle COBs (Collaborative Objects)**: CRDT-based distributed issues/patches stored as Git objects — https://deepwiki.com/radicle-dev/heartwood/6.1-collaborative-objects-(cobs)
- **Radicle Identity Documents**: Delegates, thresholds, and self-certifying repo identity — RIP-0002
- **Radicle Signed Refs**: Vulnerability disclosure (2026-03) on replay attacks in signed references

View File

@@ -0,0 +1,716 @@
# Gitserver Reference Document
> **Source**: <https://github.com/WJQSERVER/gitserver> (cloned at `/workspace/gitserver/`)
> **Version**: 0.0.3 (workspace Cargo.toml)
> **License**: MPL-2.0 (primary); upstream portions MIT (preserved in UPSTREAM-LICENSE)
> **Upstream origin**: <https://github.com/ggueret/git-server>
> **Date researched**: 2026-06-08
> **Purpose**: Evaluate gitserver as a basis for a git service adapter within alknet
---
## 1. Architecture Overview
### 1.1 What is gitserver?
Gitserver is a **Rust-native Git Smart HTTP server** that does not require an installed `git` binary at runtime. All Git operations (ref advertisement, pack generation, receive-pack) are implemented via the [gitoxide](https://github.com/GitoxideLabs/gitoxide) (`gix`) crate. It supports both Git protocol v1 and v2, including shallow clones and multi-ack negotiation.
The project follows a **library-first design**: `gitserver-core` and `gitserver-http` are reusable libraries, while the `gitserver` binary is a thin CLI wrapper for standalone deployment.
### 1.2 Crate Structure
```
crates/
├── gitserver-core/ # Git protocol operations (no HTTP dependency)
│ ├── backend.rs # GitBackend: unified interface for refs/pack/receive-pack
│ ├── discovery.rs # RepoStore: filesystem-based repo discovery
│ ├── dynamic_registry.rs # DynamicRepoRegistry, RepoResolver, MutableRepoRegistry traits
│ ├── error.rs # Error types (RepoNotFound, PathTraversal, Protocol, Git, Io)
│ ├── pack.rs # UploadPackRequest parsing, pack generation with side-band-64k
│ ├── path.rs # Path safety: resolve_repo_path (normalize + canonicalize)
│ ├── pktline.rs # pkt-line encoding/decoding utilities
│ ├── protocol_v2.rs # Git protocol v2: ls-refs, fetch, shallow, stateless-rpc
│ ├── receive_pack.rs # receive-pack: ref advertisement, pack reception, fast-forward validation
│ └── refs.rs # Protocol v1 ref advertisement
├── gitserver-http/ # Axum HTTP layer
│ ├── error.rs # AppError enum → HTTP status codes
│ ├── handlers.rs # Route handlers: info_refs, upload_pack, receive_pack, healthz, list
│ ├── lib.rs # router() function + public re-exports
│ └── state.rs # SharedState (RepoMode, AuthConfig, ServicePolicy, draining flag)
├── gitserver/ # CLI binary (thin wrapper)
│ └── main.rs # CLI args, RepoStore discovery, Axum server, graceful shutdown
└── gitserver-bench/ # Performance benchmarks (not published)
```
### 1.3 Key Dependencies
| Dependency | Version | Purpose |
|---|---|---|
| `gix` | 0.80.0 | Native Git repository operations (open refs, object store, rev-walk) |
| `gix-pack` | 0.67.0 | Pack file writing (receive-pack) |
| `axum` | 0.8.8 | HTTP routing and handlers |
| `tokio` | 1.50.0 | Async runtime, channels, IO |
| `miniz_oxide` | 0.8 | Zlib compression for pack objects |
| `sha1` | 0.10 | Pack checksum |
| `flate2` | 1 | Gzip response compression |
| `zstd` | 0.13 | Zstd response compression |
| `base64` | 0.22 | HTTP Basic auth decoding |
| `subtle` | 2 | Constant-time comparison (auth) |
| `clap` | 4.6.0 | CLI argument parsing |
### 1.4 Request Flow
#### Clone/Fetch (Protocol v1)
```
Client → GET /{repo}/info/refs?service=git-upload-pack
→ Server: resolve repo, verify auth, advertise_refs()
← Ref advertisement response
Client → POST /{repo}/git-upload-pack
→ Server: parse UploadPackRequest, generate_pack()
← Streamed side-band-64k pack response
```
#### Clone/Fetch (Protocol v2)
```
Client → GET /{repo}/info/refs (git-protocol: version=2)
← Capabilities advertisement
Client → POST /{repo}/git-upload-pack (git-protocol: version=2)
→ Server: parse_command_request() → ls-refs or fetch
← ls-refs result or streamed packfile
```
#### Push (receive-pack, must be enabled)
```
Client → GET /{repo}/info/refs?service=git-receive-pack
← Ref advertisement
Client → POST /{repo}/git-receive-pack
→ Server: parse commands, write pack, validate fast-forward, update refs
← Status report (ok/ng per ref)
```
---
## 2. Protocol Support
### 2.1 Smart HTTP Git Protocol
Gitserver implements the **Git Smart HTTP protocol** (RFC-like, de facto standard). This is the standard protocol used by `git clone http://...`, `git fetch`, and `git push` over HTTP.
**Supported endpoints:**
| Method | Endpoint | Protocol Version | Description |
|---|---|---|---|
| GET | `/healthz` | — | Health check (no auth) |
| GET | `/` | — | JSON repository listing (auth required if configured) |
| GET | `/{repo}/info/refs?service=git-upload-pack` | v1 | Ref advertisement for clone/fetch |
| GET | `/{repo}/info/refs?service=git-receive-pack` | v1 | Ref advertisement for push (disabled by default) |
| POST | `/{repo}/git-upload-pack` | v1 | Pack negotiation and transfer |
| POST | `/{repo}/git-receive-pack` | v1 | Push operations (disabled by default) |
| GET | `/{repo}/info/refs` with `git-protocol: version=2` | v2 | Capabilities advertisement |
| POST | `/{repo}/git-upload-pack` with `git-protocol: version=2` | v2 | `ls-refs` and `fetch` commands |
### 2.2 Git Operations
| Operation | Supported | Notes |
|---|---|---|
| `git clone` | ✓ | Both v1 and v2 |
| `git fetch` | ✓ | Multi-ack, multi-ack-detailed negotiation |
| `git push` | ✓ (opt-in) | Via `--enable-receive-pack` or `ServicePolicy.receive_pack: true` |
| Shallow clone | ✓ | Protocol v2 `fetch` with `deepen` |
| OFS_DELTA | ✓ | Offset delta compression in packs |
| Side-band-64k | ✓ | Multiplexed progress/pack data |
| Response compression | ✓ | Gzip and Zstd on ref advertisement |
### 2.3 Push Restrictions
When receive-pack is enabled, the following restrictions apply:
- **Fast-forward only**: Branch updates under `refs/heads/*` must be fast-forward (old commit is ancestor of new)
- **No ref deletion**: New OID cannot be the zero OID
- **No tag overwrite**: Updating an existing tag is rejected
- **Commits only**: Branch tips must point to commit objects
- **Timeouts**: 300s total, 30s idle
### 2.4 SSH Git Protocol
Gitserver does **not** support SSH Git protocol. It is HTTP-only. SSH git access would require a separate implementation or integration layer (see Section 6).
---
## 3. Interface Pattern Analysis
### 3.1 HTTP Handler Architecture
Gitserver's HTTP layer follows a clean handler pattern:
```rust
// gitserver-http/src/lib.rs
pub fn router(state: SharedState) -> Router {
Router::new()
.route("/healthz", get(handlers::healthz))
.route("/", get(handlers::list_repos))
.route("/{*path}", get(handlers::info_refs_dispatch))
.route("/{*path}", post(handlers::rpc_dispatch))
.with_state(state)
}
```
The `SharedState` is an Axum state object containing:
- `RepoMode` — either `Discovered(Arc<RwLock<RepoStore>>)` or `Dynamic { resolver, registry }`
- `AuthConfig` — optional Basic and/or Bearer authentication
- `ServicePolicy` — toggle for upload_pack, upload_pack_v2, receive_pack
- `draining: Arc<AtomicBool>` — graceful shutdown flag
Each handler follows this pattern:
1. Check `draining` flag → 503 if shutting down
2. Check `ServicePolicy` → 404 if service disabled
3. Authenticate request via `require_auth()` → 401 if credentials missing/invalid
4. Resolve repository via `SharedState::resolve()` → 404 if not found
5. Execute git operation via `GitBackend`
6. Return streaming or buffered response
### 3.2 Mapping to alknet's MessageInterface
Gitserver's `SharedState` + handler pattern maps closely to alknet's proposed `MessageInterface` trait:
```rust
// alknet's proposed MessageInterface
async fn handle_request(&self, request: InterfaceRequest) -> Result<InterfaceResponse>;
```
Gitserver's handler flow is essentially:
1. Receive HTTP request (analogous to `InterfaceRequest`)
2. Extract operation path, auth, and body
3. Dispatch to the appropriate Git operation
4. Return HTTP response (analogous to `InterfaceResponse`)
### 3.3 Low-Level Handler API
Gitserver also exposes handler functions that can be called directly without going through the Axum router:
```rust
use gitserver_http::handlers::{info_refs_endpoint, ServiceKind};
let response = info_refs_endpoint(
&state,
"my-project.git",
ServiceKind::UploadPack,
HeaderMap::new(),
).await?;
```
This is significant for alknet integration — it means the git logic can be invoked programmatically without HTTP routing.
---
## 4. Authentication
### 4.1 Current Auth Model
Gitserver supports two HTTP authentication mechanisms, both optional:
```rust
pub struct AuthConfig {
pub basic: Option<BasicAuthConfig>,
pub bearer_token: Option<String>,
}
pub struct BasicAuthConfig {
pub username: String,
pub password: String,
}
```
**Key characteristics:**
- Both can be configured simultaneously; **either one passing is sufficient**
- Basic auth uses **constant-time comparison** (`subtle` crate) to prevent timing attacks
- Bearer token is compared directly (suitable for generated tokens)
- Failed auth returns `401 Unauthorized` with `WWW-Authenticate: Basic realm="gitserver", Bearer`
- `GET /healthz` is **unauthenticated** (always accessible)
- Auth is **global** (same credentials for all repositories) — no per-repo or per-user ACL
### 4.2 Auth Flow in Handlers
```rust
fn require_auth(store: &SharedState, headers: &HeaderMap) -> Result<(), AppError> {
let auth = store.auth();
if auth.basic.is_none() && auth.bearer_token.is_none() {
return Ok(()); // No auth configured → allow all
}
let value = headers.get(AUTHORIZATION)...;
// Try Bearer first, then Basic
// Constant-time comparison for Basic
}
```
### 4.3 Mapping to alknet Identity
alknet's `IdentityProvider` resolves credentials to an `Identity`. The mapping would be:
| gitserver auth | alknet equivalent | Resolution path |
|---|---|---|
| No auth | `Identity::anonymous()` or reject | Configurable policy |
| Basic auth (username/password) | `IdentityProvider::resolve_from_token()` | Map to AuthToken or direct lookup |
| Bearer token | `IdentityProvider::resolve_from_token()` | Token is already in the right format |
The key gap is that gitserver's auth is **single-credential, global**, while alknet needs **per-identity, per-repository** access control. Integration would require:
1. Replacing `AuthConfig` with alknet's `IdentityProvider`
2. Extracting identity from the `Authorization` header
3. Checking per-repo ACL based on resolved `Identity`
---
## 5. Storage
### 5.1 Filesystem-Based Storage
Gitserver currently stores repositories as **bare Git repositories on the local filesystem**. The storage model is:
```
ROOT/
├── project-a.git/ # bare repository
│ ├── HEAD
│ ├── objects/
│ ├── refs/
│ └── description
├── org/
│ └── project-b.git/ # nested repository (up to max_depth)
└── ...
```
The `RepoStore::discover(root, max_depth)` function:
1. Canonicalizes the root path
2. Recursively walks subdirectories up to `max_depth`
3. Attempts `gix::open(path)` on each directory
4. If `repo.is_bare()`, adds it as a `RepoInfo`
5. Path traversal protection via lexical normalization + `canonicalize()` double-check
The `DynamicRepoRegistry` allows programmatic registration/unregistration of repos at runtime, validated by `gix::open()` confirming the path is a bare repo.
### 5.2 Storage Abstraction Points
The key storage interaction points in the codebase are:
| Component | Storage Pattern |
|---|---|
| `RepoStore::discover()` | Filesystem scan (local directory tree) |
| `DynamicRepoRegistry` | In-memory registry with filesystem-backed paths |
| `GitBackend::new(repo_path)` | Opens a local bare repo via `gix::open()` |
| `receive_pack::write_pack()` | Writes pack to `objects/pack/` via `gix_pack::Bundle::write_to_directory()` |
| `path::resolve_repo_path()` | Canonical path resolution + traversal protection |
**All storage operations assume a local filesystem path.** There is no abstraction for remote or object storage backends.
### 5.3 Rustfs (S3-Compatible) Integration Feasibility
Git operations fundamentally require **a local filesystem**`gix::open()` expects a directory with the standard `.git` layout (objects, refs, HEAD, etc.). Rustfs (S3-compatible) cannot serve as a **direct** storage backend for gitoxide's repository operations because:
1. `gix::open()` requires a local path — it reads `HEAD`, refs, and object packs from the filesystem
2. Pack generation (`generate_pack()`) streams objects from the local ODB
3. Receive-pack writes pack files to the local `objects/pack/` directory
4. Reference updates use `gix::Repository::edit_references()` which operates on the local refstore
However, rustfs **could** be used in several supporting roles:
| Integration Approach | Description | Feasibility |
|---|---|---|
| **Repo sync backend** | Store bare repo tarballs in rustfs; sync to local disk on demand | High — sync from S3 to local FS before serving |
| **Backup/archive** | Push repo backups to rustfs buckets | High — out-of-band backup |
| **Git LFS storage** | Store large file objects in rustfs via Git LFS | Medium — requires LFS server implementation |
| **Object store proxy** | Cache layer: serve from local FS, sync to/from rustfs | Medium — needs repo lifecycle management |
| **Direct S3 repo** | Custom `gix` object backend reading from S3 | Low — would require deep gitoxide customization |
The most practical approach: **use rustfs as a backing store for repository synchronization**. Gitserver would always operate on local filesystem paths, but a separate component would manage syncing repos to/from rustfs buckets.
---
## 6. SSH Support
### 6.1 Current State
Gitserver has **no SSH transport capability**. It only implements the HTTP Smart Git protocol. Adding SSH support would require implementing the Git SSH protocol, which is a different wire format:
| Aspect | Smart HTTP | SSH |
|---|---|---|
| Transport | HTTP (request/response) | Persistent SSH channel |
| Service discovery | `GET /info/refs?service=git-upload-pack` | `ssh://host/git-upload-pack 'repo'` |
| Protocol framing | pkt-line over HTTP | pkt-line over SSH channel |
| Authentication | HTTP Authorization header | SSH key-based |
| Multiplexing | HTTP/2 or separate connections | Multiple SSH channels |
### 6.2 How Git over SSH Works
The Git SSH protocol uses SSH as a transport for the same `git-upload-pack` and `git-receive-pack` commands:
```
Client connects via SSH → server executes git-upload-pack or git-receive-pack
Client ← SSH channel → Server (bidirectional pkt-line stream)
```
### 6.3 Integration with alknet's SSH Interface
alknet's SSH interface (`SshInterface`) is a `StreamInterface` — it accepts a persistent byte stream and multiplexes it into channels. This maps naturally to Git over SSH:
**Approach: Git as an alknet operation over SSH**
```
alknet SSH session
├─ Channel: call protocol (operations)
└─ Channel: git-upload-pack
OR git-receive-pack
gitserver-core protocol logic
(ref advertisement, pack generation, receive-pack)
```
This would work by:
1. The SSH interface receives a connection with a request like `git-upload-pack '/repos/project.git'`
2. alknet resolves the identity from the SSH key fingerprint
3. Checks ACL: does this identity have read/write access to this repo?
4. Invokes `gitserver-core` functions directly (no HTTP needed):
- `refs::advertise_refs()` → send over SSH channel
- `pack::generate_pack()` → stream over SSH channel
- `receive_pack::receive_pack()` → read/write over SSH channel
**Key advantage**: Since `gitserver-core` has no HTTP dependency, it can be used directly over SSH channels without the HTTP overhead. The `GitBackend` API is transport-agnostic.
### 6.4 Alternative: Dedicated Git SSH Adapter
A simpler approach that doesn't require modifying the SSH channel multiplexing:
```
alknet SSH session → call protocol → operation "git/upload-pack" →
→ GitAdapter::upload_pack(repo, wants, haves) → streaming response
```
This treats Git operations as alknet call operations, where the SSH interface is the transport but Git operations are invoked via the call protocol rather than raw SSH channels. This is more aligned with alknet's architecture but requires adapting the Git protocol to the call protocol's request/response model (potentially with streaming).
---
## 7. Relevance to alknet
### 7.1 Mapping to alknet's Interface Model
Gitserver is a textbook **`MessageInterface`** implementation:
| alknet MessageInterface | Gitserver Equivalent |
|---|---|
| `handle_request(InterfaceRequest)` | `info_refs_dispatch()` / `rpc_dispatch()` |
| `InterfaceRequest.operation_path` | URL path (`/{repo}/info/refs`, `/{repo}/git-upload-pack`) |
| `InterfaceRequest.auth_token` | `Authorization` header → `require_auth()` |
| `InterfaceRequest.input` | Request body (pack negotiation data) |
| `InterfaceResponse.result` | HTTP response body (ref advertisement, pack data) |
| `InterfaceResponse.status` | HTTP status code |
| `InterfaceResponse.headers` | Content-Type, Cache-Control, etc. |
However, gitserver **manages its own transport** (Axum HTTP server), which is exactly the `MessageInterface` pattern described in alknet's interface model: "MessageInterface implementations manage their own transport. They don't need the Transport trait because they're not wrapping a generic byte stream — they ARE the transport+interface combined."
### 7.2 Git as an alknet Operation
Git operations could be mapped to alknet's call protocol namespace:
```
Namespace: "git"
Operations:
- git/list → List available repositories
- git/info-refs → Get ref advertisement for a repo
- git/upload-pack → Clone/fetch (streaming response)
- git/receive-pack → Push (streaming request+response)
- git/ls-refs → Protocol v2 ls-refs
- git/fetch → Protocol v2 fetch
```
**Challenge**: Git operations are **streaming and bidirectional** (especially fetch negotiation and receive-pack), while alknet's call protocol is currently defined as request→response. This needs design consideration:
| Operation | Direction | Stream Duration | alknet Fit |
|---|---|---|---|
| `git/list` | Request → Response | Short | Direct fit |
| `git/info-refs` | Request → Response | Short | Direct fit |
| `git/upload-pack` | Request → Streaming Response | Long | Needs streaming response support |
| `git/receive-pack` | Streaming Request → Streaming Response | Long | Needs bidirectional streaming |
### 7.3 Proposed GitAdapter Architecture
```
┌─────────────────────────────────────────────────────────┐
│ alknet node │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ HttpInterface│ │ SshInterface │ │ DNS/other │ │
│ │ (Message) │ │ (Stream) │ │ (Message) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ OperationRegistry │ │
│ │ "git/list" → GitAdapter::list_repos() │ │
│ │ "git/upload-pack" → GitAdapter::upload_pack() │ │
│ │ "git/receive-pack" → GitAdapter::receive_pack() │ │
│ └──────────────┬───────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────┐ │
│ │ GitAdapter │ │
│ │ - SharedState (repos, auth) │ │
│ │ - GitBackend (protocol ops) │ │
│ │ - IdentityProvider (auth) │ │
│ │ - RepoResolver (filesystem) │ │
│ └──────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────┐ ┌────────────────┐ │
│ │ Local filesystem │ │ Rustfs sync │ │
│ │ (bare git repos) │ │ (S3 backend) │ │
│ └──────────────────────────────┘ └────────────────┘ │
└─────────────────────────────────────────────────────────┘
```
### 7.4 Auth Integration: alknet Identity → Gitserver Auth
**Current gitserver auth** (single global credential):
```rust
AuthConfig {
basic: Option<BasicAuthConfig>, // one username/password
bearer_token: Option<String>, // one token
}
```
**Proposed alknet integration** (per-identity, per-repo):
```rust
struct GitAdapter {
identity_provider: Arc<dyn IdentityProvider>,
repo_resolver: Arc<dyn RepoResolver>,
backend_factory: Arc<dyn GitBackendFactory>,
acl: Arc<dyn GitAcl>,
}
impl GitAdapter {
async fn handle_request(
&self,
request: InterfaceRequest,
) -> Result<InterfaceResponse> {
// 1. Resolve identity from auth token
let identity = self.identity_provider
.resolve_from_token(request.auth_token)?;
// 2. Parse git operation from path
let operation = parse_git_operation(&request.operation_path)?;
// 3. Check ACL
self.acl.check_access(&identity, &operation.repo, operation.access_type)?;
// 4. Dispatch to gitserver-core logic
// ...
}
}
```
**ACL design** (per-repo, per-operation):
```rust
enum GitAccess {
Read, // clone, fetch
Write, // push
}
trait GitAcl: Send + Sync {
fn check_access(
&self,
identity: &Identity,
repo: &str,
access: GitAccess,
) -> Result<()>;
}
```
### 7.5 Storage Integration with Rustfs
**Recommended approach**: Rustfs as a sync backend:
```rust
trait RepoStorage: Send + Sync {
/// Ensure a local working copy exists for the given repo.
/// May involve syncing from S3 (rustfs) to local disk.
async fn ensure_local(&self, repo: &str) -> Result<PathBuf>;
/// Sync local changes back to S3 (rustfs) after a push.
async fn sync_to_remote(&self, repo: &str) -> Result<()>;
/// List available repos (may consult S3 bucket listing).
async fn list_repos(&self) -> Result<Vec<RepoInfo>>;
}
```
The flow would be:
1. `GitAdapter` receives a request for repo `X`
2. `RepoStorage::ensure_local("X")` checks if the repo exists on local disk; if not, syncs from rustfs
3. Git operations run on the local filesystem (using `gitserver-core` directly)
4. After push operations, `RepoStorage::sync_to_remote("X")` pushes updates to rustfs
This maintains gitserver's requirement for a local filesystem while leveraging rustfs for durability and distribution.
### 7.6 Operation Mapping
| Git Operation | alknet Namespace | alknet Op | Input | Output | Stream? |
|---|---|---|---|---|---|
| List repos | `git` | `list` | `{}` | `[RepoInfo]` | No |
| Ref advertisement (v1) | `git` | `info-refs` | `{repo, service: "upload-pack" \| "receive-pack"}` | Binary ref advertisement | No |
| Ref capabilities (v2) | `git` | `capabilities` | `{repo}` | Binary capabilities | No |
| Ls-refs (v2) | `git` | `ls-refs` | `{repo, peel, symrefs, ref_prefixes}` | Binary ref listing | No |
| Clone/Fetch | `git` | `upload-pack` | `{repo, wants, haves, done, ...}` | Streamed pack data | Yes (response) |
| Push | `git` | `receive-pack` | `{repo, commands, pack_data}` | Status report | Yes (both) |
### 7.7 What gitserver-core Provides Directly
The most valuable integration point is `gitserver-core` — the HTTP-free protocol library:
```rust
// Direct usage without HTTP
use gitserver_core::backend::GitBackend;
use gitserver_core::discovery::RepoStore;
use gitserver_core::pack::{UploadPackRequest, UploadPackCapabilities, ShallowRequest};
use gitserver_core::protocol_v2;
// Repository discovery
let store = RepoStore::discover("./repos".into(), 3)?;
let repo = store.resolve("my-project.git")?;
// Protocol v1 ref advertisement
let backend = GitBackend::new(repo.absolute_path.clone());
let refs = backend.advertise_refs()?;
// Pack generation (streaming)
let request = UploadPackRequest { wants, haves, done, ... };
let pack_stream = backend.upload_pack(&request).await?;
// Receive-pack (push)
let result = backend.receive_pack(request_stream).await?;
// Protocol v2
let capabilities = protocol_v2::advertise_capabilities();
let ls_refs_output = protocol_v2::ls_refs(&repo_path, &ls_refs_request)?;
let fetch_output = backend.upload_pack(&fetch_request.upload_request).await?;
```
These functions can be called from any async context — SSH channel handler, alknet operation handler, HTTP handler — without going through the Axum HTTP layer.
---
## 8. Integration Recommendations
### 8.1 Recommended Integration Strategy
**Phase 1: HTTP Gateway (MessageInterface)**
Embed gitserver-http's Axum router into alknet's HTTP interface. This provides immediate Git-over-HTTP capability:
```rust
// In alknet's HttpInterface::handle_request()
// Route: /git/* → gitserver router
let git_app = gitserver_http::router(git_state);
let app = Router::new()
.nest("/git", git_app) // Mount git under /git
.route("/v1/{namespace}/{op}", post(operation_handler));
```
This works because gitserver is designed to be nested into existing Axum apps. Auth integration would replace `AuthConfig` with alknet's `IdentityProvider`.
**Phase 2: SSH Git Adapter (StreamInterface)**
Use `gitserver-core` directly within alknet's SSH interface for Git-over-SSH:
```rust
// In alknet's SshInterface channel handler
// SSH channel request: "git-upload-pack '/repos/project.git'"
let backend = GitBackend::new(repo_path);
let refs = backend.advertise_refs()?;
// Send refs over SSH channel
// Stream pack data over SSH channel
```
**Phase 3: Call Protocol Operations (OperationRegistry)**
Register Git operations in the operation registry for access via any interface:
```rust
registry.register(GitListRepos::new(adapter.clone()));
registry.register(GitUploadPack::new(adapter.clone()));
registry.register(GitReceivePack::new(adapter.clone()));
```
### 8.2 Key Modifications Needed
1. **Auth replacement**: Replace `AuthConfig` with `IdentityProvider`-based auth in `handlers.rs`'s `require_auth()` function
2. **ACL addition**: Add per-repo, per-identity access control (gitserver currently has none)
3. **RepoResolver abstraction**: Replace `RepoStore`/`DynamicRepoRegistry` with alknet's `RepoResolver` that integrates with rustfs sync
4. **Streaming response support**: Adapt alknet's call protocol for streaming (large pack files)
5. **Bidirectional streaming**: For receive-pack, the call protocol needs to support bidirectional streaming
### 8.3 Risks and Mitigations
| Risk | Mitigation |
|---|---|
| gitserver requires local filesystem | Use rustfs as sync backend; maintain local working copies |
| Auth is global (single credential) | Fork/modify `require_auth()` to use `IdentityProvider` |
| No per-repo ACL | Add `GitAcl` trait in the adapter layer |
| MPL-2.0 license requires modifications to be under MPL-2.0 | Acceptable for alknet (MPL-2.0 is file-level copyleft) |
| Large pack files may not fit alknet's message size limits | Implement streaming response in the call protocol |
| gitoxide version coupling | Pin `gix = "0.80.0"` as gitserver does |
### 8.4 License Considerations
- **Primary license**: MPL-2.0 (file-level copyleft)
- **Upstream portions**: MIT (preserved in UPSTREAM-LICENSE)
- **Implication**: Modifications to gitserver's `.rs` files must remain under MPL-2.0. Linking from alknet code is unrestricted.
- **Recommendation**: Use gitserver as a library dependency. If alknet-specific auth/ACL modifications are needed, contribute them upstream or maintain them as separate files under MPL-2.0.
---
## 9. Summary
### 9.1 Key Findings
1. **gitserver is a well-structured, library-first Rust Git Smart HTTP server** with clean separation between protocol logic (`gitserver-core`) and HTTP transport (`gitserver-http`).
2. **Protocol support is comprehensive**: Git Smart HTTP v1 and v2, clone, fetch, push (opt-in), shallow clones, delta compression, streaming pack generation.
3. **No SSH support exists**, but `gitserver-core` is transport-agnostic and can serve Git operations over any channel.
4. **Auth is simple but limited**: single global Basic/Bearer credential, no per-repo or per-user ACL.
5. **Storage is local-filesystem only**: `gix::open()` requires a local path. S3/rustfs integration requires a sync-to-local approach.
6. **The library design enables direct integration**: `GitBackend` and protocol functions can be called without HTTP.
### 9.2 Recommendation
**Use `gitserver-core` as alknet's Git protocol engine.** The core crate provides all Git protocol operations (ref advertisement, pack generation, receive-pack, protocol v2) without any HTTP dependency. This allows alknet to expose Git services through any interface (HTTP, SSH, call protocol) while maintaining a single protocol implementation.
**Use `gitserver-http` as alknet's Git HTTP interface** by nesting its Axum router under alknet's HTTP interface, with auth replaced by `IdentityProvider`.
**Design a `GitAdapter`** that wraps `gitserver-core` and integrates with alknet's `OperationRegistry`, `IdentityProvider`, and rustfs-backed storage.
### 9.3 Next Steps
1. Fork or vendor `gitserver-core` and `gitserver-http` into alknet's dependency tree
2. Design the `GitAdapter` trait with `IdentityProvider` auth and `GitAcl` access control
3. Implement Phase 1: HTTP gateway with nested Axum router and `IdentityProvider` auth
4. Implement `RepoStorage` trait with rustfs sync-to-local strategy
5. Design streaming extensions to alknet's call protocol for pack file transfer
6. Evaluate Phase 2: SSH Git adapter using `gitserver-core` directly over SSH channels
---
## References
- [gitserver README](https://github.com/WJQSERVER/gitserver) — project overview, quick start, CLI usage
- [gitserver Architecture docs](docs/en/architecture.md) — crate responsibilities, request flows
- [gitserver Library docs](docs/en/library.md) — embedding, dynamic registration, auth config
- [gitserver API Reference](docs/en/api.md) — REST endpoints, protocol details, error codes
- [alknet Interface Model](../../phase2/interface-model.md) — StreamInterface/MessageInterface design
- [gitoxide](https://github.com/GitoxideLabs/gitoxide) — underlying Git implementation library

View File

@@ -0,0 +1,963 @@
# OpenStack Keystone Identity Service — Reference Document
> Status: Research reference
> Created: 2026-06-08
> Context: alknet auth/identity system design; rustfs S3-compatible store with Keystone auth
## 1. Overview
OpenStack Keystone is the identity service for the OpenStack cloud platform. It
provides authentication, authorization, and service discovery via a RESTful HTTP
API. Every other OpenStack service (Nova, Neutron, Cinder, Swift, etc.) depends
on Keystone for token validation and access control.
Key responsibilities:
| Responsibility | Description |
|---|---|
| **Authentication** | Verify identity via passwords, tokens, TOTP, SAML, OIDC, application credentials |
| **Authorization** | Role-based access control (RBAC) across projects, domains, and system scope |
| **Service Catalog** | Registry of available services and their endpoint URLs |
| **Token Management** | Issue, validate, and revoke bearer tokens with scoped authorization |
| **Federation** | Accept identity assertions from external IdPs (SAML, OIDC) |
| **Trust Delegation** | Allow users to delegate limited authority to other users |
---
## 2. Core Concepts
### 2.1 Domains
A **domain** is a top-level namespace that contains users, groups, and projects.
Domains provide administrative isolation: a domain administrator can manage
users and projects within their domain but not across domains.
- Domains were introduced in the Identity API v3 (the "v3" API).
- Before domains, OpenStack used "tenants" (v2 API) — projects are the v3
equivalent, but domains add a containment boundary.
- Every user, group, and project belongs to exactly one domain.
- The `Default` domain is created automatically and holds all v2-compatible
resources.
**Key property**: Domains are the unit of administrative delegation. A domain
admin can create/delete users, groups, and projects within their domain.
### 2.2 Projects
A **project** is a container for resources — compute instances, storage volumes,
networks, etc. Projects are the primary scope for authorization in OpenStack.
- Projects group resources: "who can see/use these VMs and volumes?"
- Projects belong to a domain.
- Projects are the primary unit for role assignment and token scoping.
- Projects can be hierarchical (parent/child) with inherited role assignments.
**Key property**: A project-scoped token lets you operate on resources within
that project. You cannot use a project-scoped token to access resources in a
different project.
### 2.3 Users
A **user** represents a digital identity — a person, system account, or service
account that can authenticate and be authorized.
- Users belong to a domain.
- Users can have multiple authentication methods (password, TOTP, application
credentials, federated identity).
- Users can be members of groups.
- Users receive role assignments on projects, domains, or system scope.
### 2.4 Groups
A **group** is a named collection of users. Groups simplify role management: you
assign a role to a group on a project, and every user in the group inherits that
role.
- Groups belong to a domain.
- Groups are used for role assignment: `group:X → role:member → project:Y`.
- Federation mappings often resolve external IdP groups to local Keystone groups.
### 2.5 Roles
A **role** is a named permission set. Roles by themselves don't define what
operations are allowed — they are labels that policy files map to API operations.
- Roles are assigned by binding an actor (user or group) to a target (project,
domain, or system) with a role.
- Assignment format: `{actor, role, target}` — e.g., `{user:alice, member,
project:engineering}`.
- OpenStack defines default roles: `admin`, `member`, `reader`.
- Custom roles can be created. Policy files (policy.yaml) map roles to API
operations.
- **Implied roles**: one role can imply another (e.g., `admin` implies `member`
implies `reader`).
- **Inherited roles**: a role assigned on a domain with `inherited_to_projects`
flag propagates to all projects within that domain.
### 2.6 Endpoints
An **endpoint** is a network-accessible URL for an OpenStack service. Each
service registers one or more endpoints in Keystone's service catalog.
- Endpoints have an **interface** type:
- `public` — for end users (public network)
- `internal` — for service-to-service communication (internal network)
- `admin` — for administrative operations (restricted network)
- Endpoints have a **region** attribute for multi-region deployments.
- Endpoint URLs can contain template variables like `$(project_id)s` that are
resolved at token time.
### 2.7 Service Catalog
The **service catalog** is a registry of all services available in the
deployment and their endpoints. It is included in token responses and is
available via `GET /v3/auth/catalog`.
- A service has a `type` (e.g., `identity`, `compute`, `object-store`) and a
`name` (e.g., `keystone`, `nova`, `swift`).
- The `type` follows the [service-types authority][] — it identifies the API
contract, not the implementation version.
- The service catalog in a token is filtered by scope: a project-scoped token
shows only endpoints relevant to that project.
- Endpoint filtering allows administrators to restrict which endpoints are
visible to specific projects via project-endpoint associations or endpoint
groups.
[service-types authority]: https://service-types.openstack.org/
**Example service catalog entry:**
```json
{
"catalog": [
{
"name": "Keystone",
"type": "identity",
"endpoints": [
{
"interface": "public",
"url": "https://identity.example.com:5000/"
},
{
"interface": "internal",
"url": "https://identity.internal:5000/"
},
{
"interface": "admin",
"url": "https://identity.admin:5000/"
}
]
}
]
}
```
---
## 3. Token Lifecycle
### 3.1 Token Types by Scope
| Token Type | Scope | Contains | Use Case |
|---|---|---|---|
| **Unscoped** | None | User identity only, no roles, no catalog | Prove identity for subsequent scoped auth |
| **Project-scoped** | Project | Roles, catalog, project info | Operate on project resources (VMs, volumes) |
| **Domain-scoped** | Domain | Roles, catalog, domain info | Manage users/projects within a domain |
| **System-scoped** | System | Roles, catalog, system info | Cloud-wide admin operations |
| **Trust-scoped** | Trust | Delegated roles, trust metadata | Act on behalf of another user |
### 3.2 Authentication Flow
```
1. Client → POST /v3/auth/tokens (with credentials)
2. Keystone validates credentials
3. Keystone issues token:
- Token ID returned in X-Subject-Token header
- Token body (JSON) returned in response body
4. Client uses token: X-Auth-Token: <token_id> on subsequent requests
5. Services validate token:
- Option A: Local validation (Fernet/JWS — self-contained)
- Option B: Call Keystone to validate (UUID tokens)
```
### 3.3 Token Providers
| Provider | Format | Persistence | Size | Security |
|---|---|---|---|---|
| **Fernet** (default) | AES256-encrypted ciphertext + SHA256 HMAC | None (self-contained) | ~200 bytes | Symmetric keys; only Keystone can decrypt |
| **JWS** | JSON Web Signature (ES256) | None (self-contained) | ~800 bytes | Asymmetric keys; anyone can verify signature, payload is readable |
| **UUID** (legacy) | Random UUID string | Database (must be stored) | ~32 bytes | Requires database lookup for validation |
**Fernet tokens** are the recommended default. They are:
- Self-contained: no database persistence needed.
- Encrypted: the token payload is opaque to clients.
- Compact: much smaller than JWS tokens.
- Key rotation: Fernet keys are rotated using `keystone-manage fernet_rotate`.
**JWS tokens** are appropriate when:
- You want asymmetric key verification (services can validate without sharing
symmetric keys).
- You're comfortable with the payload being readable by anyone who has the token.
### 3.4 Token Contents
A project-scoped token contains:
```json
{
"token": {
"methods": ["password"],
"user": {
"id": "aaa...",
"name": "alice",
"domain": { "id": "default", "name": "Default" }
},
"project": {
"id": "bbb...",
"name": "engineering",
"domain": { "id": "default", "name": "Default" }
},
"roles": [
{ "id": "ccc...", "name": "member" },
{ "id": "ddd...", "name": "reader" }
],
"catalog": [ ... ],
"expires_at": "2026-06-08T12:00:00.000000Z",
"issued_at": "2026-06-08T11:00:00.000000Z",
"audit_ids": ["eeee..."],
"is_domain": false
}
}
```
Key fields:
- `methods`: Authentication methods used (e.g., `["password"]` or
`["password", "totp"]` for MFA).
- `user`: Who the token belongs to.
- `project` / `domain` / `system`: The authorization scope.
- `roles`: The roles assigned to the user within the scope.
- `catalog`: Service catalog (absent in unscoped tokens).
- `expires_at` / `issued_at`: Token validity window.
- `audit_ids`: Chain of audit IDs for tracking token derivation.
### 3.5 Token Validation
When a service receives a request with a token:
1. Extract `X-Auth-Token` header.
2. For Fernet tokens: decrypt with local Fernet key, parse payload, verify
expiration. Check revocation events.
3. For JWS tokens: verify signature with public key, parse payload, verify
expiration. Check revocation events.
4. For UUID tokens: call Keystone to validate. (Deprecated, but still supported.)
Keystone middleware (`keystonemiddleware`) handles this automatically for
OpenStack services.
### 3.6 Token Revocation
Tokens can be revoked explicitly (`DELETE /v3/auth/tokens`) or implicitly via
revocation events triggered by:
- User account disabled
- Domain disabled
- Project disabled
- Password changed (invalidates all tokens for that user)
- Role assignment changed (invalidates tokens for the affected scope)
Revocation events use pattern matching for efficiency — a single event can
invalidate many tokens (e.g., all tokens for a user, or all tokens for a project).
---
## 4. Scoping
### 4.1 Unscoped → Scoped Flow
The typical authentication flow is two-step:
1. **Authenticate** → receive an **unscoped token** (proves identity, no
authorization).
2. **Re-authenticate with scope** → receive a **scoped token** (proves identity
+ authorization).
```bash
# Step 1: Get unscoped token
curl -X POST /v3/auth/tokens -d '{
"auth": {
"identity": {
"methods": ["password"],
"password": { "user": { "name": "alice", "password": "..." } }
}
}
}'
# Step 2: Get project-scoped token using unscoped token
curl -X POST /v3/auth/tokens -d '{
"auth": {
"identity": {
"methods": ["token"],
"token": { "id": "<unscoped_token>" }
},
"scope": {
"project": { "name": "engineering", "domain": { "name": "Default" } }
}
}
}'
```
### 4.2 Scope Types and Authorization
| Scope | Token Can Do | Token Cannot Do |
|---|---|---|
| **Project** | Operate on project resources (VMs, storage, networks) | Manage domain users, system-wide operations |
| **Domain** | Manage users/projects within that domain | Operate on project resources (without project scope) |
| **System** | Cloud-wide admin: manage endpoints, services, hypervisor info | Project-specific resource operations |
| **None (unscoped)** | Prove identity to Keystone | Access any service resources |
A project-scoped token **cannot** be reused in a different project. Each scope
is a separate token. This is a deliberate security design: token scope limits
the blast radius of a compromised token.
### 4.3 Design Rationale
The scoping model exists because:
1. **Principle of least privilege**: Users authenticate once (expensive), then
get narrowly scoped tokens (cheap) for each operation context.
2. **Multi-tenancy**: A cloud serves many organizations; project scoping
prevents cross-tenant access.
3. **Administrative separation**: Domain admins manage users; system admins
manage infrastructure. Different scopes for different jobs.
---
## 5. Role-Based Access Control (RBAC)
### 5.1 Role Assignments
A role assignment binds an **actor** (user or group) to a **role** on a
**target** (project, domain, or system).
The four assignment types:
| Assignment | Actor | Target | Example |
|---|---|---|---|
| User → Project | User | Project | Alice is `member` of `engineering` |
| Group → Project | Group | Project | `dev-team` group is `member` of `engineering` |
| User → Domain | User | Domain | Alice is `admin` of `acme-domain` |
| Group → Domain | Group | Domain | `ops-team` group is `admin` of `acme-domain` |
Plus **system** role assignments for cloud-wide operations.
### 5.2 Effective Role Assignments
When querying role assignments with `effective=True`, Keystone resolves:
1. **Direct assignments**: Roles explicitly granted.
2. **Group memberships**: Roles inherited from groups the user belongs to.
3. **Inherited roles**: Roles from parent projects or domains (via
`inherited_to_projects` flag).
4. **Implied roles**: Roles implied by other roles (e.g., `admin` → `member`
→ `reader`).
### 5.3 Policy Enforcement
Keystone uses `oslo.policy` for policy enforcement. Each OpenStack service
defines policy rules in `policy.yaml` files. A rule maps an API operation to a
check string:
```yaml
"identity:create_project": "role:admin and domain_id:%(target.domain.id)s"
"identity:list_projects": "role:reader"
"identity:update_project": "role:admin or project_id:%(target.project.id)s"
```
Policy rules can check:
- Role membership (`role:admin`)
- Scope type (`system_scope:all`, `domain_id:...`)
- Resource ownership (`user_id:%(target.user.id)s`)
- Arbitrary target attributes
### 5.4 Scope Enforcement in Policy
Since the Rocky release, policies can require specific token scopes:
```yaml
# System-scoped token required
"identity:list_projects": "role:reader and system_scope:all"
# Project-scoped token required
"nova:create_server": "role:member and project_id:%(target.project.id)s"
```
This prevents:
- Using a project-scoped token for system operations.
- Using a system-scoped token for project operations (without a project context).
---
## 6. Trust Delegation (OS-TRUST)
### 6.1 Overview
Trusts allow one user (**trustor**) to delegate a subset of their authority to
another user (**trustee**) for a limited scope and duration, without sharing
credentials.
**Key properties of a trust:**
| Property | Description |
|---|---|
| `trustor_user_id` | User creating the trust (delegating authority) |
| `trustee_user_id` | User receiving the delegation |
| `project_id` | Project scope for the delegated authority |
| `roles` | Subset of trustor's roles being delegated |
| `impersonation` | If `true`, tokens appear to come from the trustor |
| `expires_at` | Optional expiration timestamp |
| `remaining_uses` | Optional limit on how many tokens can be created from this trust |
| `allow_redelegation` | Whether the trustee can create sub-trusts |
| `redelegation_count` | Maximum depth of redelegation chain |
### 6.2 Trust-Scoped Tokens
When a trustee authenticates using a trust:
1. The trustee authenticates with their own credentials.
2. They specify `trust_id` in the auth request.
3. Keystone issues a **trust-scoped token** with:
- Roles: the intersection of the trust's roles and the trustor's current
roles (if trustor lost a role, the trust is invalidated).
- `OS-TRUST:trust` section in the token body containing trust metadata.
If `impersonation=true`, the token's `user` field shows the trustor — the
trustee acts as the trustor. If `impersonation=false`, the token's `user`
field shows the trustee.
### 6.3 Trust Delegation Chains
Trusts support **redelegation**: a trustee can create a new trust delegating to
a third party. This creates a trust chain:
```
Trustor → Trust(A) → Trustee1
Trustee1 → Trust(B) → Trustee2 (redelegation)
```
Delegation depth is controlled by:
- `allow_redelegation: true/false`
- `redelegation_count: N` (decremented on each redelegation; default max is 3)
**Security constraints:**
- The redelegated trust's roles must be a subset of the original trustor's
roles (not the intermediate trustee's).
- If `impersonation=false` in the source trust, the redelegated trust cannot
set `impersonation=true`.
- Application credentials cannot create or delete trusts (prevents automated
escalation chains).
### 6.4 Automatic Trust Revocation
Trusts are automatically revoked (soft-deleted) when:
- The trustor is deleted.
- The trustee is deleted.
- The project is deleted.
- The trust expires (`expires_at`).
- The remaining uses are exhausted (`remaining_uses` reaches 0).
- The trustor loses a role that was delegated in the trust.
---
## 7. Application Credentials
### 7.1 Overview
Application credentials allow users to create long-lived, restricted credentials
for applications without exposing their password. This is especially important
for users whose identity comes from LDAP or SSO — applications can't use their
password.
**Key properties:**
| Property | Description |
|---|---|
| `name` | Unique name within the user's application credentials |
| `secret` | Auto-generated or user-provided secret (hashed on storage, shown once) |
| `project_id` | Project scope (always the user's current project) |
| `roles` | Subset of the user's roles on the project (cannot exceed user's roles) |
| `expires_at` | Optional expiration timestamp |
| `unrestricted` | `false` by default — restricted from creating/deleting other app creds and trusts |
### 7.2 Authentication with Application Credentials
```bash
# Auth with application credential ID + secret
curl -X POST /v3/auth/tokens -d '{
"auth": {
"identity": {
"methods": ["application_credential"],
"application_credential": {
"id": "aa809205ed614a0e854bac92c0768bb9",
"secret": "oKce6DOC_WcZoE13l3eX..."
}
}
}
}'
```
Or by name + user:
```bash
"application_credential": {
"name": "monitoring",
"user": { "name": "glance", "domain": { "name": "Default" } },
"secret": "securesecret"
}
```
### 7.3 Restriction Model
By default (`unrestricted=false`), application credentials **cannot**:
- Create or delete other application credentials.
- Create or delete trusts.
- List other application credentials.
This prevents a compromised app credential from regenerating itself or escalating
privileges. Setting `unrestricted=true` removes these restrictions, but adds
risk.
### 7.4 Rotation
Application credentials support **zero-downtime rotation**:
1. Create a new application credential (names must be unique per user).
2. Update the application configuration with the new ID/secret.
3. Delete the old application credential.
Multiple application credentials can coexist for the same user+project,
enabling seamless transitions.
### 7.5 Invalidation
Application credentials are automatically invalidated when:
- The user is deleted or disabled.
- The user's role assignment on the project changes (roles are checked at
auth time against the user's current roles).
- The project is deleted or disabled.
- The credential expires (`expires_at`).
- The credential is explicitly deleted.
---
## 8. Federation
### 8.1 Overview
Keystone's federation module allows external Identity Providers (IdPs) to
authenticate users, with Keystone acting as a Service Provider (SP). Keystone
maps the external identity to local users, groups, and roles.
**Supported protocols:**
| Protocol | Module | Use Case |
|---|---|---|
| **SAML 2.0** | mod_shib / mod_auth_mellon | Enterprise SSO |
| **OpenID Connect** | mod_auth_openidc | OAuth2/OIDC providers (Google, Keycloak, Okta) |
| **Mapped** | Custom auth module | Any HTTP auth module |
| **K2K** | Keystone-to-Keystone | Multi-cloud federation between OpenStack deployments |
### 8.2 Federation Architecture
```
┌──────────────────┐
│ External IdP │
│ (SAML/OIDC/...) │
└────────┬────────┘
SAML assertion or
OIDC claims
┌──────────┐ HTTPD auth module ┌───────────────┐
│ Browser │ ───────────────────────▶│ Apache/Nginx │
│ or CLI │ (mod_shib / │ + auth module │
└──────────┘ mod_auth_openidc) └───────┬────────┘
REMOTE_USER header
+ other attributes
┌──────────────────┐
│ Keystone │
│ (SP) │
│ │
│ 1. Lookup IdP │
│ 2. Apply mapping│
│ │ remote attrs │
│ │ → local user,│
│ │ groups, │
│ │ roles │
│ 3. Issue token │
└──────────────────┘
```
### 8.3 Key Federation Components
1. **Identity Provider** object — represents the external IdP in Keystone.
Has `remote_ids` (entity IDs) that Keystone uses to match incoming
requests.
2. **Mapping** — a set of rules that transform attributes from the external IdP
into Keystone-local user properties and group memberships. Mappings can:
- Map remote users to local users (by name, email, or other attributes).
- Assign users to local groups (inherit group role assignments).
- Dynamically create projects based on remote attributes.
- Support complex condition logic.
3. **Protocol** — links an Identity Provider to a Mapping. Supported values:
`saml2`, `openid`, `mapped`, or custom.
4. **Mapping rule example:**
```json
[{
"local": [{
"user": { "name": "{0}" },
"group": { "domain": { "name": "Default" }, "name": "federated_users" }
}],
"remote": [{ "type": "REMOTE_USER" }]
}]
```
This maps all authenticated external users to a local user (named by the
`REMOTE_USER` attribute) and adds them to the `federated_users` group.
### 8.4 Federation Token Flow
1. User authenticates with the external IdP.
2. The HTTPD auth module (Apache/Nginx) validates the assertion and sets
`REMOTE_USER` and other headers.
3. Keystone receives the request at `/v3/OS-FEDERATION/identity_providers/{idp}/protocols/{protocol}/auth`.
4. Keystone applies the mapping rules to produce a local user + groups + roles.
5. Keystone issues a **federated unscoped token**.
6. The user can then exchange it for a scoped token (project, domain, or
system) just like any other unscoped token.
### 8.5 Identity Provider (Keystone as IdP)
Keystone can also act as an **Identity Provider** (SAML IdP), allowing it to
authenticate users from other OpenStack deployments (K2K federation) or other
SAML SPs.
---
## 9. Service Catalog Deep Dive
### 9.1 Service Registration
Services are registered with Keystone via the API:
```bash
openstack service create --name nova --description "Compute" compute
openstack endpoint create --region RegionOne compute public https://nova.example.com:8774/
openstack endpoint create --region RegionOne compute internal https://nova.internal:8774/
openstack endpoint create --region RegionOne compute admin https://nova.admin:8774/
```
### 9.2 Catalog Filtering
The catalog returned in a token is filtered by:
1. **Scope**: A project-scoped token includes endpoints filtered by
project-endpoint associations.
2. **Endpoint groups**: Admins can define endpoint groups (filtered by service
type, region, or interface) and associate them with projects.
3. **Enabled/disabled**: Disabled services and endpoints don't appear in the
catalog.
4. **Interface visibility**: `public`, `internal`, and `admin` endpoints serve
different audiences.
### 9.3 URL Templating
Endpoint URLs support template variables:
- `$(project_id)s` — replaced with the token's project ID
- `$(user_id)s` — replaced with the token's user ID
Example:
```
https://object-store.example.com/v1/KEY_$(project_id)s
```
When a project-scoped token is issued, the catalog resolves this to:
```
https://object-store.example.com/v1/KEY_d12af07f4e2c4390a21acc31517ebec9
```
### 9.4 Client Discovery
An OpenStack client authenticates with Keystone, receives a token (which
includes the service catalog), and then uses the catalog to discover the URL
for any service it needs:
```python
# After authentication, the catalog is in the token response:
for service in token['catalog']:
if service['type'] == 'compute':
for endpoint in service['endpoints']:
if endpoint['interface'] == 'public':
nova_url = endpoint['url']
break
```
This is how every OpenStack client discovers service endpoints — they never
hardcode URLs. They authenticate once, get the catalog, and dynamically route
to the correct endpoint.
---
## 10. Mapping to alknet Concepts
### 10.1 Concept Comparison Table
| Keystone Concept | alknet Concept | Notes |
|---|---|---|
| Domain | (Not directly mapped) | alknet is single-tenant/small-team focused; no need for domain-level admin boundaries yet |
| Project | `Identity.resources` | Projects scope resources; alknet's `resources: HashMap<String, Vec<String>>` serves a similar scoping purpose |
| User | `Identity.id` | Keystone users ↔ alknet identities (fingerprint or UUID) |
| Group | (Not directly mapped) | Could be added via `Identity.scopes` patterns or a groups concept in alknet-storage |
| Role | `Identity.scopes` | Keystone roles map to alknet scopes: `["relay:connect", "service:gitea:read"]` ≈ role assignments |
| Token (scoped) | `AuthToken` + scoped permissions | alknet's AuthToken proves identity + timestamp; scopes come from IdentityProvider lookup |
| Service Catalog | `OperationRegistry` + OpenAPI spec generation | Both solve service discovery; Keystone is runtime API catalog, alknet generates from OpenAPI |
| Trust Delegation | (Potential future model) | alknet doesn't have delegation yet; trust model could inspire future `DelegationToken` |
| Application Credentials | API keys in `api_keys` table | alknet's `api_keys` table parallels app creds: long-lived, scoped, user-bound |
| Federation (SAML/OIDC) | Phase D OIDC provider aspiration | alknet wants to *be* an OIDC provider; Keystone consumes external IdPs |
| Service Endpoint | (Implicit in OperationEnv) | alknet operations are discovered via registry, not external endpoint lookup |
| Policy (policy.yaml) | `ForwardingPolicy` + call protocol ACL | Both enforce "who can do what where"; alknet is code-based, not YAML-configured |
### 10.2 What to Adopt from Keystone
#### 10.2.1 Scoped Tokens (Strong Adopt)
**Keystone pattern**: Unscoped → project/domain/system scoped token flow.
**alknet application**: Currently, `AuthToken` proves identity with a timestamp.
`Identity.scopes` and `Identity.resources` are resolved *after* token
verification by `IdentityProvider`. This is analogous to Keystone's flow:
| Keystone | alknet |
|---|---|
| Unscoped token (identity only) | AuthToken (proves key possession + timestamp) |
| Scoped token (identity + roles + catalog) | Identity (resolved by IdentityProvider with scopes + resources) |
| Re-auth with scope | Not needed — alknet scopes come from the `IdentityProvider` lookup |
**Recommendation**: alknet's current model is already similar to Keystone's, but
more streamlined. alknet doesn't need a separate "re-auth with scope" step
because the `IdentityProvider` resolution *is* the scoping step. However,
consider adding explicit scope fields to the token in the future for
multi-tenant deployments.
#### 10.2.2 Service Catalog Pattern (Strong Adopt)
**Keystone pattern**: Services register endpoints; clients discover them from
the token/catalog.
**alknet application**: The `OperationRegistry` + `OpenAPIServiceRegistry`
serves a similar purpose:
- Keystone: `POST /v3/auth/tokens` → response includes catalog of services
and URLs.
- alknet: `OperationRegistry` knows all available operations; `FromOpenAPI`
generates them from specs.
**Key difference**: In Keystone, the catalog is returned *with the token* and
is dynamic (filtered by project scope). In alknet, the registry is built at
startup from configuration, and access control is enforced per-operation in the
call protocol.
**Recommendation**: Consider adding a "service discovery" operation to the
call protocol — a way for clients to ask "what operations are available to me?"
This would be analogous to Keystone's `GET /v3/auth/catalog`.
#### 10.2.3 Role Hierarchies and Implied Roles (Moderate Adopt)
**Keystone pattern**: Roles can imply other roles (`admin` → `member` →
`reader`). Role assignments on domains propagate to projects via inheritance.
**alknet application**: Currently, alknet's scopes are flat strings. Consider:
```
admin:service:* → implies → member:service:* → implies → reader:service:*
```
This would simplify scope assignment in the `IdentityProvider`: grant `admin:service:*`
and automatically get `member` and `reader` permissions.
**Recommendation**: Implement implied scopes as a Phase 2+ feature when
alknet-storage adds the ACL graph. Don't over-engineer in Phase 1.
#### 10.2.4 Application Credentials (Strong Adopt — alreded parallels)
**Keystone pattern**: Password-less auth with restricted capabilities, tied to a
user and project, with expiration and rotation support.
**alknet application**: The `api_keys` table in alknet-storage is exactly this:
| Keystone App Credential | alknet API Key |
|---|---|
| `id` + `secret` | `key_prefix` + `key_hash` |
| `roles` (subset of user's roles) | `scopes` (subset of account's scopes) |
| `project_id` (scope) | Account-scoped |
| `expires_at` | `expires_at` |
| `unrestricted` | (not yet implemented) |
| Rotation via create-new-then-delete | (not yet implemented) |
**Recommendation**: Add the `unrestricted` concept to API keys — by default,
API keys should NOT be able to create or delete other API keys or modify
account settings. Also add rotation support (create new key, update config,
delete old key).
#### 10.2.5 Trust Delegation (Future Consideration)
**Keystone pattern**: Trustor delegates limited authority to trustee with
impersonation, expiration, usage limits, and redelegation chains.
**alknet application**: alknet doesn't have this yet, but it could be useful
for:
- **Service-to-service auth**: An alknet node delegates limited authority to a
service wrapper (e.g., "let the rustfs wrapper access S3 on my behalf for 1
hour").
- **Temporary access grants**: "Give Alice access to the `engineering` scope
for 24 hours."
- **Impersonation for audit**: Trusted services acting on behalf of a user,
with the user's identity appearing in audit logs.
**Recommendation**: Design a `DelegationToken` or `Trust` model when
alknet-storage is built. The trust model — trustor, trustee, roles, expiration,
remaining_uses — is a good template.
#### 10.2.6 Federation (Phase D Alignment)
**Keystone pattern**: External IdPs (SAML, OIDC) authenticate users; Keystone
maps them to local identities via mapping rules.
**alknet application**: Phase D of `credential-provider.md` envisions alknet
*as* an OIDC provider for self-hosted services. This is the **inverse** of
Keystone's federation model:
- Keystone: external IdP → Keystone (SP) → local identity
- alknet Phase D: alknet (IdP) → rustfs/gitea (SP) → local identity on self-hosted service
**Key learning from Keystone's federation model**:
1. **Mapping rules** are critical. Keystone's mapping engine (`local` ← `remote`)
is how IdP attributes become local roles. alknet will need the inverse:
`Identity.scopes` → OIDC claims → rustfs/gitea policies.
2. **Group membership from federation** is temporary by default (valid for
token lifetime). alknet should consider whether federated identities are
permanent or session-scoped.
3. **Multiple IdP support**: Keystone can consume from multiple external IdPs.
alknet Phase D should support multiple SPs (multiple self-hosted services)
consuming from one alknet IdP.
**Recommendation**: When building Phase D, study Keystone's mapping rule
format. alknet will need a similar concept: `alknet.scope → oidc.claim →
service.policy`. This could be part of the `CredentialProvider` or a new
`IdentityMappingProvider`.
### 10.3 What NOT to Adopt from Keystone
#### 10.3.1 Domains (Not Needed)
Keystone's domain model is designed for multi-tenant cloud hosting where
different organizations share the same OpenStack deployment. alknet is designed
for self-hosted, single-organization or small-team deployments. The domain
concept adds complexity that doesn't justify itself in alknet's use case.
alknet's `Identity.resources` already provides a lightweight scoping mechanism
that covers the "which resources does this identity have access to" use case
without the overhead of a domain hierarchy.
#### 10.3.2 Separate Policy Engine (Over-Engineering)
Keystone's `oslo.policy` is a full YAML-based policy engine with complex rule
combinations (`role:admin AND domain_id:%(target.domain.id)s OR
project_id:%(target.project.id)s`). alknet's authorization model is
programmatic (Rust code in `ForwardingPolicy` and call protocol handlers), not
configured via YAML. This is appropriate for alknet's size and complexity.
**If** alknet needs configurable policies in the future (e.g., admin-editable
ACL rules stored in the database), a simple rule engine would suffice — not the
full oslo.policy model.
#### 10.3.3 Multiple Token/Scope Types (Unnecessary Complexity)
Keystone has separate token types for project/domain/system scope. alknet's
`AuthToken` is already simpler: it proves identity + timestamp, and the
`IdentityProvider` resolves scopes. There's no need for alknet to issue
different token types for different scopes.
If multi-tenancy is added in the future, the `Identity.resources` map can
encode project equivalents without needing a separate token type.
#### 10.3.3 Service Endpoint Registration (Unnecessary)
Keystone requires every service to register its endpoints in the catalog
before it can be discovered. alknet services are registered programmatically
(via `OperationRegistry::register()`) at startup, not via a central API. The
`OperationRegistry` is built from configuration and OpenAPI specs, not from a
catalog service.
This is appropriate for alknet's architecture: services are known at deploy
time, not dynamically registered. If dynamic service discovery is needed later,
a simple registry operation in the call protocol would suffice.
---
## 11. Summary of Recommendations
| Keystone Concept | Adoption Level | alknet Implementation |
|---|---|---|
| **Scoped tokens** | ✅ Strong Adopt | Already present in IdentityProvider resolution (AuthToken → Identity with scopes/resources) |
| **Service catalog** | ✅ Strong Adopt | `OperationRegistry` + `FromOpenAPI`; consider adding "list operations" discovery |
| **Application credentials** | ✅ Strong Adopt | `api_keys` table parallels exactly; add `unrestricted` flag and rotation support |
| **Role hierarchies / implied roles** | ⚡ Moderate | Implied scope hierarchies in Phase 2+ when ACL graph is built |
| **Trust delegation** | ⚡ Moderate | Design `DelegationToken` model for service-to-service and temporary access in Phase 2+ |
| **Federation mapping** | ⚡ Moderate | Phase D: adopt `scope → claim → policy` mapping pattern for OIDC provider |
| **Token revocation events** | ⚡ Moderate | Consider pattern-matching revocation for efficiency when alknet-storage supports it |
| **Domains** | ❌ Skip | alknet is self-hosted/small-team; `Identity.resources` provides lightweight scoping |
| **oslo.policy (YAML-based)** | ❌ Skip | alknet uses programmatic auth (Rust code); add simple rule engine only if needed |
| **Multiple token types** | ❌ Skip | One token type with scope resolution via `IdentityProvider` is sufficient |
| **Endpoint registration API** | ❌ Skip | `OperationRegistry` is configured at startup, not via a catalog API |
---
## 12. References
- [Keystone Architecture — OpenStack Docs](https://docs.openstack.org/keystone/2024.2/getting-started/architecture.html)
- [Keystone Tokens Overview](https://docs.openstack.org/keystone/latest/admin/tokens-overview.html)
- [Keystone Service Catalog Overview](https://docs.openstack.org/keystone/latest/contributor/service-catalog.html)
- [Keystone Trusts Documentation](https://docs.openstack.org/keystone/latest/user/trusts.html)
- [Keystone Application Credentials](https://docs.openstack.org/keystone/queens/user/application_credentials.html)
- [Keystone Federation Configuration](https://docs.openstack.org/keystone/latest/admin/federation/configure_federation.html)
- [Keystone RBAC and Authorization — DeepWiki](https://deepwiki.com/openstack/keystone/4-authorization-and-access-control)
- [Keystone Authentication and Token Management — DeepWiki](https://deepwiki.com/openstack/keystone/3-authentication-and-token-management)
- [Keystone Trust Delegation — DeepWiki](https://deepwiki.com/openstack/keystone/4.4-trust-delegation)
- [Keystone Service Catalog — DeepWiki](https://deepwiki.com/openstack/keystone/5.4-service-catalog)
- [Keystone Token Revocation — DeepWiki](https://deepwiki.com/openstack/keystone/3.4-token-revocation)
- [Understanding OpenStack Keystone: Scoped vs. Unscoped Tokens](https://osie.io/blog/understanding-openstack-keystone-scoped-vs-unscoped-tokens)
- [Trust Delegation in OpenStack Using Keystone Trusts](https://blog.zhaw.ch/icclab/trust-delegation-in-openstack-using-keystone-trusts/)
- [OpenStack Knowledge: Keystone Federation](https://github.com/stackers-network/openstack-knowledge/blob/main/core/identity/federation.md)
- [alknet identity.md](../../architecture/identity.md)
- [alknet auth.md](../../architecture/auth.md)
- [alknet credential-provider.md](../phase2/credential-provider.md)

View File

@@ -0,0 +1,732 @@
# RustFS Reference Document
> Status: Research Complete
> Last updated: 2026-06-08
> Source: /workspace/rustfs/ (cloned repository, v1.0.0-beta.7)
> Context: alknet internal service integration research
---
## 1. Architecture Overview
### What is RustFS?
RustFS is a high-performance, distributed, S3-compatible object storage system written in Rust. It is an Apache 2.0-licensed alternative to MinIO that combines S3 API compatibility with OpenStack Swift/Keystone support, designed for data lake, AI, and big data workloads.
**Key characteristics:**
- Language: Rust (edition 2024, MSRV 1.95.0)
- License: Apache 2.0 (no AGPL restrictions)
- Workspace: 57 crates in a flat `crates/` layout
- Main binary: `rustfs/` (75K lines); core engine: `crates/ecstore/` (87K lines)
- Version: 1.0.0-beta.7
### Ports and Endpoints
| Port | Purpose |
|------|---------|
| 9000 | S3 API (primary data path) + Admin API (`/minio/` prefix) |
| 9001 | Web Console UI |
### Request Flow
```
HTTP request
→ server (TLS, auth, routing, compression)
→ app/object_usecase (validation, policy, lifecycle)
→ storage/ecfs (erasure coding, encryption, checksums)
→ ecstore (disk pool selection, data distribution)
→ rio (reader pipeline: encrypt → compress → hash → write)
→ io-core (zero-copy I/O, buffer pool, direct I/O)
→ local disk / remote disk via RPC
```
### Key Crate Map (Security & Auth Focus)
| Crate | Lines | Purpose |
|-------|-------|---------|
| `credentials` | 713 | Credential types (access key / secret key), global credentials |
| `signer` | 1.4K | AWS Signature V4 request signing |
| `iam` | 9.0K | Identity and Access Management (users, groups, policies, OIDC) |
| `policy` | 8.8K | S3 bucket/IAM policy engine |
| `keystone` | 1.9K | OpenStack Keystone auth integration |
| `appauth` | 143 | Application-level auth tokens |
| `crypto` | 1.6K | Encryption primitives |
| `kms` | 8.1K | Key management service integration |
| `protocols` | 18K | FTP/FTPS, WebDAV, Swift API support |
| `s3-ops` | — | S3 operation definitions and mapping |
| `s3-types` | — | S3 event type definitions |
### Startup Sequence (Auth-Relevant Steps)
1. Environment variable compatibility (`MINIO_*``RUSTFS_*`)
2. Tokio runtime construction
3. CLI argument parsing
4. Config parsing, credentials/endpoints initialization
5. HTTP server start (S3 API + optional console)
6. ECStore initialization
7. **Steps 13: Bucket metadata, IAM, Keystone, OIDC** initialization
8. FullReady → serving requests
---
## 2. S3 API Compatibility
### Supported S3 Operations
RustFS implements a substantial subset of the S3 API via the `s3s` crate (a fork/custom build at `https://github.com/rustfs/s3s`). Based on the feature status table and crate structure:
| Category | Status | Details |
|----------|--------|---------|
| Core Object Ops (GET/PUT/DELETE/HEAD) | ✅ Available | Primary data path |
| Multipart Upload | ✅ Available | Upload, download, multipart |
| Versioning | ✅ Available | Object versioning |
| Bucket Operations | ✅ Available | Create, list, delete, metadata |
| Logging | ✅ Available | Access logging |
| Event Notifications | ✅ Available | Webhook, Kafka, AMQP, MQTT, NATS targets |
| Bitrot Protection | ✅ Available | Checskums at storage layer |
| Single Node Mode | ✅ Available | Single-node deployment |
| Bucket Replication | ✅ Available | Cross-region replication |
| KMS | 🚧 Under Testing | Key management service |
| Lifecycle Management | 🚧 Under Testing | Object lifecycle rules |
| Distributed Mode | 🚧 Under Testing | Multi-node erasure coding |
| Admin API | ✅ Available | `/minio/` prefix, 30+ handler modules |
| Console | ✅ Available | Web UI on port 9001 |
| S3 Select | ✅ Available | `s3select-api` + `s3select-query` crates |
| WebDAV | ✅ Available | `protocols` crate, `dav-server` |
| FTP/FTPS | ✅ Available | `libunftp`, `suppaftp` |
| SFTP | — | `russh` + `russh-sftp` crate deps |
### Authentication Methods
RustFS supports multiple authentication methods (derived from `auth.rs`):
| Auth Type | Constant | Detection |
|-----------|----------|-----------|
| AWS Signature V4 (header) | `Signed` | `Authorization: AWS4-HMAC-SHA256 ...` |
| AWS Signature V4 (query) | `Presigned` | `X-Amz-Credential` in query |
| AWS Signature V2 (header) | `SignedV2` | `Authorization: AWS ...` |
| AWS Signature V2 (query) | `PresignedV2` | `AWSAccessKeyId` in query |
| Streaming V4 | `StreamingSigned` | `x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD` |
| Streaming V4 (trailer) | `StreamingSignedTrailer` | `STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER` |
| Unsigned payload (trailer) | `StreamingUnsignedTrailer` | `STREAMING-UNSIGNED-PAYLOAD-TRAILER` |
| POST policy | `PostPolicy` | `multipart/form-data` content type |
| Bearer JWT | `JWT` | `Authorization: Bearer ...` |
| STS | `STS` | `Action` header presence |
| Anonymous | `Anonymous` | No `Authorization` header |
| Keystone token | — | `X-Auth-Token` header (via middleware) |
### S3 Request Signing
The `rustfs-signer` crate implements AWS Signature V4. The general flow:
1. Client computes a canonical request (method + path + query + headers + payload hash)
2. Client creates a string to sign (algorithm + timestamp + credential scope + canonical request hash)
3. Client computes HMAC-SHA256 signature using the secret key
4. Client sends the `Authorization` header with the signature
---
## 3. OpenStack Swift and Keystone Integration
### Swift API
RustFS provides an **OpenStack Swift-compatible API** as an opt-in feature (behind the `swift` cargo feature flag). This is implemented in `crates/protocols/src/swift/`.
**Swift API endpoint pattern:** `/v1/AUTH_{project_id}/...`
**Supported Swift operations:**
- Container CRUD (create, list, delete, metadata)
- Object CRUD with streaming downloads
- Keystone token authentication
- Multi-tenant isolation with SHA256-based bucket prefixing
- Server-side object copy (COPY method)
- HTTP Range requests (206/416 responses)
- Custom metadata (X-Object-Meta-*, X-Container-Meta-*)
**Not yet implemented:** Account-level ops, large object support (>5GB), object versioning, container ACLs/CORS, TempURL, XML/plain-text response formats.
**Tenant isolation:** Swift containers are mapped to S3 buckets with a secure hash prefix:
```
Swift: /v1/AUTH_abc123/mycontainer
→ S3 Bucket: {sha256(abc123)[0:16]}-mycontainer
```
### Keystone Authentication — Complete Flow
This is the most auth-relevant subsystem for alknet integration.
#### Configuration (Environment Variables)
| Variable | Description | Default |
|----------|-------------|---------|
| `RUSTFS_KEYSTONE_ENABLE` | Enable Keystone auth | `false` |
| `RUSTFS_KEYSTONE_AUTH_URL` | Keystone endpoint URL | (required) |
| `RUSTFS_KEYSTONE_VERSION` | API version (`v3` or `v2.0`) | `v3` |
| `RUSTFS_KEYSTONE_ADMIN_USER` | Admin username | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_PASSWORD` | Admin password | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_PROJECT` | Admin project/tenant | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_DOMAIN` | Admin domain | `Default` |
| `RUSTFS_KEYSTONE_VERIFY_SSL` | Verify TLS certificates | `true` |
| `RUSTFS_KEYSTONE_ENABLE_CACHE` | Enable token caching | `true` |
| `RUSTFS_KEYSTONE_CACHE_SIZE` | Token cache capacity | `10000` |
| `RUSTFS_KEYSTONE_CACHE_TTL` | Token cache TTL (seconds) | `300` |
| `RUSTFS_KEYSTONE_TENANT_PREFIX` | Enable tenant project prefixing | `true` |
| `RUSTFS_KEYSTONE_IMPLICIT_TENANTS` | Auto-create tenants | `true` |
| `RUSTFS_KEYSTONE_TIMEOUT` | Request timeout (seconds) | `30` |
#### Architecture: Component Stack
```
KeystoneClient (HTTP calls to Keystone v3 API)
KeystoneAuthProvider (Authentication + Caching via moka::future::Cache)
KeystoneAuthMiddleware (Tower layer, intercepts HTTP requests)
↓ (task-local: KEYSTONE_CREDENTIALS)
IAMAuth → check_key_valid (Authorization)
RustFS Credentials (access_key starts with "keystone:")
```
#### Authentication Flow
**Request with `X-Auth-Token` header:**
1. **Middleware intercepts:** `KeystoneAuthMiddleware` extracts `X-Auth-Token` header
2. **Cache check:** Token cache hit → return cached credentials (~1-2ms)
3. **Token validation:** Cache miss → `KeystoneClient.validate_token()``GET /v3/auth/tokens` with `X-Auth-Token` and `X-Subject-Token` headers
4. **Token parsing:** Parse `KeystoneToken` (user_id, username, project_id, project_name, domain, roles, expires_at)
5. **Credential mapping:** Convert to `Credentials` struct:
- `access_key`: `keystone:<user_id>` (special prefix identifies Keystone users)
- `secret_key`: `""` (empty — bypasses AWS SigV4 verification)
- `session_token`: the Keystone token string
- `parent_user`: Keystone username
- `groups`: roles list
- `claims`: JSON map with `keystone_user_id`, `keystone_project_id`, `keystone_roles`, `auth_source: "keystone"`
6. **Task-local storage:** Store credentials in `KEYSTONE_CREDENTIALS` task-local (async-scoped to request)
7. **Auth bypass:** IAMAuth detects `keystone:` prefix → returns empty secret key, bypassing SigV4
8. **Authorization:** `check_key_valid()` retrieves credentials from task-local storage
9. **Role check:** `admin` or `reseller_admin` roles → `is_owner=true`; other roles → `is_owner=false`
**Request without `X-Auth-Token`:**
1. Middleware passes through unchanged
2. Standard AWS SigV4 authentication proceeds
3. IAM validation as normal
**Invalid token:**
1. Middleware returns `401 Unauthorized` immediately with XML error body
2. **No fallback** to standard S3 auth
#### EC2 Credentials
RustFS also supports Keystone EC2 credentials for S3 API compatibility:
- `POST /v3/ec2tokens` with `{access, signature, data}` validates EC2-style credentials
- `GET /v3/users/{user_id}/credentials/OS-EC2` lists EC2 credentials for a user
- Access key format: `user_id:project_id` or `user_id`
#### Role Mapping (Keystone → RustFS)
| Keystone Role | RustFS Policy | Permissions |
|---------------|---------------|-------------|
| `admin` | AdminPolicy | Full access (`s3:*`) |
| `Admin` | AdminPolicy | Full access |
| `Member` | ReadWritePolicy | Read/write |
| `_member_` | ReadOnlyPolicy | Read-only |
| `ResellerAdmin` | AdminPolicy | Full access |
| `SwiftOperator` | ReadWritePolicy | Read/write |
| `objectstore:admin` | AdminPolicy | Full access |
| `objectstore:creator` | ReadWritePolicy | Read/write |
Custom role mappings can be added programmatically via `KeystoneIdentityMapper::add_role_mapping()`.
#### Multi-Tenancy
When `RUSTFS_KEYSTONE_TENANT_PREFIX=true`:
- Bucket creation: `mybucket` → stored as `project_id:mybucket`
- Bucket listing: filtered by project_id
- Access control: users can only access their project's buckets
---
## 4. Authentication Model — Complete Reference
### Credentials Struct
The core `Credentials` struct (in `rustfs-credentials`):
```rust
pub struct Credentials {
pub access_key: String, // S3 access key (or "keystone:<user_id>")
pub secret_key: String, // S3 secret key (empty for Keystone)
pub session_token: String, // STS session token / Keystone token
pub expiration: Option<OffsetDateTime>, // Token expiration
pub status: String, // "active" or "off"
pub parent_user: String, // Parent user for STS/service accounts
pub groups: Option<Vec<String>>, // Group membership
pub claims: Option<HashMap<String, Value>>, // JWT/Keystone claims
pub name: Option<String>, // Human-readable name
pub description: Option<String>,
}
```
Key methods:
- `is_expired()` — checks if the credential's expiration has passed
- `is_temp()` — true if `session_token` is non-empty and not expired
- `is_service_account()` — true if claims contain `sa-policy` key and `parent_user` is non-empty
- `is_valid()` — access_key >= 3 chars, secret_key >= 8 chars, not expired, status != "off"
- Default credentials: `rustfsadmin` / `rustfsadmin` (env vars: `RUSTFS_ACCESS_KEY` / `RUSTFS_SECRET_KEY`)
### IAM System
The IAM system (`rustfs-iam`) manages:
- **Users and groups** with RBAC
- **Service accounts** and API key authentication
- **Policy engine** with fine-grained S3-style permissions
- **LDAP/Active Directory** integration
- **Session management** and token validation
- **OIDC integration** (full OpenID Connect with PKCE)
The IAM system is initialized as a singleton (`IAM_SYS`) backed by an `ObjectStore` (persisted in the S3 storage itself). Lookups go through `IamSys::check_key(access_key)` which loads from cache or disk.
### OIDC Support
RustFS has comprehensive OIDC support (`rustfs-iam``oidc.rs`):
**Configuration (environment variables):**
- `RUSTFS_IDENTITY_OPENID_ENABLE=on`
- `RUSTFS_IDENTITY_OPENID_CONFIG_URL` — OIDC discovery URL
- `RUSTFS_IDENTITY_OPENID_CLIENT_ID` — OAuth2 client ID
- `RUSTFS_IDENTITY_OPENID_CLIENT_SECRET` — OAuth2 client secret
- `RUSTFS_IDENTITY_OPENID_SCOPES` — comma-separated scopes (default: `openid,profile,email`)
- `RUSTFS_IDENTITY_OPENID_GROUPS_CLAIM` — claim for group membership
- `RUSTFS_IDENTITY_OPENID_ROLES_CLAIM` — claim for role mapping (Microsoft Entra ID app roles)
- `RUSTFS_IDENTITY_OPENID_CLAIM_NAME` — primary claim for policy mapping
- `RUSTFS_IDENTITY_OPENID_CLAIM_PREFIX` — prefix for claim-to-policy mapping
- `RUSTFS_IDENTITY_OPENID_REDIRECT_URI` — callback URL
- `RUSTFS_IDENTITY_OPENID_REDIRECT_URI_DYNAMIC` — allow dynamic redirect URIs
**Features:**
- Authorization Code flow with PKCE
- OIDC discovery and JWKS auto-refresh
- Multiple OIDC providers (suffixed env vars like `_PRIMARY`, `_SECONDARY`)
- ID token verification (signature, issuer, audience, expiry)
- `AssumeRoleWithWebIdentity` flow (JWT directly, no browser)
- Roles and groups claim mapping to RustFS IAM policies
- Provider-specific configuration (Microsoft Entra ID roles claim support)
**OIDC Claims → RustFS Policy Mapping:**
```json
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["admin:*"],
"Resource": ["arn:aws:s3:::*"],
"Condition": {
"ForAnyValue:StringEquals": {
"jwt:roles": ["RustFS.ConsoleAdmin"]
}
}
}]
}
```
### RPC Authentication
RustFS uses a derived RPC secret for inter-node communication:
- Environment variable: `RUSTFS_RPC_SECRET` (explicit) or derived from `access_key + secret_key` via HMAC-SHA256
- Uses a `0xFFFFFFFFFFFFFFFF` mask for the signing context
- Base64url-encoded (no padding) output
---
## 5. Docker Deployment
### Simple Deployment
```yaml
# docker-compose-simple.yml
services:
rustfs:
image: rustfs/rustfs:latest
ports:
- "9000:9000" # S3 API
- "9001:9001" # Console
environment:
- RUSTFS_VOLUMES=/data/rustfs{0...3}
- RUSTFS_ADDRESS=0.0.0.0:9000
- RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001
- RUSTFS_ACCESS_KEY=rustfsadmin
- RUSTFS_SECRET_KEY=rustfsadmin
- RUSTFS_OBS_LOGGER_LEVEL=info
volumes:
- rustfs_data_0:/data/rustfs0
- rustfs_data_1:/data/rustfs1
- rustfs_data_2:/data/rustfs2
- rustfs_data_3:/data/rustfs3
```
### Full Deployment (with Observability)
```yaml
# docker-compose.yml (with --profile observability)
services:
rustfs:
# ... same as above, plus:
- RUSTFS_OBS_ENDPOINT=http://otel-collector:4318
otel-collector: # OpenTelemetry collector
tempo: # Distributed tracing
jaeger: # Jaeger UI
prometheus: # Metrics
loki: # Logs
grafana: # Dashboards
nginx: # Reverse proxy (optional, --profile proxy)
```
### Dockerfile
- Base: Alpine 3.23.4
- Runs as non-root user `rustfs` (UID/GID 10001:10001)
- Single binary: `/usr/bin/rustfs`
- Entrypoint: `/entrypoint.sh` (processes volumes, log dirs, default credential warnings)
- Health check: HTTP/HTTPS `/health` on port 9000, `/rustfs/console/health` on 9001
- Supports TLS via `RUSTFS_TLS_PATH=/opt/tls` with `rustfs_cert.pem` + `rustfs_key.pem` + optional `ca.crt`
### Keystone-Enabled Deployment
```bash
docker run -d \
-p 9000:9000 -p 9001:9001 \
-e RUSTFS_ACCESS_KEY=admin \
-e RUSTFS_SECRET_KEY=adminsecret \
-e RUSTFS_KEYSTONE_ENABLE=true \
-e RUSTFS_KEYSTONE_AUTH_URL=http://keystone:5000 \
-e RUSTFS_KEYSTONE_VERSION=v3 \
-e RUSTFS_KEYSTONE_ADMIN_USER=admin \
-e RUSTFS_KEYSTONE_ADMIN_PASSWORD=secret \
-e RUSTFS_KEYSTONE_ADMIN_PROJECT=admin \
-e RUSTFS_KEYSTONE_ADMIN_DOMAIN=Default \
-v /data:/data \
rustfs/rustfs:latest
```
### Webhook Notification
```bash
docker run -d --name rustfs -p 9000:9000 \
-e RUSTFS_NOTIFY_ENABLE=true \
-e RUSTFS_NOTIFY_WEBHOOK_ENABLE_PRIMARY=on \
-e RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_PRIMARY=http://host:3020/webhook \
-e RUSTFS_NOTIFY_WEBHOOK_QUEUE_DIR_PRIMARY=/tmp/rustfs-events \
rustfs/rustfs:latest
```
---
## 6. SDK/Client Libraries — Rust S3 Clients
### aws-sdk-s3 (Official AWS SDK for Rust)
RustFS itself uses `aws-sdk-s3` (v1.135.0) as a dependency — this is the most mature Rust S3 client:
```toml
aws-sdk-s3 = { version = "1.135.0", default-features = false, features = ["sigv4a", "default-https-client", "rt-tokio"] }
aws-config = { version = "1.8.18" }
aws-credential-types = { version = "1.2.14" }
```
**Pros:** Full S3 API coverage, SigV4/SigV4a signing, async, production-tested
**Cons:** Heavy dependency (pulls in significant AWS SDK surface area), AWS-centric abstractions
### s3s (RustFS's own S3 framework)
RustFS uses a custom `s3s` crate (`https://github.com/rustfs/s3s`, with `minio` feature):
```toml
s3s = { git = "https://github.com/rustfs/s3s", rev = "507e1312b211c3ddc214b03875d6fabd15d22ed5", features = ["minio"] }
```
This provides S3 request/response types, routing, and the `S3Auth` trait used by RustFS's `IAMAuth`.
### rust-s3 ( Community)
Not used by RustFS, but worth noting as an alternative:
- Crate: `rust-s3` / `s3`
- Simpler API than aws-sdk-s3
- Supports MinIO-compatible endpoints
- Less complete S3 operation coverage
### Recommendation for alknet
For alknet's S3 adapter:
- **Internal use**: aws-sdk-s3, configured with custom endpoint pointing to rustfs
- **Request signing**: If building a lightweight adapter, extract just the signing logic from `rustfs-signer` or use `aws-smithy-runtime` directly
- **The CredentialSet::S3AccessKey variant** (from alknet's credential-provider.md) maps directly to RustFS's `access_key + secret_key` pair; no additional transformation needed
---
## 7. Relevance to Alknet
### 7.1 RustFS as an Internal Object Store Behind Alknet's HTTP Interface
**Architecture:**
```
Client (any S3 SDK)
→ Alknet HTTP adapter (port 443/80 with HTTPS termination)
→ RustFS (port 9000, Docker network, not exposed externally)
→ Disk storage (/data volumes)
```
**Deployment pattern:** RustFS runs as a Docker container on the same Docker network as alknet, listening only on the internal network. Alknet's HTTP interface reverse-proxies S3 API calls to rustfs.
**Reverse proxy considerations:**
- Alknet would forward `Host`, `Authorization`, `X-Auth-Token`, `X-Amz-*` headers unchanged
- RustFS needs the real client IP for S3 policy `SourceIp` conditions; alknet should set `X-Forwarded-For` and configure `RUSTFS_TRUSTED_PROXIES` or use rustfs's `trusted-proxies` crate
- Health check: Alknet proxies `/health` → rustfs:9000
- RustFS supports `X-Forwarded-Proto` for TLS offloading via its `trusted-proxies` crate
**Why behind alknet rather than standalone:**
1. Unified TLS termination at alknet
2. alknet can inject auth headers (e.g., OIDC tokens) before forwarding
3. alknet can enforce rate limiting and access control
4. Network isolation — rustfs only accessible via alknet
**Webhook integration:** RustFS can POST events to alknet via its notification system:
```bash
RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_PRIMARY=http://alknet:3020/webhook
```
### 7.2 Mapping S3 Auth to Alknet's CredentialProvider/CredentialSet
The alknet `CredentialSet` enum directly models the S3 auth pattern:
| RustFS Auth Method | Alknet CredentialSet Variant | Mapping |
|---|---|---|
| Access key + secret key (SigV4) | `S3AccessKey { access_key, secret_key, session_token }` | Direct 1:1 mapping; access_key and secret_key are the S3 credential pair |
| Keystone X-Auth-Token | `OidcToken { access_token, ... }` | Keystone token → OIDC access_token; expires_at maps to Keystone token expiration |
| STS AssumeRole session | `S3AccessKey { ..., session_token: Some(...) }` | STS temporary credentials with session token |
| OIDC (browser flow) | `OidcToken { access_token, refresh_token, expires_at }` | Direct mapping |
| Admin default credentials | `S3AccessKey { access_key: "rustfsadmin", secret_key: "rustfsadmin" }` | Service-level credential |
**S3 Request Signing (Phase C in credential-provider.md):**
The `S3AccessKey` variant contains the raw credential data. The signing computation itself is separate — it's a utility function `s3_sign(credential: &S3AccessKey, request: &HttpRequest) -> SignedRequest` that should live in a shared `alknet-s3` utility crate, not in `CredentialSet`. This matches OpenQ-04 in the credential-provider doc.
**For alknet's `S3CredentialManager`:**
```rust
impl CredentialManager for S3CredentialManager {
fn refresh(&self, current: &CredentialSet) -> Option<CredentialSet> {
// If we have an STS session token, check expiration
// and re-AssumeRole if needed
}
fn is_expired(&self, current: &CredentialSet) -> bool {
match current {
CredentialSet::S3AccessKey { session_token: Some(t), .. }
if !t.is_empty() => check_sts_expiration(t),
CredentialSet::OidcToken { expires_at: Some(ts), .. }
=> *ts < now(),
_ => false, // Static keys don't expire
}
}
fn provision(&self, identity: &Identity) -> Option<CredentialSet> {
// Create a rustfs IAM access key for this alknet identity
// via the rustfs admin API
}
}
```
### 7.3 Alknet as an OIDC Provider for RustFS (Phase D)
This is the most strategically important integration point. RustFS already has complete OIDC support — it just needs an OIDC provider to trust.
**How it would work:**
1. **alknet exposes OIDC endpoints** (via call protocol HTTP adapter or a dedicated `/oidc/` path):
- `GET /.well-known/openid-configuration` — discovery document
- `GET /oidc/authorize` — authorization endpoint
- `POST /oidc/token` — token exchange
- `GET /oidc/userinfo` — user info
- `GET /oidc/jwks` — JSON Web Key Set
- `GET /oidc/logout` — RP-initiated logout
2. **alknet's Identity maps to OIDC claims:**
- `sub``Identity.id` (SSH fingerprint or account UUID)
- `email` → from account metadata (if available)
- `username` → display name or `Identity.id`
- `groups``Identity.scopes` (e.g., `["s3:admin", "s3:readwrite"]`)
- `roles` → derived from scopes (e.g., `scope "s3:admin"` → role `"admin"`)
3. **RustFS configuration** (pointing at alknet):
```bash
RUSTFS_IDENTITY_OPENID_ENABLE=on
RUSTFS_IDENTITY_OPENID_CONFIG_URL=https://alknet:443/.well-known/openid-configuration
RUSTFS_IDENTITY_OPENID_CLIENT_ID=alknet-rustfs-client
RUSTFS_IDENTITY_OPENID_CLIENT_SECRET=<auto-generated>
RUSTFS_IDENTITY_OPENID_SCOPES=openid,profile,email,groups
RUSTFS_IDENTITY_OPENID_GROUPS_CLAIM=groups
RUSTFS_IDENTITY_OPENID_ROLES_CLAIM=roles
```
4. **Authentication flow:**
- User connects to alknet (via SSH/WebTransport/HTTP)
- alknet resolves identity → `Identity { id, scopes, resources }`
- User requests access to rustfs console
- Browser redirects to alknet's OIDC authorize endpoint
- alknet issues authorization code → token exchange → ID token
- RustFS verifies the ID token using alknet's JWKS endpoint
- RustFS maps `groups` and `roles` claims to IAM policies
5. **For `AssumeRoleWithWebIdentity` (programmatic access):**
- alknet issues a JWT directly to the client
- Client presents JWT to RustFS via `Action=AssumeRoleWithWebIdentity`
- RustFS calls `OidcSys::verify_web_identity_token()` which:
- Decodes JWT payload to get `iss` claim
- Finds matching OIDC provider (alknet)
- Verifies signature, issuer, audience, expiry
- Extracts claims → maps to RustFS policies
**This eliminates stored credentials entirely** — alknet identities authenticate directly to rustfs via OIDC, no `S3AccessKey` needed.
### 7.4 Alknet RustFS Adapter Architecture
An alknet HTTP/HTTPS adapter for the S3 API would look like:
```
alknet HTTP adapter
├── Route: /s3/* → reverse proxy to rustfs:9000
│ ├── Preserve all S3 headers (Authorization, X-Amz-*, X-Auth-Token, Content-*)
│ ├── Set X-Forwarded-For, X-Forwarded-Proto
│ ├── Optionally inject X-Auth-Token from alknet Identity
│ └── Response streaming (for large object downloads)
├── Route: /s3/health → rustfs:9000/health (health check)
└── Route: /s3/admin/* → rustfs:9000/minio/* (admin API)
```
**Key considerations:**
- S3 requests can be very large (multipart uploads, 5TB+ objects). The adapter must support streaming both request and response bodies without buffering.
- `X-Forwarded-For` must be set so rustfs can evaluate `SourceIp` condition keys in bucket policies.
- RustFS already handles `X-Forwarded-Proto` for HTTPS offloading via its `trusted-proxies` crate.
- For OIDC integration, the adapter doesn't need to modify auth headers — rustfs handles OIDC token validation itself when pointed at alknet's OIDC endpoint.
**Alknet's `OpenAPIServiceRegistry` integration:**
Since rustfs exposes an S3 API, alknet could auto-register S3 operations via an OpenAPI spec or hardcoded operation specs:
```rust
// In alknet's service registry:
let s3_ops = FromOpenAPI(s3_openapi_spec, config);
// Where config.auth = CredentialSet::S3AccessKey { access_key, secret_key, session_token: None }
// Or: config.auth = CredentialSet::OidcToken { access_token, refresh_token, expires_at }
```
---
## 8. Key RustFS Source Files for Reference
| File | Purpose |
|------|---------|
| `crates/credentials/src/credentials.rs` | `Credentials` struct, global credentials, key generation |
| `crates/credentials/src/constants.rs` | Default access/secret keys, IAM policy constants |
| `crates/signer/` | AWS Signature V4 implementation |
| `crates/keystone/src/config.rs` | Keystone configuration from env vars |
| `crates/keystone/src/client.rs` | Keystone v3 API client (token validation, EC2 creds, admin auth) |
| `crates/keystone/src/auth.rs` | `KeystoneAuthProvider` (token → `Credentials` mapping) |
| `crates/keystone/src/middleware.rs` | Tower middleware extracting `X-Auth-Token`, task-local storage |
| `crates/keystone/src/identity.rs` | `KeystoneIdentityMapper` (role → policy, tenant prefix) |
| `crates/iam/src/oidc.rs` | Complete OIDC system (discovery, PKCE, token exchange, JWT verification) |
| `crates/iam/src/sys.rs` | `IamSys` (IAM singleton, user/key management) |
| `crates/policy/` | S3 bucket/IAM policy evaluation engine |
| `rustfs/src/auth.rs` | `IAMAuth`, `check_key_valid`, auth type detection, condition values |
| `rustfs/src/server/` | HTTP server, TLS, routing, middleware stack |
| `crates/protocols/src/swift/` | OpenStack Swift API implementation |
| `Dockerfile` / `docker-compose-simple.yml` | Deployment configuration |
---
## 9. Configuration Quick Reference
### RustFS Docker Environment Variables (Auth-Relevant)
| Variable | Description | Default |
|----------|-------------|---------|
| `RUSTFS_ACCESS_KEY` | Root access key | `rustfsadmin` |
| `RUSTFS_SECRET_KEY` | Root secret key | `rustfsadmin` |
| `RUSTFS_ADDRESS` | S3 API listen address | `0.0.0.0:9000` |
| `RUSTFS_CONSOLE_ADDRESS` | Console listen address | `0.0.0.0:9001` |
| `RUSTFS_CONSOLE_ENABLE` | Enable web console | `true` |
| `RUSTFS_TLS_PATH` | TLS certificate directory | (none, HTTP) |
| `RUSTFS_KEYSTONE_ENABLE` | Enable Keystone auth | `false` |
| `RUSTFS_KEYSTONE_AUTH_URL` | Keystone v3 endpoint | (required if enabled) |
| `RUSTFS_KEYSTONE_VERSION` | Keystone API version | `v3` |
| `RUSTFS_KEYSTONE_ADMIN_USER` | Keystone admin user | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_PASSWORD` | Keystone admin password | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_PROJECT` | Keystone admin project | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_DOMAIN` | Keystone admin domain | `Default` |
| `RUSTFS_KEYSTONE_VERIFY_SSL` | Verify Keystone TLS | `true` |
| `RUSTFS_KEYSTONE_CACHE_SIZE` | Token cache size | `10000` |
| `RUSTFS_KEYSTONE_CACHE_TTL` | Token cache TTL (sec) | `300` |
| `RUSTFS_KEYSTONE_TENANT_PREFIX` | Enable tenant prefixing | `true` |
| `RUSTFS_IDENTITY_OPENID_ENABLE` | Enable OIDC | `off` |
| `RUSTFS_IDENTITY_OPENID_CONFIG_URL` | OIDC discovery URL | (required) |
| `RUSTFS_IDENTITY_OPENID_CLIENT_ID` | OIDC client ID | (required) |
| `RUSTFS_IDENTITY_OPENID_CLIENT_SECRET` | OIDC client secret | (optional) |
| `RUSTFS_IDENTITY_OPENID_SCOPES` | OIDC scopes | `openid,profile,email` |
| `RUSTFS_IDENTITY_OPENID_GROUPS_CLAIM` | Groups claim name | `groups` |
| `RUSTFS_IDENTITY_OPENID_ROLES_CLAIM` | Roles claim name | (empty, opt-in) |
| `RUSTFS_RPC_SECRET` | Inter-node RPC auth secret | (derived from keys) |
| `RUSTFS_NOTIFY_WEBHOOK_ENABLE_PRIMARY` | Enable webhook notifications | `off` |
| `RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_PRIMARY` | Webhook URL | (required) |
---
## 10. Summary of Integration Paths
### Phase A (Immediate): Static S3 Credentials
- Deploy rustfs as a Docker service next to alknet
- Configure `RUSTFS_ACCESS_KEY` and `RUSTFS_SECRET_KEY`
- alknet stores these as `CredentialSet::S3AccessKey`
- alknet's HTTP adapter reverse-proxies S3 calls to rustfs
- Use `aws-sdk-s3` or `rust-s3` as the client library
**Effort:** Low. No auth changes in either system.
### Phase B: OIDC via External Provider
- Configure rustfs `RUSTFS_IDENTITY_OPENID_*` to point at an external OIDC provider (e.g., Keycloak, Authentik, Microsoft Entra ID)
- alknet can still manage its own auth independently
- Both systems trust the same OIDC provider
**Effort:** Low. Configuration-only change in rustfs.
### Phase C: Managed Credentials
- alknet provisions rustfs access keys via admin API (`/minio/` endpoints)
- `S3CredentialManager` handles session token rotation
- Identity-bound credentials: alknet creates per-user access keys in rustfs IAM
**Effort:** Medium. Requires admin API client, credential lifecycle management.
### Phase D: Alknet as OIDC Provider (Target State)
- alknet exposes OIDC endpoints (`.well-known/openid-configuration`, `/oidc/authorize`, `/oidc/token`, `/oidc/jwks`)
- rustfs trusts alknet as its OIDC provider
- `Identity.scopes` maps to rustfs IAM policies (e.g., `s3:admin` → admin policy)
- No stored S3 credentials — users authenticate directly via alknet identity
- `AssumeRoleWithWebIdentity` for programmatic access
**Effort:** High. Requires building OIDC authorization server in alknet. This is the most elegant but most complex path.
---
## References
- [RustFS GitHub](https://github.com/rustfs/rustfs) — v1.0.0-beta.7
- [RustFS Documentation](https://docs.rustfs.com)
- [RustFS Keystone README](file:///workspace/rustfs/crates/keystone/README.md) — comprehensive Keystone integration docs
- [RustFS OIDC implementation](file:///workspace/rustfs/crates/iam/src/oidc.rs) — full OIDC client with PKCE, discovery, JWKS refresh
- [RustFS auth.rs](file:///workspace/rustfs/rustfs/src/auth.rs) — IAMAuth, check_key_valid, auth type detection
- [alknet credential-provider.md](file:///workspace/@alkdev/alknet/docs/research/phase2/credential-provider.md) — alknet's outbound auth design
- [alknet identity.md](file:///workspace/@alkdev/alknet/docs/architecture/identity.md) — alknet's inbound auth design