Files
alknet-firewall/docs/research/modern-python-project-setup.md
glm-5.1 cf464c2296 feat: initial architecture specification and research
Phase 0→1 setup for alknet-firewall — a behavioral signal detection
library that screens untrusted LLM inputs using small model activations.

Architecture docs (5 specs, 10 ADRs, 7 open questions):
- overview: vision, scope, dependencies, package structure
- firewall: core API, alarm protocol, score composition, error handling
- codebook: SVD basis, spline distributions, calibration, tensor format
- model: activation extraction, model-agnostic interface, lazy loading
- configuration: thresholds, model selection, detection tuning

Research reports:
- modern-python-project-setup: uv, pyproject.toml, src layout, ruff, CI
- python-ml-packaging: optional PyTorch, HF Hub download, safetensors
- llm-input-safety-landscape: threat taxonomy, defenses, academic evidence

Agent role adaptations for Python project (replaced Rust conventions).
2026-06-13 05:17:40 +00:00

903 lines
31 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Research: Modern Python Project Setup (2026)
**Project context**: Python library for LLM input safety/firewall. Uses PyTorch (inference only), transformers, and sklearn. Distributed as a pip-installable package.
**Date**: June 2026
---
## Table of Contents
1. [uv Project Setup](#1-uv-project-setup)
2. [pyproject.toml Best Practices](#2-pyprojecttoml-best-practices)
3. [Source Layout](#3-source-layout)
4. [Testing Setup](#4-testing-setup)
5. [Linting and Formatting](#5-linting-and-formatting)
6. [CI/CD Basics](#6-cicd-basics)
7. [Python Version Targeting](#7-python-version-targeting)
8. [Recommended Configuration for alknet-firewall](#8-recommended-configuration-for-alknet-firewall)
---
## 1. uv Project Setup
### Overview
uv (by Astral, the Ruff company) is the 2026 consensus Python package manager. Written in Rust, it replaces pip, venv, virtualenv, pip-tools, pyenv (for Python version management), and the project-management layer of Poetry — in a single binary that is 10100x faster than legacy tools. As of June 2026, uv is at v0.9.26 and is the default choice for new Python projects.
**Key capabilities**: Python installation, project initialization, dependency management, virtual environments, lockfiles, building, and publishing — all from one tool.
### `uv init` vs Manual `pyproject.toml` Creation
| Approach | When to Use | Pros | Cons |
|----------|-------------|------|------|
| `uv init --lib` | New projects | Scaffolds src layout, creates .python-version, README, py.typed marker, build system, git init | Generated `requires-python` may be too narrow (defaults to latest Python on system) |
| Manual `pyproject.toml` | Existing projects, migrating from Poetry/setuptools | Full control over structure | More boilerplate, risk of missing required fields |
**Recommendation for this project**: Use `uv init --lib` and then customize. It generates the correct src layout and a complete `pyproject.toml` with a build system. After init, widen `requires-python` to your actual target range (e.g., `>=3.10`).
```bash
# Initialize a library project
uv init --lib alknet-firewall
# This creates:
# alknet-firewall/
# ├── .python-version
# ├── README.md
# ├── pyproject.toml
# └── src/
# └── alknet_firewall/
# ├── py.typed
# └── __init__.py
```
The `--build-backend` flag lets you choose an alternative backend: `hatchling`, `flit-core`, `pdm-backend`, `setuptools`, `maturin`, or `scikit-build-core`. The default is `uv_build`.
### Core uv Commands
| Command | Purpose | Key Flags |
|---------|---------|-----------|
| `uv add <pkg>` | Add a dependency | `--dev` (dev group), `--group <name>`, `--optional <extra>` |
| `uv remove <pkg>` | Remove a dependency | Same flags as add |
| `uv sync` | Install all dependencies from lockfile | `--locked` (CI: fail if lockfile stale), `--extra <name>`, `--dev` / `--no-dev` |
| `uv run <cmd>` | Run command in project venv | Automatically activates the right environment |
| `uv lock` | Resolve and lock dependencies | Creates/updates `uv.lock` |
| `uv build` | Build sdist + wheel | Outputs to `dist/`; use `--no-sources` before publishing |
| `uv publish` | Upload to PyPI | `--token`, `--index <name>`; supports OIDC trusted publishing |
| `uv version` | Bump project version | `--bump minor`, `--bump patch`, `1.0.0` (exact) |
**Important**: `uv sync --locked` is the CI-safe variant. It fails if `uv.lock` is out of date, ensuring reproducible builds. Always commit `uv.lock` to version control.
### Virtual Environment Management
uv manages virtual environments automatically. You never need to run `source .venv/bin/activate`. Instead, use `uv run <command>` which automatically uses the correct environment. The venv is created at `.venv/` on first `uv sync` or `uv add`.
uv also uses a global cache with hardlinks/Copy-on-Write, so packages like PyTorch (2+ GB) are only stored once on disk even across multiple projects.
---
## 2. pyproject.toml Best Practices
### Build System Selection
For a pure-Python library in 2026, the options are:
| Build Backend | Status | Best For | Our Recommendation |
|---------------|--------|----------|-------------------|
| **uv_build** | Production/Stable (since June 2026) | Pure Python libraries; zero-config | **Recommended** — default for `uv init`, fastest builds, tightest uv integration |
| hatchling | Stable, mature | Projects needing build hooks, VCS-derived versions, complex layouts | Good alternative if you need hatch-vcs or custom build hooks |
| setuptools | Legacy standard | Maintaining existing projects, C extensions | Avoid for new projects |
| flit-core | Minimal | Very simple single-module packages | Too minimal for our needs |
**Recommendation**: Use `uv_build`. It is now marked Production/Stable, is the default for `uv init --lib`, auto-discovers src layout, and is 1035x faster than setuptools/hatchling at build time. Our project is pure Python with ML dependencies — no C extensions — so uv_build is the right fit.
```toml
[build-system]
requires = ["uv_build>=0.11,<0.12"]
build-backend = "uv_build"
```
> The upper bound on `uv_build` version follows Astral's recommendation — it ensures your package continues to build correctly as new versions are released, since the build backend follows the same versioning policy as uv itself.
### Structure of the `[project]` Section
Follow PEP 621. Here is the recommended structure:
```toml
[project]
name = "alknet-firewall"
version = "0.1.0"
description = "LLM input safety/firewall library"
readme = "README.md"
license = "MIT" # Or { file = "LICENSE" }
requires-python = ">=3.10"
authors = [
{ name = "Your Name", email = "you@example.com" },
]
classifiers = [
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Security",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
]
dependencies = [
"scikit-learn>=1.5",
"transformers>=4.40",
]
[project.urls]
Homepage = "https://github.com/your-org/alknet-firewall"
Repository = "https://github.com/your-org/alknet-firewall"
Issues = "https://github.com/your-org/alknet-firewall/issues"
```
### Dependency Groups vs Extras vs Optional Dependencies
This is a critical distinction for our project, especially for handling PyTorch.
| Concept | Table | Published? | Use Case |
|---------|-------|------------|----------|
| **Core dependencies** | `[project].dependencies` | Yes | Always required at runtime |
| **Optional dependencies (extras)** | `[project.optional-dependencies]` | Yes | User-installable feature groups (`pip install alknet-firewall[torch]`) |
| **Dependency groups** | `[dependency-groups]` | No | Dev/test/docs dependencies; local to development |
**PEP 735** (accepted October 2024) standardized Dependency Groups. They are:
- NOT published in built distributions (unlike extras)
- NOT installable by end users (they don't appear in package metadata)
- Used for dev/test/lint dependencies that only developers need
- Installable via `uv sync --group <name>` or `uv add --dev/--group <name>`
#### How to Handle PyTorch
PyTorch is large (2+ GB for CPU, 3+ GB for GPU) and has different install sources for CPU vs GPU variants. **Do not put PyTorch in `[project].dependencies`**. Instead, use `[project.optional-dependencies]` with extras, combined with `[tool.uv.sources]` and `[tool.uv.index]` to handle CPU/GPU variants.
**Strategy**:
```toml
[project.optional-dependencies]
torch = ["torch>=2.2"] # Generic: pip install alknet-firewall[torch]
torch-cpu = ["torch>=2.2"] # CPU-specific
torch-gpu = ["torch>=2.2"] # GPU-specific
[tool.uv]
conflicts = [[{ extra = "torch-cpu" }, { extra = "torch-gpu" }]]
[tool.uv.sources]
torch = [
# macOS: CPU from PyPI
{ index = "pytorch-cpu-mac", extra = "torch-cpu", marker = "platform_system == 'Darwin'" },
# Linux CPU: from PyTorch CPU index
{ index = "pytorch-cpu", extra = "torch-cpu", marker = "platform_system != 'Darwin'" },
# GPU: from PyTorch CUDA index
{ index = "pytorch-gpu", extra = "torch-gpu" },
# Default (no extra specified): from PyPI
{ index = "pytorch-cpu-mac", extra = "torch", marker = "platform_system == 'Darwin'" },
{ index = "pytorch-cpu", extra = "torch", marker = "platform_system != 'Darwin'" },
]
[[tool.uv.index]]
name = "pytorch-cpu-mac"
url = "https://pypi.python.org/simple"
explicit = true
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true
[[tool.uv.index]]
name = "pytorch-gpu"
url = "https://download.pytorch.org/whl/cu126" # Adjust for your CUDA version
explicit = true
```
**Installation commands for end users**:
```bash
pip install alknet-firewall # Core only (sklearn + transformers)
pip install alknet-firewall[torch] # With PyTorch (auto-selects CPU variant by OS)
uv sync --extra torch-cpu # Dev: explicit CPU variant
uv sync --extra torch-gpu # Dev: explicit GPU variant
```
> **Important**: `explicit = true` on index definitions ensures uv only uses those indexes for packages that explicitly reference them (via `[tool.uv.sources]`), not as a general package source.
#### Dev Dependencies
Use `[dependency-groups]` (PEP 735 standard, supported by uv) for development-only dependencies:
```toml
[dependency-groups]
dev = [
"ruff>=0.11",
"pytest>=8.0",
"pytest-cov>=5.0",
"mypy>=1.10",
"pre-commit>=3.7",
]
test = [
"pytest>=8.0",
"pytest-cov>=5.0",
{ include-group = "dev" }, # Include dev group
]
```
**Adding dev dependencies with uv**:
```bash
uv add --dev ruff pytest pytest-cov mypy pre-commit
```
This automatically populates `[dependency-groups].dev`.
**Key difference from extras**: Dependency groups are never published. Users installing your package from PyPI will never see them. They exist only for developers working on the project.
### Summary: Where Each Dependency Goes
| Dependency | Location | Why |
|-----------|----------|-----|
| scikit-learn | `[project].dependencies` | Always required at runtime |
| transformers | `[project].dependencies` | Always required at runtime |
| torch | `[project.optional-dependencies]` | Large; only needed for model inference |
| ruff, pytest, mypy | `[dependency-groups].dev` | Development only; not published |
---
## 3. Source Layout
### `src/` Layout vs Flat Layout
**The modern consensus for libraries is the `src/` layout.** The Python Packaging User Guide, uv's `--lib` template, and most major projects now use it.
#### Flat Layout (Avoid for Libraries)
```
alknet_firewall/
├── __init__.py
├── classifier.py
pyproject.toml
tests/
```
#### `src/` Layout (Recommended)
```
src/
└── alknet_firewall/
├── __init__.py
├── py.typed
├── classifier.py
pyproject.toml
tests/
```
### Why `src/` Layout Wins
1. **Prevents accidental imports**: Python adds `cwd` to `sys.path`. With flat layout, `import alknet_firewall` picks up the local directory instead of the installed package. This masks packaging bugs (missing files, wrong `__init__.py`) that only surface after `pip install`.
2. **Forces proper editable installs**: With `src/`, you must install the package (via `uv sync`) before you can import it. This catches packaging issues early — if it imports in development, it'll import after install.
3. **Better test isolation**: Tests run against the installed package, not the source tree. This matches what users experience.
4. **Type checker friendliness**: Type checkers like mypy and ty need explicit root configuration. With `src/`, the configuration is unambiguous.
5. **uv_build default**: The uv build backend auto-discovers packages under `src/` by default. Zero configuration needed.
### Namespace Packages with `src/` Layout
If you later want a namespace package (e.g., `alknet.firewall`), uv_build supports this via the `module-name` configuration:
```toml
[tool.uv.build-backend]
module-name = "alknet.firewall"
```
With the directory structure:
```
src/
└── alknet/
└── firewall/
├── __init__.py
```
> **Note**: For namespace packages, the `__init__.py` is omitted from the `alknet/` directory (the shared namespace), but included in `alknet/firewall/`.
### Recommendation for This Project
Use `src/alknet_firewall/` layout. It's what `uv init --lib` generates, it's the modern standard, and it prevents the class of packaging bugs that flat layout allows.
---
## 4. Testing Setup
### pytest Configuration
pytest remains the standard testing framework in 2026. Configure it in `pyproject.toml`:
```toml
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-v --tb=short"
filterwarnings = [
"error",
"ignore::DeprecationWarning:transformers",
"ignore::FutureWarning:sklearn",
]
```
### Test Directory Structure
```
tests/
├── conftest.py # Shared fixtures
├── test_classifier.py # Unit tests for classifier module
├── test_firewall.py # Unit tests for firewall logic
├── test_integration/ # Integration tests (slower, may need models)
│ ├── __init__.py
│ ├── test_model_loading.py
│ └── test_end_to_end.py
└── fixtures/ # Test data / mock models
├── sample_inputs.json
└── mock_tokenizer/ # Small tokenizer for fast tests
```
### Coverage Configuration
Use `pytest-cov` (which wraps coverage.py). Configure in `pyproject.toml`:
```toml
[tool.coverage.run]
source = ["alknet_firewall"]
source_pkgs = ["alknet_firewall"]
[tool.coverage.report]
exclude_lines = [
"pragma: no cover",
"if TYPE_CHECKING",
"raise NotImplementedError",
"if __name__ == .__main__.",
]
fail_under = 80 # Enforce minimum coverage
show_missing = true
```
**Run with coverage**:
```bash
uv run pytest --cov --cov-report=term-missing
```
### Testing with ML Model Dependencies
This is a key challenge. ML models are large and can't be committed to the repo. Strategies:
1. **Separate unit tests from integration tests**:
- Unit tests mock model loading and inference. Fast, no model files needed.
- Integration tests load actual models. Mark with `@pytest.mark.slow` or `@pytest.mark.integration`.
- Use `pytest.mark` to skip integration tests in CI by default:
```toml
[tool.pytest.ini_options]
markers = [
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
"integration: marks tests that require model files",
]
```
2. **Use small/dummy models for testing**:
- For sklearn: Train tiny models on synthetic data in fixtures.
- For transformers: Use `distilbert-base-uncased` or `prajjwal1/bert-tiny` — small models that download in seconds.
- Cache model files locally in `.cache/` (add to `.gitignore`).
3. **conftest.py fixtures**:
```python
# tests/conftest.py
import pytest
from unittest.mock import MagicMock
@pytest.fixture
def mock_classifier():
"""Fast mock classifier for unit tests — no model loading."""
clf = MagicMock()
clf.predict.return_value = [0] # Safe
clf.predict_proba.return_value = [[0.1, 0.9]]
return clf
@pytest.fixture(scope="session")
def tiny_model():
"""Load a real tiny model for integration tests."""
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model_name = "prajjwal1/bert-tiny"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
return model, tokenizer
```
4. **Conditional model download**:
- Use `pytest.mark.skipif` to skip tests that need models when they're not available.
- Or download models in CI setup step and cache them across runs.
5. **Offline CI for unit tests**:
```bash
uv run pytest -m "not integration" # Fast, no downloads
uv run pytest -m integration # Requires model download
```
---
## 5. Linting and Formatting
### The 2026 Standard Toolchain
| Concern | Tool | What It Replaces | Status |
|---------|------|------------------|--------|
| Linting + Formatting | **Ruff** | flake8, black, isort, pyupgrade, bandit | Industry standard |
| Type Checking | **mypy** (strict) or **ty** (beta) | — | mypy is stable default; ty is emerging fast alternative |
**Ruff** is the undisputed 2026 standard for linting and formatting. It replaces 6+ tools with one Rust binary that processes large codebases in milliseconds. Used by FastAPI, Hugging Face, LangChain, and most major Python projects.
### Type Checking: mypy vs ty vs Pyright
| Tool | Status | Speed | Spec Conformance | IDE Integration | Recommendation |
|------|--------|-------|-------------------|-----------------|---------------|
| **mypy** | Stable, mature | Baseline | Reference implementation | Good (via mypy daemon or LSP) | **Safe default** for production |
| **ty** | Beta (Astral) | 10-60x faster than mypy | ~53% of test suite (growing) | Built-in language server | **Adopt if willing to tolerate beta**; excellent for new projects |
| **Pyright/Pylance** | Stable | 5x faster than mypy | 98% spec conformance | Best-in-class (VS Code native) | Best for VS Code users; less CLI-friendly |
**Practical recommendation**: Use **mypy** for CI stability today. Add **ty** as a secondary check if you want faster local feedback. If the team uses VS Code, Pylance (which wraps Pyright) provides the best editor experience regardless of which CLI checker you use.
### Ruff Configuration
```toml
[tool.ruff]
line-length = 100
target-version = "py310"
[tool.ruff.lint]
select = [
"E", # pycodestyle errors
"W", # pycodestyle warnings
"F", # pyflakes
"I", # isort (import sorting)
"B", # flake8-bugbear (common Python gotchas)
"UP", # pyupgrade (auto-modernize syntax)
"S", # flake8-bandit (security checks) — relevant for security library
"C4", # flake8-comprehensions
"SIM", # flake8-simplify
"TCH", # flake8-type-checking (optimize TYPE_CHECKING blocks)
"RUF", # Ruff-specific rules
]
ignore = [
"E501", # Line too long (handled by formatter)
"S101", # Use of assert (fine in tests)
]
[tool.ruff.lint.per-file-ignores]
"tests/**" = ["S101", "S311"] # Allow assert and random in tests
[tool.ruff.format]
docstring-code-format = true
quote-style = "double"
```
### mypy Configuration
```toml
[tool.mypy]
python_version = "3.10"
strict = true
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = true
# Third-party libraries without stubs
[[tool.mypy.overrides]]
module = ["sklearn.*", "transformers.*", "torch.*"]
ignore_missing_imports = true
```
> `ignore_missing_imports` for sklearn/transformers/torch is necessary because these packages don't always ship complete type stubs. As they improve, you can tighten this.
### Pre-commit Hooks
Use pre-commit to catch issues before they reach CI:
```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.11.0
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.10.0
hooks:
- id: mypy
additional_dependencies: [types-requests]
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-toml
- id: check-added-large-files
args: [--maxkb=1024] # Prevent committing large model files
```
**Install and run**:
```bash
uv run pre-commit install # Install hooks
uv run pre-commit run --all-files # Run on all files
```
### Daily Workflow
```bash
# Before committing — run all quality checks
uv run ruff check --fix .
uv run ruff format .
uv run mypy src/
uv run pytest
# Or rely on pre-commit hooks to catch issues automatically
```
---
## 6. CI/CD Basics
### GitHub Actions: Modern Python CI
The 2026 standard CI pipeline for a Python package has four stages: **lint → type-check → test → build/publish**. All using uv.
#### CI Workflow (`.github/workflows/ci.yml`)
```yaml
name: CI
on:
push:
branches: [main]
pull_request:
jobs:
check:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v8
with:
python-version: ${{ matrix.python-version }}
enable-cache: true
- run: uv sync --locked --dev
- run: uv run ruff check .
- run: uv run ruff format --check .
- run: uv run mypy src/
- run: uv run pytest -m "not integration" --cov
integration:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v8
with:
enable-cache: true
- run: uv sync --locked --dev --extra torch-cpu
- run: uv run pytest -m integration
```
**Key points**:
- `uv sync --locked` ensures CI uses exact versions from `uv.lock`. Fails if lockfile is stale.
- `enable-cache: true` caches uv's global package cache across runs, dramatically speeding up PyTorch installs.
- Matrix strategy tests across all supported Python versions.
- Integration tests run separately with `torch-cpu` extra, using PyTorch's CPU-only index.
- `ruff format --check` verifies formatting without modifying files.
#### Publish Workflow (`.github/workflows/publish.yml`)
```yaml
name: Publish
on:
release:
types: [published]
jobs:
publish:
runs-on: ubuntu-latest
permissions:
id-token: write # Required for OIDC trusted publishing
contents: read
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v8
- run: uv build --no-sources # Build without uv.sources (use PyPI indexes)
- run: uv publish # OIDC trusted publishing — no secrets needed
```
**Trusted Publishing** (OIDC) is the recommended approach. No API tokens stored in GitHub. The workflow authenticates via a short-lived OIDC token that GitHub provides. Configure the trusted publisher on PyPI's publishing settings page.
**Setup steps on PyPI**:
1. Go to your PyPI project → Publishing settings
2. Add a trusted publisher: your GitHub org, repo, workflow filename (`publish.yml`), optional environment name
3. No secrets needed — the OIDC token is automatically available in GitHub Actions
---
## 7. Python Version Targeting
### Current EOL Schedule (June 2026)
| Version | Status | EOL Date |
|---------|--------|----------|
| 3.9 | End of Life | October 2025 (already passed) |
| 3.10 | Security fixes only | October 2026 |
| 3.11 | Security fixes only | October 2026 |
| 3.12 | Bug fixes | October 2027 |
| 3.13 | Bug fixes | October 2028 |
| 3.14 | Latest stable | October 2029 |
### Recommendation: `requires-python = ">=3.10"`
**Rationale**:
- **3.10 EOL is October 2026** — it will be EOL by end of this year. However, many enterprise users and CI environments still run 3.10. Supporting it costs us little (no special syntax to avoid) and maximizes adoption.
- **3.11 is also EOL October 2026** — same reasoning. 3.11 brings faster CPython performance and better error messages, but from a packaging perspective, supporting 3.10+ automatically includes 3.11.
- **3.12 is the current "safe floor"** for new projects that don't need maximum compatibility — it'll be supported until October 2027.
- **3.13 and 3.14** are cutting edge. Test against them in CI but don't require them.
**Our recommendation**: Target `>=3.10` to maximize compatibility. Test against 3.10, 3.11, 3.12, and 3.13 in CI. Revisit dropping 3.10 support in Q4 2026 after its EOL.
**Features we get from 3.10+ baseline**:
- `match` statements (structural pattern matching)
- `X | Y` union type syntax (PEP 604)
- Parameter specification variables (PEP 612)
- `from __future__ import annotations` works well
- `zip(strict=True)` for strict iteration
**Note on Python 3.14**: PEP 649/749 makes deferred evaluation of annotations the default, eliminating the need for `from __future__ import annotations`. This is nice but not a reason to require 3.14.
---
## 8. Recommended Configuration for alknet-firewall
### Complete `pyproject.toml`
```toml
[project]
name = "alknet-firewall"
version = "0.1.0"
description = "LLM input safety/firewall library for content classification and filtering"
readme = "README.md"
license = { text = "MIT" }
requires-python = ">=3.10"
authors = [
{ name = "AlkDev", email = "dev@alknet.dev" },
]
classifiers = [
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Security",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Typing :: Typed",
]
dependencies = [
"scikit-learn>=1.5",
"transformers>=4.40",
]
[project.optional-dependencies]
torch = ["torch>=2.2"]
torch-cpu = ["torch>=2.2"]
torch-gpu = ["torch>=2.2"]
[project.urls]
Homepage = "https://github.com/alkdev/alknet-firewall"
Repository = "https://github.com/alkdev/alknet-firewall"
Issues = "https://github.com/alkdev/alknet-firewall/issues"
[build-system]
requires = ["uv_build>=0.11,<0.12"]
build-backend = "uv_build"
# --- uv configuration ---
[tool.uv]
conflicts = [[{ extra = "torch-cpu" }, { extra = "torch-gpu" }]]
[tool.uv.sources]
torch = [
{ index = "pytorch-cpu-mac", extra = "torch-cpu", marker = "platform_system == 'Darwin'" },
{ index = "pytorch-cpu", extra = "torch-cpu", marker = "platform_system != 'Darwin'" },
{ index = "pytorch-gpu", extra = "torch-gpu" },
{ index = "pytorch-cpu-mac", extra = "torch", marker = "platform_system == 'Darwin'" },
{ index = "pytorch-cpu", extra = "torch", marker = "platform_system != 'Darwin'" },
]
[[tool.uv.index]]
name = "pytorch-cpu-mac"
url = "https://pypi.python.org/simple"
explicit = true
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true
[[tool.uv.index]]
name = "pytorch-gpu"
url = "https://download.pytorch.org/whl/cu126"
explicit = true
# --- Dependency groups (dev only, not published) ---
[dependency-groups]
dev = [
"ruff>=0.11",
"pytest>=8.0",
"pytest-cov>=5.0",
"mypy>=1.10",
"pre-commit>=3.7",
]
# --- Ruff ---
[tool.ruff]
line-length = 100
target-version = "py310"
[tool.ruff.lint]
select = ["E", "W", "F", "I", "B", "UP", "S", "C4", "SIM", "TCH", "RUF"]
ignore = ["E501", "S101"]
[tool.ruff.lint.per-file-ignores]
"tests/**" = ["S101", "S311"]
[tool.ruff.format]
docstring-code-format = true
quote-style = "double"
# --- mypy ---
[tool.mypy]
python_version = "3.10"
strict = true
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = true
[[tool.mypy.overrides]]
module = ["sklearn.*", "transformers.*", "torch.*"]
ignore_missing_imports = true
# --- pytest ---
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-v --tb=short"
markers = [
"slow: marks tests as slow",
"integration: marks tests that require model files",
]
filterwarnings = [
"error",
"ignore::DeprecationWarning:transformers",
"ignore::FutureWarning:sklearn",
]
# --- coverage ---
[tool.coverage.run]
source_pkgs = ["alknet_firewall"]
[tool.coverage.report]
exclude_lines = [
"pragma: no cover",
"if TYPE_CHECKING",
"raise NotImplementedError",
"if __name__ == .__main__.",
]
show_missing = true
```
### Recommended Project Structure
```
alknet-firewall/
├── .github/
│ └── workflows/
│ ├── ci.yml
│ └── publish.yml
├── .pre-commit-config.yaml
├── .python-version # 3.13 (latest stable for dev)
├── .gitignore
├── LICENSE
├── README.md
├── pyproject.toml
├── uv.lock
├── src/
│ └── alknet_firewall/
│ ├── __init__.py
│ ├── py.typed
│ ├── classifier.py # Sklearn-based classifiers
│ ├── firewall.py # Core firewall logic
│ └── models.py # Model loading & inference
└── tests/
├── conftest.py
├── test_classifier.py
├── test_firewall.py
├── test_integration/
│ ├── __init__.py
│ └── test_model_loading.py
└── fixtures/
└── sample_inputs.json
```
### Getting Started Commands
```bash
# 1. Initialize project
uv init --lib alknet-firewall
cd alknet-firewall
# 2. Pin Python version for dev
uv python pin 3.13
# 3. Add core dependencies
uv add "scikit-learn>=1.5" "transformers>=4.40"
# 4. Add PyTorch as optional (uv add --optional creates extras)
uv add --optional torch "torch>=2.2"
# 5. Add dev tooling
uv add --dev ruff pytest pytest-cov mypy pre-commit
# 6. Set up pre-commit hooks
uv run pre-commit install
# 7. Verify everything works
uv sync
uv run ruff check .
uv run ruff format .
uv run mypy src/
uv run pytest
# 8. Build the package
uv build
# 9. Test install from built wheel
uv run --with dist/alknet_firewall-0.1.0-py3-none-any.whl --no-project -- \
python -c "import alknet_firewall; print('OK')"
```
---
## References
- [uv Official Documentation — Building and Publishing](https://docs.astral.sh/uv/guides/package/)
- [uv Official Documentation — Creating Projects](https://docs.astral.sh/uv/concepts/projects/init/)
- [uv Official Documentation — Build Backend](https://docs.astral.sh/uv/concepts/build-backend/)
- [uv Official Documentation — Managing Dependencies](https://docs.astral.sh/uv/concepts/projects/dependencies/)
- [PEP 735 — Dependency Groups in pyproject.toml](https://peps.python.org/pep-0735/)
- [Python Packaging User Guide — Writing pyproject.toml](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/)
- [Python Packaging User Guide — src layout vs flat layout](https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout/)
- [Python Devguide — Status of Python Versions](https://devguide.python.org/versions/)
- [Simplifying PyTorch Environment Setup with uv (Zenn)](https://zenn.dev/haru256/articles/6ded722b409d13)
- [uv Build Backend Is Stable (ByteIota)](https://byteiota.com/uv-build-backend-stable-python-packaging/)
- [Build and Publish a Python Package with uv (pydevtools)](https://pydevtools.com/handbook/tutorial/build-and-publish-a-python-package/)
- [Python Project Setup 2026: uv + Ruff + Ty + Polars (KDnuggets)](https://www.kdnuggets.com/python-project-setup-2026-uv-ruff-ty-polars)
- [Modern Python Best Practices: The 2026 Definitive Guide (OneHorizon)](https://onehorizon.ai/blog/modern-python-best-practices-the-2026-definitive-guide)
- [Python Packaging Best Practices 2026: setuptools, Poetry, and Hatch (DasRoot)](https://dasroot.net/posts/2026/01/python-packaging-best-practices-setuptools-poetry-hatch/)