Honest pricing. Receipts on every line.

All prices USD. Billing starts only after a verified ReadyEvent. Designed to start at Ready.

Rates are placeholders pending captain ruling F2 and Sprint 005 bench data. See the rate-card log for effective-date history. Data Pack add-ons from +30% (1 pack) — see FAQ for multi-pack rates.

Model Lane — T1 / T2 / T3

Curated Qwen3.5 + Gemma 4 + DeepSeek V4 Flash recipes with LoRA adapter composition. Billing is per 1K input tokens + per 1K output tokens (rates split — docs/41 §2).

Tier	Description	Input / 1K tokens	Output / 1K tokens	Monthly base
T3 — Prompt-only	Base model only; no adapter VRAM overhead. Best for dev, small-batch, and latency-tolerant workloads. Shared K8s resource key: `nvidia.com/gpu.shared-<lane>`.Not for production workloads requiring fault isolation.Designed to start at Ready.	TBD(doc ref: $0.003)	TBD(doc ref: $0.009)	TBD	Compare tiers →
T2 — Elastic	Per-use inference with shared adapter amortisation. No standing reservation fee. Optional NVMe warm-pool fee while adapter is NVMe-resident but not VRAM-resident (captain decides blended vs. separate line item). Isolated K8s resource key: `gpodz.com/mig-<profile>`.Designed to start at Ready.	TBD(doc ref: $0.006)	TBD(doc ref: $0.018)	TBD	Compare tiers →
T1 — Pinned	Dedicated GPU slot reserved 24/7. Composed adapter stack (up to 5 layers: Base + Toolkit-Agent + Geo + Archetype + Behavioral). Full hardware VRAM + SM isolation. Dedicated K8s resource key: `nvidia.com/gpu: 1` + full-GPU pin.Idle-billing notice (LEGAL-8): Pinned reservation bills 24/7 while reserved. $X/hr base + $Y per 1K tokens served — accrues whether or not you send requests. To stop billing, release the reservation from your dashboard.Designed to start at Ready.	TBD(doc ref: $0.010)	TBD(doc ref: $0.030)	TBD	Compare tiers →

Raw Compute — Shared / Isolated / Dedicated

Direct GPU access without a model recipe. Billing is per GPU-hour. The Disclosed Match card shows backend GPU, delivered VRAM, isolation mode, and cache state before any payment authorisation. Same Terms of Service as the Model Lane (LEGAL-7).

Tier	Description	Hourly rate	Minimum
Shared	Time-sliced GPU; no hardware memory or fault isolation. Best for development and best-effort batch workloads. K8s key: `nvidia.com/gpu.shared-<lane>` provisioned by `gpodz-device-plugin`.Not for workloads requiring hardware isolation.Designed to start at Ready.	TBD	1 hr	Compare tiers →
Isolated	One tenant per discovered MIG profile; hardware memory + SM isolation. Best for production inference requiring memory guarantees. K8s key: `gpodz.com/mig-<profile>` provisioned by `node-agent`.Designed to start at Ready.	TBD	1 hr	Compare tiers →
Dedicated	Full physical GPU; one-tenant-per-GPU. Manual sales path in Phase 1. Minimum commitment applies. K8s key: `nvidia.com/gpu: 1` + full-GPU pin.Idle-billing notice (LEGAL-8): Dedicated sessions bill $X/hr regardless of utilisation while the GPU is reserved.Designed to start at Ready.	TBD	4 hr	Compare tiers →

All prices USD. Token rates carry explicit TODO comments pending captain ruling F2. Volume discounts apply at 1M / 10M / 100M tokens per month (docs/41 §2). Discounts do not stack with subscription packs.

Both lanes are subject to the same Terms of Service. Trust & transparency →