From agent to silicon. Six layers, six decisions.

Every layer has a distinct role, a distinct cost profile, and a distinct decision for any organisation adopting AI. Reading top-down: where the user interacts. Bottom-up: where the spend lives.

Six layers, top to bottom.

1
Agent
Autonomous reasoning, tool use, planning loops (ReAct). Sits on top of everything else and orchestrates work.
2
Orchestration
Memory, RAG, prompt chaining, vector retrieval. Connects the model to your private data without retraining it.
3
Inference Engine
Tokenization, API gateway, sampling strategies. Every token costs money and latency.
4
Transformer Model
Attention heads, embeddings, decoder stack. The 175B to 1T parameters that ARE the compressed knowledge.
5
Training / ML Core
Pre-training, supervised fine-tuning, RLHF, Constitutional AI. Where the model gets its values.
6
Infrastructure
GPU clusters (NVIDIA H100), HBM3 memory, NVLink, InfiniBand. Do not build, buy. Cloud-first.

One value lever per layer.

LayerBusiness insightValue lever
AgentAutomate multi-step knowledge workProcess cost
OrchestrationRAG over private data, no retraining neededData moat
InferenceEvery token costs money. Caching and prompt design control OpExOpEx control
TransformerCapability is largely fixed. Choose the right modelCapEx avoidance
TrainingFine-tuning at 1 to 5% of pre-training costCompetitive edge
InfrastructureBuy compute, do not own itCapital discipline

Which layer is our spend actually on?

Most organisations think they are buying AI. They are buying inference (per-token costs) and orchestration (RAG infrastructure). Knowing which layer carries the cost makes budget conversations honest.

Want the boardroom version of this?