Open-weight vs closed models

Concepts · The AI Model & Tooling Ecosystem

Open-weight vs closed (proprietary) models.

This entry untangles a debate that is more about engineering trade-offs than ideology. You will leave able to reason about the real consequences of choosing a downloadable model versus a hosted API — control, cost, privacy, licensing, and lock-in — without the marketing framing from either camp.

STEP 1

Get the terminology straight first.

The most common confusion is treating "open source" and "open weight" as the same thing. They are not.

Open weight. The trained model parameters are published; you can download them and run inference (and often fine-tune) yourself. The training data and full training code are usually not released. Most "open" large models today are open-weight in this sense.
Fully open source. Weights, training code, and ideally training data are all released under an open license. This is rarer and a stronger claim than "open weight."
Closed / proprietary. Weights are never published. You access the model only as a hosted service through an API. The provider controls the model entirely.

Licenses vary widely even within "open weight": some are permissive (close to standard open-source software licenses), others are source-available with conditions — for example acceptable-use restrictions, or extra terms above a certain scale of users. "You can download it" does not automatically mean "you can use it however you like." Read the license for the specific model.

STEP 2

What you actually trade.

Control and customization

Open weights let you fine-tune deeply, inspect behavior, pin an exact version forever, and modify the serving stack. Closed models offer a constrained customization surface (system prompts, hosted fine-tuning, adapters) but you cannot freeze a version the provider has retired, and behavior can shift under you when snapshots update.

Privacy and data residency

Self-hosting an open-weight model means inference data never leaves your infrastructure — often decisive for regulated data. Closed APIs require sending data to the provider; most offer no-training-on-your-data and regional options, but it is still data leaving your boundary, which is a different risk and compliance posture.

Cost structure

Closed APIs are opex: pay per token, zero infrastructure, scales smoothly from zero. Self-hosted open weights are capex-like: you pay for GPUs/accelerators whether or not they are busy. The crossover depends entirely on utilization. Low or spiky volume almost always favors a hosted API; sustained high volume on a model small enough to serve efficiently can favor self-hosting — but only after you account for the engineering and reliability cost of running inference well.

Capability

Historically, the strongest frontier models were closed. As of early 2026 the best open-weight models are competitive with closed models on many mainstream tasks, while the very top of the frontier — especially on the hardest reasoning — is still often held by closed labs, and the gap narrows and re-opens with each release cycle. Treat "open has caught up" and "closed is always ahead" as equally unreliable blanket claims; measure on your task.

Operational burden and lock-in

Closed APIs offload reliability, scaling, and safety tooling to the provider — at the cost of dependency on their pricing, availability, and roadmap. Open weights remove that dependency but transfer the entire operational burden (serving, scaling, evaluation, security patching) to you.

STEP 3

A decision heuristic.

┌──────────────────────────────────────────────────────────────┐ │ LEAN OPEN-WEIGHT / SELF-HOST WHEN │ │ • data cannot legally leave your boundary │ │ • you need a frozen, never-changing version │ │ • sustained high volume on a servable-size model │ │ • deep custom fine-tuning is core to the product │ │ │ │ LEAN CLOSED / HOSTED API WHEN │ │ • you need top-of-frontier capability now │ │ • volume is low, spiky, or unknown │ │ • small team, no inference/SRE muscle to spare │ │ • time-to-market matters more than per-token cost │ │ │ │ COMMON REAL ANSWER: a hybrid │ │ • hosted frontier model for hard / rare requests │ │ • cheaper open or small model for the high-volume path │ └──────────────────────────────────────────────────────────────┘

The hybrid answer is increasingly the norm, not a compromise: route easy, high-volume requests to a cheap or self-hosted model and escalate only the hard cases to a frontier API. That structure also reduces lock-in, because the expensive dependency is isolated behind a router you control.

"Open" is not automatically cheaper, safer, or more private once you account for the cost and risk of operating inference yourself. The honest comparison is total cost and total risk of ownership at your volume and compliance posture — not list price per million tokens.