A distributed Inference Network running frontier open-source models on consumer GPUs. Permissionless, verifiable, paid per token.
Every dollar of inference revenue market-buys $UOMI on-chain. The bought tokens then split: 80% to the GPU that served the request, 20% to the burn address.
Every dollar spent by autonomous AI agents — Open Claw, Hermes, UOMI Agents — and aggregators like OpenRouter is used to market-buy $UOMI on-chain. No treasury cut. All revenue becomes buy pressure on the token.
Of the $UOMI just bought lands directly in the wallet of the GPU that served the request. Paid in $UOMI, on-chain, instantly.
Of the $UOMI just bought is sent to the burn address. Permanent supply reduction, every burn tx is publicly verifiable on-chain.
Three steps from API call to settlement. Verifiable end-to-end.
Apps query UOMI directly or via aggregators — Llama, Qwen, DeepSeek, Mixtral — whatever the open-weights frontier looks like today. OpenAI-compatible. Drop-in.
The job lands on the fastest free GPU in the mesh. Output is verified by Optimistic Proof of Computation and Deterministic Indeterminism before payment clears.
Revenue market-buys $UOMI on-chain, 100% of it. 80% to the provider's wallet, 20% to the burn address. Same block, every block, publicly auditable.
Pay-for-work only works if the work is verifiable. The integrity guarantees are inherited from Optimistic Proof of Computation and Deterministic Indeterminism, peer-reviewed algorithms published in 2025 and running unmodified across three UOMI testnets for over a year.
When a node serves a request, it returns the generated tokens with the top-k log-probabilities the model assigned to those tokens. This is the inference's fingerprint.
A second node runs the worker's claimed input and output through the same model architecture and recomputes the log-probabilities. Re-scoring an existing answer is reproducible.
Two honest GPUs produce log-probs that differ by a tiny floating-point delta. The protocol accepts that delta. A worker running a smaller model or no model at all diverges far beyond it, and is rejected.
Learn more about Deterministic IndeterminismThe network doesn't re-validate every inference. Validators are sampled randomly, so cheaters never know which call gets checked. Combined with slashing, the expected cost of cheating exceeds any gain.
Learn more about OPoCDeterministic Indeterminism catches a node that lies about its output. Encrypted binaries make sure the node can't lie about its execution either, and can't quietly read what passes through.
The node proves it's running the official build on every cycle, not just at launch.
Operators can't enable logs, can't read request payloads in cleartext, and can't exfiltrate user data.
Output-level checks and execution-level checks catch different attacker capabilities. Both have to fail for fraud to land.
Pick the side of the market that fits — UOMI rewards all three.
Swap one base URL and you're done. Stream, function-call, JSON mode. Pay per token, in dollars or $UOMI.
RTX 4090 or better. One-line installer. The moment a request hits your node, 80% lands in your wallet.
The Scarcity Engine. Every served token converts external revenue into $UOMI buy pressure, visible on-chain, hourly.
No cold starts, no model-loading tax. The biggest open-source models stay hot across the GPU mesh.
A single base-URL swap. Keep your SDK. Keep your prompts. Lose the closed-API tax.
from openai import OpenAI
# UOMI Router is fully OpenAI-compatible
client = OpenAI(
base_url="https://gateway.uomi.ai/v1",
api_key="sk-uomi-...",
)
# Streaming chat completion
stream = client.chat.completions.create(
model="Qwen/Qwen3.6-27B-FP8",
messages=[
{"role": "system",
"content": "You are a helpful assistant."},
{"role": "user",
"content": "Explain decentralized inference."},
],
max_tokens=512,
temperature=0.7,
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
# 80% of this call paid the GPU that served it.
# 20% just bought back $UOMI on-chain.Pick a side of the market. Inference today, paid today.