UOMI LIVE NOWCA: 0x3628d69aa2d66e9efe95ab1267d440dec24389b6
TRADE NOW ONUNISWAP
UOMI Logo
Contributors · GPU Earnings

What your GPU could earn.

Every row is a model deployment. Every column is a GPU cluster configuration. The number inside is the upper bound of what one cluster of that shape can earn per day serving that model on UOMI.

The full $/cluster/day matrix

Pick a model. Pick a GPU.

How to read. Each cell shows the maximum daily revenue one cluster of that GPU configuration can earn serving the model on UOMI. Background color encodes the value on a log scale: pale gold = low, deep red = high, green = exceptional. The marks each row's best configuration.

$0$280+ best config for that model
Model
A
2×4090
B
4×4090
C
2×5090
D
4×5090
E
2×L40s
F
4×L40s
G
1×Pro6K
H
2×Pro6K
Qwen3.6 27B
up to$49.7
up to$74.5
up to$74.5
up to$149
up to$49.7
up to$74.5
up to$74.5
up to$149
Gemma 4 31B
up to$15.2
up to$22.8
up to$25.8
up to$36.9
up to$13.6
up to$22.8
up to$25.8
up to$45.5
GLM 5
up to$39.7
MiniMax M2.5
up to$21.4
up to$37.5
Gemma 4 31B
up to$17.6
up to$29.3
up to$29.3
up to$44
up to$17.6
up to$29.3
up to$29.3
up to$44
Gemma 4 26B A4B
up to$34
up to$34
up to$34
up to$34
up to$34
up to$34
up to$34
up to$34
DeepSeek V4 Flash
up to$10.7
up to$21.3
Qwen3.5-9B
up to$10.5
up to$21
up to$21
up to$21
up to$10.5
up to$21
up to$21
up to$21
Qwen3.6 35B A3B
up to$90.3
up to$136
up to$136
up to$136
up to$67.8
up to$136
up to$136
up to$271
Qwen3.5-35B-A3B
up to$79
up to$119
up to$119
up to$237
up to$79
up to$119
up to$119
up to$237
MiniMax M2.5
up to$21.3
up to$35.5
Llama 3.3 70B
up to$2.36
up to$13.7
up to$3.97
up to$21.6
up to$8.39
up to$13.7
up to$15.1
up to$25.2
DeepSeek V4 Flash
up to$8.86
up to$17.7
Gemma 4 26B A4B
up to$48
up to$48
up to$96
up to$96
up to$48
up to$48
up to$96
up to$96
gpt-oss-120b
up to$4.75
up to$9.5
up to$6.33
up to$9.5
up to$9.5
up to$19
up to$9.5
up to$19

Why 2× RTX Pro 6000 dominates. With 96 GB on a single GPU, a 1T-parameter MoE that needed 28 clusters per instance on 2× 4090 only needs 1, and the interconnect penalty stops compounding. The bigger the model, the more bigger GPUs pay off.

Caveats. Throughput estimates are batched-aggregate at FP8 with sublinear scaling for multi-cluster sharding (interconnect_factor = 1/√(1 + 3.0·(N−1))) calibrated for public-internet latency (~50 ms RTT, ~500 Mbps effective). A provider running entirely within one cloud region with private peering would see closer to K=0.5; the K=3.0 used here is for cross-region operator distribution. Single-cluster numbers (where N=1) are unaffected and most reliable.

Plug in. Get paid.

80% of every dollar earned lands directly in the wallet of the GPU that served the request. The other 20% buys back $UOMI on the open market.