Everyone benchmarks the RTX 5090 for games. Almost nobody pushes it on AI. So when developer-YouTuber Alex Ziskind lined up a regular RTX 5090, a 1,000-watt "Lightning" 5090, and NVIDIA's RTX Pro 6000 and ran them all through local-LLM and video-generation workloads, the results were a useful reality check for anyone about to spend serious money on a desktop AI rig.
The three contenders
The test bench had a regular RTX 5090 (32 GB VRAM, 600 W cap), a 5090 "Lightning" (liquid-cooled, limited edition, a wild 1,000 W cap), and the RTX Pro 6000 Blackwell — the workstation card with 96 GB of VRAM (but, notably, also capped at 600 W).
What the benchmarks actually showed
A few findings that matter if you're buying for AI, not frame rates:
- For prompt processing (prefill), the Pro 6000 is in another league — roughly twice as fast as either 5090.
- But for chatting (token generation), they're basically tied — ~160 tok/s across the board, and the pricey Pro 6000 was actually a hair slowest. So if you just talk to models, the expensive card buys you nothing.
- VRAM is the real story. A 14B model in full BF16 (~28 GB of weights) simply won't fit on a 32 GB 5090 once you add the runtime overhead — both 5090s collapsed under concurrency, while the 96 GB Pro 6000 didn't flinch (12,000+ tok/s). That headroom is what you're really paying for.
- The Lightning's 1,000 watts is mostly theater for AI. Across every LLM test it never pulled more than ~695 W (inference is memory-bound); video generation topped out near 650 W. Only a pure matrix-multiply torture test actually hit 1,000 W. The upside: it runs ~20° cooler and quieter than a stock 5090.

What viewers are saying
- "Been wanting to see this exact setup comparison — thank you again!" — @JustinFYI
- "The RTX Pro 6000 runs games just fine as well, you know. Thanks for the video, Alex." — @b1lleman
- "Excellent video, as always." — @accuratecalcs
The bottom line
If you're training models or running sustained, pure-compute workloads (and want a card that stays cool and quiet), the 1,000 W RTX 5090 Lightning earns its keep. But for most local-AI work — inference and image/video generation — that extra wattage sits unused, and the thing that actually unlocks bigger models is raw VRAM. That's why, for AI-only builds, the 96 GB RTX Pro 6000 keeps winning despite being the "slower" card on paper. Watch Alex's full breakdown above for the numbers.