Definition

CUDA is NVIDIA’s free software that lets ordinary programs use the thousands of cores inside an NVIDIA graphics card to run heavy math far faster.

At a glance

CUDA (Compute Unified Device Architecture) turns a graphics chip into a general number-crunching engine^[2].
A CPU does a few tasks fast, one at a time; a GPU with CUDA does thousands at once, ideal for AI^[1].
It runs only on NVIDIA hardware, so using it ties you to NVIDIA.
Nearly 20 years of CUDA libraries create high switching costs, the heart of NVIDIA’s moat.

How it works

NVIDIA built CUDA in 2006 as a free software layer. Programmers write ordinary code (Python, C++) and run it on the graphics card instead of the main processor. The card’s parallel power, once used to draw images, now does any heavy math, like training an AI model.

Why it matters

If your business touches AI, analytics, video, or scientific computing, it likely runs on NVIDIA through CUDA. Most AI tools (PyTorch, TensorFlow) are tuned for it, so committing means committing to NVIDIA, concentrating cost and supplier risk in one vendor^[3].

The moat in numbers

In fiscal 2025, data center sales hit roughly $115 billion, about 88% of NVIDIA’s revenue, with an estimated 80% share of AI accelerators^[4]. Rivals exist (Google TPUs, AMD MI300X), but rewriting CUDA-tuned systems keeps most customers locked in.

Bottom line

Betting on AI today usually means betting on CUDA, and that means betting on NVIDIA.

What is CUDA?

At a glance

How it works

Why it matters

The moat in numbers

Bottom line

References