Apple's MLX adding CUDA support

(github.com)

Comments

lukev 15 July 2025
So to make sure I understand, this would mean:

1. Programs built against MLX -> Can take advantage of CUDA-enabled chips

but not:

2. CUDA programs -> Can now run on Apple Silicon.

Because the #2 would be a copyright violation (specifically with respect to NVidia's famous moat).

Is this correct?

nxobject 14 July 2025
If you're going "wait, no Apple platform has first-party CUDA support!", note that this set of patches also adds support for "Linux [platforms] with CUDA 12 and SM 7.0 (Volta) and up".

https://ml-explore.github.io/mlx/build/html/install.html

paulirish 15 July 2025
It's coming from zcbenz who created Electron among others https://zcbenz.com/ Nice.
zdw 14 July 2025
How does this work when one of the key features of MLX is using a unified memory architecture? (see bullets on repo readme: https://github.com/ml-explore/mlx )

I would think that bringing that to all UMA APUs (of any vendor) would be interesting, but discreet GPU's definitely would need a different approach?

edit: reading the PR comments, it appears that CUDA supports a UMA API directly, and will transparently copy as needed.

MuffinFlavored 14 July 2025
Is this for Mac's with NVIDIA cards in them or Apple Metal/Apple Silicon speaking CUDA?... I can't really tell.

Edit: looks like it's "write once, use everywhere". Write MLX, run it on Linux CUDA, and Apple Silicon/Metal.

neurostimulant 15 July 2025
> Being able to write/test code locally on a Mac and then deploy to super computers would make a good developer experience.

Does this means you can use MLX on linux now?

Edit:

Just tested it and it's working but only python 3.12 version is available on pypi right now: https://pypi.org/project/mlx-cuda/#files

numpad0 15 July 2025
> This PR is an ongoing effort to add a CUDA backend to MLX

looks like it allows MLX code to compile and run on x86 + GeForce hardware, not the other way around.

dnchdnd 15 July 2025
Random aside: A lot of the people working on MLX don't seem to be officially affiliated with Apple at least in a superficial review. See for example: https://x.com/prince_canuma

Idly wondering, is Apple bankrolling this but wants to keep it in the DL? There were also rumours the team was looking to move at one point ?

mattfrommars 15 July 2025
It's year 2025 and we have yet to have impact of CUDA like what Java had in the idea, "write once, run it anywhere"

Academia and companies continue to write proprietary code. Its as if we continue to write code for Adobe Flash or Microsoft Silverlight in year 2025.

Honestly, I don't mind as Nvidia shareholder.

benreesman 15 July 2025
I wonder how much this is a result of Strix Halo. I had a fairly standard stipend for a work computer that I didn't end up using for a while so I recently cashed it in on the EVO-X2 and fuck me sideways: that thing is easily competitive with the mid-range znver5 EPYC machines I run substitors on. It mops the floor with any mere-mortal EC2 or GCE instance, like maybe some r1337.xxxxlarge.metal.metal or something has an edge, but the z1d.metal and the c6.2xlarge or whatever type stuff (fast cores, good NIC, table stakes), blows them away. And those things are 3-10K a month with heavy provisioned IOPS. This thing has real NVME and it cost 1800.

I haven't done much local inference on it, but various YouTubers are starting to call the DGX Spark overkill / overpriced next to Strix Halo. The catch of course is ROCm isn't there yet (they're seeming serious now though, matter of time).

Flawless CUDA on Apple gear would make it really tempting in a way that isn't true with Strix so cheap and good.

albertzeyer 14 July 2025
This is exciting. So this is using unified memory of CUDA? I wonder how well that works. Is the behavior of the unified memory in CUDA actually the same as for Apple silicon? For Apple silicon, as I understand, the memory is anyway shared between GPU and CPU. But for CUDA, this is not the case. So when you have some tensor on CPU, how will it end up on GPU then? This needs a copy somehow. Or is this all hidden by CUDA?
sciencesama 15 July 2025
Apple is planing to build data centers with mseries of chips for both app development, testing and to host external services!
Abishek_Muthian 15 July 2025
I’ve been very impressed with MLX models; I can open up local models to everyone in the house, something I wouldn’t dare with my Nvidia computer for the risk of burning down the house.

I’ve been hoping Apple Silicon becomes a serious contender for Nvidia chips; I wonder if the CUDA support is just Embrace, extend, and extinguish (EEE).

qwertox 15 July 2025
If Apple would support Nvidia cards it would be the #1 solution for developers.
teaearlgraycold 14 July 2025
I wonder if Jensen is scared. If this opens up the door to other implementations this could be a real threat to Nvidia. CUDA on AMD, CUDA on Intel, etc. Might we see actual competition?
neuroelectron 15 July 2025
Just remember to name for fp8 kernels "cutlass" for +50% performance.
orliesaurus 15 July 2025
Why is this a big deal, can anyone explain if they are familiar with the space?
m3kw9 15 July 2025
I thought you either use MLX for apple silicone or you compile it for cudaw
adultSwim 15 July 2025
This is great to see. I had wrongly assumed MLX was Apple-only.
Keyframe 14 July 2025
Now do linux support / drivers for Mac hardware!
gsibble 14 July 2025
Awesome
natas 15 July 2025
that means the next apple computer is going to use nvidia gpu(s).
nerdsniper 14 July 2025
Edit: I had the details of the Google v Oracle case wrong. SCOTUS found that re-implementing an API does not infringe copyright. I was remembering the first and second appellate rulings.

Also apparently this is not a re-implementation of CUDA.