A developer recently stacked 8 Mac Minis into a "mini data center" to run large language models locally.
Is it really used the Mac's GPU cores?...
And, any stable way to actually use 'Apple GPUs' for NN trains and inferences?
5 tokens per second is … not too bad
The cost is only 10% or even less than its peers.
Is it really used the Mac's GPU cores?...
And, any stable way to actually use 'Apple GPUs' for NN trains and inferences?
5 tokens per second is … not too bad
The cost is only 10% or even less than its peers.