@brucethemoose

brucethemoose@lemmy.world · edit-2 9 days ago

Kinda odd. 8 GPUs to a CPU is pretty much standard, and less ‘wasteful,’ as the CPU ideally shouldn’t do much for ML workloads.

Even wasted CPU aside, you generally want 8 GPUs to a pod for inference, so you can batch a model as much a possible without physically going ‘outside’ the server. It makes me wonder if they just can’t put as much PCIe/NVLink on it as AMD can?

LPCAMM is sick though. So is the sheer compactness of this thing; I bet HPC folks will love it.

brucethemoose@lemmy.world · edit-2 9 days ago

It’s definitely chiplet, as the article points out. You can see the seams.

brucethemoose@lemmy.world · 10 days ago

The questions is, why would anyone engage with this instead of immediately reporting it as bot spam?

I get people are not tech literate, which is fine, but I don’t get how they are so gullible/undiscerning online.

brucethemoose@lemmy.world · 10 days ago

100%

This is some engagement farm. Or maybe a future spammer bot building up a profile.

brucethemoose@lemmy.world · edit-2 1 month ago

Billionaire seem to have a… unscientific view of a sci fi future. Especially Musk, since he thinks he’s so transcendental, but apparently Bezos can’t help himself now either.

It doesn’t look like Star Trek.

It doesn’t look like a Cyberpunk movie.

I’d recommend diving into this for a more scientifically ‘thought out’ and optimistic extrapolation: https://www.orionsarm.com/

Interestingly, this is a neat idea waaay down the line, in the way a Dyson Swarm is interesting. But not anytime in the near future, not until humanity is very, very different (assuming we survive that long).

brucethemoose@lemmy.world · edit-2 8 months ago

It’s ironic how conservative the spending actually is.

Awesome ML papers and ideas come out every week. Low power training/inference optimizations, fundamental changes in the math like bitnet, new attention mechanisms, cool tools to make models more controllable and steerable and grounded. This is all getting funded, right?

No.

Universities and such are seeding and putting out all this research, but the big model trainers holding the purse strings/GPU clusters are not using them. They just keep releasing very similar, mostly bog standard transformers models over and over again, bar a tiny expense for a little experiment here and there. In other words, it’s full corporate: tiny, guaranteed incremental improvements without changing much, and no sharing with each other. It’s hilariously inefficient. And it relies on lies and jawboning from people like Sam Altman.

Deepseek is what happens when a company is smart but resource constrained. An order of magnitude more efficient, and even their architecture was very conservative.