15
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 11 Nov 2024
15 points (100.0% liked)
TechTakes
1480 readers
324 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 1 year ago
MODERATORS
to the computing side, and with the proviso that in my own estimation of my skills I am at best slightly less than "dangerously clueless": unfortunately not as much as may be desired because the kind of chips being added are fairly specialised silicon
it's not impossible that people may find other uses for it over time but to the best of my knowledge as it stands right now much of this shit is dead weight the moment this bubble pops
(I don't think it will all go entirely away; there are some ML uses that are not complete trash. but that's a long different arc)
I'm not sure I follow the skeu side of your comment?
that;s exactly the catch I was hoping wouldn't be the case. When the AI shit is abandoned, is the hardware useful for regular stuff...
So, from what you're saying: Generative AI is fucking up in the past, present, and future
broad brush strokes, yes largely that
there's some extremely fucking interesting details in the weeds, but that's beyond the scope of merely a comment (and also I don't feel equipped to make a goodpost about it as yet)
My baseline understanding is that "NPUs," as such, are vector accelerators with perhaps lower precision and definitely lower peak TDP. I say this because much of the incremental ML research I've skimmed over seems to be around getting away with lower precision, dropping down to FP8 or even FP4 from FP16 when they can get away with it.
I'm still confused as to why and how this is an acceptable tradeoff to firing up an iGPU with precise power/TDP stepping. Perhaps one of those situations where the power budget and latency to fire up the whole GPU block or burst it to max power ends up costing as much as the actual calculation. I think for purposes of this discussion, we also need a source that sheds light on the architectural differences between NPUs and GPU shader/execution units.