15
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 11 Nov 2024
15 points (100.0% liked)
TechTakes
1480 readers
277 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 1 year ago
MODERATORS
My baseline understanding is that "NPUs," as such, are vector accelerators with perhaps lower precision and definitely lower peak TDP. I say this because much of the incremental ML research I've skimmed over seems to be around getting away with lower precision, dropping down to FP8 or even FP4 from FP16 when they can get away with it.
I'm still confused as to why and how this is an acceptable tradeoff to firing up an iGPU with precise power/TDP stepping. Perhaps one of those situations where the power budget and latency to fire up the whole GPU block or burst it to max power ends up costing as much as the actual calculation. I think for purposes of this discussion, we also need a source that sheds light on the architectural differences between NPUs and GPU shader/execution units.