Technology

40517 readers

280 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

TheRtRevKaiser@beehaw.org

alyaza@beehaw.org

gyrfalcon@beehaw.org

SemioticStandard@beehaw.org

coldredlight@beehaw.org

rs5th@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

A jargon-free explanation of how AI large language models work (arstechnica.com)

submitted 2 years ago by Gaywallet@beehaw.org to c/technology@beehaw.org

14 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] PlatinumPangolin@kbin.social 1 points 2 years ago (1 children)

The whole "we don't know how they work" thing is a bit overblown. We have all the formulas, we know exactly how the math and code works. You can go and look at the weights for every node, you're just not going to derive any meaning or necessarily explain why one number works better than another.

[–] u_tamtam@programming.dev 1 points 2 years ago* (last edited 2 years ago)

This is the definition of complexity, isn't it? The fact here is that we can't scale up our understanding at a small level to make sense of the bigger picture. Having worked myself with (much simpler) artificial neural networks, I think it's very much correct and to the point to say that "we don't know how it works". I would even go further and claim that we will never know how it works fully: the weights in the network in essence form structures that do what they do, that we can recognize by analogy (e.g. logic gates, contour extractors, ...), but this is an anthropomorphic approximation which moreover only works in a certain range of values/set of conditions. Had we a formal definition of what the weights represent, we would then be dealing with a (much simpler and efficient) algorithm in the traditional sense (with cleanly delineated and rigorously defined specialized functions).