Inside An LLM
I’ve always been amazed by children.<p>They are sponges.<p>Give them something to learn and they learn it quickly.
Too quickly.<p>Psychologists call this memory plasticity.<p>A child can absorb sensory information,
hold it together,
and make sense of it
almost immediately.<p>Learning doesn’t arrive one piece at a time.
It happens in parallel.<p>Many impressions,
held at once,
until patterns begin to stand out on their own.<p>As we grow older, that plasticity fades.
We stop absorbing so easily.<p>We carry more.
But we change less.<p>In 2017, a Google research paper helped ignite the current wave of AI.
Its title was simple:<p>All You Need Is Attention.<p>The idea was not to hand-build understanding.
Not to carefully specify every connection in advance.<p>Instead:
turn experience into tokens,
examine their relationships all at once,
and let structure emerge.<p>Up to that point, much of AI had tried to design intelligence explicitly.
Representations.
Connections.
Rules.<p>It worked.
But slowly.
At enormous cost.<p>The new proposal was different.
Just throw everything at it.
Let the system figure it out.<p>In other words:
teach the system the way a baby learns.<p>But the environments are not the same.<p>Children learn by being immersed in the world.
Large language models learn by being immersed in the internet.<p>One of these environments contains playgrounds,
stories,
and banged knees.<p>The other contains comment sections.
At scale.<p>And then there is a hard boundary.<p>At some point, the learning must stop.<p>The figuring-out is frozen into place—
for better or worse—
so the system can be used.<p>An LLM may have learned a great deal.
But it has learned only what was present in its training.<p>This is what developers mean when they say a model is stateless.<p>It does not progress.
It does not accumulate.<p>It resets.<p>Each time you use it,
you are meeting the same frozen system again.<p>It may be intelligent.
But it cannot learn more than it already knows—
except for what you place in the prompt.<p>And when the session ends,
that too disappears.<p>This has become a quiet frustration for many users.<p>Because the question isn’t whether these systems are intelligent.<p>It’s whether intelligence
without the ability to change
is learning at all.<p>---<p>Also on Medium: https://medium.com/@roger_gale/where-mistakes-go-to-learn-51a82a6f1187<p>If you enjoyed this, I'm writing a series on AI limitations and learning.
查看原文
I’ve always been amazed by children.<p>They are sponges.<p>Give them something to learn and they learn it quickly.
Too quickly.<p>Psychologists call this memory plasticity.<p>A child can absorb sensory information,
hold it together,
and make sense of it
almost immediately.<p>Learning doesn’t arrive one piece at a time.
It happens in parallel.<p>Many impressions,
held at once,
until patterns begin to stand out on their own.<p>As we grow older, that plasticity fades.
We stop absorbing so easily.<p>We carry more.
But we change less.<p>In 2017, a Google research paper helped ignite the current wave of AI.
Its title was simple:<p>All You Need Is Attention.<p>The idea was not to hand-build understanding.
Not to carefully specify every connection in advance.<p>Instead:
turn experience into tokens,
examine their relationships all at once,
and let structure emerge.<p>Up to that point, much of AI had tried to design intelligence explicitly.
Representations.
Connections.
Rules.<p>It worked.
But slowly.
At enormous cost.<p>The new proposal was different.
Just throw everything at it.
Let the system figure it out.<p>In other words:
teach the system the way a baby learns.<p>But the environments are not the same.<p>Children learn by being immersed in the world.
Large language models learn by being immersed in the internet.<p>One of these environments contains playgrounds,
stories,
and banged knees.<p>The other contains comment sections.
At scale.<p>And then there is a hard boundary.<p>At some point, the learning must stop.<p>The figuring-out is frozen into place—
for better or worse—
so the system can be used.<p>An LLM may have learned a great deal.
But it has learned only what was present in its training.<p>This is what developers mean when they say a model is stateless.<p>It does not progress.
It does not accumulate.<p>It resets.<p>Each time you use it,
you are meeting the same frozen system again.<p>It may be intelligent.
But it cannot learn more than it already knows—
except for what you place in the prompt.<p>And when the session ends,
that too disappears.<p>This has become a quiet frustration for many users.<p>Because the question isn’t whether these systems are intelligent.<p>It’s whether intelligence
without the ability to change
is learning at all.<p>---<p>Also on Medium: https://medium.com/@roger_gale/where-mistakes-go-to-learn-51a82a6f1187<p>If you enjoyed this, I'm writing a series on AI limitations and learning.