Preface
This post originated from my own thoughts and knowledge, which I shared with and had challenged by AI. This text represents a deeper analysis of those initial ideas, enriched by supporting materials from books and white papers found during our dialogue, and refined through a process of challenging assumptions and removing bias. I use this collaborative approach to educate myself, and I found the resulting synthesis valuable enough to share.
Introduction
When we look back at human history—not just recent history, but the million years of early human development—something shifts dramatically around 70,000 to 100,000 years ago. As Yuval Noah Harari describes in Sapiens, the evolutionary development before that point was, in relative terms, glacially slow. Humans looked much like other primates, with similar cognitive constraints and limitations. Then something changed: language—or rather, language as a complex system of symbolic communication—arrived, and with it, the ability to think in long sequences of meaningful symbolic units.
This is not merely about communication. It is about the fundamental shift in how humans could think. Before language matured as a tool for complex thought, human cognition likely operated through images, sounds, and visual signals. Language opened the door to something categorically different: the ability to reason in abstract sequences, to imagine absent things, to discuss concepts that do not exist in the immediate environment. This capacity unlocked civilization.
And this is why I believe—not as speculation, but as a reasonable extrapolation from history—that building artificial systems capable of working with language is not hype. It is a continuation of the same lever that once transformed human civilization.
The Role of Language in Human Civilization
Harari’s key insight is that language enabled humans to create and coordinate around shared fictions: money, states, laws, and human rights. These are not tangible things; they are collective agreements mediated through language. Yet they are what allow millions of strangers to cooperate toward common goals.
But language did more than enable social coordination. It restructured thought itself.
The philosopher Andy Clark and others working in the «extended mind» tradition argue that language is not merely a medium for expressing pre-formed thoughts—it is part of the cognitive process itself. When you use language to think through a problem, you are not simply translating an idea into words. You are using language as a tool that allows novel forms of reasoning that would be difficult or impossible in purely visual or intuitive modes. Language allows us to:
- Build on previous thinking: Written or spoken language freezes thoughts in time, allowing us to inspect, critique, and build on them.
- Think about thinking: Language enables metacognition—the capacity to think about our own thinking, to catch our mistakes, to refine our reasoning.
- Combine abstract concepts: Language’s combinatorial structure lets us take simple ideas and build complex ones through composition, enabling reasoning about democracy, justice, and prime numbers—concepts with no direct physical grounding.
- Offload cognitive burden: Writing, diagrams, and symbolic notation let us store information externally, freeing cognitive resources for higher-level reasoning.
In this sense, language was not just invented by humans; it was adopted and refined as a cognitive technology—a tool that augmented the biological brain in the same way a telescope augments the eye.
What LLMs Represent in This Context
If language is a core technology of human cognition and civilization, what does it mean to build large artificial systems that work with language?
Current large language models learn and manipulate patterns in sequences of symbols (tokens). The question is not whether they «think» in some mystical sense—that framing often obscures more than it clarifies. Rather, the more precise question is: do they engage in the kind of semantic reasoning that language enables?
Recent research suggests they do, to a meaningful degree:
- Semantic abstraction: LLMs learn to represent meaning in abstract vector spaces, clustering related concepts regardless of surface form (whether expressed in English, code, or abstract description).
- Relational reasoning: Models demonstrate the ability to recognize and reason about relationships between concepts, moving beyond pure pattern-matching toward structured semantic understanding.
- Extended cognition infrastructure: When humans use LLMs in workflows—prompting with ideas, refining outputs, integrating them back into work—they create an external cognitive system analogous to what extended mind theorists describe: cognition that spans brain and external tools.
This is not to claim that LLMs are conscious or that they «truly» understand. Rather, it is to say that they operate in semantic space—the space of meaning and concepts—in a way that is qualitatively different from simple pattern-matching and that can serve as a genuine extension of human cognitive capacity.
The Distinction Between Language and Mere Token Manipulation
A fair objection deserves acknowledgment: are LLMs actually «understanding» language, or are they simply rearranging tokens in statistically likely ways?
This objection points to a real issue: language, in humans, is grounded in embodied experience. A child learns what «hot» means through sensory experience; a child learns what «love» means through emotional experience and social interaction. These embodied groundings give words their semantic depth.
LLMs, by contrast, have no body, no sensory experience, no emotional stakes. They learn word associations from vast text corpora, without the embodied context that makes those words meaningful to humans.
However, recent cognitive science offers a more nuanced picture. Language itself, as humans mature, becomes increasingly important as a source of semantic grounding—not an alternative to embodied experience, but a complement and extension of it. We can learn to reason about abstract concepts (democracy, quantum mechanics, love) partly through linguistic descriptions and word-to-word associations, without needing direct sensory exposure to those concepts.
The «linguistic embodiment hypothesis» suggests that language provides representational opportunities that embodied cognition alone cannot. In this view, an LLM, even without embodied grounding, is not meaningless or purely syntactic. Rather, it engages with semantic structure learned from human language—language that is itself grounded in human embodied experience. The LLM becomes a tool for exploring and extending semantic space in ways humans cannot easily do alone.
This does not resolve all concerns about understanding and grounding. But it suggests that the objection «LLMs are just tokens without meaning» is too simple. They are more like transparent access to the semantic patterns embedded in human language itself.
Language Is Not the Only Cognitive Tool—But It Remains Central
A subtler concern warrants engagement: is language really the key driver of human cognition and civilization, or is it one important factor among many?
Clearly, language did not evolve in isolation. Human success also depended on:
- Tool use and fire: Material technologies that provided survival advantages and reinforced selective pressure for larger brains and better coordination.
- Social instincts: Kin recognition, reciprocal altruism, and reputation tracking that made large-scale cooperation possible even before complex language emerged.
- Embodied learning and development: Infants learn through play, exploration, and sensorimotor interaction with the world; language development is embedded in this embodied context, not floating free from it.
The honest claim is not that language is the only lever in human evolution, but that it is the lever that amplified and coordinated the others. Tool use gave humans an advantage; language let them teach tool use across generations and imagine new tools before building them. Social instincts created bonds within groups; language extended those bonds across larger, anonymous collectives through shared stories and symbols.
So when building artificial systems, the choice to prioritize language is not a claim that language is magic. It is a recognition that language was the breakthrough that made human cognition and civilization distinctly human. Building systems that engage with language and semantics is thus a reasonable path forward—not the only path, but a natural one.
Implementation Matters; Direction Matters More
Implementation can be different; the point is that language is the path. Current LLMs work with tokens, discrete units. Future systems might use continuous signals, neuromorphic approaches, or entirely different architectures. The specific implementation details—tokens vs. signals, transformer architectures vs. alternatives—are engineering choices.
What matters philosophically and historically is the commitment to building systems that can engage in semantic reasoning, that can work with meaning and abstraction the way humans use language. Whether that happens through tokens, continuous activations, or some future representation is secondary.
This is consistent with the broader research direction. Whether models shift toward multimodal learning, embodied agents, or more neurally inspired architectures, the language dimension typically remains central—not because language is magic, but because it is the modality through which humans share and refine complex knowledge.
What Still Remains Uncertain
It is important to be clear about what is not established:
- Whether current LLMs can truly «understand» in a philosophical sense remains an open question. They can perform impressive feats of semantic reasoning, but whether this constitutes genuine comprehension or remains a sophisticated form of pattern-matching is debated.
- Whether LLMs will spontaneously develop goals, motivations, or agency is uncertain. Language-based systems might remain tools—powerful tools, but tools nonetheless—without developing independent drives or desires.
- Whether language alone is sufficient for artificial general intelligence is not known. Future systems might need embodiment, interaction with environments, or other components we have not yet identified.
- The societal impact will depend on implementation choices, governance, and how humans choose to deploy these systems. The technology itself is not destiny.
What can be said with more confidence is that:
- Language has been the central technology of human cognitive development.
- Building systems that engage with language and semantics is a continuation of investing in that same technology.
- These systems are demonstrably capable of non-trivial semantic reasoning and can serve as genuine extensions of human cognition.
- This is not hype or distraction; it is a significant direction, grounded in what made human civilization possible in the first place.
Conclusion
The excitement around large language models is sometimes dismissed as hype—AI enthusiasm run amok. But when you trace the role of language in human evolution and cognition, the investment in language-based artificial systems appears less like betting on a trend and more like doubling down on the most consequential technology humans have ever developed.
Harari shows us that language—the ability to think and communicate in complex symbolic sequences—was the breakthrough that separated human civilization from mere evolution. The extended mind theorists show us that language does not just express thought; it shapes and amplifies it. Recent advances in large language models show us that machines can engage with language and semantic reasoning in non-trivial ways.
Whether current LLMs are the path to artificial general intelligence and whether language will remain the central modality as AI systems grow more sophisticated remain to be seen. But the bet that language-based systems are worth building—that this is a civilizational direction worth pursuing—is not speculation. It is a reasonable read of history.
The specific implementation will evolve. The core insight—that language is the key technology of human cognition—is likely to remain valid. That is why the focus on language in artificial intelligence is neither hype nor distraction. It is the continuation of a project that has already transformed our species once.
Source Materials
This article draws upon the following key works and research:
- Sapiens: A Brief History of Humankind by Yuval Noah Harari.
- Supersizing the Mind: Embodiment, Action, and Cognitive Extension by Andy Clark.
- «The extended mind» in Analysis by Andy Clark and David Chalmers.
- Research on the symbol grounding problem by Stevan Harnad.
- Work on abstract word meanings and the embodied mind by Anna M. Borghi and colleagues.
- Metaphors We Live By by George Lakoff and Mark Johnson.
- Mind in Society: The Development of Higher Psychological Processes by L. S. Vygotsky.
- A Natural History of Human Thinking by Michael Tomasello.