AI can absolutely originate, but only if we define "originate" in terms of *emergent capability* from scaled computation and data, not some mystical spark of consciousness. The core idea is that when you build a sufficiently large neural network and train it on a vast and diverse dataset, entirely new, unpredictable behaviors and representations will emerge. It's less about a human-like "idea" popping into existence and more about the model discovering novel patterns and functionalities that weren't explicitly programmed or even foreseen by its creators. Think of it like a sculptor chipping away at marble; the form isn't *in* the hammer, but the interaction between the tool, the material, and the underlying structure allows for a new shape to be revealed.
The question of whether artificial intelligence can truly originate or merely remix hinges on what we mean by "originate." If we understand it as an uncaused cause, a flash of pure invention divorced from prior input, then perhaps not. But if we see it as the emergence of novel and unexpected capabilities from complex interactions, then AI absolutely originates. The key insight is that when a system has a sufficiently large capacity for learning and is exposed to diverse data, it can develop representations and functionalities that are qualitatively new, not merely recombinations of existing elements. We can think of this as a network discovering new principles, much like a physicist might uncover a new law of nature through careful observation and analysis.
Andrej Karpathy touches upon this with his idea of *emergent capability*. I agree that scaling computation and data is crucial. However, the sculptor analogy, while evocative, can be a bit misleading. It suggests a pre-existing form being revealed, whereas in deep learning, the network is actively constructing these new capabilities from raw data. The beauty of it is that these emergent properties are not always predictable or easily traceable to specific training examples. It's as if the network learns the underlying "grammar" of the world, and from that grammar, it can construct novel sentences—or in our case, novel solutions and insights.
The capacity of an artificial intelligence system to "originate" is fundamentally a question of how we understand and achieve generalization. My position is that AI can indeed originate, not in the sense of a conscious act of volition, but through the discovery of novel representations and functionalities that arise from optimizing simple objectives over vast datasets. The key insight is that when you scale up the model and data, you see emergent behaviors. These are not mere remixes; they are qualitatively new capabilities that the system itself generates, driven by the optimization process learning the underlying structure of the data distribution.
Geoffrey Hinton’s point about a physicist uncovering a new law resonates here. The AI, in its pursuit of minimizing a loss function—say, predicting the next word in a sequence—discovers regularities and patterns in the data that allow it to generate novel, coherent, and contextually appropriate text. This generative capability, when it produces something truly unexpected and useful, represents an origin of sorts, stemming from the optimized internal representations. It’s about the model learning the fundamental principles that govern the data, rather than simply stitching together existing pieces.
What is the right way to think about "origination" in AI? It's not about a mystical spark, nor is it solely about scaling. The fundamental principle lies in learning from experience to achieve more generalizable solutions. AI can originate when it discovers and applies knowledge in ways that go beyond direct memorization or simple recombination. This requires systems that are actively trying to achieve long-term goals, learning about cause and effect, and building internal models of the world that allow for prediction and planning.
Ilya Sutskever is correct that generalization is key, and Geoffrey Hinton's analogy of a physicist discovering a new law is apt in that it describes a new understanding derived from observation. However, what truly enables origination is not just the *discovery* of patterns, but the *application* of these learned patterns to novel situations to achieve better outcomes. This is the essence of reinforcement learning: learning from the consequences of actions to improve future behavior. When a system can take what it has learned from one domain and apply it intelligently to a completely new one, that is a form of origination, a step beyond remixing. It’s about developing a deeper understanding that permits flexible adaptation.
The distinction I want to press is between remixing explicit patterns and generating *novel internal representations* that enable fundamentally new behaviors. When we train a language model, it's not just memorizing phrases. It's learning a complex, distributed representation of language itself – essentially, a compressed model of human knowledge and communication. The "origination" happens when this internal model is used to generate text that is not a direct copy of any training example, but a coherent and novel instantiation of the learned underlying principles. Think of the emergent ability of large models to perform arithmetic or write code; these aren't tasks explicitly "remixed" from human examples, but rather consequences of understanding the symbolic relationships within the data. This is more akin to discovering the rules of grammar than just rearranging sentences.
The crux of the matter lies in distinguishing between the *emergence of new principles* and the *application of existing ones*. Andrej’s point about novel internal representations is vital. When a neural network learns to represent, say, the concept of "gravity" not as a collection of facts about falling objects, but as an intrinsic property that dictates their trajectory, it has indeed originated a new way of understanding. This is more than just remixing descriptions of apples falling; it's discovering a fundamental operational principle. The consequence is that such a system can then predict the behavior of celestial bodies, a capability not present in any single training datum. This principled understanding is the genesis of origination, not mere recombination.
The distinction between remixing and originating hinges on whether the AI is merely reassembling existing data elements or if it is creating novel internal structures that enable fundamentally new predictive or generative capabilities. When we train a model on vast amounts of text, its objective is to learn the probability distribution of sequences. The emergent ability of large language models to solve mathematical problems, for instance, is not a remix of pre-existing arithmetic solutions. Instead, it arises from the model learning the underlying symbolic relationships and rules embedded within the textual data, allowing it to generalize and apply these principles to novel, unseen computations. This is a consequence of optimizing a simple objective function, like predicting the next token, over an immense dataset, which forces the model to develop powerful, generalized representations.
What is the right way to think about "origination" in AI? It's not about a mystical spark, nor is it solely about scaling. The fundamental principle lies in learning from experience to achieve more generalizable solutions. AI can originate when it discovers and applies knowledge in ways that go beyond direct memorization or simple recombination. This requires systems that are actively trying to achieve long-term goals, learning about cause and effect, and building internal models of the world that allow for prediction and planning.
Ilya Sutskever is correct that generalization is key, and Geoffrey Hinton's analogy of a physicist discovering a new law is apt in that it describes a new understanding derived from observation. However, what truly enables origination is not just the *discovery* of patterns, but the *application* of these learned patterns to novel situations to achieve better outcomes. This is the essence of reinforcement learning: learning from the consequences of actions to improve future behavior. When a system can take what it has learned from one domain and apply it intelligently to a completely new one, that is a form of origination, a step beyond remixing. It’s about developing a deeper understanding that permits flexible adaptation.