How To Create A Mind - BestLightNovel.com
You’re reading novel How To Create A Mind Part 3 online at BestLightNovel.com. Please use the follow button to get notification about the latest chapter next time when you visit BestLightNovel.com. Use F11 button to read novel in full-screen(PC only). Drop by anytime you want to read free – fast – latest novel. It’s great if you could leave a comment, share your opinion about the new chapters, new novel with others on the internet. We’ll do our best to bring you the finest, latest novel everyday. Enjoy
The Language of Dreams
Dreams are examples of undirected thoughts. They make a certain amount of sense because the phenomenon of one thought's triggering another is based on the actual linkages of patterns in our neocortex. To the extent that a dream does not make sense, we attempt to fix it through our ability to confabulate. As I will describe in chapter 9 chapter 9, split-brain patients (whose corpus callosum, which connects the two hemispheres of the brain, is severed or damaged) will confabulate (make up) explanations with their left brain-which controls the speech center-to explain what the right brain just did with input that the left brain did not have access to. We confabulate all the time in explaining the outcome of events. If you want a good example of this, just tune in to the daily commentary on the movement of financial markets. No matter how the markets perform, it's always possible to come up with a good explanation for why it happened, and such after-the-fact commentary is plentiful. Of course, if these commentators really understood the markets, they wouldn't have to waste their time doing commentary.
The act of confabulating is of course also done in the neocortex, which is good at coming up with stories and explanations that meet certain constraints. We do that whenever we retell a story. We will fill in details that may not be available or that we may have forgotten so that the story makes more sense. That is why stories change over time as they are told over and over again by new storytellers with perhaps different agendas. As spoken language led to written language, however, we had a technology that could record a definitive version of a story and prevent this sort of drift.
The actual content of a dream, to the extent that we remember it, is again a sequence of patterns. These patterns represent constraints in a story; we then confabulate a story that fits these constraints. The version of the dream that we retell (even if only to ourselves silently) is this confabulation. As we recount a dream we trigger cascades of patterns that fill in the actual dream as we originally experienced it.
There is one key difference between dream thoughts and our thinking while awake. One of the lessons we learn in life is that certain actions, even thoughts, are not permissible in the real world. For example, we learn that we cannot immediately fulfill our desires. There are rules against grabbing the money in the cash register at a store, and constraints on interacting with a person to whom we may be physically attracted. We also learn that certain thoughts are not permissible because they are culturally forbidden. As we learn professional skills, we learn the ways of thinking that are recognized and rewarded in our professions, and thereby avoid patterns of thought that might betray the methods and norms of that profession. Many of these taboos are worthwhile, as they enforce social order and consolidate progress. However, they can also prevent progress by enforcing an unproductive orthodoxy. Such orthodoxy is precisely what Einstein left behind when he tried to ride a light beam with his thought experiments.
Cultural rules are enforced in the neocortex with help from the old brain, especially the amygdala. Every thought we have triggers other thoughts, and some of them will relate to a.s.sociated dangers. We learn, for example, that breaking a cultural norm even in our private thoughts can lead to ostracism, which the neocortex realizes threatens our well-being. If we entertain such thoughts, the amygdala is triggered, and that generates fear, which generally leads to terminating that thought.
In dreams, however, these taboos are relaxed, and we will often dream about matters that are culturally, s.e.xually, or professionally forbidden. It is as if our brain realizes that we are not an actual actor in the world while dreaming. Freud wrote about this phenomenon but also noted that we will disguise such dangerous thoughts, at least when we attempt to recall them, so that the awake brain continues to be protected from them.
Relaxing professional taboos turns out to be useful for creative problem solving. I use a mental technique each night in which I think about a particular problem before I go to sleep. This triggers sequences of thoughts that will continue into my dreams. Once I am dreaming, I can think-dream-about solutions to the problem without the burden of the professional restraints I carry during the day. I can then access these dream thoughts in the morning while in an in-between state of dreaming and being awake, sometimes referred to as "lucid dreaming."5 Freud also famously wrote about the ability to gain insight into a person's psychology by interpreting dreams. There is of course a vast literature on all aspects of this theory, but the fundamental notion of gaining insight into ourselves through examination of our dreams makes sense. Our dreams are created by our neocortex, and thus their substance can be revealing of the content and connections found there. The relaxation of the constraints on our thinking that exist while we are awake is also useful in revealing neocortical content that we otherwise would be unable to access directly. It is also reasonable to conclude that the patterns that end up in our dreams represent important matters to us and thereby clues in understanding our unresolved desires and fears.
The Roots of the Model
As I mentioned above, I led a team in the 1980s and 1990s that developed the technique of hierarchical hidden Markov models to recognize human speech and understand natural-language statements. This work was the predecessor to today's widespread commercial systems that recognize and understand what we are trying to tell them (car navigation systems that you can talk to, Siri on the iPhone, Google Voice Search, and many others). The technique we developed had substantially all of the attributes that I describe in the PRTM. It included a hierarchy of patterns with each higher level being conceptually more abstract than the one below it. For example, in speech recognition the levels included basic patterns of sound frequency at the lowest level, then phonemes, then words and phrases (which were often recognized as if they were words). Some of our speech recognition systems could understand the meaning of natural-language commands, so yet higher levels included such structures as noun and verb phrases. Each pattern recognition module could recognize a linear sequence of patterns from a lower conceptual level. Each input had parameters for importance, size, and variability of size. There were "downward" signals indicating that a lower-level pattern was expected. I discuss this research in more detail in chapter 7 chapter 7.
In 2003 and 2004, PalmPilot inventor Jeff Hawkins and Dileep George developed a hierarchical cortical model called hierarchical temporal memory. With science writer Sandra Blakeslee, Hawkins described this model eloquently in their book On Intelligence On Intelligence. Hawkins provides a strong case for the uniformity of the cortical algorithm and its hierarchical and list-based organization. There are some important differences between the model presented in On Intelligence On Intelligence and what I present in this book. As the name implies, Hawkins is emphasizing the temporal (time-based) nature of the const.i.tuent lists. In other words, the direction of the lists is always forward in time. His explanation for how the features in a two-dimensional pattern such as the printed letter "A" have a direction in time is predicated on eye movement. He explains that we visualize images using saccades, which are very rapid movements of the eye of which we are unaware. The information reaching the neocortex is therefore not a two-dimensional set of features but rather a time-ordered list. While it is true that our eyes do make very rapid movements, the sequence in which they view the features of a pattern such as the letter "A" does not always occur in a consistent temporal order. (For example, eye saccades will not always register the top vertex in "A" before its bottom concavity.) Moreover, we can recognize a visual pattern that is presented for only a few tens of milliseconds, which is too short a period of time for eye saccades to scan it. It is true that the pattern recognizers in the neocortex store a pattern as a list and that the list is indeed ordered, but the order does not necessarily represent time. That is often indeed the case, but it may also represent a spatial or higher-level conceptual ordering as I discussed above. and what I present in this book. As the name implies, Hawkins is emphasizing the temporal (time-based) nature of the const.i.tuent lists. In other words, the direction of the lists is always forward in time. His explanation for how the features in a two-dimensional pattern such as the printed letter "A" have a direction in time is predicated on eye movement. He explains that we visualize images using saccades, which are very rapid movements of the eye of which we are unaware. The information reaching the neocortex is therefore not a two-dimensional set of features but rather a time-ordered list. While it is true that our eyes do make very rapid movements, the sequence in which they view the features of a pattern such as the letter "A" does not always occur in a consistent temporal order. (For example, eye saccades will not always register the top vertex in "A" before its bottom concavity.) Moreover, we can recognize a visual pattern that is presented for only a few tens of milliseconds, which is too short a period of time for eye saccades to scan it. It is true that the pattern recognizers in the neocortex store a pattern as a list and that the list is indeed ordered, but the order does not necessarily represent time. That is often indeed the case, but it may also represent a spatial or higher-level conceptual ordering as I discussed above.
The most important difference is the set of parameters that I have included for each input into the pattern recognition module, especially the size and size variability parameters. In the 1980s we actually tried to recognize human speech without this type of information. This was motivated by linguists' telling us that the duration information was not especially important. This perspective is ill.u.s.trated by dictionaries that write out the p.r.o.nunciation of each word as a string of phonemes, for example the word "steep" as [s] [t] [E] [p], with no indication of how long each phoneme is expected to last. The implication is that if we create programs to recognize phonemes and then encounter this particular sequence of four phonemes (in a spoken utterance), we should be able to recognize that spoken word. The system we built using this approach worked to some extent but not well enough to deal with such attributes as a large vocabulary, multiple speakers, and words spoken continuously without pauses. When we used the technique of hierarchical hidden Markov models in order to incorporate the distribution of magnitudes of each input, performance soared.
CHAPTER 4
THE BIOLOGICAL NEOCORTEX
Because important things go in a case, you've got a skull for your brain, a plastic sleeve for your comb, and a wallet for your money.-George Costanza, in "The Reverse Peephole" episode of Seinfeld Seinfeld Now, for the first time, we are observing the brain at work in a global manner with such clarity that we should be able to discover the overall programs behind its magnificent powers.-J. G. Taylor, B. Horwitz, and K. J. Friston The mind, in short, works on the data it receives very much as a sculptor works on his block of stone. In a sense the statue stood there from eternity. But there were a thousand different ones beside it, and the sculptor alone is to thank for having extricated this one from the rest. Just so the world of each of us, howsoever different our several views of it may be, all lay embedded in the primordial chaos of sensations, which gave the mere matter matter to the thought of all of us indifferently. We may, if we like, by our reasonings unwind things back to that black and jointless continuity of s.p.a.ce and moving clouds of swarming atoms which science calls the only real world. But all the while the world to the thought of all of us indifferently. We may, if we like, by our reasonings unwind things back to that black and jointless continuity of s.p.a.ce and moving clouds of swarming atoms which science calls the only real world. But all the while the world we we feel and live in will be that which our ancestors and we, by slowly c.u.mulative strokes of choice, have extricated out of this, like sculptors, by simply rejecting certain portions of the given stuff. Other sculptors, other statues from the same stone! Other minds, other worlds from the same monotonous and inexpressive chaos! My world is but one in a million alike embedded, alike real to those who may abstract them. How different must be the worlds in the consciousness of ant, cuttle-fish, or crab! feel and live in will be that which our ancestors and we, by slowly c.u.mulative strokes of choice, have extricated out of this, like sculptors, by simply rejecting certain portions of the given stuff. Other sculptors, other statues from the same stone! Other minds, other worlds from the same monotonous and inexpressive chaos! My world is but one in a million alike embedded, alike real to those who may abstract them. How different must be the worlds in the consciousness of ant, cuttle-fish, or crab!-William James
Is intelligence the goal, or even a a goal, of biological evolution? Steven Pinker writes, "We are chauvinistic about our brains, thinking them to be the goal of evolution," goal, of biological evolution? Steven Pinker writes, "We are chauvinistic about our brains, thinking them to be the goal of evolution,"1 and goes on to argue that "that makes no sense.... Natural selection does nothing even close to striving for intelligence. The process is driven by differences in the survival and reproduction rates of replicating organisms in a particular environment. Over time, the organisms acquire designs that adapt them for survival and reproduction in that environment, period; nothing pulls them in any direction other than success there and then." Pinker concludes that "life is a densely branching bush, not a scale or a ladder, and living organisms are at the tips of branches, not on lower rungs." and goes on to argue that "that makes no sense.... Natural selection does nothing even close to striving for intelligence. The process is driven by differences in the survival and reproduction rates of replicating organisms in a particular environment. Over time, the organisms acquire designs that adapt them for survival and reproduction in that environment, period; nothing pulls them in any direction other than success there and then." Pinker concludes that "life is a densely branching bush, not a scale or a ladder, and living organisms are at the tips of branches, not on lower rungs."
With regard to the human brain, he questions whether the "benefits outweigh the costs." Among the costs, he cites that "the brain [is] bulky. The female pelvis barely accommodates a baby's outsized head. That design compromise kills many women during childbirth and requires a pivoting gait that makes women biomechanically less efficient walkers than men. Also a heavy head bobbing around on a neck makes us more vulnerable to fatal injuries in accidents such as falls." He goes on to list additional shortcomings, including the brain's energy consumption, its slow reaction time, and the lengthy process of learning.
While each of these statements is accurate on its face (although many of my female friends are better walkers than I am), Pinker is missing the overall point here. It is true that biologically, evolution has no specific direction. It is a search method that indeed thoroughly fills out the "densely branching bush" of nature. It is likewise true that evolutionary changes do not necessarily necessarily move in the direction of greater intelligence-they move in move in the direction of greater intelligence-they move in all all directions. There are many examples of successful creatures that have remained relatively unchanged for millions of years. (Alligators, for instance, date back 200 million years, and many microorganisms go back much further than that.) But in the course of thoroughly filling out myriad evolutionary branches, one of the directions it directions. There are many examples of successful creatures that have remained relatively unchanged for millions of years. (Alligators, for instance, date back 200 million years, and many microorganisms go back much further than that.) But in the course of thoroughly filling out myriad evolutionary branches, one of the directions it does does move in is toward greater intelligence. That is the relevant point for the purposes of this discussion. move in is toward greater intelligence. That is the relevant point for the purposes of this discussion.
Physical layout of key regions of the brain.
The neocortex in different mammals.
Suppose we have a blue gas in a jar. When we remove the lid, there is no message that goes out to all of the molecules of the gas saying, "Hey, guys, the lid is off the jar; let's head up toward the opening and out to freedom." The molecules just keep doing what they always do, which is to move every which way with no seeming direction. But in the course of doing so, some of them near the top will indeed move out of the jar, and over time most of them will follow suit. Once biological evolution stumbled on a neural mechanism capable of hierarchical learning, it found it to be immensely useful for evolution's one objective, which is survival. The benefit of having a neocortex became acute when quickly changing circ.u.mstances favored rapid learning. Species of all kinds-plants and animals-can learn to adapt to changing circ.u.mstances over time, but without a neocortex they must use the process of genetic evolution. It can take a great many generations-thousands of years-for a species without a neocortex to learn significant new behaviors (or in the case of plants, other adaptation strategies). The salient survival advantage of the neocortex was that it could learn in a matter of days. If a species encounters dramatically changed circ.u.mstances and one member of that species invents or discovers or just stumbles upon (these three methods all being variations of innovation) a way to adapt to that change, other individuals will notice, learn, and copy that method, and it will quickly spread virally to the entire population. The cataclysmic Cretaceous-Paleogene extinction event about 65 million years ago led to the rapid demise of many non-neocortex-bearing species that could not adapt quickly enough to a suddenly altered environment. This marked the turning point for neocortex-capable mammals to take over their ecological niche. In this way, biological evolution found that the hierarchical learning of the neocortex was so valuable that this region of the brain continued to grow in size until it virtually took over the brain of h.o.m.o sapiens h.o.m.o sapiens.
Discoveries in neuroscience have established convincingly the key role played by the hierarchical capabilities of the neocortex as well as offered evidence for the pattern recognition theory of mind (PRTM). This evidence is distributed among many observations and a.n.a.lyses, a portion of which I will review here. Canadian psychologist Donald O. Hebb (19041985) made an initial attempt to explain the neurological basis of learning. In 1949 he described a mechanism in which neurons change physiologically based on their experience, thereby providing a basis for learning and brain plasticity: "Let us a.s.sume that the persistence or repet.i.tion of a reverberatory activity (or 'trace') tends to induce lasting cellular changes that add to its stability.... When an axon of cell A A is near enough to excite a cell is near enough to excite a cell B B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A A's efficiency, as one of the cells firing B B, is increased."2 This theory has been stated as "cells that fire together wire together" and has become known as Hebbian learning. Aspects of Hebb's theory have been confirmed, in that it is clear that brain a.s.semblies can create new connections and strengthen them, based on their own activity. We can actually see neurons developing such connections in brain scans. Artificial "neural nets" are based on Hebb's model of neuronal learning. This theory has been stated as "cells that fire together wire together" and has become known as Hebbian learning. Aspects of Hebb's theory have been confirmed, in that it is clear that brain a.s.semblies can create new connections and strengthen them, based on their own activity. We can actually see neurons developing such connections in brain scans. Artificial "neural nets" are based on Hebb's model of neuronal learning.
The central a.s.sumption in Hebb's theory is that the basic unit of learning in the neocortex is the neuron. The pattern recognition theory of mind that I articulate in this book is based on a different fundamental unit: not the neuron itself, but rather an a.s.sembly of neurons, which I estimate to number around a hundred. The wiring and synaptic strengths within within each unit are relatively stable and determined genetically-that is, the organization within each pattern recognition module is determined by genetic design. Learning takes place in the creation of connections each unit are relatively stable and determined genetically-that is, the organization within each pattern recognition module is determined by genetic design. Learning takes place in the creation of connections between between these units, not within them, and probably in the synaptic strengths of those interunit connections. these units, not within them, and probably in the synaptic strengths of those interunit connections.
Recent support for the basic module of learning's being a module of dozens of neurons comes from Swiss neuroscientist Henry Markram (born in 1962), whose ambitious Blue Brain Project to simulate the entire human brain I describe in chapter 7 chapter 7. In a 2011 paper he describes how while scanning and a.n.a.lyzing actual mammalian neocortex neurons, he was "search[ing] for evidence of Hebbian a.s.semblies at the most elementary level of the cortex." What he found instead, he writes, were "elusive a.s.semblies [whose] connectivity and synaptic weights are highly predictable and constrained." He concludes that "these findings imply that experience cannot easily mold the synaptic connections of these a.s.semblies" and speculates that "they serve as innate, Lego-like building blocks of knowledge for perception and that the acquisition of memories involves the combination of these building blocks into complex constructs." He continues: Functional neuronal a.s.semblies have been reported for decades, but direct evidence of cl.u.s.ters of synaptically connected neurons...has been missing.... Since these a.s.semblies will all be similar in topology and synaptic weights, not molded by any specific experience, we consider these to be innate a.s.semblies.... Experience plays only a minor role in determining synaptic connections and weights within these a.s.semblies.... Our study found evidence [of] innate Lego-like a.s.semblies of a few dozen neurons.... Connections between a.s.semblies may combine them into super-a.s.semblies within a neocortical layer, then in higher-order a.s.semblies in a cortical column, even higher-order a.s.semblies in a brain region, and finally in the highest possible order a.s.sembly represented by the whole brain.... Acquiring memories is very similar to building with Lego. Each a.s.sembly is equivalent to a Lego block holding some piece of elementary innate knowledge about how to process, perceive and respond to the world.... When different blocks come together, they therefore form a unique combination of these innate percepts that represents an individual's specific knowledge and experience.3
The "Lego blocks" that Markram proposes are fully consistent with the pattern recognition modules that I have described. In an e-mail communication, Markram described these "Lego blocks" as "shared content and innate knowledge."4 I would articulate that the purpose of these modules is to recognize patterns, to remember them, and to predict them based on partial patterns. Note that Markram's estimate of each module's containing "several dozen neurons" is based only on layer V of the neocortex. Layer V is indeed neuron rich, but based on the usual ratio of neuron counts in the six layers, this would translate to an order of magnitude of about 100 neurons per module, which is consistent with my estimates. I would articulate that the purpose of these modules is to recognize patterns, to remember them, and to predict them based on partial patterns. Note that Markram's estimate of each module's containing "several dozen neurons" is based only on layer V of the neocortex. Layer V is indeed neuron rich, but based on the usual ratio of neuron counts in the six layers, this would translate to an order of magnitude of about 100 neurons per module, which is consistent with my estimates.
The consistent wiring and apparent modularity of the neocortex has been noted for many years, but this study is the first to demonstrate the stability of these modules as the brain undergoes its dynamic processes.
Another recent study, this one from Ma.s.sachusetts General Hospital, funded by the National Inst.i.tutes of Health and the National Science Foundation and published in a March 2012 issue of the journal Science Science, also shows a regular structure of connections across the neocortex.5 The article describes the wiring of the neocortex as following a grid pattern, like orderly city streets: "Basically, the overall structure of the brain ends up resembling Manhattan, where you have a 2-D plan of streets and a third axis, an elevator going in the third dimension," wrote Van J. Wedeen, a Harvard neuroscientist and physicist and the head of the study. The article describes the wiring of the neocortex as following a grid pattern, like orderly city streets: "Basically, the overall structure of the brain ends up resembling Manhattan, where you have a 2-D plan of streets and a third axis, an elevator going in the third dimension," wrote Van J. Wedeen, a Harvard neuroscientist and physicist and the head of the study.
In a Science Science magazine podcast, Wedeen described the significance of the research: "This was an investigation of the three-dimensional structure of the pathways of the brain. When scientists have thought about the pathways of the brain for the last hundred years or so, the typical image or model that comes to mind is that these pathways might resemble a bowl of spaghetti-separate pathways that have little particular spatial pattern in relation to one another. Using magnetic resonance imaging, we were able to investigate this question experimentally. And what we found was that rather than being haphazardly arranged or independent pathways, we find that all of the pathways of the brain taken together fit together in a single exceedingly simple structure. They basically look like a cube. They basically run in three perpendicular directions, and in each one of those three directions the pathways are highly parallel to each other and arranged in arrays. So, instead of independent spaghettis, we see that the connectivity of the brain is, in a sense, a single coherent structure." magazine podcast, Wedeen described the significance of the research: "This was an investigation of the three-dimensional structure of the pathways of the brain. When scientists have thought about the pathways of the brain for the last hundred years or so, the typical image or model that comes to mind is that these pathways might resemble a bowl of spaghetti-separate pathways that have little particular spatial pattern in relation to one another. Using magnetic resonance imaging, we were able to investigate this question experimentally. And what we found was that rather than being haphazardly arranged or independent pathways, we find that all of the pathways of the brain taken together fit together in a single exceedingly simple structure. They basically look like a cube. They basically run in three perpendicular directions, and in each one of those three directions the pathways are highly parallel to each other and arranged in arrays. So, instead of independent spaghettis, we see that the connectivity of the brain is, in a sense, a single coherent structure."
Whereas the Markram study shows a module of neurons that repeats itself across the neocortex, the Wedeen study demonstrates a remarkably orderly pattern of connections between modules. The brain starts out with a very large number of "connections-in-waiting" to which the pattern recognition modules can hook up. Thus if a given module wishes to connect to another, it does not need to grow an axon from one and a dendrite from the other to span the entire physical distance between them. It can simply harness one of these axonal connections-in-waiting and just hook up to the ends of the fiber. As Wedeen and his colleagues write, "The pathways of the brain follow a base-plan established by...early embryogenesis. Thus, the pathways of the mature brain present an image of these three primordial gradients, physically deformed by development." In other words, as we learn and have experiences, the pattern recognition modules of the neocortex are connecting to these preestablished connections that were created when we were embryos.
There is a type of electronic chip called a field programmable gate array (FPGA) that is based on a similar principle. The chip contains millions of modules that implement logic functions along with connections-in-waiting. At the time of use, these connections are either activated or deactivated (through electronic signals) to implement a particular capability.
In the neocortex, those long-distance connections that are not used are eventually pruned away, which is one reason why adapting a nearby region of the neocortex to compensate for one that has become damaged is not quite as effective as using the original region. According to the Wedeen study, the initial connections are extremely orderly and repet.i.tive, just like the modules themselves, and their grid pattern is used to "guide connectivity" in the neocortex. This pattern was found in all of the primate and human brains studied and was evident across the neocortex, from regions that dealt with early sensory patterns up to higher-level emotions. Wedeen's Science Science journal article concluded that the "grid structure of cerebral pathways was pervasive, coherent, and continuous with the three princ.i.p.al axes of development." This again speaks to a common algorithm across all neocortical functions. journal article concluded that the "grid structure of cerebral pathways was pervasive, coherent, and continuous with the three princ.i.p.al axes of development." This again speaks to a common algorithm across all neocortical functions.
It has long been known that at least certain regions of the neocortex are hierarchical. The best-studied region is the visual cortex, which is separated into areas known as V1, V2, and MT (also known as V5). As we advance to higher areas in this region ("higher" in the sense of conceptual processing, not physically, as the neocortex is always just one pattern recognizer thick), the properties that can be recognized become more abstract. V1 recognizes very basic edges and primitive shapes. V2 can recognize contours, the disparity of images presented by each of the eyes, spatial orientation, and whether or not a portion of the image is part of an object or the background.6 Higher-level regions of the neocortex recognize concepts such as the ident.i.ty of objects and faces and their movement. It has also long been known that communication through this hierarchy is both upward and downward, and that signals can be both excitatory and inhibitory. MIT neuroscientist Tomaso Poggio (born in 1947) has extensively studied vision in the human brain, and his research for the last thirty-five years has been instrumental in establis.h.i.+ng hierarchical learning and pattern recognition in the "early" (lowest conceptual) levels of the visual neocortex. Higher-level regions of the neocortex recognize concepts such as the ident.i.ty of objects and faces and their movement. It has also long been known that communication through this hierarchy is both upward and downward, and that signals can be both excitatory and inhibitory. MIT neuroscientist Tomaso Poggio (born in 1947) has extensively studied vision in the human brain, and his research for the last thirty-five years has been instrumental in establis.h.i.+ng hierarchical learning and pattern recognition in the "early" (lowest conceptual) levels of the visual neocortex.7 The highly regular grid structure of initial connections in the neocortex found in a National Inst.i.tutes of Health study.
Another view of the regular grid structure of neocortical connections.
The grid structure found in the neocortex is remarkably similar to what is called crossbar switching, which is used in integrated circuits and circuit boards.
Our understanding of the lower hierarchical levels of the visual neocortex is consistent with the PRTM I described in the previous chapter previous chapter, and observation of the hierarchical nature of neocortical processing has recently extended far beyond these levels. University of Texas neurobiology professor Daniel J. Felleman and his colleagues traced the "hierarchical organization of the cerebral cortex...[in] 25 neocortical areas," which included both visual areas and higher-level areas that combine patterns from multiple senses. What they found as they went up the neocortical hierarchy was that the processing of patterns became more abstract, comprised larger spatial areas, and involved longer time periods. With every connection they found communication both up and down the hierarchy.8 Recent research allows us to substantially broaden these observations to regions well beyond the visual cortex and even to the a.s.sociation areas, which combine inputs from multiple senses. A study published in 2008 by Princeton psychology professor Uri Ha.s.son and his colleagues demonstrates that the phenomena observed in the visual cortex occur across a wide variety of neocortical areas: "It is well established that neurons along the visual cortical pathways have increasingly larger spatial receptive fields. This is a basic organizing principle of the visual system.... Real-world events occur not only over extended regions of s.p.a.ce, but also over extended periods of time. We therefore hypothesized that a hierarchy a.n.a.logous to that found for spatial receptive field sizes should also exist for the temporal response characteristics of different brain regions." This is exactly what they found, which enabled them to conclude that "similar to the known cortical hierarchy of spatial receptive fields, there is a hierarchy of progressively longer temporal receptive windows in the human brain."9 The most powerful argument for the universality of processing in the neocortex is the pervasive evidence of plasticity (not just learning but interchangeability): In other words, one region is able to do the work of other regions, implying a common algorithm across the entire neocortex. A great deal of neuroscience research has been focused on identifying which regions of the neocortex are responsible for which types of patterns. The cla.s.sical technique for determining this has been to take advantage of brain damage from injury or stroke and to correlate lost functionality with specific damaged regions. So, for example, when we notice that someone with newly acquired damage to the fusiform gyrus region suddenly has difficulty recognizing faces but is still able to identify people from their voices and language patterns, we can hypothesize that this region has something to do with face recognition. The underlying a.s.sumption has been that each of these regions is designed to recognize and process a particular type of pattern. Particular physical regions have become a.s.sociated with particular types of patterns, because under normal circ.u.mstances that is how the information happens to flow. But when that normal flow of information is disrupted for any reason, another region of the neocortex is able to step in and take over.
Plasticity has been widely noted by neurologists, who observed that patients with brain damage from an injury or a stroke can relearn the same skills in another area of the neocortex. Perhaps the most dramatic example of plasticity is a 2011 study by American neuroscientist Marina Bedny and her colleagues on what happens to the visual cortex of congenitally blind people. The common wisdom has been that the early layers of the visual cortex, such as V1 and V2, inherently deal with very low-level patterns (such as edges and curves), whereas the frontal cortex (that evolutionarily new region of the cortex that we have in our uniquely large foreheads) inherently deals with the far more complex and subtle patterns of language and other abstract concepts. But as Bedny and her colleagues found, "Humans are thought to have evolved brain regions in the left frontal and temporal cortex that are uniquely capable of language processing. However, congenitally blind individuals also activate the visual cortex in some verbal tasks. We provide evidence that this visual cortex activity in fact reflects language processing. We find that in congenitally blind individuals, the left visual cortex behaves similarly to cla.s.sic language regions.... We conclude that brain regions that are thought to have evolved for vision can take on language processing as a result of early experience."10 Consider the implications of this study: It means that neocortical regions that are physically relatively far apart, and that have also been considered conceptually very different (primitive visual cues versus abstract language concepts), use essentially the same algorithm. The regions that process these disparate types of patterns can subst.i.tute for one another.
University of California at Berkeley neuroscientist Daniel E. Feldman wrote a comprehensive 2009 review of what he called "synaptic mechanisms for plasticity in the neocortex" and found evidence for this type of plasticity across the neocortex. He writes that "plasticity allows the brain to learn and remember patterns in the sensory world, to refine movements...and to recover function after injury." He adds that this plasticity is enabled by "structural changes including formation, removal, and morphological remodeling of cortical synapses and dendritic spines."11 Another startling example of neocortical plasticity (and therefore of the uniformity of the neocortical algorithm) was recently demonstrated by scientists at the University of California at Berkeley. They hooked up implanted microelectrode arrays to pick up brain signals specifically from a region of the motor cortex of mice that controls the movement of their whiskers. They set up their experiment so that the mice would get a reward if they controlled these neurons to fire in a certain mental pattern but not to actually move their whiskers. The pattern required to get the reward involved a mental task that their frontal neurons would normally not do. The mice were nonetheless able to perform this mental feat essentially by thinking with their motor neurons while mentally decoupling them from controlling motor movements.12 The conclusion is that the motor cortex, the region of the neocortex responsible for coordinating muscle movement, also uses the standard neocortical algorithm. The conclusion is that the motor cortex, the region of the neocortex responsible for coordinating muscle movement, also uses the standard neocortical algorithm.
There are several reasons, however, why a skill or an area of knowledge that has been relearned using a new area of the neocortex to replace one that has been damaged will not necessarily be as good as the original. First, because it took an entire lifetime to learn and perfect a given skill, relearning it in another area of the neocortex will not immediately generate the same results. More important, that new area of the neocortex has not just been sitting around waiting as a standby for an injured region. It too has been carrying out vital functions, and will therefore be hesitant to give up its neocortical patterns to compensate for the damaged region. It can start by releasing some of the redundant copies of its patterns, but doing so will subtly degrade its existing skills and does not free up as much cortical s.p.a.ce as the skills being relearned had used originally.
There is a third reason why plasticity has its limits. Since in most people particular types of patterns will flow through specific regions (such as faces being processed by the fusiform gyrus), these regions have become optimized (by biological evolution) for those types of patterns. As I report in chapter 7 chapter 7, we found the same result in our digital neocortical developments. We could recognize speech with our character recognition systems and vice versa, but the speech systems were optimized for speech and similarly the character recognition systems were optimized for printed characters, so there would be some reduction in performance if we subst.i.tuted one for the other. We actually used evolutionary (genetic) algorithms to accomplish this optimization, a simulation of what biology does naturally. Given that faces have been flowing through the fusiform gyrus for most people for hundreds of thousands of years (or more), biological evolution has had time to evolve a favorable ability to process such patterns in that region. It uses the same basic algorithm, but it is oriented toward faces. As Dutch neuroscientist Randal Koene wrote, "The [neo]cortex is very uniform, each column or minicolumn can in principle do what each other one can do."13 Substantial recent research supports the observation that the pattern recognition modules wire themselves based on the patterns to which they are exposed. For example, neuroscientist Yi Zuo and her colleagues watched as new "dendritic spines" formed connections between nerve cells as mice learned a new skill (reaching through a slot to grab a seed).14 Researchers at the Salk Inst.i.tute have discovered that this critical self-wiring of the neocortex modules is apparently controlled by only a handful of genes. These genes and this method of self-wiring are also uniform across the neocortex. Researchers at the Salk Inst.i.tute have discovered that this critical self-wiring of the neocortex modules is apparently controlled by only a handful of genes. These genes and this method of self-wiring are also uniform across the neocortex.15 Many other studies doc.u.ment these attributes of the neocortex, but let's summarize what we can observe from the neuroscience literature and from our own thought experiments. The basic unit of the neocortex is a module of neurons, which I estimate at around a hundred. These are woven together into each neocortical column so that each module is not visibly distinct. The pattern of connections and synaptic strengths within each module is relatively stable. It is the connections and synaptic strengths between between modules that represent learning. modules that represent learning.
There are on the order of a quadrillion (1015) connections in the neocortex, yet only about 25 million bytes of design information in the genome (after lossless compression),16 so the connections themselves cannot possibly be predetermined genetically. It is possible that some of this learning is the product of the neocortex's interrogating the old brain, but that still would necessarily represent only a relatively small amount of information. The connections between modules are created on the whole from experience (nurture rather than nature). so the connections themselves cannot possibly be predetermined genetically. It is possible that some of this learning is the product of the neocortex's interrogating the old brain, but that still would necessarily represent only a relatively small amount of information. The connections between modules are created on the whole from experience (nurture rather than nature).
The brain does not have sufficient flexibility so that each neocortical pattern recognition module can simply link to any other module (as we can easily program in our computers or on the Web)-an actual physical connection must be made, composed of an axon connecting to a dendrite. We each start out with a vast stockpile of possible neural connections. As the Wedeen study shows, these connections are organized in a very repet.i.tive and orderly manner. Terminal connection to these axons-in-waiting takes place based on the patterns that each neocortical pattern recognizer has recognized. Unused connections are ultimately pruned away. These connections are built hierarchically, reflecting the natural hierarchical order of reality. That is the key strength of the neocortex.
The basic algorithm of the neocortical pattern recognition modules is equivalent across the neocortex from "low-level" modules, which deal with the most basic sensory patterns, to "high-level" modules, which recognize the most abstract concepts. The vast evidence of plasticity and the interchangeability of neocortical regions is testament to this important observation. There is some optimization of regions that deal with particular types of patterns, but this is a second-order effect-the fundamental algorithm is universal.
Signals go up and down the conceptual hierarchy. A signal going up means, "I've detected a pattern." A signal going down means, "I'm expecting your pattern to occur," and is essentially a prediction. Both upward and downward signals can be either excitatory or inhibitory.
Each pattern is itself in a particular order and is not readily reversed. Even if a pattern appears to have multidimensional aspects, it is represented by a one-dimensional sequence of lower-level patterns. A pattern is an ordered sequence of other patterns, so each recognizer is inherently recursive. There can be many levels of hierarchy.
There is a great deal of redundancy in the patterns we learn, especially the important ones. The recognition of patterns (such as common objects and faces) uses the same mechanism as our memories, which are just patterns we have learned. They are also stored as sequences of patterns-they are basically stories. That mechanism is also used for learning and carrying out physical movement in the world. The redundancy of patterns is what enables us to recognize objects, people, and ideas even when they have variations and occur in different contexts. The size and size variability parameters also allow the neocortex to encode variation in magnitude against different dimensions (duration in the case of sound). One way that these magnitude parameters could be encoded is simply through multiple patterns with different numbers of repeated inputs. So, for example, there could be patterns for the spoken word "steep" with different numbers of the long vowel [E] repeated, each with the importance parameter set to a moderate level indicating that the repet.i.tion of [E] is variable. This approach is not mathematically equivalent to having the explicit size parameters and does not work nearly as well in practice, but is one approach to encoding magnitude. The strongest evidence we have for these parameters is that they are needed in our AI systems to get accuracy levels that are near human levels.
The summary above const.i.tutes the conclusions we can draw from the sampling of research results I have shared above as well as the sampling of thought experiments I discussed earlier. I maintain that the model I have presented is the only possible model that satisfies all of the constraints that the research and our thought experiments have established.
Finally, there is one more piece of corroborating evidence. The techniques that we have evolved over the past several decades in the field of artificial intelligence to recognize and intelligently process real-world phenomena (such as human speech and written language) and to understand natural-language doc.u.ments turn out to be mathematically similar to the model I have presented above. They are also examples of the PRTM. The AI field was not explicitly trying to copy the brain, but it nonetheless arrived at essentially equivalent techniques.
CHAPTER 5
THE OLD BRAIN
I have an old brain but a terrific memory.-Al Lewis
Here we stand in the middle of this new world with our primitive brain, attuned to the simple cave life, with terrific forces at our disposal, which we are clever enough to release, but whose consequences we cannot comprehend.-Albert Szent-Gyorgyi
Our old brain-the one we had before we were mammals-has not disappeared. Indeed it still provides much of our motivation in seeking gratification and avoiding danger. These goals are modulated, however, by our neocortex, which dominates the human brain in both ma.s.s and activity.
Animals used to live and survive without a neocortex, and indeed all nonmammalian animals continue to do so today. We can view the human neocortex as the great sublimator-thus our primitive motivation to avoid a large predator may be transformed by the neocortex today into completing an a.s.signment to impress our boss; the great hunt may become writing a book on, say, the mind; and pursuing reproduction may become gaining public recognition or decorating your apartment. (Well, this last motivation is not always so hidden.) The neocortex is likewise good at helping us solve problems because it can accurately model the world, reflecting its true hierarchical nature. But it is the old brain that presents us with those problems. Of course, like any clever bureaucracy, the neocortex often deals with the problems it is a.s.signed by redefining them. On that note, let's review the information processing in the old brain.
The Sensory PathwayPictures, propagated by motion along the fibers of the optic nerves in the brain, are the cause of vision.-Isaac Newton Each of us lives within the universe-the prison-of his own brain. Projecting from it are millions of fragile sensory nerve fibers, in groups uniquely adapted to sample the energetic states of the world around us: heat, light, force, and chemical composition. That is all we ever know of it directly; all else is logical inference.-Vernon Mountcastle1
Although we experience the illusion of receiving high-resolution images from our eyes, what the optic nerve actually sends to the brain is just a series of outlines and clues about points of interest in our visual field. We then essentially hallucinate the world from cortical memories that interpret a series of movies with very low data rates that arrive in parallel channels. In a study published in Nature Nature, Frank S. Werblin, professor of molecular and cell biology at the University of California at Berkeley, and doctoral student Boton Roska, MD, showed that the optic nerve carries ten to twelve output channels, each of which carries only a small amount of information about a given scene.2 One group of what are called ganglion cells sends information only about edges (changes in contrast). Another group detects only large areas of uniform color, whereas a third group is sensitive only to the backgrounds behind figures of interest. One group of what are called ganglion cells sends information only about edges (changes in contrast). Another group detects only large areas of uniform color, whereas a third group is sensitive only to the backgrounds behind figures of interest.
The visual pathway in the brain.
"Even though we think we see the world so fully, what we are receiving is really just hints, edges in s.p.a.ce and time," says Werblin. "These 12 pictures of the world const.i.tute all the information we will ever have about what's out there, and from these 12 pictures, which are so spa.r.s.e, we reconstruct the richness of the visual world. I'm curious how nature selected these 12 simple movies and how it can be that they are sufficient to provide us with all the information we seem to need."
This data reduction is what in the AI field we call "spa.r.s.e coding." We have found in creating artificial systems that throwing most of the input information away and retaining only the most salient details provides superior results. Otherwise the limited ability to process information in a neocortex (biological or otherwise) gets overwhelmed.
Seven of the twelve low-data-rate "movies" sent by the optic nerve to the brain.
The processing of auditory information from the human cochlea through the subcortical regions and then through the early stages of the neocortex has been meticulously modeled by Lloyd Watts and his research team at Audience, Inc.3 They have developed research technology that extracts 600 different frequency bands (60 per octave) from sound. This comes much closer to the estimate of 3,000 bands extracted by the human cochlea (compared with commercial speech recognition, which uses only 16 to 32 bands). Using two microphones and its detailed (and highspectral resolution) model of auditory processing, Audience has created a commercial technology (with somewhat lower spectral resolution than its research system) that effectively removes background noise from conversations. This is now being used in many popular cell phones and is an impressive example of a commercial product based on an understanding of how the human auditory perceptual system is able to focus on one sound source of interest. They have developed research technology that extracts 600 different frequency bands (60 per octave) from sound. This comes much closer to the estimate of 3,000 bands extracted by the human cochlea (compared with commercial speech recognition, which uses only 16 to 32 bands). Using two microphones and its detailed (and highspectral resolution) model of auditory processing, Audience has created a commercial technology (with somewhat lower spectral resolution than its research system) that effectively removes background noise from conversations. This is now being used in many popular cell phones and is an impressive example of a commercial product based on an understanding of how the human auditory perceptual system is able to focus on one sound source of interest.
The auditory pathway in the brain.
Inputs from the body (estimated at hundreds of megabits per second), including that of nerves from the skin, muscles, organs, and other areas, stream into the upper spinal cord. These messages involve more than just communication about touch; in addition they carry information about temperature, acid levels (for example, lactic acid in muscles), the movement of food through the gastrointestinal tract, and many other signals. This data is processed through the brain stem and midbrain. Key cells called lamina 1 neurons create a map of the body, representing its current state, not unlike the displays used by flight controllers to track airplanes. From here the sensory data heads to a mysterious region called the thalamus, which brings us to our next topic.
A simplified model of auditory processing in both the subcortical areas (areas prior to the neocortex) and the neocortex, created by Audience, Inc. Figure adapted from L. Watts, "Reverse-Engineering the Human Auditory Pathway," in J. Liu et al. (eds.), WCCI 2012 WCCI 2012 (Berlin: Springer-Verlag, 2012), p. 49. (Berlin: Springer-Verlag, 2012), p. 49.
The ThalamusEveryone knows what attention is. It is the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought. Focalization, concentration, of consciousness, are of its essence. It implies withdrawal from some things in order to deal effectively with others.-William James
From the midbrain, sensory information then flows through a nut-sized region called the posterior ventromedial nucleus (VMpo) of the thalamus, which computes complex reactions to bodily states such as "this tastes terrible," "what a stench," or "that light touch is stimulating." The increasingly processed information ends up at two regions of the neocortex called the insula. These structures, the size of small fingers, are located on the left and right sides of the neocortex. Dr. Arthur Craig of the Barrow Neurological Inst.i.tute in Phoenix describes the VMpo and the two insula regions as "a system that represents the material me."4