Tuesday, 28 February 2012

The Evolution of Intelligence – From Neural nets to Language

Brain Storms - 9
An Information Processing Approach to the Evolution of Human Language and Intelligence
The aim of this post is to suggest that the way to understand the evolution of language and human intelligence is to start from the assumption that the human brain is no more than an animal brain in which some of the features have been stretched – in a way analogous to the giraffe's neck or the elephant's trunk. It notes how a very simple recursive architecture, demonstrated by an unconventional experimental computer language called CODIL, can actually be used to support a wide range of “intelligent” information processing tasks, and suggests how the approach could be extended to support natural languages. In particular it suggests trigger points where a comparatively small change could lead to an explosive increase in the potential information processing power of the brain.
Chris Reynolds

The new-born Animal Brain
So lets start by considering what a new-born animal brain has to do.
The initial assumption is that the “empty” brain knows nothing about the world outside, but will be immersed in a sea of signals and and its goal is to make some sense of good and bad feelings about its condition. It needs to categorise the signals and this means saving signals in some form of memory node in the neural net - referred to as memodes in this post. It will then need a deductive mechanism to compare the incoming signals with the memodes it has earlier saved, to predict some action to increase the time feeling good, or to avoid the threats.
To be useful the memodes need to be linked into groups so that if a signal arrives the brain can predict which signals are likely to follow and take early appropriate action. This suggest a simple approach. At any one time there are a number of active signals forming what can be considered a short term memory. The memodes are stored in the long term memory and are compared with the signals in the short term memory. If one memode is deduced as relevant it is activated to produce the predicted signal so that the appropriate action can be taken before the expected signal arrives from the outside world.
Take a very simple example. A mouse's brain receives a SMELL-OF-CAT signal, followed by a FAST-MOVING-OBJECT signal, followed by a FLIGHT-IN-PROGRESS  signal and these appear for a short time in the short term memory. Having survived the incident the brain links to recognise any repeat  by linking three memodes SMELL-OF-CAT, FAST-MOVING-OBJECT and FLIGHT-IN-PROGRESS in the long term memory (signals in italics, memodes in normal print.)
Later the signal SMELL-OF-CAT appears in the short term memory and is recognised the SMELL-OF-CAT memode. It "wakes up" the FAST-MOVING-OBJECT memode which checks the short term memory - and if it finds a FAST-MOVING-OBJECT signal, it wakes up the FLIGHT-IN-PROGRESS memode - which in turn activates the part of the brain which generated the original signal - anticipating the need to flee. If the FAST-MOVING-OBJECT memode does not find a matching signal it activates the part of the brain that had generated the original signal to, in effect, open the eyes and look!
In this case the linked memodes relate three different features of an event, but there are other ways the may be linked - for instance a CAT memode may actually act to bring together the memodes SIGHT-OF-CAT, SMELL-OF-CAT and SOUND-OF-CAT - which for many purposes could be considered as synonyms.  A similar linkage could link together CHEESE, APPLE and OATS under the higher level memode FOOD.
The most general way of considering the way that the information in the brain is stored is to to consider each memode is a set representing either a real entity linked to the signal from a sense organ observing something in the real world or some internally generated concept. Each memode is a set which in most cases has other memodes as "components", and the long term memory is a collection of recursively linked nemodes.

How does a memode work?
A memode is a memory unit which can store, recognise or signal information relevant to the animal whose brain it is.
Except at the lowest levels of mental activity processing single "bit" signals from the sense organs, a memode can be seen as a collection of linked memodes. (i.e. the concepts of a memode is recursive.)
A memode may be in the following states:
Excited – signalling that it is an active part of the current information context. There is a limit to the number of active signals – so the memodes representing the older less accessed signals revert to sleeping.
Active Compare – If a memode is asked if it is “True” it looks at the signals in the short term memory and if it matches any of them it responds to the request as “True”. Otherwise it interrogates linked memodes to see if they are true. [If the original signal came from a sense organ such as the eye this request might result in the eye actively looking for evidence of the signal.)  If no “True” response is received in a reasonable time goes back to sleep. (The approach works on the basis of “If no reply then not true – so ignore.)
Active do – If a memode is deduced to be true it becomes excited and is added to the short term memory. If the original signal represented some bodily action this may trigger an appropriate response.
Asleep Memode waiting for an active call.
All memodes have a priority relative to those they are linked to, and where there are multiple links the ones with the higher priorities will be activated first. The mode a memode is used the higher the priority, unused memodes eventually being forgotten. If a memode is associated with a threat it will automatically be given a high priority.
When something “memorable” happens the current excited memodes are linked together in the long term memory, either to form a completely new memode or to increase the priority of those already in long term memory.

Could this really be the basis for the capabilities of the Human Brain?
This is basically a very simple architecture where a memode can only become excited, decide whether it is true in the context of the current signals in the short term memory, generate a new signal, or pass wake up messages to other memodes who will often not bother to reply! Surely we are so clever (or like to think we are) that this is a totally inadequate foundation for our intelligence – but perhaps it is time to question this assumption, especially as recent research suggests that many animals are more intelligent than had previously been credited.
If we think about the giraffe's neck expanding under evolutionary pressure there we need to consider  ways that brain power can be expanded without changing the basic way it works..
Clearly the bigger the brain the more memodes can be accommodated, and the more information can be stored. However there is no point of a bigger brain if it takes too long to fill up with usable knowledge – as the basic learning system is really based on trial and error (via adjusting priorities when situations repeat) and looks just too slow for what humans do. This simple learning system may well be adequate for very young children learning the phonemes of their mother tongue, but seems inadequate for the later rapid burst in learning. This leads to one of the key trigger points to be discussed later.
The number of excited memodes limits the capacity to handle a large number of inputs simultaneously. But Millers idea that the human short term memory is limited to about seven items suggests there is no need to suggest any great increase in capacity compared with animals here.
The use of recursion is important as in computer programming it is known to be a very powerful concept. But the computer way of doing it has serious complication when mapped onto a neural net, in terms of handling parameters, working memory, return links, and looping. The simple brain model gets round this by working on the basis that “Only respond if I need to be told the answer to my request is 'Yes' - possibly accompanied by some kind of "time out" feature.
More needs to be said about the ways memodes are linked and this will be discussed below, and at least one of the matters to be considered is another potential tipping point. Apart from perhaps selecting which links to activate, this is not seen as affecting the basic decision making process, but rather the way the links are established.
While the memodes can be discussed in terms of sets, the pure mathematician will undoubtedly argue that such a system could not operate logically within the confines of formal set theory. My response to this is “So what!!!” The point is that humans (and animals) are not prefect logic machines, and do not operate on perfect logic rules. In fact a very obvious feature of humanity is the way that people brought up in different cultures, with different belief systems, end up with logically incompatible views of the world, so any brain model must be capable to doing the same! We evolved in a world where incomplete knowledge is the general rule and the key question is whether a simple system such as the above is capable (with giraffe-neck-like extensions where needed) to support a wide range of sophisticated information processing activities.

Introducing CODIL
More information on the development of CODIL is given elsewhere (An Introduction to Publications on CODIL)  The approach taken was, in some ways, diametrically opposite to that taken in the development of the store program computer and it is useful to quickly highlight some of the difficulties.
The stored program computer architecture was designed to process arrays of numbers using well defined algorithms. The tasks were ones which many humans would find difficult (if not impossible) to understand whatever way they were carried out. The effect of the design is that a stored program computer is a black box in which the inner workings are always going to be hidden from the user, and most of the software on any computer is there to protect the users from ever getting a glimpse of the inner workings,
The proposed CODIL processor architecture was designed to help people with open ended tasks (i.e. ones which cannot be predefined in advance) and a basic requirement is that there was good understanding at the basic operating level. In effect the target was to produce a white box system which could tell the user step by step, what it was doing, using the same language as the user instructed the system. The idea was to build a system where the human user and the machine could work together on difficult information processing tasks as a symbiotic team.
Unfortunately no CODIL hardware ever got built, but various experimental programmed models were produced and tested on a wide range of tasks, including online tutoring, data bases of medical and historical information, and a wide variety of artificial intelligence tasks. In addition a mini-version was made available as a school teaching package and received many favourable reviews. The problem that (together with health related problems) that led to the project being abandoned was that the approach was philosophically incompatible with the way the computer industry was going and adequate funding proved impossible (see A Short History of CODIL).
The reason why CODIL is relevant is that, while there are differences, there are considerable similarities with the animal brain model described above, and an examination of what CODIL can do suggests ways in which the simple brain model discussed above can be extended to support human language and human behaviour.

CODIL and the Brain
In making the comparison there are several facts that need to be taken into account. Much of the software, as written was concerned with purely computer matters – such as processing key stroke on the key board, or organising the display screen. It was also designed as a potentially practical tool with, for example, a powerful arithmetic facility – although any initial model of the evolving human brain will not consider numbers (much less arithmetic function) until language is well established. Such “computer” related features will be ignored in the following discussion.
In CODIL the knowledge base consists of lists of linked items, organised into associatively addressed (i.e. virtual linked) files. While the linking mechanism is different to that of memodes in a neural net, the effect is very similar. The biggest significant difference is that the default is that there are no learning function to modify the “priority” of items and links. However some tests have been done with learning functions included.
The CODIL system has a series of registers, called The Facts, which contain items which define the current context. These play the same role as the excited memodes of the short term memory of the animal brain model, but there is a different “garbage collection” approach. In theory there is no limit to the number of registers in the Facts but for most CODIL trials the number needed was under 10 if items such “date” was considered as a single item. This suggests that the limit is in the way the human thinks about the problem (limited by the size of their short term memory) and not the potential power of the CODIL system.
The diagram shows the CODIL decision making unit which compares items from the knowledge base with the Facts. It involves comparing items, and moving selected items to the Facts. While the CODIL system uses a single processor and moves items around to access then, the underlying logic is virtually identical to that of the suggested animal model – which is working on a network with each memode being responsible for its own processing. There are differences which involve self-referencing situations (A links to B which links to C which links back to A) and the way such a situation is resolved. (In the animal brain model an item which is already “active” simply ignores all further requests to become “active”.)

Moving towards Natural Language
If we are to look at how the animal brain model can be extended to accommodate natural language it is useful to look at how the memodes can be linked together, and I will use the simple model I introduced in Brain Storms 6: CODIL and Natural Language.
Let us consider the sentence “Macbeth murdered Duncan with a dagger.” and extract the nouns “Macbeth; Duncan; Dagger” and link the appropriate memodes together

The problem is that in doing this information is lost as the relationship between the nouns is not retained, because they are not independent. In CODIL this problem is overcome by defining normal items to consist of a set name relevant to the context, and a text value:
MURDERER = Macbeth, VICTIM = Duncan, WEAPON = Dagger
This makes CODIL very much more powerful that a simple memode model and I recently realised  (Braim Storms 7: Getting rid of those pesky numbers) that the CODIL format was a hang-over from its computer-related origins of the idea. The approach becomes far more general if a normal CODIL item is seen as a set name pair, the above statement becoming:
This can then be mapped into a memode structure in which the memodes in the above diagram become a nested structure involving memode pairs.

Of course the  MACBETH memode will be linked elsewhere with other memodes (set names), such as KING, SCOTTISH, SHAKESPEAREAN-CHARACTER, etc. but this suggests that some links will be context dependent. but the important thing to note is that this approach does not significantly affect the way decisions are made, but does requires a specific linkage to be established which is context dependent. More work needs to be done in this area but it seems unlikely that the simplest animal brains will be handling such sophisticated links, and brains that can handle such links in any quantity may need more capacity.
This suggests that one distinct evolutionary step in creating a more advanced brain might be the ability to have "dummy" memodes - with not explicit signal of their own - whose role is to link together groups of memodes into specific contexts.

Communicating information between brains
As mentioned earlier, there may be little evolutionary advantage in developing a large brain, and then spending all your life  filling it with detailed knowledge of your environment, if everything is lost when you die. Ideally you will need to do most of the learning early in life to maximise the benefit and the faster you can learn the better. If information learnt by one generation can be passed to the next this is clearly advantage, and there is widespread examples of learning by imitation in animals which have  social groups, such as chimpanzees and orca, where different groups clearly have different cultures passed from generation to generation.
Symbolic communications in social animals in such activities as hunting can be useful and I have examined the human implications in Brains Storms 3: Evolutionary factors starting on the African Plains. If such communication methods can be extended to enable transfer of learnt information this could alter the economics of the larger brain. If a species (or cultural group within a species) develops a moderately sophisticated signing language to control communal  hunting it is probably only a small step (in terms of brain development) to adapt the way the signs are used to plan a future hunt and to recount what happened in a recent hunt. One step more and the result could be" teaching" the next generation the elements of how to hunt without actually taking them into what might be a dangerous situation for the inexperienced. juveniles. Once communication has reached a stage of being able to say "Don't do this because that is how 'Uncle Joe' died" the advantage of a larger brain to accept more communications will become important. However there seems to be no obvious reason why there should be a change of way the brain makes decisions. Such activities may well involve adding a further layer of interconnecting memodes, which represent the communication language symbols (words).

Once there is an effective mechanism for passing information from one generation to the next, the speed at which such information can be transferred becomes important. The normal mechanism described earlier is a pattern matching activity where the signal history affects the relative priority of memodes and their links to each other. This mechanism seems perfectly appropriate to the way a human child learns to recognise the phonemes of its mother tongue but is totally inappropriate to the later stage - when a child appears to absorb new ideas (via language) line a sponge. It is almost as if the new information is loaded directly into the brain with not "trial and error" learning or assessment.
At first sight we appear to need a completely new "Speed leaning" mechanism - but actually all effective animal brains will need to have one to dealt with the cat and mouse example at the beginning of this blog. If an animal's brain receives signals which suggest a real threat it is a matter of life and death to recognise such a situation and take the appropriate avoiding action. If signals A and B are followed by threatening symbol C it is better to take action whenever A  or B occur, rather than wait to see whether the next time they are followed by C. Of course sometimes the presence of A and/or B will be irrelevant to the appearance of C - but if a threat is involved it is better to be safe than sorry.
This suggests that all that is needed for high speed learning is for the existing threat mechanism to be extended to cover information that comes in via the "language" channel. There are implications in this approach as, in effect, such information will be given priority, and hence taken as true. Most may well be true but in many cases the information imparted may relate to the myths and traditions of earlier generations. Perhaps all that it is appropriate to say at this stage is that virtually all successful cultures put a lot of effort into training young children in their control to believe the traditions associated with the culture or the beliefs which the rulers consider are politically correct.

Where do we go from Here?
I think many people will agree that there is a Black Hole in Brain Research - but you may well be reacting negatively to the above proposed solution. For many reasons you are right to be cautious. It is often said that "Exceptional claims require exceptional proof" and you probably feel that what I am saying on this blog is exceptional for the following reasons.
  • My earlier work on CODIL was, in effect, blue sky research going back to first principles and asking what would be the best architecture for a human friendly information processing machine involving hard to dedefine tasks. I am suggesting that the stored program computer architecture, devised for implementing well defined algorithms for processing numbers was not a good starting point, despite the enormous successes of the technology (which I do not dispute) and the fact that the most countries now insist in teaching their young people that the store program computer is "the greatest thing since sliced bread."
  • In this post I am saying, in effect, that the all the detailed research looking for the fundamental difference between the thought processes of humans and animals is irrelevant because there are no fundamental differences. Instead I am suggesting the the human brain is no more exceptional, in evolutionary terms, than the giraffe's neck or the elephant's trunk. Our intelligence is the result of simply stretching certain features of all animal brains to a very large extent - helped by positive cultural feedback from the use of language.
So I am being controversial and you are right to ask "Where is your exceptional proof?"
The problem is that exceptional proof requires exceptional resources to supply - and where do you think such resources should come from - and what kinds of support do you think a researcher would need while he was collecting sufficient preliminary evidence to be able to successfully capture the necessary resources.
Let me simply say that I abandoned the CODIL research because of lack of resources, sheer exhaustion, a family suicide, and a bullying boss. OK I collected quite a few brownie points - in the form of refereed papers in top journals, and a large number of favourable reviews of some educational software. I decided that my life, and family, were more important that trying to explore what looked like very promising but unfunded and highly controversial research. It really wasn't practical to continue to do research that questioned the established views held by the sources of support and funding. (See Why "Blue Sky" Research is so Difficult.)

So why have I posted here, more than 20 years after I retired and abandoned all formal contacts with the academic world.
It's actually quite simple. My son raised the question of the eventual need to clear my house when the Grime Reaper called - and pointed out that in the absence of any instructions the contents of the garage could end up in a skip. In retirement I have become very much involved in historical research (I run the Genealogy in Hertfordshire web site) and this caused me to think about the possibility of finding an organisation who might be interested in all or part of my computer archives.
The first step was to carry out a online survey to see what relevant research had been done since my retirement - and I discovered the fact that there was nothing to link work on the physiology of the brain, the path of childhood development, and how language worked. I was also impressed by various projects trying to understand animal intelligence. It seemed that that the research I had abandoned might map onto a neural net and provide the missing link. If I was forty years younger, and still working in an academic environment, I would undoubtedly be rushing to get in a research application to fund further research.
The truth is that I am now in my 70s and have no desire to return to full time research. Definitely I do not have the energy to catch up on all the research literature relating to all aspects of the brain and its evolution that might be relevant. I am also aware that while my ideas are interesting, there is a lot I haven't said, some areas where I know that there are difficulties and numerous questions which still need investigating.
So what can I do. The answer is that I have set up this blog to alert people to a possibly interesting area of future research. Please do not treat the above post as definite answer to the problem, As the title indicates I would like you to treat this post as a brain storming session and add comments giving your reactions. Obviously If anyone is interested in actually taking the research further I would be happy to advise, make available the MicroCODIL software, and in many cases I may well have unpublished information in  my archives that could be of use. 

Earlier Brain Storms
  1. Introduction
  2. The Black Hole in Brain Research
  3. Evolutionary Factors starting on the African Plains
  4. Requirements of a target Model
  5. Some Factors in choosing a Model
  6. CODIL and Natural Language
  7. Getting rid of those pesky numbers
  8. Was Douglas Adams right about the Dolphins


  1. Hello Chris, very interesting brain storm, it needs a lot of research in computing and psychology. About "memodes", how different or similar is it compared to a neuron in the
    context of neural networks.
    Will give further comment later.
    Best wishes

  2. It can help to think of a memode as an interconnected group of one or more nemodes/neurons – so in the simplest case a memode would be a single neuron. At a higher level the entry point to a memode could be considered to be a “master neuron” which responds after “consulting” lower level linked memodes/neurons. On this basis all memodes can be considered to be a single neuron backed by a group of none or more other neurons.
    However this is an over simple view – in that we need to allow for what, in computer terms, would be called fuzzy logic – so that a group of memodes/neurons which can recognise a face may have many different possible “entry neurons” any of which could act as the “master neuron”. In addition subgroups of memodes/neurons may be linked into many different higher groups (in computer terms think of subroutines or shared definitions).
    I will expand on this later as a separate post, but will wait until I have more comments on this post. The whole point of the brain storming approach is to help me to identify the areas where people have questions about the model, and to expand on my thoughts on the subject. The subject is so large I could not possibly address all potential issues in one post.

  3. I believe it is time we consider Post Neural Networks models, to overcome limitations of the NN model in such applications as natural language ...

    1. You may be surprised (perhaps you didn't look at the publications on CODIL before replying) as I think that we may be in agreement. See http://trapped-by-the-box.blogspot.com/2012/03/neural-nets-or-networks-of-neurons.html for a detailed discussion on what what I am trying to model.