Science and Mathematics is all about building abstract models which attempt to reflect various aspects of reality. As someone who did a Ph.D. in theoretical organic chemistry I am well used to the idea of multiple models of atoms and how they interact and whether, for example, it is useful to think of them as rather like miniature billiard balls, or as abstract probability functions. The problem Chris Yapp refers to arises because the computer industry has myopically concentrated on a single model based on pre-defined algorithms for the way information can be processed.
The early computers were developed to carry out highly repetitive mathematical calculations which could not be done quickly or accurately enough by specially trained human beings. It turns out that many highly repetitive tasks could be represented by a predefined set of rules (an algorithm) and hence be carried out using a computer. What could be done was limited by the speed and memory of the computer, and the ability of people of people to program the systems. However this proved to be no real barrier as every few years faster computers, with more memory, and easier to use programming languages appeared on the market, while more and more people were trained to use them. There was big money to be made and careers to be built and it seemed that everyone tried to get on the bandwagon. Worldwide hundreds of people started to develop better hardware and software and the result was a rat race where the first to get to get a successful product to the market won, and the rest fell by the wayside.
In this heated environment did anyone stop and ask whether there was a alternative information model for handling open-ended tasks involving dynamic interaction with human beings. Even the pioneering work at Xerox Parc, which led to the kinds of user interfaces we find today on personal computing systems, did not go back to first principles. It took if for granted that computers were inherently opaque black box systems and that what was needed was a front end which hid the incomprehensible internal workings from the human users. Dozens of different computer languages were devised to find different ways to write algorithms – without asking whether humans naturally thought in an algorithmic way. It was suggested that there was no point in looking for an alternative approach because theoreticians such as Turing related the stored program computer to a “universal machine” – and surely one couldn’t possibly start with anything better than a universal machine. In fact anyone who took time off to question the scientific foundations of what was an outrageously successful industry would soon find themselves at the back of the queue in the race for fame and fortune.
But is the algorithmic model really the best or only “universal machine” model for handling information – especially when incompletely understood and dynamically changing real world tasks involving incomplete and fuzzy information is concerned?
My own research suggests that there is an alternative – but to someone who is immersed in the world of formal algorithms the first steps are counter-intuitive.
In 1967 I made a mistake as far as my career was concerned as I would undoubtedly have had an easier life if I had not queried the establishment line. I was a comparative newcomer to the computer industry, but one who had entered via an unusual career path. I had experience of working in a very complex manual management information system where the key was spotting and reporting the unexpected. I then moved to a very large and complex commercial sales accounting system (Shell Mex & BP) in a completely different industry where the problem was interfacing with a wide and ever changing market. It may well have been one of the most advances computer system of its type at the time. Finally I moved to a planning department concerned with the probable market place for next generation large computers. My mistake was to pass my boss a note which said that I thought it might be possible to reprogram the microcode of an IBM architecture computer to give it a human friendly symbolic assembly language. This language was called CODIL as it was a Context Dependent Information Language. In retrospect what I had done was to take my manual skills in processing open-ended tasks and transferred to the computer.
The note was passed to the computer pioneers David Caminer and John Pinkerton (who I understand consulted Professor Maurice Wilkes) and as a result I was quickly transferred to research with a useful sized budget and told not to talk to anyone until the patents had been taken out. What happened was that an initial tentative idea, which in retrospect needed several years interdisciplinary brainstorming, was dropped straight into the computer industry rat race. Apart from the fact that the idea clearly caused excitement I had no idea how unconventional it was, and knew nothing about research into the relevant mathematical theory or psychological studies relevant to modelling human thinking. I spent two years writing and testing a pilot simulation program which demonstrated that the idea was at least capable of processing a range of different applications. My reward was to be declared redundant because of the formation of ICL and the closure of the research division in which I worked. Despite the support of Basil de Ferranti (the new Research Director) my project was deemed irrelevant to the company policy of developing the 2900 Series of computers- and it had to go.
So, with the benefit of nearly 50 years hindsight, what was the idea at the heart of my proposal?
The stored program model is a rule based top-down approach which uses numbers to process numbers and assumes that there is a human “creator” who can, a priori, define the rules. If you look carefully at the “universal machine” approach you realise that the theory does not cover cases where the rules are not knowable in advance. In practice there is the additional restriction that any “knowable” rules must be identifiable and implementable at a reasonable cost and on a realistic timescale.
In contrast, the CODIL model I developed is bottom up pattern recognition approach which assumes no prior knowledge of the task to be handled. It uses sets and partitions of sets when viewed as a mathematical model but these can be considered as concepts when its human user interface is considered. (For example the CODIL item “Murderer = Macbeth” is treated by the system as defining “Macbeth” as a member of the set “Murderers”.) In set theoretic terms the model seems weak but its strength lies in the power of recursion plus the ability to morph into the stored program computer model. This can happens if you can divide the patterns into two – with one set of patterns being “the rules” and the other set being “the data”. However the system is best when handling problems when there is no clear pre-definable global model and it can become very inefficient when handling tasks which require millions of iterations through a small number of precisely defined rules working within a highly constrained global model – the very area where the stored program computer is strongest.
The two models are in fact complementary – providing different views of the world of information processing:
· The stored program computer model can be considered a special case of the CODIL model – but at the same time the CODIL model can be considered a special case of the stored program computer model – as it represents an algorithm whose specialist tack is to provide human users with a tool to manage tasks for which the relevant algorithms are not known in advance.
· The stored program computer model is best at formal mathematical tasks which people find difficult – while the CODIL model is more appropriate to open ended real world tasks where human interaction is essential. This means that they both excel in the areas where the other is weakest.
· The original proposal (relating to reprogramming the processor microcode of an existing system), and some later research, suggests that it could be possible to build systems which combine both models.
· Recent work suggests that the CODIL model will map onto a neural network and provide the basis of an evolutionary pathway to explain human intelligence.
So what happened next?
1971: Looking for a home.
ICL had agreed that I could continue the research at a university with access to the patent but without any funding from them, on the understanding that I didn’t make explicit claims that suggested that ICL had made a mistake in closing down the project. John Pinkerton helped me draft a paper for the computer journal [The CODIL Language and its Interpreter], and a couple of conference papers. The problem was finding a suitable opening and in the interim I worked on a vast and soul-destroying military project called Linesman. I still had no real idea of how potentially controversial my research was and naively thought that as long as I got paid and had access to computers any university would do. Then I got two invitations for interview at the same time. The first was as Reader in Computer Science at Brunel University at a significant increase in salary, and the second was as poorly paid assistant lecturership at Cambridge under Professor Maurice Wilkes. With a wife, three children and a mortgage to support I accepted the job as Reader and cancelled the interview at Cambridge.
To be fair Brunel University was set up to produce competent engineering graduates to work in industry, and was probably not the place to do research which was questioning the foundations of the technology it was teaching. It had been a technical college only a few years before, and still had very little experience of conventional research and no experience of anything as unconventional as CODIL proved to be. In addition it was very poorly equipped for what I wanted to do. The Computer Science Department had no modern computer of its own, and the only system I could use was an ICL1903 run by the separate Computer Unit. As a result the research on CODIL (which was basically a dynamic interactive language) was carried out by simulating interaction using punch cards on a batch system for the first 8 years!
1972-79 – The problems with Artificial Intelligence
The first thing I had to do on arriving at Brunel was to completely rewrite the early software simulation and decided to use COBOL (in retrospect not a good choice) because the idea had started by considering complex large volume data processing application. One of the first real applications was a medical research records data base [Using CODIL to handle poorly structured clinical information], but one of my colleagues suggested that it might be appropriate to look at some Artificial Intelligence tasks. I responded that the literature looked difficult and he said it was mainly hype – and if you looked under the surface much of it was trivial. He lent me a copy of a recent American Ph.D. and within a few days I had described all the tasks in CODIL and got the simulator program to produce the answers. As I mentioned above CODIL can morph into something resembling a programming language. Every week the News Scientist published a logic puzzle called Tantalizer and I used CODIL to write a heuristic problem solving package called Tantalize which asked questions about the problem, generated a series of patterns, then (at least in some cases) ranked the patterns to put them into an optimum order, and matched the patterns against each other to produce the answer. On one occasion the package solved 15 consecutive Tantalizer puzzles week by week as they were published. I also combed the published literature and found it could solve nearly all of the similar puzzles in recent A.I. publications.
There was only one problem. Could I get any papers on the subject published? The answer was only with difficulty [A Psychological Approach to Language Design includes something about the A.I. research]. For example I submitted a pair of papers to one A.I. conference, the first describing the Tantalise package and the second full of examples of the problems it had solved. It was rejected because the idea was too theoretical ever to work. Having tried a number of UK outlets I cheekily sent a paper on the subject to Com.A.C.M. and got yet another rejection note. The paper had been to four different referees. Two vigorously denounced it as rubbish, one said he did not understand it, and one approved. I found the wording of the rejections very hurtful and I abandoned all further A. I. related research work. In fact I didn’t even make any further attempts to publish what I had already done. It was only several years later that I re-read the rejection letter and realised that I had been so upset by the wordings of the rejection reviews that I have missed the significance of the editor’s last sentence – which suggested that I should continue the work as there could well be something in it because it annoyed the reviewers so much. After all he would have known who the reviewers were – and their likely responses to “not invented here” ideas.
In a way this highlights the biggest problem with being at Brunel. I needed a strong supportive manager who understood the problems associated with promoting unconventional ideas. When I got these rejections I needed a shoulder to cry on, and a voice to encourage me to persevere, and one to suggest alternative ways forward. In fact I just got more depressed and tried to avoid research and publications in areas where rejection would simply make me more miserable.
During this period I started to investigate the use of CODIL for handling historical data – as an example of how the approach could handle poorly structured information filled with uncertainties. [ CODIL as a knowledge base system for handling historical information] In retrospect I probably spent too much time building a large demonstration data base.
1980-8 – Jim Fixed It for me.
Two things happened. The first was that the Department was asked by the BBC to help Jimmy Saville “Fix It” for a young boy who wanted a computer to do his homework. I prepared a demonstration package, using CODIL, showing how a computer might help and shortly afterwards the University moved from batch to online computing. I quickly converted the CODIL software to the new system in time to use a modified “Fix It” as an aid to introduce a class of 125 first year undergraduates to using a computer at a time when a small number had early personal computers and a third had not even used a typewriter, much less a computer. The results were very successful and from then on CODIL was used to support a variety of interactive teaching packages and investigations into online publishing [A Software package for Electronic Journals] associated with the British Library funded BLEND project. In addition I found myself with a heavier teaching load.
1982-6 – Is small beautiful?
I decided that what was required was a portable demonstration version of CODIL. I also decided to return to the original idea that it might eventually be possible to micro-code the main routines. I therefore decided to do a complete design to see how small the key parts of the system could be and chose to use the BBC computer which had a mere 32 Kbytes of memory for program and all work areas including the display and input/output buffers. The result was that I was able to introduce more recursive pathways and the resulting MicroCODIL package [A Microcomputer Package for demonstrating Information Processing Concepts] was logically very much more powerful than its main-frame predecessor- but, because of the small size of the BBC it could only handle small quantities of information. The bulkiest part of the software was written in conventional code to interface with the display and the keyboard - and the latter used dynamic colour coding to verify the syntax as the user typed information in character by character [The Use of Colour in Language Syntax Analysis]. The result was a powerful “beta” Schools package which demonstrated many different ways in which information could be processed. The package attracted enthusiastic reviews from a variety of magazines including the New Scientist, the Times Educational Supplement, The Psychologist, and other educational and hobby magazines. I buckled down to write a more up-to-date paper for the Computer Journal [CODIL: The Architecture of an Information Language] and a number of other papers on the systems application.
1987-8 – The End of the Road
The idea behind MicroCODIL was that it could give me the opportunity to move to a more suitable University environment. However there was a problem. During the 1980s I had spent too much time on trying to produce a better CODIL interpreter, and appropriate applications, and little time trying to unravel the underlying theory, or in making grant applications. In addition the work on MicroCODIL had been significantly disrupted by the illness and subsequent tragic death of my daughter Lucy and I was suffering from post-traumatic stress disorder.
What happened was that a new Head of Department was appointed and at the same time Maggie Thatcher decided that universities should have the opportunity to get rid of “deadwood.” I was singled out for the chop and the new head used every opportunity to humiliate me into taking early retirement. For instance he insisted that all evidence of my research should be removed from the department notice boards – including the very complimentary reviews of MicroCODIL in magazines such as New Scientist and the Times Educational Supplement. When my paper was accepted by the Computer Journal his reaction was to say the B.C.S. standards must have fallen to accept such rubbish from me and I am sure that when it was eventually published he did not include it in the list of publications by department staff. Such treatment, combined with the effects of my daughter’s death, and I simply took the quick and easy way out, and accepted early retirement and left as quickly as possible. Shortly before I left I had an article published in the New Scientist on the virtual impossibility of doing unconventional creative research at a modern university [Why genius gets nipped in the bud].
1989-2010 – Retirement
I had not intended to abandon work on CODIL but I urgently needed a break. I had been at Brunel seventeen years and as far as I can remember no-one in the department had had a sabbatical in all that time. So on leaving I arranged to spend a year in Australia working with the CSIRO building a database of information on climate change – although I ended up working on a database on part of the Tasmanian rainforest for Australian Heritage. On my return to England I thought about writing a CODIL interpreter for the personal computer but decided life was too short to keep banging one’s head against a powerful establishment brick wall. My pension was just about enough to live on and I treated it as a salary which allowed me to do voluntary work for better mental health services at both the local and national levels. I was not going to give up on research but switched to local history studies and I now run a large web site [Genealogy in Hertfordshire] which provides free help and advice to local and family historians.
When I reached 70 I decided to gracefully retire from the mental health work, and I started to think about CODIL, and whether it had any relevance today, or whether I should simply dump the complete set of research papers in a skip. I therefore decided to have a good look on the internet to catch up on what had happened in the 25 or so years since I abandoned the research. After all I was now in a “Blue Sky research” situation with no need to please my boss, no need to churn out regular papers, and (at least at the initial thinking stage) no need for funding. This is what I found:
- · I could find no evidence that anyone else had gone down this route.
- · It is now generally agreed that the approach to Artificial Intelligence common in the 1970’s has contributed little or nothing to the understanding of human intelligence. (It was why my 1970s papers were rejected by this school of thought which had led to me abandoning research in this area.)
- · In the 1970’s I had thought about how CODIL might interface with natural language – but abandoned the idea because I couldn’t relate my ideas to those of Chomsky. I now find that a significant number of people currently working in the field of linguistics reject Chomsky’s models.
- · Again in the 1970’s I very briefly looked at the idea of neural nets, but Minsky had shown that there were insurmountable problems in using simple networks. But Minsky was viewing the problem in terms of formal mathematical logic – and this would only apply to modelling human intelligence if the human brain was a perfect logic processing system.
- · If you take an overview of all the brain related disciplines, from investigations into the workings of neurons through to psychology, linguistics, child learning, etc. you find there is a gaping hole. There is no adequate predictive working model to link neural activity to activities which we would recognise as intelligent.
- · There is also no adequate predictive model of how the brain works to explain why we are much more intelligent than animals.
What became clear was that CODIL should be seen as a pattern recognition system, which used words (in psychological terms “concepts”) to provide a working model of human short term memory. What’s more, if one stripped out some of the more advanced facilities such as arithmetic functions (which human hunter gatherers would not have needed) it would map easily onto a simple neural net.
But if CODIL mimics human short term memory could it provide the basis of an evolutionary model that bridged the black hole in brain research? I have examined various issues earlier on this blog, and I am currently drafting details of a possible evolutionary pathway which suggests why the intelligence of an animal’s brain may be limited by the amount it can usefully learn in a lifetime, and how language, seen as a self-modifying tool, triggers a tipping point. As mentioned earlier CODIL is pattern recognition rather than a rule based approach, but it can morph into a rule based mode. Probably about 150,000 years ago humans discovered an efficient rule based way of passing culture to the next generation. This meant that more information could be passed more rapidly – and also stored in the brain more efficiently because learning generalizations is easier than learning large numbers of specific examples. This allowed us to develop language to become an even better teaching tool and the more efficient it became the more and faster we learnt – with no need for any change in the brain size, because we were storing the acquired information more efficiently.
I started reassessing the research because there was a question of whether to keep the research documentation or put it in a skip. And having done the reassessment I feel the papers should find a proper home. But I have to be realistic and I am no longer young.
Chris Yapp raised the subject of “The Limits of Algorithms” and even if you don’t agree with me, the above should have given you something to think about. Of course the algorithmic way of thinking is extremely powerful but do our brains naturally work that way? Perhaps they are simple pattern recognition organs that bootstrap themselves up from a state of ignorance?
If you think any of my ideas are worth following up please contact me as at the age of 77 I feel any further research should be done by the next generation and that my role would be to merely pass on what I have learned. In such a case it would be useful to preserve the surviving documentation.
One thing is certain. My initial plans to build a user-friendly white box “computer” were a failure and while there are many studies about how research projects succeed my experiences illustrate dramatically what can go wrong if a creative idea does not get adequate support and funding. This suggests that the documentation may well also have some historical interest and ensure that other creative scientists with “outside the box” ideas get the support they need.