Science and Mathematics is all
about building abstract models which attempt to reflect various aspects of
reality. As someone who did a Ph.D. in theoretical organic chemistry I am well
used to the idea of multiple models of atoms and how they interact and whether,
for example, it is useful to think of them as rather like miniature billiard
balls, or as abstract probability functions. The problem Chris Yapp refers to
arises because the computer industry has myopically concentrated on a single model
based on pre-defined algorithms for the way information can be processed.
The early computers were
developed to carry out highly repetitive mathematical calculations which could
not be done quickly or accurately enough by specially trained human beings. It
turns out that many highly repetitive tasks could be represented by a
predefined set of rules (an algorithm) and hence be carried out using a
computer. What could be done was limited by the speed and memory of the
computer, and the ability of people of people to program the systems. However
this proved to be no real barrier as every few years faster computers, with
more memory, and easier to use programming languages appeared on the market,
while more and more people were trained to use them. There was big money to be
made and careers to be built and it seemed that everyone tried to get on the
bandwagon. Worldwide hundreds of people started to develop better hardware and
software and the result was a rat race where the first to get to get a
successful product to the market won, and the rest fell by the wayside.
In this heated environment did
anyone stop and ask whether there was a alternative information model for
handling open-ended tasks involving dynamic interaction with human beings. Even
the pioneering work at Xerox Parc, which led to the kinds of user interfaces we
find today on personal computing systems, did not go back to first principles.
It took if for granted that computers were inherently opaque black box systems
and that what was needed was a front end which hid the incomprehensible
internal workings from the human users. Dozens of different computer languages
were devised to find different ways to write algorithms – without asking
whether humans naturally thought in an algorithmic way. It was suggested that
there was no point in looking for an alternative approach because theoreticians
such as Turing related the stored program computer to a “universal machine” –
and surely one couldn’t possibly start with anything better than a universal
machine. In fact anyone who took time off to question the scientific
foundations of what was an outrageously successful industry would soon find
themselves at the back of the queue in the race for fame and fortune.
But is the algorithmic model
really the best or only “universal machine” model for handling information –
especially when incompletely understood and dynamically changing real world tasks
involving incomplete and fuzzy information is concerned?
My own research suggests that
there is an alternative – but to someone who is immersed in the world of formal
algorithms the first steps are counter-intuitive.
In 1967 I made a mistake as far
as my career was concerned as I would undoubtedly have had an easier life if I
had not queried the establishment line. I was a comparative newcomer to the
computer industry, but one who had entered via an unusual career path. I had
experience of working in a very complex manual management information system where the key was spotting and reporting the unexpected. I then moved to a
very large and complex commercial sales accounting system (Shell Mex & BP) in a completely different
industry where the problem was interfacing with a wide and ever changing market. It may well have been one of the most advances computer system of its type at the time. Finally I moved to a planning department concerned with the probable
market place for next generation large computers. My mistake was to pass my
boss a note which said that I thought it might be possible to reprogram the
microcode of an IBM architecture computer to give it a human friendly symbolic
assembly language. This language was called CODIL as it was a Context Dependent
Information Language. In retrospect what I had done was to take my manual skills in processing open-ended tasks and transferred to the computer.
The note was passed to the
computer pioneers David Caminer and John Pinkerton (who I understand consulted
Professor Maurice Wilkes) and as a result I was quickly transferred to research
with a useful sized budget and told not to talk to anyone until the patents had
been taken out. What happened was that an initial tentative idea, which in
retrospect needed several years interdisciplinary brainstorming, was dropped
straight into the computer industry rat race. Apart from the fact that the idea
clearly caused excitement I had no idea how unconventional it was, and knew
nothing about research into the relevant mathematical theory or psychological
studies relevant to modelling human thinking. I spent two years writing and
testing a pilot simulation program which demonstrated that the idea was at
least capable of processing a range of different applications. My reward was to
be declared redundant because of the formation of ICL and the closure of the
research division in which I worked. Despite the support of Basil de Ferranti
(the new Research Director) my project was deemed irrelevant to the company
policy of developing the 2900 Series of
computers- and it had to go.
So, with the benefit of nearly 50
years hindsight, what was the idea at the heart of my proposal?
The stored program model is a
rule based top-down approach which uses numbers to process numbers and assumes
that there is a human “creator” who can, a
priori, define the rules. If you look carefully at the “universal machine”
approach you realise that the theory does not cover cases where the rules are
not knowable in advance. In practice there is the additional restriction that any
“knowable” rules must be identifiable and implementable at a reasonable cost
and on a realistic timescale.
In contrast, the CODIL model I
developed is bottom up pattern recognition approach which assumes no prior
knowledge of the task to be handled. It uses sets and partitions of sets when
viewed as a mathematical model but these can be considered as concepts when its
human user interface is considered. (For example the CODIL item “Murderer =
Macbeth” is treated by the system as defining “Macbeth” as a member of the set
“Murderers”.) In set theoretic terms the model seems weak but its strength lies
in the power of recursion plus the ability to morph into the stored program
computer model. This can happens if you can divide the patterns
into two – with one set of patterns being “the rules” and the other set being “the
data”. However the system is best when handling problems when there is no clear
pre-definable global model and it can become very inefficient when handling
tasks which require millions of iterations through a small number of precisely defined rules working within a
highly constrained global model – the very area where the stored program
computer is strongest.
The two models are in fact
complementary – providing different views of the world of information
processing:
·
The stored program computer model can be
considered a special case of the CODIL model – but at the same time the CODIL
model can be considered a special case of the stored program computer model –
as it represents an algorithm whose specialist tack is to provide human users with
a tool to manage tasks for which the relevant algorithms are not known in advance.
·
The stored program computer model is best at
formal mathematical tasks which people find difficult – while the CODIL model
is more appropriate to open ended real world tasks where human interaction is
essential. This means that they both excel in the areas where the other is
weakest.
·
The original proposal (relating to reprogramming
the processor microcode of an existing system), and some later research,
suggests that it could be possible to build systems which combine both models.
·
Recent work suggests that the CODIL model will
map onto a neural network and provide the basis of an evolutionary pathway to
explain human intelligence.
So what happened next?
1971: Looking for a
home.
ICL had agreed that I could
continue the research at a university with access to the patent but without
any funding from them, on the understanding that I didn’t make explicit claims
that suggested that ICL had made a mistake in closing down the project. John
Pinkerton helped me draft a paper for the computer journal [The CODIL Language and its Interpreter], and a couple
of conference papers. The problem was finding a suitable opening and in
the interim I worked on a vast and soul-destroying military project called
Linesman. I still had no real idea of how potentially controversial my research
was and naively thought that as long as I got paid and had access to computers
any university would do. Then I got two invitations for interview at the same
time. The first was as Reader in Computer Science at Brunel University at a
significant increase in salary, and the second was as poorly paid assistant
lecturership at Cambridge under Professor Maurice Wilkes. With a wife, three
children and a mortgage to support I accepted the job as Reader and cancelled
the interview at Cambridge.
To be fair Brunel University was
set up to produce competent engineering graduates to work in industry, and was
probably not the place to do research which was questioning the foundations of
the technology it was teaching. It had been a technical college only a few
years before, and still had very little experience of conventional research and
no experience of anything as unconventional as CODIL proved to be. In addition
it was very poorly equipped for what I wanted to do. The Computer Science
Department had no modern computer of its own, and the only system I could use
was an ICL1903 run by the separate Computer Unit. As a result the research on
CODIL (which was basically a dynamic interactive language) was carried out by
simulating interaction using punch cards on a batch system for the first 8
years!
1972-79 – The problems with
Artificial Intelligence
The first thing I had to do on arriving at
Brunel was to completely rewrite the early software simulation and decided to
use COBOL (in retrospect not a good choice) because the idea had started by
considering complex large volume data processing application. One of the first
real applications was a medical research records data base [Using CODIL to handle poorly structured clinical information], but one of my
colleagues suggested that it might be appropriate to look at some Artificial
Intelligence tasks. I responded that the literature looked difficult and he
said it was mainly hype – and if you looked under the surface much of it was
trivial. He lent me a copy of a recent American Ph.D. and within a few days I
had described all the tasks in CODIL and got the simulator program to produce the
answers. As I mentioned above CODIL can morph into something resembling a
programming language. Every week the News Scientist published a logic puzzle
called Tantalizer and I used CODIL to write a heuristic problem solving package
called Tantalize which asked questions about the problem, generated a series of
patterns, then (at least in some cases) ranked the patterns to put them into an
optimum order, and matched the patterns against each other to produce the
answer. On one occasion the package solved 15 consecutive Tantalizer puzzles
week by week as they were published. I also combed the published literature and
found it could solve nearly all of the similar puzzles in recent A.I.
publications.
There was only one problem. Could
I get any papers on the subject published? The answer was only with difficulty [A Psychological Approach to Language Design includes something about the A.I. research]. For example I
submitted a pair of papers to one A.I. conference, the first describing the
Tantalise package and the second full of examples of the problems it had
solved. It was rejected because the idea was too theoretical ever to work.
Having tried a number of UK outlets I cheekily sent a paper on the subject to
Com.A.C.M. and got yet another rejection note. The paper had been to four
different referees. Two vigorously denounced it as rubbish, one said he did not
understand it, and one approved. I found the wording of the rejections very
hurtful and I abandoned all further A. I. related research work. In fact I
didn’t even make any further attempts to publish what I had already done. It
was only several years later that I re-read the rejection letter and realised that I
had been so upset by the wordings of the rejection reviews that I have missed
the significance of the editor’s last sentence – which suggested that I should
continue the work as there could well be something in it because it annoyed the
reviewers so much. After all he would have known who the reviewers were – and
their likely responses to “not invented here” ideas.
In a way this highlights the
biggest problem with being at Brunel. I needed a strong supportive manager
who understood the problems associated with promoting unconventional ideas.
When I got these rejections I needed a shoulder to cry on, and a voice to
encourage me to persevere, and one to suggest alternative ways forward. In fact
I just got more depressed and tried to avoid research and publications in areas where rejection would simply make me more miserable.
During this period I started to
investigate the use of CODIL for handling historical data – as an example of
how the approach could handle poorly structured information filled with
uncertainties. [ CODIL as a knowledge base system for handling historical information] In retrospect I probably spent too much time building a
large demonstration data base.
1980-8 – Jim Fixed It
for me.
Two things happened. The first was that the Department was asked
by the BBC to help Jimmy Saville “Fix It” for a young boy who wanted a computer
to do his homework. I prepared a demonstration package, using CODIL, showing
how a computer might help and shortly afterwards the University moved from
batch to online computing. I quickly converted the CODIL software to the new
system in time to use a modified “Fix It” as an aid to introduce a class of 125
first year undergraduates to using a computer at a time when a small number had
early personal computers and a third had not even used a typewriter, much less
a computer. The results were very successful and from then on CODIL was used to
support a variety of interactive teaching packages and investigations
into online publishing [A Software package for Electronic Journals] associated with the British Library funded BLEND project. In addition I found myself with a heavier teaching load.
1982-6 – Is small
beautiful?
I decided that what was required
was a portable demonstration version of CODIL. I also decided to return to the original
idea that it might eventually be possible to micro-code the main routines. I
therefore decided to do a complete design to see how small the key parts of the
system could be and chose to use the BBC computer which had a mere 32 Kbytes of
memory for program and all work areas including the display and input/output
buffers. The result was that I was able to introduce more recursive pathways
and the resulting MicroCODIL package [A Microcomputer Package for demonstrating Information Processing Concepts] was logically very much more powerful than
its main-frame predecessor- but, because of the small size of the BBC it could
only handle small quantities of information. The bulkiest part of the software
was written in conventional code to interface with the display and the keyboard
- and the latter used dynamic colour coding to verify the syntax as the user
typed information in character by character [The Use of Colour in Language Syntax Analysis]. The result was a powerful “beta” Schools
package which demonstrated many different ways in which information could be processed.
The package attracted enthusiastic reviews from a variety of magazines including
the New Scientist, the Times Educational Supplement, The Psychologist, and other educational and hobby magazines. I buckled down to write a more up-to-date paper for the Computer Journal
[CODIL: The Architecture of an Information Language] and a number of other papers on the systems application.
1987-8 – The End of the
Road
The idea behind MicroCODIL was
that it could give me the opportunity to move to a more suitable University
environment. However there was a problem.
During the 1980s I had spent too much time on trying to produce a better
CODIL interpreter, and appropriate applications, and little time trying to
unravel the underlying theory, or in making grant applications. In addition the
work on MicroCODIL had been significantly disrupted by the illness and
subsequent tragic death of my daughter Lucy and I was suffering from post-traumatic
stress disorder.
What happened was that a new Head
of Department was appointed and at the same time Maggie Thatcher decided that universities
should have the opportunity to get rid of “deadwood.” I was singled out for the
chop and the new head used every opportunity to humiliate me into taking early
retirement. For instance he insisted that all evidence of my research should be
removed from the department notice boards – including the very complimentary reviews
of MicroCODIL in magazines such as New Scientist and the Times
Educational Supplement. When my paper was accepted by the Computer
Journal his reaction was to say the B.C.S. standards must have fallen to accept
such rubbish from me and I am sure that when it was eventually published he did
not include it in the list of publications by department staff. Such treatment,
combined with the effects of my daughter’s death, and I simply took the quick
and easy way out, and accepted early retirement and left as quickly as possible.
Shortly before I left I had an article published in the New
Scientist on the virtual impossibility of doing unconventional creative
research at a modern university [Why genius gets nipped in the bud].
1989-2010 – Retirement
I had not intended to abandon
work on CODIL but I urgently needed a break. I had been at Brunel seventeen
years and as far as I can remember no-one in the department had had a sabbatical in all that time. So on leaving I
arranged to spend a year in Australia working with the CSIRO building a
database of information on climate change – although I ended up working on a
database on part of the Tasmanian rainforest for Australian Heritage. On my return to
England I thought about writing a CODIL interpreter for the personal computer
but decided life was too short to keep banging one’s head against a powerful
establishment brick wall. My pension was just about enough to live on and I
treated it as a salary which allowed me to do voluntary work for better mental
health services at both the local and national levels. I was not going to give
up on research but switched to local history studies and I now run a large web
site [Genealogy in Hertfordshire] which provides free help and advice to local and family historians.
CODIL revisited
When I reached 70 I decided to
gracefully retire from the mental health work, and I started to think about CODIL,
and whether it had any relevance today, or whether I should simply dump the
complete set of research papers in a skip. I therefore decided to have a good
look on the internet to catch up on what had happened in the 25 or so years since
I abandoned the research. After all I was now in a “Blue Sky research”
situation with no need to please my boss, no need to churn out regular papers,
and (at least at the initial thinking stage) no need for funding. This is what
I found:
- · I could find no evidence that anyone else had gone down this route.
- · It is now generally agreed that the approach to Artificial Intelligence common in the 1970’s has contributed little or nothing to the understanding of human intelligence. (It was why my 1970s papers were rejected by this school of thought which had led to me abandoning research in this area.)
- · In the 1970’s I had thought about how CODIL might interface with natural language – but abandoned the idea because I couldn’t relate my ideas to those of Chomsky. I now find that a significant number of people currently working in the field of linguistics reject Chomsky’s models.
- · Again in the 1970’s I very briefly looked at the idea of neural nets, but Minsky had shown that there were insurmountable problems in using simple networks. But Minsky was viewing the problem in terms of formal mathematical logic – and this would only apply to modelling human intelligence if the human brain was a perfect logic processing system.
- · If you take an overview of all the brain related disciplines, from investigations into the workings of neurons through to psychology, linguistics, child learning, etc. you find there is a gaping hole. There is no adequate predictive working model to link neural activity to activities which we would recognise as intelligent.
- · There is also no adequate predictive model of how the brain works to explain why we are much more intelligent than animals.
What became clear was that CODIL
should be seen as a pattern recognition system, which used words (in psychological
terms “concepts”) to provide a working model of human short term memory. What’s
more, if one stripped out some of the more advanced facilities such as
arithmetic functions (which human hunter gatherers would not have needed) it
would map easily onto a simple neural net.
But if CODIL mimics human short
term memory could it provide the basis of an evolutionary model that bridged
the black hole in brain research? I have examined various issues earlier on
this blog, and I am currently drafting details of a possible evolutionary pathway
which suggests why the intelligence of an animal’s brain may be limited by the
amount it can usefully learn in a lifetime, and how language, seen as a self-modifying
tool, triggers a tipping point. As mentioned earlier CODIL is pattern
recognition rather than a rule based approach, but it can morph into a rule
based mode. Probably about 150,000 years ago humans discovered an efficient rule
based way of passing culture to the next generation. This meant that more
information could be passed more rapidly – and also stored in the brain more
efficiently because learning generalizations is easier than learning large
numbers of specific examples. This allowed us to develop language to become an
even better teaching tool and the more efficient it became the more and faster
we learnt – with no need for any change in the brain size, because we were storing the acquired information more efficiently.
What Now?
I started reassessing the research because there was a
question of whether to keep the research documentation or put it in a skip. And
having done the reassessment I feel the papers should find a proper home. But I have to be realistic and I am no longer young.
Chris Yapp raised the subject of “The Limits of Algorithms”
and even if you don’t agree with me, the above should have given you something
to think about. Of course the algorithmic way of thinking is extremely powerful
but do our brains naturally work that way? Perhaps they are simple pattern
recognition organs that bootstrap themselves up from a state of ignorance?
If you think any of my ideas are worth following up please
contact me as at the age of 77 I feel any further research should be done by
the next generation and that my role would be to merely pass on what I have
learned. In such a case it would be useful to preserve the surviving
documentation.
One thing is certain. My initial plans to build a
user-friendly white box “computer” were a failure and while there are many
studies about how research projects succeed my experiences illustrate dramatically
what can go wrong if a creative idea does not get adequate support
and funding. This suggests that the documentation may well also have some historical
interest and ensure that other creative scientists with “outside the box” ideas
get the support they need.
No comments:
Post a Comment