Monday, 28 July 2014

Why using CODIL is very different to programming a computer

So you have already learnt to program a computer and you know want to understand CODIL (Context Dependent Information Language) and how it works. I suspect that you have a problem because your have learnt to think in “computer programming” type mental box. – Let me explain.


In 1971, when high level computer programming languages were beginning to be used Gerald Weinberg wrote the book The Psychology of Computer Programming. He was interested in teaching a new language PL/1 and some of the students had already learnt the commercial programming language COBOL, and others the scientific programming language Fortran. In addition some had never programmed before. He found that if a student had learnt COBOL or Fortran they tended to use PL/1 as if it was a version to the language they knew and hence failed to make full use of the novel features of PL/1. Students who had no prior experience had no such inhibitions.     Read ON ...

 In fact there are a very large number of procedural programming languages such as Ada, Basic, C, etc., but all are designed to allow you to tell the computer explicitly what you want it to do, and exactly how to do it. For highly repetitive tasks this can been very useful. You tell the computer once, check that it does what you want and it will then work reliably and cheaply thousands, millions or even billions of times, for years is needs be. But how can you be certain that what it was told was exactly what you want – because the computer cannot tell you exactly what it did in terms you can easily understand. And of course you need to know what you want in advance.

CODIL (Context Dependent Information Language) is conceptually very different. It is a pattern recognition language which stores and compares patterns and if you give it an incomplete pattern it attempts to provide the missing parts. It was originally proposed as the symbolic assembly language of a “white box” computer which would work symbiotically with its human master to tackle open-ended and poorly defined tasks where it is not practical to provide a pre-defined specification suitable for conventional programming.

Because CODIL and conventional programming languages work in very different ways there is a real problem when someone who has learnt to program ties to use CODIL. CODIL is designed to be very flexible, work well with people, and in effect provides an intelligent information mirror of the user’s own thoughts about the task. But of course physically a CODIL system looks just like any other computer system. This means that if someone approaches CODIL with a “computer programming” mind-set they will use it as if it was a programming language and ignore all the dynamic flexibility it provides. I remember a student whose undergraduate project involved comparing COBOL and CODIL for a particular commercial application. After a few days he came to me and said he had got the application working in CODIL and it was so easy that there seemed little point in wasting time writing a COBOL program. I had a look and discovered that he had used CODIL as if it was a friendly COBOL interpreter!

I have been thinking about how to explain this dilemma and I think the easiest way to do so is to look at the steps by which CODIL came into existence and how a pattern based approach differs from a procedural approach.
… … … …
After getting a Ph.D. in theoretical organic chemistry I worked in an information department of an international subsidiary of the Wellcome Foundation. My job was to monitor the internal technical reports we were getting from overseas, ensuring that copies went to those interested, producing a monthly report for senior management, and maintaining appropriate indexes. The brief was, in effect, “If it is relevant to the company’s business – report it.” Report included topics such as the insecticidal properties of Lake Victoria’ mud, new of a competitor’s new product overheard in a bar in Argentina, the veterinary  needs of ostrich farmers, clearer labels aimed at people who are illiterate, and the failure of our product  on a farm in Australia.

No computers were used, but as a student member of the Institute of Information Scientists I realised that they were being introduced in libraries for indexing purposes and I may have read Vannevar Bush’s As We may Think at this stage. When my suggestion that I worked with the company’s computer was turned down I applied to a large computer centre a few miles away, suggesting that I would make a good systems analyst.

In 1965 Shell Mex & BP’s computer Centre at Hemel Hempstead was possibly running the biggest magnetic tape based sales accounting system in the UK, with circa 250,000 customers and 5,000 products. On my first day there was a major crisis as there had been a major updated which involved amending the customer date and printing out maybe half a million customer record cards. Something like 30,000 customer records had been lost, goods had been supplied, but invoices could not be produced until a new customer record had been created.  After the unplanned week helping to sort out the mess the idea was that I should learn to program before moving to systems. I became very interested is how and why programming errors occurred, and how they could be minimised, and caught in the testing stages. In particular I used a table driven approach to make it possible for the program to check itself for self-consistency and generate a copy of its system spec from the validated tables.

The company was planning to replace their computer system with a more modern system which would replace the magnetic tapes with direct access storage, and introduce at least some online terminals. I was moved into systems to look at the problems of moving their extremely complex sales contracts system to the next (as yet not agreed) computer. No-one in the company (and possibly in the world) has any experience of such a large system move from batch to online working and there were no clear guidelines for me to work to.
My experience while programming had led me to believe that most difficulties with the old system lay in the Chinese Whispers chain between the sales staff, via the systems and programming departments and into the code, aggravated by the fact that the sales staff didn’t really understand what the system was doing and tended to assume that if it was on the computer it must be right. I soon found further support with deviations between what the system spec said (where it actually had been maintained), what the clerical support manual said, and what the program actually did. If the sales team wanted changes relating to a a new product, a special sales campaign, or a very large customer with special requirements they took months to arrange – if they were not vetoed by the computer side as requiring too much work. Such changes were the least well documented, and once the sales campaign, etc., was no longer needed the code remained in the program – as sales never asked for the redundant code to be removed. One of my activities was to sample the changes in contracts already on the system and look at what was happening. To do this I made some notes and several of the clerks commented on how easy my notes were to read.

Basically I had identified three interlinked problems, and as a comparative novice to computers I didn’t see any problem. A flexible sales system needed an open-ended approach, the “Chinese Whispers” way of instructing the computer was error prone and too cumbersome to be flexible, and the sales staff could only be in control if they understood how the system worked. But I was used to working manually with complex open-ended information processing problems, I had just written a program using tables in a way that allowed it to generate its own specification reducing the danger of Chinese Whisper type errors, and the clerical staff said they could understand my notes.

OK, I thought, let us move all the application information, apart from the actual price calculation, into a massive “virtual” table. Each line of the table would consist of two fields. The first field would be a list of conditions and the second field would contain a new bit of information. For instance a table entry for a customer with a special contract might be something like:
Customer = XYZ Ltd
Product = Widget X
Quantity >= 1000
Rebate = 10%
In some cases a special customer might be buying many different products, and there may be many such lines in the table. A more normal customer on a single standard contract might only have one line in the table.
Customer = J Smith
Standard Contract = Central Heating

There would be similar entries for different products, different contracts, etc.

The old system for processing the sales contracts was vast and limited by the memory of the computers available while for the above approach the program to read in and price a delivery reduced to something like:
  • 1.      Read in list of relevant items: Customer, Product, Date, Quantity.
  • 2.      Find matching table entries
  • 3.      Add any new items to list of relevant items and go back to 2 with revised list of items..
  • 4.      Calculate price.

While many customers would have been in standard contracts, some of the larger customers would have had complex contracts involving many lines and the table size, and all the pricing tables would all be included. I never did the calculation but the complete table would have had at least a million lines. If the approach had been implemented circa 1968 the table would have been split up with all entries for individual customers being places on the customer record in the indexed customer file, and similarly for products, standard contracts, etc.
   
One of the most obvious advantages about this approach is that all relevant information about the contracts is in a single searchable table, where it can be displayed, and when appropriate updated, in an easily salesman understandable form. A question such as “how was this transaction processed?” can be answered by showing the relevant lines in the table (and for most transactions there may well be less than half a dozen) and assuming most conditions include date information it should be possible to ask about historic and future pricing.

The approach also makes it much easier to have special contracts, possibly containing novel features, tailored for individual large customers and also once off sales campaigns. Not only could an appropriately authorised salesman enter the contract, but if he made a mistake it could not affect transactions with other customers because the “special code” could only be accessed for the correct customer. This is unlike the conventionally programmed approach where a typing error in an “IF” statement could cause havoc. In fact the approach is inherently “fail safe” because if a mistake is made the system will not be able to find a table entry.

What actually happened in 1967 was that the ideas was rejected as being too revolutionary. In retrospect part of the problem was that I had no idea I was saying anything unusual, and did not have the experience to describe it in the way I have just done. In any case I know one of my colleagues liked the idea and in the weeks before I moved to a different company we discussed how part of the idea might be adapted so that it didn’t look too controversial. I don’t have details but I gather the result was called “Variants”.
* * *
CODIL generalises the above approach. Items of information, such as “CUSTOMER = XYZ Ltd” are represents to the user in the same way, but are interpreted as “XYZ Ltd” is a member of the set “CUSTOMER” and of course you chose set names appropriate to whatever application you are doing, and an item may be any valid partition of the set, including the whole set or the nul set. Set names can also be applied to one or more rows in a table, so that, when appropriate, the set can be recursively to provide potentially very extensive additional ways of defining the set. (This is similar to putting the relevant rows of the table onto the customer record in the sales contract example.)

There is a working list of items called the FACTS – and items are added there by the user or are moved from the table. These items act as a filter – in that the only statements in the table that are visible are those that match the FACTS. There are also additional facilities for arithmetic and controlling the input and output of information.

As there is information on CODIL elsewhere on the site (An Introduction to Publications on CODIL) I don’t go into any great details here, but just highlight a few issues that can cause difficulty if someone is “thinking computer programming”
·    CODIL is a pattern recognition system. There is a knowledge base of patterns (called statements), each pattern consisting of a list of items. It is NOT a rule based system – but a user can “teach” it to behave as if it was.
·      Each item represents a set or a partition of a set. The set name is used for all instances referring to the set. This is completely different to the use of names in a programming language where the name refers to the location of information and not its meaning 
·      There is no “IF” statement in CODIL – the system automatically matches things that match and ignores everything else – basically it makes “intelligent” decisions automatically. 
·      There is no global definition of any application – the system never sees more than the context defined by the contents of the FACTS.  The whole stored program approach to programming requires the program to be a global definition of the operations to be carried out on the data. 
·   CODIL is not suitable for mathematically oriented applications which naturally use arrays as they require tools which address information by its location and not its meaning. 



4 comments:

  1. Read it. Am reading followed links & related posts, but I read slowly. Don't know how long it will take before I've read enough for it to cleave together. Cleave, an interesting word. Means either to adhere firmly, or split apart depending upon context.

    ReplyDelete
  2. Hello Chris,

    The government plans to change the nature of IT teaching in schools, moving the emphasis from learning to use existing applications to learning to write programs. I think this is a good move. I've looked at Scratch, a derivative if LOGO developed at MIT and used widely in schools. It looks like an excellent introduction to procedural programming that allows quite young children to develop interesting programs. However, these developments might serve to aggravate the problem you address in this post. Children will equate programming with procedural programming from the outset.

    It seems rather tragic that Micro CODIL is no longer available to complement traditional approaches to IT teaching.

    Regards,
    Roger

    ReplyDelete
    Replies
    1. Thanks for your comments - Of course MicroCODIL is still running if you have a BBC computer and it would not be impossible to create an updated version - but I have decided I am no longer the person to do the coding of the interpreter software.

      I had been planning to do a post of the primary school programming changes - and your comment has inspired me to do it - see "Should we be teaching Primary School Children to Program Computers?"

      Delete
  3. An interesting article on brain function. One article, two places. Both behind paywall. The SN version, I think, may be open in about two or three weeks or so, but the abstracts from the journal and the comments on SN might be of interest.
    https://www.sciencenews.org/article/electrode-turns-consciousness-and

    http://www.epilepsybehavior.com/article/S1525-5050(14)00201-7/abstract
    or, as the journal says, you may have institutional access.

    ReplyDelete