Ada Lovelace To Deep Learning

(This column was originally in the Westchester Guardian of April 4, 2016 — http://www.westchesterguardian.com/Community%209.html)

Creative Disruption

by John F. McMullen

Creative Disruption is a continuing series examining the impact of constantly accelerating technology on the world around us. These changes normally happen under our personal radar until we find that the world as we knew it is no more.

Ada Lovelace To Deep Learning

C 780–850 — Life of Mohammed ibn-Musa al-Khwarizmi from whose name we get the word “algorithm” (as well as “algebra”)

1786 — Hessian Army engineer J. H. Müller publishes a paper describing a “Difference Engine” but could not obtain funding to proceed.

1822 — Charles Babbage proposes to develop such a machine and, in 1823, obtains funding from the British government. After developing an early version of such a machine he specifies a much more ambitious project, “Analytical Engine,” which is never completed.

1843 — Ada King, Countess of Lovelace, writes the “First Computer Program.”

1945 — John von Neumann authors the first draft of a paper containing the first published description of the logical design of a computer using the stored-program concept.

1946 — The first working electronic computer, “ENIAC” is announced to the public.

1948 — An experimental computer, the “Manchester Small-Scale Experimental Machine,” successfully ran a stored program.

1956 — John McCarthy organizes the first international conference to emphasize “artificial intelligence.”

1975 — The first consumer microcomputer, the “Altair 8800” was introduced. Upon reading of the computer, Bill Gates and Paul Allen developed Altair Basic to allow the Altair to run stored programs (this was the product that launched Microsoft — then called “Micro-Soft).

1997 — IBM’s Big Blue Defeats World Chess Champion Garry Kasparov 3 ½–2 ½.

2011 — IBM’s “Watson” Defeats Jeopardy Champions.

2016 — Google’s “AlphaGo” Defeats world-class Go Player Lee Se-dol 5–1.

Algorithms — https://en.wikipedia.org/wiki/Algorithm

We constantly hear terms such as “algorithm,” “computer program,” and, more and more, “Deep Learning.” Yet, while most have an understanding of computer programs, the other terms are somewhat elusive. Normally, it’s not very important for the average person to understand technical terms, but a knowledge of the progression from what’s known as “Ada’s Algorithm” to Deep Learning has meaning in appreciating our now rapid movement toward true “Artificial Intelligence.

An algorithm, quite simply, is a rule or a method of accomplishing a task. No matter how complex computers are, they are no more than a collection of wiring and physical components. They must receive direction to accomplish whatever task or tasks are desired by the owners of the device.

So, if a computer were to be used to calculate employee payroll, the way to do this would be contained in the payroll algorithm. The algorithm would contain a number of instructions or “program steps” to properly complete its processing.

One step might be to calculate “Gross Pay” for an employee;” an instruction to do this might simply be:
 “Gross — Hours * Rate” where * stands for multiplication.

However, that is a very simple statement that might be used only in a case where no one could work overtime as defined by state law. If there were many employees, all “on the clock” in a jurisdiction where hours over 40 had to be compensated at 1 ½ times the normal rate, the instruction might look like this:
 “IF Hours Are More Than 40, 
 THEN Regular Gross = Rate * 40 and OTGross = (Hours — 40) * 1.5 (Rate)
 ELSE Regular Gross = Rate * Hours and OTGross = 0
 Total Gross = Regular Gross + OTGross”

 
 This series of instructions must be calculated for each employee as must, in a normal firm, the determination as to whether the employee is a salaried on an “on the clock employee,” how much tax (if any) must withheld for the federal government and state and city (based on number of dependents and federal and appropriate state regulations).

Additionally, reports (and possibly checks) would have to be produced. All in all, something that we might consider as straightforward now may seem very complex as we get into the details — there can be no errors in the instructions; they must be precise and accurate. Even minor errors may cause large financial loss, mechanical failure, and /or loss of life.

Ada King, Countess of Lovelace, and daughter of the famed English Lord Byron (George Gordon) is called the “first programmer” — even the “first computer programmer” — even though there was no understanding of programming — and certainly no computers in 1843. She is referred to in these terms simply because her writing in a notebook about the never-to-be-finished “Analytical Engine” showed an understanding of the concepts that would be important over 100 years later. (see James Essinger’s “Ada’s Algorithm” How Lord Byron’s Daughter Ada Lovelace Launched The Digital Age,” 2014, Melville House Publishing, for the entire fascinating and somewhat tragic story). In recognition of her contributions, the US Department of Defense named a programming language, developed in the 1970s, “Ada.

By the time that the first working electronic computer, the “ENIAC”, was developed — during World War II but not completed until 1946, it was well understood that the computer was no Frankenstein Monster that could “think” on its own; it had to be programmed! The original programming for the ENIAC was done on paper and thoroughly checked for logic (hence the term “desk checking”) before touching the computer. To do otherwise would be costly — computer time was considered very expensive and the wasting of it was frowned upon. In the case of the ENIAC, there could also be a great waste of human effort as each instruction had to be entered one at a time by “throwing” mechanical switches as it was to be executed.

While the ENIAC was being developed, the famed mathematician Jon von Neumann had postulated the concept of a “stored program.” A program would be written, tested, “debugged” (all errors found and corrected), and stored on some medium (punched cards, paper tape, etc.). When needed, it would be loaded into the computer with the data to be processed and used (think of Microsoft Word, kept on your hard drive or, even now, “in the Cloud” and only called into the computer when you wanted to write a letter or create a memo).

When the Altair 8800 first appeared, 30 years after the ENIAC, it was purely a hobbyist’s machine for tinkering until Altair BASIC arrived and allowed it to utilize stored programs.

For over 50 years after the ENIAC, progress in computer technology was found in making bigger, faster, and cheaper components, communication breakthroughs (such as the Internet), and enhanced programming languages (COBOL, Fortran, BASIC, Ada, C, Forth, APL, Logo, LISP, Pascal, Java, etc.) and tools to make program development more efficient and hopefully, more “bulletproof” (error-free).

While this mainstream of computer progress had been going on, lurking on the sidelines had been the science fiction sounding dream of “Artificial Intelligence” (a term coined by John McCarthy in the mid-1950s) — the ability to have something other than human’s exhibit human intelligence, a dream going back to the mythical Golem and Mary Shelley’s Frankenstein, a dream thought to be much more possible through the advent of computer technology.

The term “artificial intelligence” has taken on many meanings since McCarthy coined it — robotics, expert systems, case-based reasoning, etc. but none seems as profound as a system that emulates human learning.

The business and scientific systems developed for the first 50+ years of electronic computing were all based on rule-based systems — “deductive reasoning,” in which we are given general principles and then we apply them to individual cases as we go along (such as the ‘IF-THEN-ELSE” example given above). In short, we proceed from the abstract or general to the particular or individual case.

Humans, however, also learn by the reverse of this — from the particular to the abstract or rule; this is inductive reasoning. If we go out of the house when it snows, turn right, and snow falls off a tree on our head, sooner or later we start turning left when it snows. In short, we learn and build rules based on the learning.

If we considered the human generated algorithms to be analogous to deductive reasoning, so-called Deep Learning is the inductive opposite — we may set goals but then “dump into the system” thousands or millions of related facts or games or war scenarios and then, more or less, say “you figure it out.”

In short, the computer is writing the algorithm — and it’s doing it thousands of times faster than a human could, based on analyzing millions of more related facts than a human could. Ada would be proud!

I welcome comments on this piece to johnmac13@gmail.com.

John F. McMullen is a writer, poet, college professor and radio host. Links to other writings, Podcasts, & Radio Broadcasts at www.johnmac13.com, and his books are available on Amazon.

© 2016 John F. McMullen