model.fit() part III

Parikshit Sanyal
Significant others
Published in
6 min readOct 30, 2021

Making whole organisms from recursion

You know you are getting old when you mention ‘K&R C’ amidst a bunch of geeks and get blank looks all around. Even a mention of the more popular ‘Let Us C’ doesn’t ring a bell in most (usually, what I get is a flurry of exasperations and sighs, as the geeks wait for me to get on with the ‘real’ topic, i.e. some fancy new JavaScript or Python library for bleeding-edge stuff, like make-AI-pre-chew-your-food etc). You then realise that like many before you, you too are clinging on to the good-ol-bad-ol-days like a lousy lover. C is probably never going to be a general purpose language again. Sigh.

On hindsight, C was not much of a novelty. Most of its features were decades old, and C provided a convenient wrapper around them all (plus, a blockbuster operating system — UNIX — to go with it). It was my third programming language, though (after LOGO & BASIC), and the first one to have a semblance of structure. Very early on, the book made it quite clear, that unlike mathematics, code is something that has to be done, not just thought. To put the point more succinctly: mathematics is declarative, code is imperative (From SICP). One can define a square root in mathematics as

let x = √y

… and be happy about it. A computer program, however, has to be painstakingly crafted to determine the square root

# Python
def square_root(n): # n is a square integer
for i in range(n/2): # iterate all numbers between 0 and n/2
if i*i == n:
return i

And now your hands are dirty. To write a program, one must come down from the ivory tower of mathematics and formulate an algorithm. The above algorithm works only for whole numbers, which are also square numbers (25, 36, 64 etc). If we want to make an algorithm for any positive number, then:

# Python
def newtons_square_root(n, iterations=100):
x = float(n)
for i in range(iterations):
n = (n + x / n)/2
return n

There’s also another method of determining square roots, which most of us have learnt in middle school. Does anyone remember it? Heck, I am not very sure I can do a simple division on paper anymore. A famous short story ‘The Feeling of Power’ by Isaac Asimov, portrays a man, in a not so distant future, who could do calculations on paper.

“No, sir,” said Shuman patiently. “Not a paper computer. Simply a sheet of paper. General, would you be so kind as to suggest a number?”
“Seventeen,” said the general.
“And you, Congressman?”
“Twenty-three.”
“Good! Aub, multiply those numbers and please show the gentlemen your manner of doing it.”
“Yes, Programmer,” said Aub, ducking his head. He fished a small pad out of one shirt pocket and an artist’s hairline stylus out of the other. His forehead corrugated as he made painstaking marks on the paper.
General Weider interrupted him sharply. “Let’s see that.”
Aub passed his the paper and Weider said, “Well, it looks like the figure seventeen.”
Congressman Brant nodded and said, “So it does, but I think I could make a passable seventeen myself, even without practice.”
“If you will let Aub continue, gentlemen,” said Shuman without heat.
Aub continued, his hand trembling a little. Finally he said in a low voice, “The answer is three hundred and ninety-one.”
Congressman Brant took out his computer a second time and flicked it, “By Godfrey, so it is. How did he guess?”
“No guess, Congressman,” said Shuman. “He computed that result. He did it on this sheet of paper.”

In Newton’s method, one is introduced to the idea of iteration, which is roughly the mathematicians idea of a series, or a progression of numbers. Iterations like the for loop mentioned here are found aplenty in all kinds of programs, and provide a bridge between mathematics and code. The example also demonstrates a procedure (‘newtons_square_root’) which converts a number to its square root: in some ways, it is similar to (but not exactly) a mathematical function.

The thing about procedures is that they can call themselves, i.e. circular definitions are allowed (and in some cases, encouraged). In mathematics, circular definitions are usually prohibited. You simply cannot say that a triangle is a ‘triangular shape on two dimensions’. There are few scenarios, however, that circular definitions make perfect sense. Consider this procedure, mentioned somewhere in the middle of the K&R book:

# C
# Factorial function
int fact(int n)
{
if (n == 0)
return 1;
else
return n*fact(n-1)
}

This is an exact replica of the factorial function from mathematics books. There’s no hacks, no iterations, no tricks with silly divisions and additions in between. Just copying the definition of factorial function to code feels so … clean!

n! = 1 if n equals 0,
n ⨉ (n-1)! if n > 0

And it works! The factorial function gets the thing done. It’s been 25 years since I typed that code on my old Turbo C compiler, and I can still feel that awe when it compiled and ran. Later, I found that many famous mathematical series, like the Fibonacci sequence, can be copy-pasted from maths to code:

def fibo(n):
if n <= 1:
return n
else:
return(fibo(n-1) + fibo(n-2))

… which is exactly the same as the formula for the n-th Fibonacci number, which is the sum of two previous Fibonacci numbers

f(n) = 1 if n ≤ 1
f(n-1) + f(n-2) if n > 1

So how far does this rabit hole drill? Surely, McCarthy’s paper must have introduced the world to the idea of recursion (and LISP), right?

…DNA makes RNA makes protein makes DNA…

Now why does that sound surprising? This is recursion, albeit on several levels. In prokaryotes, the DNA molecule must make arrangements to replicate itself, which must be carried out by DNA polymerases, which are proteins that are coded somewhere inside the same DNA molecule.

The bacterial (and also human) genome contains the code for the enzyme which replicate itself (purple); replication will usually start from a different site(s) — Ori — shown in blue

There might be instances where, let’s say, the polymerase thus produced forgets to copy an Ori site (which can still be managed, because there are several Ori sites), or worse, forgets to copy the code for itself!

Again, the mRNA (‘messenger RNA’), an intermediate between DNA and protein, must be transcribed from DNA. However, the mRNA depends on ribosomes (made of rRNA) and tRNA (‘transfer’) molecules to get the job done, which are both encoded in the original DNA, and must be there — pretranscribed. The very process of transcription, however, is dependent on transcription factors that are themselves proteins who bind to original DNA, and thus they must themselves be ‘translated’ and ‘transcribed’ in the first place. Add to these the miRNAs (‘micro-inhibitory’) which capture mRNAs and inactivate them, and you’ve got yourself a colorful soup of arrows pointing at every direction.

The recursion loops in the central dogma

The whole cycle must have started somewhere, but its’ still unclear where. There must have been preformed tRNAs and rRNAs and transcription factors for an organism to start its life. For the individual cell, its parent cells must have provided those through division of the cytoplasm. (In humans, the ovum provides for all this paraphernilia). For life as a whole, nobody is quite sure how the first protein was made. But after that, this self-propagating loop seems to be running without pause, for a few billion years now. Which only provides irrefutable testimony to the efficiency of recursive algorithms!

--

--