Perceptron Perceiving
A polygon with infinite sides is a circle. Any polygonal area can be represented as an aggregation of many many circles of different radii. Inceptive indeed.
Winter! Chill breeze with warm blankets, hot pho with rustic Merlot, and. Holiday season! For a change, I decided to spend my holidays back in the U S of A this year. Everybody tripping out, I had a lot of time to spend with myself. My kind of vacation indeed. But for the two Christmas holidays, I had all the time for myself. And the to-do checker I am, I had a ton of things I wanted to do during the week-long break. Writing, music practice, studying, side projects, books, cooking, cricket, fitness and so on. I made an elaborate plan to check most of these items off starting 23rd. The first two days went fine, as per my plan, but Christmas made my holidays even merrier. I did not get any work done and was often left contemplating my interactions with two high school kids I met at my aunt’s place in LA.
Walnut-Nutella-cheese-cake! I like every word in that sentence. While I was completely focused on hogging away all the cake at my aunt’s place, “Oh, the solution to that Object-Oriented Programming exam question was to combine 3 perceptron systems and perform a logical OR from each of their outputs”, explained senior high school Ellie to her classmates Doug and my cousin Palak. My eyes lit up, jaws dropped on hearing high school kids speak about perceptrons. Yes, I was breathing in the AI-ML hype every single day in the tech industry, but I had not given thought about the depths it had reached — touching young high school children. My high school computer assignments only involved basic C programs to either perform arithmetic operations or display fancy patterns of asterisks. But that was 12 years old. I realized how outdated the education I went through then has become and thanked everyone who pushed me into the paths of constant learning to pursue higher education. I carried my cake over to the kids’ table to have a conversation about their assignments during their holidays. I instantly got the “you are so old and boring” look from Palak but I chose to get past the moment. “So Ellie, what did you build in the perceptron experiment?”, I started.
Over the next one hour, young Ellie would make me realize, it was not just what I had learned that was archaic, more importantly, it was the process and protocols used in schools around me which are too bizarre and counter-intuitive from a modern lens. Almost irrelevant. And the sad truth, it is over a decade and my cousins in India, are still building Pascal’s triangle of asterisks in their 101 programming assignments.
“A perceptron is a mathematical representation of a neuron. It fires when the strength of the input signal is higher than a particular threshold. An easier way to imagine can be — a perceptron is like a function that represents a line. Things to the right of a line represent crossing the threshold and those to the left represent weak input signals not strong enough to cross the threshold. The line acts as a separator. And since any line can be represented with its slope and intercept, a perceptron can also be represented with a slope and intercept. For any given input coordinates, the perceptron function tells us if the coordinate is to the left or right of the line i.e. in terms of neurons whether it crosses the threshold”, the kids enunciated.
Sure it was mostly textbook knowledge, what most of you might already know. But what stood out was how students were asked to apply concepts of coordinate geometry learned earlier in high school, to code a bunch of plug-gable functions as part of their current programming course to build a mathematical representation of a physical system and eventually understand fundamentals of a future concept — classifiers.
Ellie went on to show me how her system of perceptrons could be extended to represent any region in the XY plane. Creating multiple perceptron functions (lines), each with different slopes and intercepts we can create closed convex polygons. Programmatically, for any input (x, y) if the point lies on a certain side of EACH of the created perceptron line, we can determine if the point is inside or outside the polygon. I was only in awe by their level of understanding and application.
“Given a regular polygon of size i and shape n, implement a system of Perceptrons to classify a given point (x, y) as ‘in’ or ‘out’ of the polygon”, was their problem.
They were given a code-base with an interface for Line with an activate function to identify an input Point (another class) as positive or negative based on the point’s orientation w.r.t to the line instance. They had aliased interface line as Perceptron. The interface abstracted another function called output which was supposed to return 0/1 based on the output of activate(). A higher level class System was created with abstractions addPerceptron to add Line objects to the System and evaluate to return the result of classification. Class System was aliased as Model. This framework was written by the TA and given to students. Palak, Dough, and Ellie had to implement a few classes and the main function.
As submission one they were asked to implement the interface Perceptron into a MyPerceptron class. They added constructors to set the slope and intercept. programmed the logic for the activate function by computing the geometrical orientation of the point and the line. The class also had the support to initialize the activate function dynamically (functions as arguments) as a function of activate.
sign(weight*point.x + bias — point.y)
Before submission, the tutors made them alias slope as weight and intercept as bias.
Submission two required completion of the Model class. They used a list of interface Line to hold the constituent perceptrons. They implemented the constructors, add_perceptron and evaluate functions in the Model class by aggregating outputs of all the constituent perceptrons and return “in” or “out”.
They were made to test their models on a regular hexagon of 2 units length. In the main function, they diligently created 6 instances of the Perceptron class using constructors with appropriate weights and bias. For each of the six perceptrons, they initialized their respective output functions as functions of activate. They instantiated a Model class and added the six perceptrons to their model which was now ready for testing with any given (x,y). Randomly generated coordinates inside the hexagon, on the hexagon and outside the hexagon were thrown at the model for evaluation.
The final submission demanded them to write a utility function that automated the model creation which they had earlier coded themselves in main, i.e for any given size and shape. Ellie, Dough and Palak methodically converted the size and shape of the polygon into angles and distances building upon high school geometry. They summoned their trigonometry skills and then derive the endpoints of the polygon. From there, it was straight forward later to infer slopes and intercepts of each of the lines. It took effort on their end to convert geometry and trigonometry into a generic code piece, but their fundamentals in programming (learned earlier) were strong enough to back them. The completed the utility function, called it from main to get the model based on “i” and “n” and tested it for accuracy. No surprises they had a full score.
I asked them how did they classify a point as “in” or “out” based on its presence in three disconnected hexagons as in the exam question.
They initially started with representing each of the 18 lines from the hexagons as perceptron and aggregating the results, but soon realized that a logical AND of all the perceptron outputs would encompass incorrect regions, or maybe no region at all. They identified that the problem could be solved separately for each hexagon and then use a separate combiner implemented as a logical OR of the three outputs to get the final result. To test them further, I asked them how would they extend their assignment code, to incorporate a system like the one in the exam. It took a while and some nudging but they arrived at a fairly decent solution of creating one more higher level class which encapsulated an ordered collection of their earlier model instances, chained outputs of one to the next resulting in the final output. I was just jaw dropped.
Taking it even further, this is also a possibility
There are many takeaways in the above kind of assignment.
- It is a lot more top-down than bottom-up. Programming assignments that I have done make you implement primitive implementations of things like string comparison, pattern drawing, fancy math tricks, etc. But the focus should be rather on developing skills for incremental development i.e given the building blogs, can you extend it, complete them and connect them to solve a problem. Not only would appreciate the skeleton (which they might have to build on their own in future) a lot more, but they would also understand it much better by actually seeing it getting applied by their own hands. As an introduction, a top-down approach gives a broader picture into the domain and seeds a stronger motivation in the student to learn.
- The second key takeaway from the assignment was the connection-ism. Using known concepts (geometry, trigonometry, programming) to learn something new (OOPs), to build something which I will learn in the future (classifier) has the connotation of story-telling. It adds a sense of purpose to what I am learning and is taken a lot more seriously. Humans have a tendency to derive satisfaction when things connect, this assignment was a bold testament for the same.
- A final takeaway from the assignment was something I call metaphoric-scaffolding. The act of explaining complex concepts by representing them as interactions between simpler, physical object collections which serve as metaphors. Explaining “training a classifier” technically to a kid of that age might not be all that easy, but converting it into a configuration problem in coordinate geometry with reduced dimensions was a solid metaphor.
I was not just impressed by their ability to adapt, but also felt like attending their school to just enjoy learning. If taught so intuitively how can learning become boring? Ever. The idea to build using what you have already learned to plant seeds for something you are going to learn should be the foundation of an education system. This not only cements a better understanding but engenders creativity by virtue of connection-ism, the most pivotal skill in the decade to come. Using a scaffolding of analogies/metaphors to the tangible physical world, complex abstract concepts should be broken down. As Robert Greene points out,
“The future belongs to those who learn more skills and combine them in creative ways.”
was very evident in their system. I kept telling myself that had I been taught like this, probably my interest would have manifested differently. A lot of onus of how students turn out is on the teacher and the teaching paradigms. A good teacher can make you fall in love with something rather boring, while a bad teacher can wreck even the most interesting of concepts. Of course, there is an element of subjectivity here, but the need for learning to be done right embodies an inherent objective truth.
Learning to learn is an art. Mastery over which manifests into the finest of quality.