On Getting Started
Experts. Transitions. Friends.
In one of our previous discussions, we examined how the simple act of predicting the next word within a context window — pursued relentlessly with copious data and staggering computational resources — gave birth to a program (a large language model) with emergent qualities, capable of mimicking the prose of Milton and the meandering musings of Murakami. Yet, these emergent traits are but mirages, lacking practical value without task-specific setup and fine-tuning. In plain terms, we possess a splendid horse that requires discipline to triumph in a specific race. For instance, ChatGPT has undergone meticulous adjustments and fine-tuning for 1:1 conversations. However, the path from an illusory super-bot to a capable conversational companion demands several steps. Today we shall discuss the essence behind the very first step in this journey for fine-tuning.
Have you ever experienced the feeling of being utterly out of place, as if you didn’t belong where you were? Maybe recall that initial week amidst the cool kids, the fleeting illusion of fitting in, only to see oneself trailing behind the rest. Or while learning a difficult new skill, realizing your mental unpreparedness towards it? Or maybe during that life transition, burdened with myriad responsibilities, doubting your ability to manage it all? At times when the the learning curve is too steep, progress is slow, and the short-term gains remain meager, it fosters that sense of inadequacy also known as impostor syndrome.
The good news is, we’ve all been there. The bad news? Adventurous souls will encounter these impostor-like situations more often than they anticipate. But the great news is, a simple, well-known solution exists.
Introducing Sam — a virtuoso in programming and logic, a wunderkind who crafted remarkable apps during his teenage years, and a troubleshooter who amazed his peers. College life unveiled promising opportunities, and in the early months of his sophomore year, job offers graced his doorstep, thanks to his prowess in computer science.
A couple of years into college, he develops a passion for the science behind combinatorics, which drives him to switch his major to pure-mathematics. Suddenly, the tables turn — his classmates with more years of math behind them, excel at proving theorems, while he struggles to keep up. His grades falter, motivation wanes, and he questions his choices.
Yet, Sam is no quitter; he is charming. He hence, befriends the theorem wizards of his class, spending time with them, absorbing their work styles and their futile attempts etched on their whiteboards. He imitates their approach relentlessly, applying it to other problems in the textbook. And finally, success! The next semester is smoother, and the rest…
This in essence, is behavioral cloning — an easy way to bootstrap yourself when plunged into unfamiliar waters. Just three steps: find an expert, watch them demonstrate their modus operandi, and then blindly emulate their approach until you gain momentum. To devotees of originality, it may seem sacrilegious to do so. And indeed, it is. However, developing your own strategy, exploring different opportunities, crafting your own style can only happen when you feel at ease — behavioral cloning merely serves as the starting recipe.
In the realm of dialogue agents, they were trained on Q&A logs from sources like Stack-Overflow to transform the intelligent hallucination into an agent capable of answering questions. Years before ChatGPT, behavioral cloning proved popular in robotics to kickstart learning. Industrial robots, ChatGPT, and our friend Sam should ideally learn by doing, through trial and error, forging their own path — not by mimicking an expert’s modus operandi. Yet, that process can be slow, agonizing, and arduous, making this a necessary initial step.
In conclusion, when one is faced with new life challenges and is feeling cornered — one could make friends. Finding those who have treaded the same path and emulating their playbook should get us a foot in the door. The rest, as they say, will be history.