Your Personal AI (PAI): Pt 5 — Deep Training

The Brave New World of Personalized Smart Agents & their Data

A Multi-Part Series

Part 1 — Your Attention Please
Part 2 — Why Agents Matter
Part 3 — The Agent Environment
Part 4 — Deep Agents
Part 5 — Deep Training (this post)
More soon…

This is an excerpt from my book, The Foresight Guide, free online at ForesightGuide.com. The Guide intros the field of professional foresight and offers a Big Picture view of our accelerating 21st century future.


Open, Massively Bottom-Up Software Design: Conversational Coders

So what does agent and PAI development and training look like as deep learning grows? What general kind of platform will take us another big step toward the “technological singularity”? Let me offer a rough vision.

Github, a Facebook for programmers, launched in 2008, now has over 14 million coders and 35 million repositories of open source code. Github is already the largest bottom-up built code repository in the world. But open, mass-collaboration platforms like Github are still in their infancy. Today’s programming is quite technical, and the code being manipulated has only low-level capabilities. Imagine what they’ll be like when our deep learning-based natural language understanding systems become the front end to development environments that let people code in more natural ways.

In a vision of the future I call conversational coding, programmers will be able to manipulate neural networks as objects, change their architectures, and try them on different data sets, and physical environments, using natural language, gestures, and visual environments. (Typing will decline but never go away among performance coders, as long as parts of our brain are specialized to use fingers). There won’t be just a few tens of millions of coders in that world, there’ll be hundreds of millions. Even billions, because every human who speaks back to their own agent will, in that world, be an entry level conversational coder.

Alex Repenning’s vision of a conversational programming envir. (Agentsheets.com)

In a deep learning future, whoever has the largest repository of neural network variations, open data sets, and the largest tester and trainer community, will increasingly build the most useful and trustable variants of naturally intelligent software, including Web 3.0’s next operating system and tools. Who will that be? Individual corporations, or the crowd? The massive parallelism of the web, and the growing ease of conversational coding, argues that open and crowd tools will increasingly be our preferred way to create the best naturally intelligent software. There’s a saying in open source: “with enough eyes, all bugs are shallow.”

Besides being open and massively bottom-up, this coding approach will have to be much more modular. Modularity is yet another thing that biology does very well. Neuropsychologists believe human brains have at least 500 discrete cognitive modules, specialized subsystems used to resolve cognitive tasks.

Many software platforms are taking a big step forward in modularity right now. Open source platforms like Docker, which my developer and futurist friend Bino Gopal says are future of large application architecture, split large applications into thousands of software containers, many of which deliver microservices, modular software processes that do small tasks for the user, communicating with each other using language-independent interfaces.

Containers allow programmers to push individual updates to any microservice or its OS “instance”, in minutes, behind the scenes, without risk of breaking the application. This design increases the resiliency, parallelism, and virtualization of applications, an obvious advance in natural computing. This application virtualization follows operating system virtualization, pioneered in the 2000’s using open source software like Linux, by companies like VMware. It is possible then that developing software containers for neural network modules may be one of the next steps forward in natural computing. The seeders, growers, and trainers of tomorrow’s massive neural nets will certainly need the ability to update microservices within each application without breaking the architecture. It will need to fail or improve gracefully, just like the human brain.

A Small Cryptocurrency MIning Rig. Source: Linustechtips.com

Deep Training — A Better Distributed Computing Vision than the Blockchain for the Digitally-Empowered Crowd

Let’s now discuss a currently very popular topic, the blockchain. One job of a foresight professional is to try to tell the most accurate weeble stories one can, while calling out faulty stories. Unfortunately, there are a ton of faulty stories being told today in the blockchain space. Aside from bitcoins (digital currencies), which are very real innovations and, in the leading ones, smart long-term investments in need better regulation, most of today’s blockchain businesses are hype. They’ll never work. See my posts, The Truth About Bitcoins and Blockchain, if you want a deeper story.

Fortunately, a small and growing number of entrepreneurs besides the greedy utopians over at Ethereum are recognizing the great value represented by the global crowd of distributed computing volunteers. We need to offer those volunteers something more useful to do on their increasingly parallelized and powerful machines besides mining yet another digital coin or investing in highly implausible blockchain startups.

A good example of a better alternative is Quantiacs, a crowdsourced hedge fund working to democratize algorithmic trading (quantitative investing). Quantiacs takes 20% of the profits generated in trading on their platform, where everyone’s algorithms compete in the open. The really democratizing part is that they pay algorithm developers 50% of the profits generated by their algorithm, for as long as it generates a return. They also offer free tutorials and run quant coding workshops to teach data management, backtesting, and strategy and algorithm development skills, which are the future of trading. Quantopian, with 85,000 members in July 2016, is doing exactly the same thing.

These companies have a smart, two-sided market business model. The first market is algorithm developers who develop and test their own algorithms, trading their own money, paying their own commissions, and competing against each other in open contests on the platform. The second market is a crowd-sourced hedge fund run by investor-members, who commit to having their investments managed by winning algorithms, and in which the hedge fund pays substantial commissions to the winning algorithms. This is a great distributed model, based on identifiable traders, and it seems to recognize that for the next decade at least, human traders are going to be a lot smarter than the AI they will be building into their algorithms.

Then there’s Numer.ai, a crowd hedge fund startup that uses blockchain-like encryption to anonymize trading data before sharing it with a community of equally encrypted (anonymous) data scientists. The traders are all paid anonymously in Bitcoin. Apparently the fund’s performance is also kept secret, too. This sounds like recipe for terrible results to me, and a system ripe for financial abuse.

At best, it is an unproven encryption experiment that has strong parallels with utopian libertarian thinking, like many blockchain applications. I don’t expect it to survive in its current form. Even if it does, I’d bet it will be a very minor player versus nonanonymous platforms like Quantiacs and Quantopian, where the hard work of creating better algorithms is, like the best science, open to group improvement and critique.

Algorithmic trading was only 13% of all trading in 2012, but is now over 30% and is still trending up strongly. Some estimates say that up to 80% of current trading is automated, though we wouldn’t call that automation any form of intelligence. Most of the algorithms used today are quite simple, but they’ll get increasingly sophisticated, neural network like, and predictive as time goes on. Today’s algorithms mostly correlate vast public data, doing simple dimensional reduction and looking for lots of short-term signals. Eventually they’ll incorporate complex memories, be able to parse financial statements, do sentiment and awareness analysis, build preference landscapes, and use both short and long-term value investing models.

Today, quant investment firms are getting tons of capital inflow. Quant firms like D.E. Shaw, still minor players even as late as 2008, have come to dominate the industry since the financial crisis, and particularly since the rise of deep learning in 2012. According to Joe Rago (brilliant WSJ reporter, now sadly dead at 34), several of the top quant firms now spend millions a quarter in energy bills alone for their quant trading server farms. See Scott Patterson’s Dark Pools (2013) for a fascinating but now somewhat historical take on this emerging industry.

Fortunately, these top down solutions are already in competition with distributed platforms like those we’ve mentioned. The faster fund redemption money flows into today’s dead-simple trading algorithms, the more their “alphas” (returns) will drop. While the big companies will continue to take most of the profits, it will be the open and well-incentivized crowd, enabled by increasingly powerful open source coding tools, not the top firms, that will increasingly develop the best new experimental algorithms. Diversity and experimentation are always fastest in that environment. Many of those will outcompete the majors for a while, and our planet’s collective trading intelligence will grow.

But we can also improve our collective agent intelligence as well. As I argue in this series, I think smart agents and PAIs are where our greatest new opportunities for social progress now lie. The big companies will continue to lead in agent and PAI production and revenues, but as with quant trading, the crowd will own agent diversity and experimentation.

Here’s my suggestion: Google, Microsoft, IBM, Amazon, Nvidia, Intel, a gaming company, or another major hardware or software player presently seeking to become a cloud based deep learning company should build a platform to recruit and pay the distributed computing crowd to train and test a wide variety of neural nets, inside simulated worlds running on their now highly advanced gaming machines, and an innovation market like Brightidea, that pays deep trainers and testers for valuable new application, training, and testing ideas.

As with Github, both open public and private corporate training environments would exist on the platform. This Deep Training Ecosystem should also pay the trainers and testers in Bitcoin or another top digital currency, to maximally reward their efforts.

OpenAI’s Universe Platform — Helping AIs Get Into Computer Games

We see a tiny bit of this vision in OpenAI’s release of a platform, Universe, that allows AI programs to interact with 3D games originally designed for humans, in the opening of Google’s DeepMind Lab to outside software developers, and in Microsoft’s creation of a fork of Minecraft called Malmo that lets algorithms build and act in that primitive virtual world. But we need so much more.

Given the difficulties of decentralized software, which we’ll discuss shortly, I expect most of our deep learning hardware, virtual worlds, and core software architectures will continue to improve the fastest in central locations and clouds, run by corporations, not crowds. But even with our current bandwidth and machines, the greatest diversity, experimentation, and innovation can now be done by the crowd, training, testing and permuting open software, and generating next-step innovation ideas, in the trainer-tester’s virtual worlds and in their local physical environments. That’s what has become truly new in recent years.

A good Deep Training Ecosystem would be something like GitHub, Kaggle or Algorithmia, with a reputation system and an innovation market like Brightidea, and with plenty of free guides and courses to onboard human trainers and testers. It would include biz dev and marketing freelance opportunities, to enlist the crowd in growing the field. The platform would match the best data managers, trainers, and testers with companies curious to try out deep learning on their problems, starting out in low risk ways, in both virtual or physical environments.

Such an exponentially growing use of distributed computing would get our global volunteer computing communities, and their increasingly bio-inspired rigs, working on problems that really matter. Any takers? Without viable employment opportunities like this, far too many well-meaning, fed-up people will continue to fall victim to the blockchain hype.

So we can see that many fields are driving this great phase transition (aka “singularity” :) to natural machine intelligence. Each of us can contribute to this epic event in many ways, and the first is just to be aware of it and to share that awareness with others.

If you want to do more, for any curious and analytical mind, young or old, communities and fields like IT, data science, computer science, biotech and bioscience, nanotech and nanoscience, deep learning and neuroscience offer great places to make an outsized contribution to our naturally intelligent future. Have fun and change the world!

Our Next Post — Safe Agents

Our next post considers the big questions of safety and morality in our smart agents and personal AIs as their intelligence grows. We’ll look at biology to understand how life maintains safety, trust, and moral behavior in social collectives, and understand that something we can call natural security will have to be the future of physical and cyber security in all of tomorrow’s most intelligent machines. There isn’t any other alternative, as I see it.


John Smart is CEO of Foresight University and author of The Foresight Guide. You can find him on Twitter, LinkedIn, or YouTube.

Feedback? Leave it here or reach me at john@foresightU.com.
Want access to my events? Enter your email at ForesightU.com.
Need a speaker? See my speakers page, JohnMSmart.com.

CC 4.0. Please share or adapt, with link and attribution.