Opex AI Roundup — October 2018
The Opex AI Roundup provides you with our take on the coolest and most interesting Artificial Intelligence (AI) news and developments each month. Stay tuned and feel free to comment on any you think we missed!
All Is Fair in Love and War, and Evidently ML Models Too
Machine learning models are only as good as their training data. It works for both definitions, but in this case, we mean “good” in the Glinda-the-Good-Witch sense: noble, fair, and just. When models for hiring practices, legal sentences, or other important real-world decisions are based on historical data, they often include harmful biases that can create unfair outcomes (because historical data reflects our historical terribleness). These comparably Wicked-Witch-of-the-West models are pretty difficult to fix, and in this episode of the O’Reilly Data Show Podcast, Stanford professor Sharad Goel and his student Sam Corbett-Davies break down exactly why it’s so hard to design machine learning models that are truly fair.
Banking on Bilinguals
Would you believe me if I told you that M.I.T. is creating a college to “educate the bilinguals of the future?” Well, it just so happens that the President of M.I.T., L. Rafael Reif, defines “bilinguals” as people in fields like biology, chemistry, or history who are also skilled in modern computing and analytical techniques. Starting next fall, M.I.T.’s new college (backed by a modest investment of $1 billion) will teach students how to wield the power of A.I. — responsibly, of course, as your friendly neighborhood data scientist must.
Great Deals at the Data Science Barn!
Data science is a hot field, and with academic programs and boot camps popping up all over the place, it may feel like shelling out some coin and going back to school is the best path to data science. While a formal education will definitely help, if you’re motivated and disciplined, you don’t need to go to school to be able to learn the essentials of data science. This handy guide is a great curriculum for a (nearly) free data science education. One of its truest points? “Don’t feel like you have to memorize every method or function name, that comes with practice. If you forget, Google it.” Most of a data scientist’s job is using what they do know to Google effectively for what they don’t (or at least that’s what I tell myself at night).
The Truth Is Out There (in Tweet Form)
Twitter has publicly released the complete archives of tweets attributed to foreign states who allegedly attempted to influence the 2016 U.S. Presidential Election (or, more broadly, the U.S. political climate). In these archives are “…all public, nondeleted Tweets and media (e.g., images and videos) from accounts we believe are connected to state-backed information operations.” With sizes in the gigabytes, these datasets present a great opportunity for data scientists to flex their big data muscles and help detect political misinformation at the same time.
I Wonder How Thin We Can Stretch These Bodily Analogies
To date, deep learning models have primarily mimicked the body’s image processing system, wherein features are identified and processed in an information hierarchy. But now, researchers are considering a different sense as a basis for modeling signal processing — olfaction, or the sense of smell. Scent-based information is naturally unstructured, unlike images, which have defined shapes and colors and dimensionality. The complexity of olfactory input, as well as how this information is translated into the sensations we process, could serve as the inspiration for a new kind of machine learning approach.
That’s it for this month’s roundup! Check back in November for more of the most interesting news and developments in the AI community (from our point of view, of course).