Keeping up with Machine Learning — (February 2020 — Week 1)

Jad Slim
Learn The Part
Published in
5 min readFeb 6, 2020

Welcome to your weekly dose of Artificial Intelligence, Machine Learning and Data Science. Sign up for our weekly newsletter to stay up to date on the latest news.

Some Good News

Web scraping is now legal in the U.S.

  • Data that is publicly available is now legal for web crawlers; since it was determined that a web scraper bot is no different than an entry from the browser.
  • In both cases “a user is still requesting open data and is doing something with it on their side”.
  • The court now prohibits competition from removing information from your site if the competition’s site is public.
  • The court also prohibits companies from interfering with web scraping activities.
  • Among numerous applications, being able to freely automate the extraction of data from any website will have considerable implications on machine learning model training.

OpenAI Switches to PyTorch

  • OpenAI announced that they’re migrating from Tensorflow, and will primarily use PyTorch as their deep learning framework.
  • Switching to PyTorch is supposed to increase research productivity at scale on GPUs. The OpenAI research team believes that deep reinforce­ment learning will play pivotal roles in the development of powerful AI technology. Switching to PyTorch made it a lot easier to execute new research ideas, drastically reducing iteration time from weeks to days.
  • OpenAI cited PyTorch’s efficiency, scalability, and adoption as the reasons for its decision.
  • PyTorch claimed one of the top spots for fastest-growing open source projects in the past 12 months.
  • As part of this migration, Open AI has released a PyTorch-enabled version of Spinning Up in Deep RL, an educational resource for those looking to learn about deep reinforcement learning.

Google Dataset Search is out of beta and adds new features

  • Google just announced that Dataset Search is officially out of beta. Dataset Search is a service that lets you search for close to 25 million different publicly available data sets; researchers can use these data sets to train and test their machine learning models.
  • Google is looking to add new features to Dataset Search. Most notably, being able to filter which type of data set that you want to see (tables, images, text, etc…), making it easier to find the data that you’re looking for.
  • Google said Dataset Search has indexed almost 25 million of these datasets.
  • Anybody who owns an interesting data set can make it available to be indexed by using a standard schema.org markup to describe the data in more detail.

Some Bad News

The winner of Kaggle’s PetFinder competition was caught cheating

  • A non-profit devised a contest for creating an algorithm that could predict how quickly a pet would be adopted.
  • A $25,000 prize was set to reward the best solutions,
  • The winner was Pavel Pleskov and his team, Bestpetting.
  • Prior to the cheating incident, Pavel Pleskov was regarded as a Kaggle grandmaster, and described as “the best of the best” On Kaggle.
  • Teenage student, Benjamin Minixhofer, volunteered to put Bestpetting’s code into production.
  • Minixhofer later found out that Bestpetting cheated on the competition. The algorithm Bestpetting developed had access to the answers, and wasn’t capable of making any actual predictions as they were all pre-defined within the algorithm.
  • According to Minixhofer, this not only undermines the legitimacy of Kaggle competitions, but that everyone who wins money from a competition should be required to open-source their solution.

Other News

Google claims its new chatbot Meena is the best in the world

  • Google released a neural-network-powered chatbot called Meena that it claims is better than any other chatbot out there.
  • Google says Meena can talk about pretty much anything, and can even make up (bad) jokes.
  • Google has developed a new metric it calls the Sensibleness and Specificity Average (SSA), which captures important attributes for natural conversations. For example, if you say “I like tennis” and a chatbot replies “That’s nice,” the response makes sense but is not specific.
  • Google says it won’t be releasing a public demo until it has vetted the model for safety and bias.
A chat between Meena (left) and a person (right)

Content To Help You Grow

Should You Launch an AI Startup in 2020?

Machine Learning with Phil shares his perspective on the viability of starting an AI Startup

23 Amazing Deep Learning Project Ideas

23 project ideas for you to practice and improve your deep learning knowledge and skills. These project ideas are divided according to there difficulty level.

Conclusion

You’ve reached the end of the article; feel free to subscribe below to stay up to date on the latest developments.

We also have courses on machine learning and data analysis, feel free to check them out:

--

--