Data Science News Flash: 10–17–2019

District Data Labs
7 min readOct 25, 2019

The latest Data Science articles — algorithmically curated, ranked, and summarized just for you.

News Flash is a weekly publication that features the top news stories for a specific topic. The stories are algorithmically curated, evaluated for quality, and ranked so that you can stay on top of the most important developments. Additionally, the most important sentences for each story are extracted and displayed as highlights so you can get a sense of what each story is about. If you want more information for a particular story, just click on it to read the entire article.

You can see the other topics we have News Flashes available for here and sign up to receive any that you’re interested in.

Artificial intelligence: Cheat sheet

Highlights:

  • There are many artificial intelligence software platforms and AI machines designed to do all that heavy lifting, and the results are transforming businesses: What was once out of reach for smaller organizations is now feasible, and businesses of all sizes can make the most of each resource by using artificial intelligence to design the perfect future.
  • Analytics may be the rising star of business AI, but it’s hardly the only application of artificial intelligence in the commercial and industrial worlds.
  • AWS offers pre-built algorithms, one-click machine learning training, and training tools for developers getting started in, or expanding their knowledge of AI development.
  • Watson is IBM’s version of cloud-hosted machine learning and business AI, but it goes a bit further with more AI options.
  • As previously reported by TechRepublic, finding employees with the right set of AI skills is the problem most commonly cited by organizations looking to get started with artificial intelligence.

7 start-ups shaking up Ireland’s data science scene

Highlights:

  • From optimising athlete performance to developing predictive analytics for public transport, we’ve rounded up some of the most exciting data science start-ups from around the country.
  • And as it’s Data Science Week on Siliconrepublic.com, it only makes sense to look at some of the most interesting players emerging in Ireland’s data science scene, with a primary focus on the west coast.
  • The company’s clients include global brands and multinationals in Ireland and the UK, and the firm is also involved in data analytics projects in China and the US.
  • Eberle, O’Reilly and Whelan merge the worlds of sports science, engineering and data science to create scientifically valid wearable technologies that can change the way coaches optimise their athlete’s performance.
  • The start-up’s founders also created Big Data Belfast, which is one of the leading data and technology events on the island of Ireland.

Thoughtfully Using Artificial Intelligence in Earth Science

Highlights:

  • Artificial intelligence (AI) methods have emerged as useful tools in many Earth science domains (e.g., climate models, weather prediction, hydrology, space weather, and solid Earth).
  • The field of theory-guided data science investigates ways in which AI and scientific knowledge can be combined into hybrid algorithms that incorporate the best of both worlds.The field of theory-guided data science investigates ways in which AI and scientific knowledge can be combined into hybrid algorithms that incorporate the best of both worlds Karpatne et al., 2017.
  • There is a tremendous need to develop guidelines and best practices to prepare the future Earth science workforce for innovative, interdisciplinary research bridging Earth science and AI.It might not always be possible to find suitable collaborators, so one option is to join learning communities, such as the National Science Foundation–sponsored EarthCube Research Coordination Network IS-GEO: Intelligent Systems Research to Support Geosciences.
  • The increasing number of sessions (e.g., coordinated by AGU’s Earth and Space Science Informatics section), workshops (e.g., Climate Informatics), and conferences (e.g., the American Meteorological Society’s Conference on Artificial Intelligence for Environmental Science) dedicated to AI research in Earth science is encouraging, yet there is still a large need for additional events that engage Earth science and AI researchers simultaneously and build bridges between these communities.
  • Numerous institutions are starting to incorporate data science and AI courses into their curricula (e.g., Cornell’s Institute for Computational Sustainability and National Science Foundation Research Traineeship programs at the University of Chicago, University of California, Berkeley, and Northwestern University).

Geospark Analytics deploys AI for risk assessment and effective decision-making

Highlights:

  • With the help of Machine Learning and AI, the company makes sense of data, identifies what’s relevant and what’s not, and provides actionable insights that enable clients to make informed decisions.
  • Geospark Analytics, a Washington DC-based startup, was founded in 2017 with the purpose of helping organizations avoid risks and make decisions based on highly accurate data.
  • The company geotags all the data that it gathers, categorizes it using Machine Learning models, and using the data, assesses activity and stability levels.
  • Geospark Analytics recently signed an agreement with Twitter enhancing its Hyperion platform with hyper-focused real-time breaking events, further strengthening its Machine Learning and AI models.
  • Hyperion uses AI and Machine Learning to compile automated risk and threat intelligence information by ingesting massive amounts of data, determining what is relevant, and delivering the user streaming assessments and alerts.

Health Care Startup Doc.ai Announces New Digital Trial and Lays Blueprint for Changing the $3.5 Trillion Health Care Industry

Highlights:

  • Doc.ai’s platform allows researchers to obtain more patient data across increasingly diverse data sets, and to process the data with advanced AI to identify important correlations, causation or conclusions.
  • Compared to a centralized approach that uses machine learning in the cloud, Federated Learning decentralizes the processing of data to end devices, like a mobile phone.
  • Federated Learning will have a profound impact on AI, machine learning, edge-computing and it will impact virtually every industry in some way.
  • It will be important for any technology provider handling data or looking to add AI and machine learning to their products and services.
  • Whether you consider the implications with consumer data rights, AI, Federated Learning or the power of data, you should understand how this announcement could have a significant impact on your life, your work or your industry.

Facebook Has Been Quietly Open Sourcing Some Amazing Deep Learning Capabilities for PyTorch

Highlights:

  • For years, Facebook has based its deep learning work in a combination of PyTorch and Caffe2 and has put a lot of resources to support the PyTorch stack and developer community.
  • Not surprisingly, the artificial intelligence(AI) research community has started adopting PyTorch as one of the preferred stacks to experiment with new deep learning methods.
  • Crypten incorporates security and data privacy techniques as a native citizen of machine learning models allowing researchers to leverage these methods without having to become an expert in cryptography.
  • Developers can also use Captum to improve and troubleshoot models by facilitating the identification of different features that contribute to a model’s output in order to design better models and troubleshoot unexpected model outputs.
  • Projects like Captum, Detectron2 and Crypten complement the core PyTorch stack and helps to bridge the gap between research and production deep learning systems.

The Rise of Meta Learning

Highlights:

  • The term “Meta-Learning” is thrown around in Deep Learning literature frequently referencing “AutoML”, “Few-Shot Learning”, or “Neural Architecture Search” when in reference to the automated design of neural network architectures.
  • Despite the use of Reinforcement Learning to train a single agent compared to Population-based Learning to adapt a group of agents, POET and Automatic Domain Randomization are very similar.
  • Data Augmentation is most easily understood in the context of image data, although we have already seen how the physics data can be augmented and randomized as well.
  • Neural Architecture Search has employed a wide range of algorithms to search for architectures, Random Search, Grid Search, Bayesian Optimization, Neuro-evolution, Reinforcement Learning, and Differentiable Search.
  • This paradigm, described in Jeff Clune’s AI-GAs, of algorithms that contain meta-learning architectures, meta-learning the learning algorithms themselves, and generating effective learning environments stand to be an enormous opportunity for the advancement of Deep Learning and Artificial Intelligence.

Trade and Invest Smarter — The Reinforcement Learning Way

Highlights:

  • Under the hood, the framework uses many of the APIs from existing machine learning libraries to maintain high quality data pipelines and learning models.
  • Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.
  • While the recommended use case is to plug a trading environment into a trading strategy, you can obviously use the trading environment separately, any way you’d otherwise use a gym environment.
  • In this example, we will be using the Stable Baselines library to provide learning agents to our trading strategy, however, the TensorTrade framework is compatible with many reinforcement learning libraries such as Tensorforce, Ray’s RLLib, OpenAI’s Baselines, Intel’s Coach, or anything from the TensorFlow line such as TF Agents.
  • A TradingEnvironment is a gym environment that takes an InstrumentExchange, an ActionStrategy, a RewardStrategy, and an optional FeaturePipeline, and returns observations and rewards that the learning agent can be trained and evaluated on.

Produced and Sponsored by:

Innovative Data Science & Advanced Analytics Solutions

--

--

District Data Labs

Data science consulting and corporate training. Take your analytics to the next level.