A Guide to Apache Kafka, Paradoxes in Data Science, GitHub Copilot, and Jobs

ODSC - Open Data Science

Published in

ODSCJournal

Sent as a

Newsletter

3 min readJan 7, 2022

ODSC East Warmup Guide to Apache Kafka

This free download is a comprehensive guide to Apache Kafka, a popular open-source distributed event streaming platform that you can learn more about at ODSC East 2022.

Paradoxes in Data Science

Here’s a look into some of the main paradoxes associated with data science and its statistical foundations.

Reviewing the TensorFlow Decision Forests Library

This library can be used to build tree-based models with Tensorflow and Keras.

Tips and Tricks in RStudio and R Markdown

These tips and shortcuts in RStudio and R Markdown can help you to speed up the writing of your code.

What is GitHub Copilot?

Check out GitHub Copilot, a Visual Studio Code that auto-generates code based on the content of the file and the location of your cursor.

Is Data Science Team Training Right For Your Team?

From sharing knowledge to improved ROI, here are seven benefits of data science team training that you should consider.

Earn a certification in deep learning in just 6 weeks with Dr. Jon Krohn’s virtual Deep Learning Bootcamp. You’ll cover a range of essential topics from building and training networks to deep reinforcement learning.

Register here.

Neural Network Models Can Hide Malware, Research Shows

A group of researchers recently revealed that it’s possible to hide at least 36.9 megabytes of malware in neural network models.

Ai+ Live Data Science Training Coming January 2022

With offerings ranging from building ML pipelines to using NetworkX, here are a few live data science training sessions coming to the Ai+ platform in January 2022.

Ai+ Highlight of the Week: A Spurious Outlier Detection System For High-Frequency Time Series Data

Like any other data source, sensor data can also be contaminated by noise (outliers) which may or may not be preventable. The presence of these outlier points will adversely affect the performance of any analytical model. Here, we propose an integrated and scalable approach to detect spurious outliers.

Featured Jobs from Hiring Partners:

Upcoming Webinars:

Git-based CI/CD for ML

Tue, Jan 11, 2022 1:30 PM — 2:30 PM EST

In this session, we’ll discuss how to enable continuous delivery of machine learning to production using Git-based ML pipelines (Github Actions) with hosted training and model serving environments.

ML System Design for Continuous Experimentation

Tue, Jan 18, 2022 1:00 PM — 2:00 PM EST

In this live webinar, we will examine some naive ML workflows that don’t take the development-production feedback loop into account and explore why they break down, showcase some system design principles that will help manage these feedback loops more effectively, and more.

The Rise of Vector Databases for Machine Learning at Scale: An Interview with Weaviate’s co-creator

Thurs, Jan 20, 1:00 PM EST

In our next Lightning Interview, we speak with Weaviate’s co-creator, Bob van Luijt.

What Can You Expect from ML in 2022

Wed, Jan 26, 2022 1:00 PM — 2:00 PM EST

In this webinar join us as we talk about noteworthy highlights in the AI/ML space from 2021, upcoming trends in ML/AI for 2022, and more.

A Guide to Apache Kafka, Paradoxes in Data Science, GitHub Copilot, and Jobs

Written by ODSC - Open Data Science