Many images, many masks, bounding boxes, and key points. How to transform them in sync?

Image for post
Image for post
Image source: https://www.newsbreak.com/news/1480496874024/elon-musk-revealed-his-favorite-film-of-2019-was-parasite

I am one of the authors of the image augmentation library Albumentations.

Image augmentations is an interpretable regularization technique. You transform the existing data to generate a new one. (For more details check the great post about different augmentations libraries by Neptune: Data Augmentations in Python)


Face detection, face recognition on 330 million faces at OK.Ru.

Image for post
Image for post
https://habr.com/ru/company/odnoklassniki/blog/350566/

This article was originally written in Russian by Alexander Tobol and it was published on March 7, 2018, at habr.com.

In 2018 every student or even every high school kid did a pet project with face recognition. The task becomes much harder if your dataset is not one million people but:

  • 330 million of the user accounts
  • 20 million photos are uploaded every day
  • 0.2 seconds is the maximum time that you can spend per image
  • your hardware is limited

In this article, we share our experience with the development and deployment of the face recognition system in the social…


Image for post
Image for post
http://basementrejects.com/review/ducktales-season-2/

In my previous article, I talked about how researchers can improve the readability of the code that they write.

Today I will talk about something that may be interesting even to those who do not write the code yet. This blog post will be about money.

Note 1: There will be a lot of $$$ numbers in the text. All of them before taxes! Taxes will eat a considerable part of your compensation. There are some ways to minimize the impact, but I will not talk about it.

Note 2: Some links in the post are referral links. I do…


I was contacted by a person who is researching corporate practices concerning Kaggle participation by employees.

I do not take surveys, prefer more open discussions. In this blog post, I will try to comment on the questions that I got.

How many people at your company are active on Kaggle?

I work in a company called Lyft. We have a few thousand employees, hundreds of which are Software Engineers, Data Scientists, and Research Scientists. I do not know all of them, and I cannot tell how many of them are active on Kaggle.

136k+ people are participating in competitions at the platform, but there are only 180 Grandmasters and 1398…


Image for post
Image for post
Source Comfreak, via pixabay. (Pixabay Licence)

Regularly, I look at the code that supplements academic papers, released datasets, or analyzes the solutions to the Kaggle competitions.

My great respect to the researchers that share the code to reproduce their results as well as people who participate in Machine Learning (ML) competitions and share their solutions.

The code is better than the lack of it, but I believe the readability of it could be improved.

The code that I see reminds me of the code that I was writing in academia. I did not like it. This was one of the reasons why I moved to industry…


Hardware in Machine Learning competitions is not everything, but having a good machine helps.

I like Google Collab, I like Kaggle kernels, and I love that Amazon and Google provide credits fro AWS and GCP for the participants in competitions.

I want to share one more initiative in this direction.

Hostkey is a web service provider.

They have a program in which the competitors in Machine Learning challenges(ML) with a prooved record of getting to the top in competitions may get a dedicated server with GPUs to use on the problem.

We offer free GPU servers to the winners of…


How I moved from debt collection to self-driving cars

Image for post
Image for post

The new year is a time for reflection on our past and planning for the future. Within that spirit, I would like to take the opportunity to recount my journey within the industry as deep learning (DL) / computer vision (CV) engineer.

In January of 2017, I was working in a company called TrueAccord, located in San Francisco. I was building traditional recommender systems using time series and tabular data. At that time the internet began to flood with blog posts and papers describing these new cutting edge deep learning models which were even able to outperform humans. …


Image for post
Image for post
https://github.com/albu/albumentations

My name is Vladimir Iglovikov. I am Sr Computer Vision Engineer at Lyft, Level5, applying Deep Learning to the problems of the autonomous vehicles.

I am also Kaggle Grandmaster and one of the authors of the image augmentation library Albumentations. The library emerged from a set of winning solutions for computer vision competitions by Kaggle Masters: Alexander Buslaev, Alex Parinov, Eugene Khvedchenya, and I.

We released an alpha version of the library about a year ago. Since then it was adopted by machine learning engineers and researchers in the industry, academia, and of course by a competitive machine learning community…


The importance of data augmentation

Image for post
Image for post
From https://habr.com/company/ods/blog/415571/

A computer vision challenge was hosted at kaggle.com about a year ago named IEEE’s Signal Processing Society — Camera Model Identification. The task was to assign what type of camera was used to capture an image. After the competition was over, Arthur Kuzin, Artur Fattakhov, Ilya Kibardin, Ruslan Dautov, and I decided to write a tech report describing how to approach this problem and share some insights we acquired along the way. It was accepted to The 2nd International Workshop on Big Data Analytic for Cyber Crime Investigation in Seattle, where I will present December 10th, 2018. …


Image for post
Image for post

Hello, my name is Vladimir.

After graduating from university with a degree in theoretical physics, I moved to Silicon Valley in search of a data science role in the industry. This led me to my current position in Lyft’s autonomous vehicle division where I work on computer vision related applications.

In the past few years, I have invested a lot of time in Machine Learning competitions. On the one hand, it is pretty fun, but on the other, it is a very efficient way to boost some of your data science skills. I would not say that all the competitions…

Vladimir Iglovikov

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store