5 Chrome extensions for Github to boost your productivity!

Image for post
Image for post
Image via Pixabay

For those who have yet to explore Chrome Extensions (per Chrome Web Store), nowadays, there is an abundance of extensions to add on — each designed to enhance the interface, the potential for progress (i.e., productivity), and the features available with an assortment of intentions. Recently, I am setting up a system to use as part of my research — applied machine learning, often targeting imagery as the input/output signals. Thus, let me share the Chrome-specific setup that I found useful. …


Image for post
Image for post
Face verification data with priority to balance in gender, ethnicity, and identity.

The Problems of Bias in AI: Labeled Faces and Benchmark Dataset

Our paper — FR: Too bias, or Not Too Bias? — published as part of The Workshop of Fair, Data-Efficient, and Trusted Computer Vision held in conjunction with the 2020 Conference on Computer Vision and Pattern Recognition (CVPR).

The purpose of this brief tutorial is to provide the reader with the following:

  1. High-level understanding of the paper — answering So what? Who cares?
  2. Spicing up the tutorial by sharing my opinion in subjective matters — something which, of course, was not included in our published paper.
  3. Provide a one-stop resource. As part of this effort, we provide not only the paper but the data in an easy-to-use structure and source code (mostly Python). As part of the code-base, there are several notebooks exemplifying aspects of this work (i.e., a notebook per every figure in the paper provides a means of quickly reproducing). …


Image for post
Image for post
Configure 1x; Connect each time.

It is that easy!

Pycharm, for me, is a great IDE — complete with features that promote productive programming, a community devoted to sharing clever plug-ins, and, my personal favorite trait, Professional licenses are free to students. With this, JetBrains toolbox with its many IDEs (one for most modern computing language) is available to students free of charge (no strings attached). Bravo, JetBrains! Free for students is a service that more products should embrace.

With that, let’s move on to the point — working remotely via PyCharm.

With the Coronavirus now an international concern, a vast percentage of professionals must work remotely (myself included). As I set up an iMac (i.e., local machine) to work in sync with a PC running Ubuntu (i.e., remote host), the next step is to configure PyCharm to edit locally and run remotely. There are many reasons one may want to do this — my motivation is to easily deploy jobs to the remote host with GPUs. This is not the first time I have stepped through this process — each time having to recall how it is done. For this, I figured it was worth taking notes and sharing on Medium. …


Command-line tools in production.

Image for post
Image for post

As far as time in manual labor, preparing data for an ML pipeline more often than not takes the majority. Furthermore, building or extending a database usually cost astronomical amounts of time, subtasks, and attention to detail. The latter led me to find a great command-line tool for cleaning out duplicates and near-duplicates, and especially when used with iTerm2 (or iTerm) — namely imgdupes.

Note that the aim here is to introduce imgdupes. See reference for the technical details of specifications, algorithms, options, and such (or stay tuned for a future post on the details).

Problem Statement: De-Duplicating an Image Set

My situation while building a facial image database was as follows: a directory of multiple directories, and with each subdirectory containing images for the respective class. This is a common scenario in ML tasks, as many renowned datasets follow such convention: separate class samples by directory for both convenience and as explicit labels. Thus, I was cleaning face data, and the identity of the faces within named the subdirectories. …


DATA ANALYTICS LIKE A PYTHON PRO

A Personal Favorite 1-Liner

Image for post
Image for post
Kungfu Panda

Overview

My last post demonstrated a simple process for evaluating a set of face pairs to determine whether or not the two are blood relatives. Several snippets were breezed over like black-boxes. Let us look at one of my those snippets, a simple 1-liner: a Python Pandas feature I recently learned, and now use frequently.

where the DataFrame contains the list of face pairs p1 and p2 (Fig. 1), and features (type dict) with filenames as the keys and face encodings as values.

Image for post
Image for post
Fig. 1. df_pairlist.head(5): top 5 of >100,000 rows. Notice samples are repeated, but pairs are unique (i.e., member 1 of family 7 is the bother of member 9 of family 7, each with various face samples paired. Specifically, Nchoose(2) face pairs, where N is the total number of faces of the two relatives).

So there you are! In just 1 line of code, we apply a scoring metric to millions of pairs without memory or speed concerns, and while creating a column for scores for pairs as part of the same DataFrame. …


Image for post
Image for post

Visual Recognition of Families In the Wild

Recognizing Families In the Wild

Table of Contents

Problem Formulation

The goal of kinship verification is to determine whether a pair of faces of different subjects are kin of a specific type, like parent-child. This is a classical Boolean problem with system responses being either KIN or NON-KIN. That is, true or false, respectively, and formulating the one-vs-one paradigm of automatic kinship recognition.

Image for post
Image for post
Given a pair of faces, the task here would be to determine whether or not either is a father-son pair.

Overview

This basic demo shows some tricks for using pandas in the Recognizing Families In the Wild (RFIW) data challenge. Specifically, kinship verification (Task-I). The FIW dataset supports RFIW. The purpose here is to demonstrate how to complete the evaluation in a few easy steps. …


Set it up once, and it should work forever

Image for post
Image for post
Photo by Patrick Ward on Unsplash

Motivation (Personal, That Is)

I stumbled upon Medium several months ago. Initially, I found the content broad in scope and material keen on quality. Already there have been significant improvements seen in articles and overall interface. With that, and as I configure a new iMac for the lab, I figured I would record the process followed to install and configure Git. I intend to make this my first of many articles; nonetheless, I just thought this simple tutorial would serve as an excellent way to break the ice. I hope you enjoy it.

Let’s now move on to some good, old-fashioned Git for Mac. We’ll get your Mac machine set up correctly. …


Visual Recognition of Families In the Wild

The Road Ahead (Part I or many)

Image for post
Image for post
Training machinery to identify relatives using visual cues like imagery, videos, and such.

Related Links

  1. 2020 Recognizing Families In the Wild (RFIW) workshop webpage
  2. RFIW-2019, RFIW-2018, and RFIW-2017 challenge website
  3. Families In the Wild (FIW) project page
  4. Kinship Verification: A Python Pandas Tutorial (Part 2 of many)

Overview

My Ph.D. research involves kinship recognition and FIW dataset. Over the near future, the goal is to summarize key findings while providing lessons and demos of data challenges held at top tier conferences annually (i.e., this is the first of many to come).Overview

My Ph.D. research involves kinship recognition and FIW dataset. Over the near future, the goal is to summarize key findings while providing lessons and demos of data challenges held at top tier conferences annually (i.e., …


Image for post
Image for post

Please checkout updated version, https://towardsdatascience.com/to-recognize-families-in-the-wild-a-machine-vision-tutorial-6d6ed85ca1c4?source=---------6------------------

My PhD research involves kinship recognition and our FIW dataset. Over the near future the goal is to summarize key findings, while providing lessons and demos of data challenges being held at top tier conferences annually (i.e., this is the first of many to come).

To learn more about FIW visit the project page. To learn more, register, and be involved with the upcoming RFIW 2020 visit workshop page.

The ability to automatically recognize blood relatives (i.e., kinship) through imagery holds promise in an abundance of applications. To name a few areas: forensics (e.g., human tracking, missing children, crime-scene investigations), border-control and security, dislocated refugee families, historic and genealogical lineage studies, social media, predictive modeling, and even as search cues for facial recognition (i.e., …

About

Joseph Robinson

PhD student of Yun Fu and SMILE Lab Northeastern U. Focus: applied ML w emphasis on vision, big data, automatic face understanding. https://www.jrobsvision.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store