Choosing a residency project (ML)

Published in

Project Clarify

6 min readDec 3, 2019

The Project Clarify residency program is getting underway with planning for work to begin in Q1. Naturally some guidance is necessary for how to choose a good ML project given both the context of (1) the residency program as well as (2) the broader project.

Of our more than 24 new members at least 21 (88%) are interested in working on machine learning (see also issues in area/machine-learning). All of our current program participants are undergraduates with some downwards skew in class level (first, 6/24 or 25%; second, 8/24 or 33%; third, 7/24 or 29%; fourth, 2/24 or 8%). More than 17/24 (or 80%) have estimated they will spend approximately 10h/wk or more contributing to the project in Q1. These numbers represent current participation against which we expect both some amount of attrition as well as new growth at the start of Q1. These numbers tell us we need to manage the challenge of dividing and integrating work over a large group, find appropriate fit over a range of skill sets, and provide a path to building skill in new areas for those interested in doing so.

The broader project context is the improvement of mental (meta-/cognitive and meta-/emotional) training techniques through the application of deep learning for state understanding. The purpose of this is to be used in randomized controlled trials of training experiences built around these capabilities of state understanding. That is, we are seeking (1) improved technical capabilities for state understanding, that (2) are designed with the product/user experience in mind, and (3) that are translated into use in production. Thus it is essential to understand that we are working to build things for the benefit of real people not for the objective of publishing papers, training models, etc.

Component to continued success with (1) is building such methods in a way that are technically sound — well tested, documented, and designed in a way to organize the contributions of our various members. Component to (2) is both (a) capabilities that enable new or improved forms of training as well as (b) capabilities that simply enable measurement of state both acutely and over the long-term that will improve statistical analyses of traditional training methods.

Before getting into a discussion about particular projects it’s important to understand what we need in ML projects in general as well as how we do things in this project — this will provide clarity on why certain kinds of projects would be better for people with certain interests or skills vs. others. These are summarized below:

Datasets: Naturally we need to identify and obtain data sets that can be used for training; for some data sets, doing this the right way can be time-consuming — such as in the case of the Facial Expression Correspondence (FEC) dataset and awareness of this is important for those considering breaking new ground. It is important that code
Examples: A further non-trivial step once a data set is obtained is preparing training examples from it. In this project we use tensor2tensor Problem definitions (see also), such as that for FacialExpressionCorrespondence and VGGFace2 pre-training which is in turn extend the TripletImageProblem and in turn PCMLProblem base classes.
Models, losses, and their hyper-parameters: Deep learning research more or less is the process of choosing models, losses, and hyper-parameters that “work”, examining the results, and making changes that yield improvements. In this project we use tensor2tensor Model objects to define our models in a usefully structured way, see our primary triplet embedding model object and the ResNet implementation it uses for processing images.
Evaluation: A means of evaluating your model is important. Typically models are trained on one subset of the data, evaluated during training and tuning on another subset, then when you’re ready to report results the model is applied to a third data set disjoint from the others and preferably from a different acquisition with sufficiently different statistics and other properties to emulate the process of using the model in other contexts.
Qualitative analysis: When possible we also like to have a means of qualitative analysis that complements performance metrics with more intuitive indicators of model performance. In the case of our FEC/FEx understanding model we have so far accomplished this with image similarity lookup; the relevant means for audio, EEG, etc. are not yet established for this project.
Application demo: One of the easiest ways to communicate the value of a new or improved ML capability to a non-ML audience is by building a colab-based interactive demo they can experience and use to intuitively grasp the potential; our current capabilities demo notebook is an example of this. A key point of strategy is to develop the means to accomplish both steps (6) and (7) more rapidly by establishing technical foundations to enable this.
Demo application deployment: The last step in the chain of responsibility of the ML research portions of Project Clarify is deploying new and improved ML capabilities into production in the form of additions to our demo application running at ai4hi.org. Doing so is not only a close approximation of the way these capabilities will be used in studies but it’s also an important way to further communicate to collaborators and stakeholders the immediate usefulness of what is being accomplished in the ML program. Our current methods for running models in production revolve around Cloud Functions that run in response to the arrival of modality (e.g. video) data in Firestore from the front-end, see this. We are updating the current from Eager inference within the CF (which is slow on account of doing inference on a single CPU as well as on account of the need to stage in model and code dependencies and parameters) to queries from those functions to GPU- or TPU-backed models served on Kubernetes with the Kubeflow TFServing module.

There are two benefits to outlining these. One is to say that if you are primarily interested in ML theory, or narrowly concerns related to improving architectures, loss functions, and training techniques it would probably be better for you to focus on an end-to-end data/model/eval/demo setup we already or will soon have in place — such as for image- or video-based expression understanding. Likewise if you are more interested in studying existing models (such as the development of data visualization or qualitative analysis techniques) or indeed if you are more interested in developing demonstrations that make use of existing capabilities this first case is probably a better fit for you. The other benefit is for those interested in breaking new ground on the EEG understanding portions of the project it is important that you see the big picture and work to put in place high-quality components for each of these.

That is, to summarize, a good project (A) pursues a new technical capability with a specific product goal in mind, (B) spans all of 1–7 above, and (C) improves upon what was available before in one or more of quality, informativeness, speed, or cost. Tweaking a setup such as we already have for FEx understanding (demo notebook) will allow you a greater opportunity to build your skills in deep learning vs. working on breaking new ground on EEG understanding (after working with me and your peers to establish a plan for (A-C)) will present a more diverse set of engineering challenges and will firstly be concerned with establishing 1–7 before going deep on what some would consider the more interesting aspects of (3).

Currently open epics in area/machine-learning can be viewed at bit.ly/pmcl-areaml-epics. A good approach is to discuss these generally in Slack, specific technical discussions on component issues, and to keep up with discussions about these in our weekly SIG meetings (in this case SIG-ML).

Opportunities in other domains can be found by filtering open issues by area (such as area/platform, area/feedback-experiences, and area/systems). A post similar to this in regard to the philosophy and plan for our web platform and feedback experiences is forthcoming so stay tuned. A separate post will also collect resources for junior contributors to get up to speed with technologies of interest to them. For now check out the following:

Google’s Rules of Machine Learning
Sculley, David, et al. “Hidden technical debt in machine learning systems.” Advances in neural information processing systems. 2015. link
Udacity deep learning course
Google’s machine learning crash course

Christopher

Founder, Project Clarify

Choosing a residency project (ML)

Written by Christopher W. Beitel, Ph.D.