Looking at your examples (ex. white, male resumes), it makes me want to say there are explicit, and implicit biases that affect data. Explicit biases are easy to spot, such as gender and a name; and a professional organization, especially one that builds a learning model would be advised to scrub any personal information off before providing it to the ‘curators’ to reduce bias. Implicit biases, though are harder to reason about since they appear as patterns in data, such as a tendency to favor resumes that took a particular program, at a particular school, which happened to be in a particular city that was pre-dominantly white. As Isaac Wolkerstorfer mentioned, de-biasing explicit biases should be an absolute minimum, and perhaps a good medium is to try and separate any explicit biases that may be embedded in any implicit bias we find.