Capital One Data Science Interview Questions

Capital One had $105.293 billion in credit card loans given in the United States as of December 31, 2017.

Capital One is ranked 10th on the list of largest banks in the United States by assets. The bank has 755 branches including 30 café style locations and 2,000 ATMs. The amount of credit card transactions that Capital One has generates a huge quantity of data which is great for Data Scientists to venture into the fintech. Capital One has office locations in multiple countries which makes it accessible to Data Scientists globally.

Source: Eater.com
Interview Process

Their interview process consists of a Data Interpretation test which is a multiple choice test followed by a case interview and then later in person interviews. The interview process is fairly straightforward and you can expect the datasets and questions to be finance centric on which questions are asked.

Important Reading
Source: Streaming Pipeline
AI/Data Science Related Questions
  • How would you build a model to predict credit card fraud?
  • How do you handle missing or bad data?
  • How would you derive new features from features that already exist?
  • If you’re attempting to predict a customer’s gender, and you only have 100 data points, what problems could arise?
  • Suppose you were given two years of transaction history. What features would you use to predict credit risk?
  • Design an AI program for Tic-tac-toe
  • Explain how RDDs work with Scala in Spark
  • If you have 70 red marbles, and the ratio of green to red marbles is 2 to 7, how many green marbles are there?
  • What would the distribution of daily commutes in New York City look like?
  • Given a die, would it be more likely to get a single 6 in six rolls, at least two 6s in twelve rolls, or at least one-hundred 6s in six-hundred rolls?
  • How would you ‘disjoin’ two arrays (like JOIN for SQL, but the opposite)?
  • Create a function that does addition where the numbers are represented as two linked lists.
  • Create a function that calculates matrix sums.
  • How would you use Python to read a very large tab-delimited file of numbers to count the frequency of each number?
  • What is Hadoop serialization?
Reflecting on the Questions

Capital One’s Data Science stack has Hadoop as a part of their stack. They also use Spark as well as Scala as one of the languages. They have a trove of data and the data science team has been around for a long time creating a different level of maturity in the Data Science projects they take on. Data is embedded in their product which makes the team very important rather than an afterthought. Studying well on fintech datasets can surely land you a job at the fourth largest credit card bank in the world.

Subscribe to our Acing AI newsletter, I promise not to spam and its FREE!

Thanks for reading! 😊 If you enjoyed it, test how many times can you hit 👏 in 5 seconds. It’s great cardio for your fingers AND will help other people see the story.

The sole motivation of this blog article is to learn about Capital One and its technologies helping people to get into it. All data is sourced from online public sources. I aim to make this a living document, so any updates and suggested changes can always be included. Please provide relevant feedback.