Preparing for Programming interview with Python
I have been preparing for a few interviews in tech/finance companies (Google, Goldman Sachs and alike). I am a final year computer science PhD student at National University of Singapore, and my PhD did not have much programming component to it. While preparing, I have gathered a few pointers for going over again and again before the interview — so when I say “general tips” I mean tips for me to remember. Hope it helps.
General Tips
- Watch the excellent technical interview tips from Google engineers and interview tips from Google recruiters. Not many companies take this much effort to clarify their interview process.
- Follow the Udacity course on preparing for technical interview using Python. This course is the comprehensive version of technical details we would discuss here.
- Read very succinct tips by Philip J. Guo and probably wander around internet some more.
- Do not open links from click bait news articles (e.g. from businessinsider and forbes).
- The most comprehensive guide towards programming practice is Google interview university. Think of it as a reference material, do not get overwhelmed by it. The resources for system design questions are pretty good there.
- I have used Python as my first preference. Some Googlers suggest to use C++/Java, but the unanimous decision is: choose the language you are most comfortable with, unless the job description has a requirement. A formal interview coach from Google ensured that Python is fine, and often helps in conveying basic understanding faster in an interview.
- Stick to one interview practice platform and do the problems thoroughly. I highly recommend leetcode. Do not use platforms that focus on programming contests. If you are practicing for phone interview, use Google Docs for practice and then copy paste into the platform (or your personal editor). For face-to-face interview (both hangout and on-site), practice programming on a whiteboard. It frustrates at first, but helps a lot as well. If you have a fancy editor with code linting, autocomplete and what not, do not open it till the interview.
- In the interview, ask questions and clarify every single thing no matter how small. Validate assumptions of data structure and complexity beforehand. In this document, whenever you see 🙋, it gives an example of a context where asking questions to the interviewer is common.
- Practice every problem as if it is an interview. Solving them right as early as possible is not the goal (contrary to programming competitions), communication is key. Do not hand waive to yourself. See my solutions to some leetcode problems, each tackled as if it was an interview. Each file has the details written as comment.
- Start with listing down examples that will cover most edge, extreme and common cases.
- A few type of problems are common across all interview related platforms, books and videos (which probably implies they are common in interviews too). The main four types of problems one should focus are: 1. String/array manipulation, 2. Tree structures, 3. Graph structures, 4. Number manipulation. Each has many different subtypes, look for them below with 💡. This list is by no means exhaustive.
String/Array manipulation
This is the most diverse area of possible questions and an easy source of questions as well. It is easy to ask and explain string/array related questions over phone too, so paying attention to this section helps.
One should know the following algorithms/structures, their complexity, example of applications and implementations.
- Bottom-up mergesort.
- Quicksort.
- Priority queues.
- Double-ended queues.
- Stacks.
- Permutation/combination.
- Suffix array (Manber and Meyers).
- Aho Corasick Algorithm.
- KMP algorithm.
The following are some class of techniques that can use any of the aforementioned algorithms/data structures.
💡 Subarray related problems
Given an array often consisting of integers (rarely it may involve only non-negative integers 🙋), a subarray is a contiguous slice of that array. Rarely it is allowed to be empty 🙋. You may have to find the subarray with maximum sum (Kadane’s algorithm) or maximum product, or maybe the minimum size subarray which has sum greater than a given target.
Most interview problems in this category can be solved in linear time without extra space. The trick is often to keep a window of subarray and growing/shrinking the subarray depending on the context. You may have to
- start the subarray with size zero starting at the left. Often when the array is sorted (or had to be sorted), the subarray may start as full array, with two pointers at start and end shrinking the subarray depending on the context (a slightly different type of example: in a sorted array, find two values that add up to a target).
- Keep a global return value (e.g. the sum) and a local one. Often you may have to keep two local variables (e.g. min and max product).
If it does not seem obvious at start, chances are, it needs O(n²) solution and you should start with that as early as possible 🙋. See if you can sort the array and then solve it with O(n) passes, thus giving a better O(n log n) solution.
💡 Subsequence related problems
Subsequences are not contiguous, but the order of the elements in the original array is kept. Finding if a string is a subsequence of another string is the easiest form, but you may have to find the longest common subsequence (the dynamic programming problem every textbook has), longest increasing subsequence, or a quirky longest wiggle subsequence. These problems often only asks you to return the length of the subsequence. However, a good followup question is to find the subsequence itself. You may have to
- First clearly state the recursion of the solution.
- Keep a cache of smaller subarray results (umbrella term: dynamic programming). In Python, you may use a decorator-styled cache that works for a lot of cache related problems, but in interviews just keep a matrix/list. See an excellent roundup.
- You should expect a O(n²) solution, though some clever algorithms may achieve O(n log n) solution using log-search techniques.
- Try to see if greedy methods work (they usually do not).
💡 Log-search Techniques
Log-search technique is a term I just made up for problems that reduces at least one O(n) to O(log n) by traversing a search space chopping half of it at every step. For a list, the search space can be the set of indexes, or even the range of values (e.g. if the min and max values of the array are known). The most common form is binary search in a sorted array.
💡 Interval related problems
Given a set of intervals, the problem often involve understanding overlaps or finding the optimal size for a given feature. Almost always, you need to sort the intervals based on their end (or start), so unless a very obvious problem is asked, you can start with trying for a O(n log n) solution 🙋. Usually greedy approaches work after that. Example problems include finding minimum number of intervals to remove to make the rest non-overlapping and minimum number of arrows needed to burst balloons in 2D.
Number manipulation
Most number related problems are either from discrete mathematics or bit manipulation.
- Modular arithmetic.
- Modular multiplicative inverse (when modulus is prime).
- Bit manipulation using operators.
💡 Bit manipulation techniques
Unless you are a natural and/or have done intense practice, it is pretty hard to do correct bit manipulation without help from a compiler. Nonetheless, at least know very basic definitions and tricks using bit wise (different from usual) AND (&), OR (|), and most importantly XOR (^) and bit-shift (<< and >>). Basic should-know tips (n being the number) include
- n&(n-1) switches of the last set bit.
- n^n cancels all bits pairwise. It is often useful when cancelling duplicates (e.g. finding the single number). This trick can often get out of hand real quickly, though (e.g. finding the single number when every other element appears thrice).
- Use a mask value to find out the set bit at certain index or limiting an integer to 32-bit (specially in Python) 🙋.
- In particular, n&1 is True if and only if n is odd.
Some more areas
Problems that involve basic probability are fairly common. Although most programming interviews do not ask for probability puzzles (quant positions in finance institutions may do), one should know reservoir sampling, bloom filter and similar data-structure-oriented probabilistic algorithms.
Common Python tips that will come handy
Lists:
- Python lists can be used to implement array, stack, queue right off-the-shelf. There is a better Queue module as well, but stick to basic data structures in interview as much as you can.
- You can iterate both front and backwards of a list easily, even with an optional hop. It never reaches the stop, so you can think of range as [start,stop) with hop. If start > stop, returns empty list.
- Slice a list easily with the colon (:) operator. It creates a new copy of the slice, so be careful about the space complexity.
- Use list comprehension, but not complicated ones (according to Google Python Style Guide).
- Strings are technically list-style, but they are immutable. So iterating and slicing works but not append/pop etc.
Dictionary:
- The Python dictionary data structure is as universally useful as it gets. It works out-of-box as an excellent hashmap.
- Use defaultdict from collections for saving some interview time initializing values for new keys.
- To create a hashmap of elements of a string or list (some other iterables will work too), use Counter from collections. These are simple techniques to save a few error-prone lines when time is essential. Caution: be careful about traversing a Counter, some functions from usual dict may not work.
Loops:
- Loops in Python are same as other languages, if not easier. You’d use while and for loop, but try to use for loop only, as missing to increment the counter is a common mistake. Always push the increment to as later as possible (see line 4 in the following code snippet).
- Do not put too many return statements inside a loop (again, Google Style Guide), and always check if one last return is needed after you just ended the loop.
- Use break/continue, they are useful. Break will get you out of the loop as continue will throw you back at the loop invariant. You should not need to use pass much.
- Python has a cute else statement after the loop, which executes if you finish the loop normally (say, without a break). This can be very useful for shorthand coding, but do mention this to the interviewer 🙋as it looks like part of a faulty if-else at first glance.
Basic Data structures:
- Linked lists and Binary trees are the most common data structures you would need in an interview. It is straightforward to implement them in Python.
- While loops are useful to loop through a linked list or traverse a tree (e.g. while node.next is not None).
- A graph is most often represented as an adjacency list in terms of Python dictionaries. For undirected graphs, same edge will be counted twice.
- Breadth-first and depth-first search are absolutely simple in Python using a queue and a stack respectively. Check out an excellent roundup.
Random Python quirks
- Use float(“inf”) and float(“-inf”) for initializing a far bigger and smaller value 🙋.
- The way python checks for emptiness/None/zero values is not obvious. For example, if not x: will evaluate to True for x being an empty list, None, 0, and some others. While this syntax is recommended, go with explicit syntax (e.g. x != 0:) if you are even slightly uncertain.
- In usual cases, comparing two variables can be done in two ways: a == b or a is b. They serve different purposes (read up string interning for more), hence to avoid confusion, use a == b when comparing the values of variables. However, if x is not None is pretty acceptable way to check if a variable is set to None.
Any suggestion is more than welcome!