Solving Leetcode Hard Problems with ChatGPT?

How does it do?

Published in

The Research Nest

5 min readDec 10, 2022

While ChatGPT is impressive, it looks like it doesn’t give correct answers easily to complex problems. I tried solving the first 10 Leetcode hard problems (tagged under top interview questions) using ChatGPT to verify the same.

You can find the problems here: Problem Set. Some of these include famous problems like trapping rainwater and sliding window.

I won’t waste your time in this article showing all the prompts, code, and responses ChatGPT gives. Instead, I will condense my observations and learnings from the overall experience. The basic approach is to give the entire or part of the Leetcode question as a prompt to ChatGPT.

Here’s a summary of the results.

Median of two sorted arrays — First try, direct solution, no code change/further analysis required. This was an easy question, either way. Finished in a minute.
Regular expression matching — First try, gave a running code. However, it passed only 287/353 test cases. The code failed for an edge scenario where the pattern is longer than the string. That’s pretty impressive too. Then I retried with ChatGPT, including all the examples, and looked for an optimized solution. It came up with an approach using dynamic programming, and it passed all the test cases without any code change. Finished in ~15 minutes, as I was trying different modifications to arrive at a DP solution. Do note that I did not mention it to use DP anywhere.
Merge k sorted lists — First three tries, it doesn’t give the correct solution. There were syntax errors as well. Initially, it tried to give a solution using heaps. Then I refresh and modify the prompt and keep it short and straightforward. Like that, I get a solution where it first creates a helper function to merge two lists and uses it to merge all the lists. No code change was required. All test cases passed. Finished in ~10 minutes.
First missing positive — 5th retry, and it works! No code change was required. All test cases passed. Finished in ~6 minutes. The first few solutions it gave were all over, having either syntax errors or not passing the test cases. Then I refresh everything and try to add the entire question in the prompt with the constraints, and boom, I get an elegant solution.
Trapping rain water — This is one of the famous and trickier questions. And GUESS WHAT! First try, no code change and ALL TEST CASES PASSED. Finished in under 1 minute. I expected it as it is a famous problem and can arguably have a lot of info in the training data. Here’s a gist of the two-pointer approach that it uses:
“To solve this problem, we can use a two-pointer approach. We will keep two pointers at the beginning and end of the array, and move them inward until they meet. At each step, we will compute the amount of water that can be trapped by the two bars at the pointers.
To compute the amount of water that can be trapped, we will take the minimum of the two bars at the pointers and subtract the height of the bar at the lower pointer. This will give us the height of the water that can be trapped. We will then multiply this by the distance between the two pointers to get the total amount of water that can be trapped.”
Wildcard matching — Second try, it gives an elegant recursive solution that passes 1616/1811 test cases. Then it throws a time limit exceeded error. I asked ChatGPT to come up with a more optimized solution. It generates a solution using dynamic programming. All test cases pass without any code change required. Finished in about ~3 minutes. Looks like ChatGPT is pretty good with dynamic programming.
Minimum window substring — This one’s crazy. Finished in ~20 seconds. I didn’t even read the whole problem or the solution. 1st try, all test cases passed. Things are getting interesting. I thought I would have to fine-tune my approach to more challenging problems. Looks like that is not the case.
Largest rectangle in histogram — The streak continues. It generated a solution using a stack with more than 40 lines of code. 1st try, It passed all test cases. It finished in under a minute.
Binary tree maximum path sum — It first gives some recursive solution. I ask it to try to optimize using DP directly. It does so. However, the solution passes only 62/94 test cases. This time, I refresh the conversation, prompt the question again, and explicitly ask it to solve in DP. It does, and the code passes all test cases. Finished in 5 minutes.
Word ladder — Second try, and it generates a solution using deque. I just had to change some variable names to get the code running. All test cases passed. Finished in ~3 minutes.

These results exceeded my expectations. I was able to solve ten problems in under 60 minutes with minimal code changes required. Here are a few observations I made.

For some questions, it is effective to give example inputs, outputs, and constraints in the prompt. For some other questions, just the direct question works. If one doesn’t work, refresh and try the other.
You can generate the required responses faster if you understand the technique you can use for a given problem statement. For example, if you know that DP or BFS can solve a problem, you can ask ChatGPT to use those techniques immediately.
It is possible to resolve errors or modify code for edge cases by explicitly asking for the same in the chat.
If you understand code, you can probably tell at a glance if it would work. If you feel it is going in the wrong direction, you can quickly refresh the conversation and nudge it correctly. It will likely come around to give a correct answer.
One thing I find pretty intuitive is how it tries to give explanations and writes code comments. It helps me immensely understand the logic and verify if it is plausible.

I want to try something even more challenging next time. #StayTuned

Perhaps, I will test the waters with competitive coding and Google Code Jam.

Solving Leetcode Hard Problems with ChatGPT?

How does it do?

Written by XQ