Github Copilot X will NOT replace developers

Published in

Data Science at Microsoft

10 min readDec 26, 2023

For the last six months or so, I have been using Github Copilot X and now I don’t think I can write code without it. It simplifies my work a lot. But it is not a replacement for thorough and systematic thinking and debugging by reducing and removing errors. Github Copilot cannot help you with that.

Many developers are unnecessarily worried about losing their jobs to it. But remember, it is literally your assistant — the copilot — and you are the captain — the pilot. I will show you how I learned this myself when I tried to deploy and debug an Azure Open AI embeddings model in two different Azure regions.

First deployment

I have been experimenting with OpenAI tools a lot recently; specifically Azure Open AI because I work for Microsoft. This exercise is about writing the code for creating vector embeddings. For a simple explanation of vector embeddings, please refer to this article: What are vector embeddings?. An embedding is a representation of an… | by Kirk Kirkconnell | Medium. For a more thorough explanation, this is a good reference: What are Vector Embeddings | Pinecone. Briefly, vector embeddings are numerical representations of words and sentences that facilitate semantic search.

My goal was to write a simple toy program in the Python language to produce the vector embeddings, a semantic representation for the word “cat”. The intent of the program was to output a list of numbers upon running correctly. Here is a screenshot of my Visual Studio (VS) code that shows the lines of the program:

There is nothing complicated about this “toy” Python program. It imports the necessary Python packages (lines 1 to 4) and then loads the necessary environment variables (line 6) to set up the openai.AzureOpenAI call (lines 19 to 23) to create the openai client object. Finally, it uses the client to create embedding in line (25) using the Azure OpenAI deployment. I have purposely left the commented out code in lines12 to 14 that use the hard coded values which, when uncommented, override those loaded from the “.env” (called dot env) file shown below:

You can compare the values shown in the dot env file above to the hard coded values in the previous screenshot and check that they are the same. Also, please note that I show the key values so that you can compare them in the program code and the dot env file with those shown in the Azure portal. I will be deleting both the deployments and the instances of Azure OpenAI before this article is made publicly available.

Azure has the concept of being able to create multiple deployments of an instance of the OpenAI service. In other words, one instance of the service can have many deployments depending on models and versions. The following screenshot shows an instance of the Azure OpenAI service in my subscription:

Note that either of the keys (Key 1 or Key 2) can be used to access this resource but there is only one endpoint that makes it accessible from the outside world. We can navigate to the deployments screen by either clicking on the “Model deployments” menu option, or from the “Overview” menu option and choosing the “Go to Azure OpenAI Studio” option as shown below:

Please make a note of the Location (East US) where the Azure Open AI resource is deployed. We will create a new instance of the Azure Open AI resource in the North-Central US region to experiment further.

The following screenshot shows the existing deployment for this instance of the Azure OpenAI service:

Please make a note of this deployment name, embed_demo, which is the same as shown in the program code screenshot as well as the dot env file screenshot above. Also, you can use the “Create new deployment” button to create additional deployments for this instance of the Azure OpenAI service. Typically, you create additional deployments to test different models and/or different versions of the same model.

Let’s try to run the program shown in the first screenshot in VS Code Terminal. The following screenshot shows the result:

This is encouraging. You can see that the deployment name is embed_demo and also the generated output shows (partially) the embeddings as a vector (in Python represented as a list of numbers). In fact, this list contains 1536 numbers, which corresponds to the number of dimensions in the Open AI transformer models. But that is beside the point.

To deploy the service in the North-Central US region, we must create a new instance of the Azure Open AI service because it is tied to a specific region. This is a common scenario in real life. Clients have preferences for specific regions based on their location.

We do this in the following section.

Deploying the model in the North-Central US region

As mentioned above, we need to create a new instance of the Azure Open AI service so that we can create it in the “North-Central US” region. This is shown in the following screenshot:

Also, please note that I have given an appropriate name to the new Azure Open AI instance to disambiguate it from the previous instance. Upon successful creation, we have two instances of the service deployed in two different regions as shown below:

Let’s create a deployment for the instances of the Azure Open AI service. We do that using the Azure Open AI Studio as shown below:

In the Azure Open AI studio, we select the same embedding model (text-embedding-ada-002) and the same (default) version 2 and give it an appropriate name, embed_demo_north. Upon its successful deployment, we see it as shown below:

With this done, we are ready to test and adapt our program to use the new Azure Open AI instance (bharat-text-embedding-north) with its new deployment (embed_demo_north). I choose to copy the existing code to a new file (app_north.py) to be on the safe side within the same folder as shown below:

Again, please note that the hard-coded values of the api_key, azure_endpoint, and the deployment_name attributes of the openai object have been updated to match the new Azure OpenAI instance and its deployment. Again, I show the commented out hard-coded values for a reason that becomes clear shortly. I received the key and endpoint values from the following screen:

And of course, the new deployment name is embed_demo_north. Let’s change these values in the dot env files also as shown below:

With these changes, in theory, we should be able to run the app_north.py file from the terminal and expect it to produce the embeddings vectors successfully. When run, it produces the embeddings vector successfully as shown below:

But look at the demo name! It is using the old deployment, which is in an entirely different region!!

Being naively happy that our program ran and produced the embeddings can lead to some serious issues downstream. We need to figure out why this is happening. I will delete the old deployment to eliminate it from the equation. It is quite easy to recreate the deployments of Azure OpenAI Base models. I can do it from the Azure Open AI Studio as shown below:

After deleting the deployment, we go back to the terminal and run the program again. It produces the following output:

It is a bit better than the quite misleading embeddings that we were getting from the wrong (old) deployment (embed_demo instead of embed_demo_north). How can this happen since we have edited the dot env file to set the new environment variable values as we show in the modified dot env file? The screenshot is reproduced below for your convenience:

Clearly something is wrong because we are still printing the old deployment value even if it has been deleted. Let’s use the hard-coded value by uncommenting line number 14 as shown below. This should override the value from reading the dot env file.

This overrides the deployment_name value read from the dot env file in line number 10 by the hard-coded value in line number 14. Save the changes and run the program again. Here is what we see:

This is quite confusing! Why is it still complaining that the deployment is not found? Could it have something to do with the other two environment values that are being read from the dot env file? To prove it, we can uncomment lines 12 and 13, save, and rerun the program. I show the uncommented lines in the screenshot below:

Let’s run it. We see the following heart-warming output:

Now we have a better explanation for what can be the problem. Somehow, it seems as though the terminal is “caching” the old environment variable values, but we don’t quite know why. To prove it further, I wrote a smaller “toy” program file called temp.py as shown below and ran it:

It proves to me that the terminal is indeed getting the old environment variable values. Surely it has something to do with the dotenv.load_dotenv() method or os.getenv(“DEPLOYMENT_NAME”) methods?

Let me ask the Copilot using the chat interface as shown in the following screenshot:

As you can see above, the Copilot’s explanation couldn’t have been clearer. I will follow it and use the override argument. Doing so indeed fixes the issue as shown below:

Armed with this proof, we make the same change in our bigger toy program app_north.py and rerun it, producing the correct output as shown below:

As this process shows, we didn’t rely on Copilot until we clearly understood the fundamental issue, which is that somehow the terminal had cached the old values of the environmental variables. We fixed it using Copilot’s clear explanation once we had the issue figured out using old fashioned systematic debugging. As a matter of fact, we didn’t even use the VS Code debugger. OK, I am lying — I did use the debugger and was surprised to see it use the “correct” values from the dot env file and still see the program fail when run directly from the terminal. Now I know that the debugger does what the “override” flag is doing in line number 6 above.

I hope this was an instructive experience for you if you stayed with me this long. Now time for some retrospective learnings.

Lessons learned

Github Copilot X is not a substitute for clear and logical thinking and debugging, but rather a helper that can handle the tedious, time consuming, and tiring research work that I have had to repeat over and over and which becomes dull and stale. I still must do the difficult and rewarding work of finding out the root causes.

Some of the best developers that I have worked with in my career do not rely on tools too much when it is time to figure out and fix challenging bugs. Rather, they have a very focused and methodical approach to problem solving. They use tools when appropriate though. I have tried to learn from them.

That is why my opinion is clearly on using AI to leverage developers and not replace them. Until next time!

Bharat Ruparel is on LinkedIn.

Github Copilot X will NOT replace developers

First deployment

Deploying the model in the North-Central US region

Lessons learned

Written by Bharat Ruparel