A Deep Dive Into GitHub Copilot

How GitHub Copilot works under the hood

Aidan Tilgner
Better Programming

--

let GitHub_Copilot = new AI_Review({
Category: Code Completion,
Contestant: GitHub Copilot,
Model: GPT-3,
Engine: Codex,
Creator: GitHub,
Last_Updated: 10/30/2021,
})

Welcome to AI Review, where we talk about different Artificial Intelligence products designed to make life easier for developers like you and me. Today we’re gonna be talking about GitHub Copilot, a product I’ve talked about extensively on this blog and one I’m pretty excited about. We’ll dive into how it works under the hood, whether you should bother using it, and how it squares up to its competitors.

We’re going to measure it based on a series of metrics that will allow us to be as objective as possible. The goal here is to figure out how useful this product actually is, and if it accomplishes its main goal of making developers more productive by making their jobs easier. A product could be cool and flashy, but at the end of the day be nothing more than a party trick (a nerdy party, granted) with no real value to it.

GitHub Copilot doesn’t seem to be that, however first impressions can be misleading. With the backing of some top-notch names like Microsoft, GitHub, and OpenAI, Copilot comes with some big expectations, and I certainly hope those are fulfilled. That being said we’ll never know unless I get on with the article.

Without further adieu, let’s talk about GitHub Copilot.

Under the Hood

The way that an AI product works is going to determine how good it is, plain and simple. Bad tech under the hood is going to be obvious in the final product. Whether it be inconsistency, slowness, lack of sufficient training, or anything else, if the tech is bad, the product is bad. That said, Copilot has a pretty good stack to work with.

GitHub Copilot is based on the GPT-3 AI model created by OpenAI which features 175 billion parameters for language processing. The story goes that people were trying to use GPT-3 to get code completion based on a plain-English input. Although they were successful in getting some basic code working, the completions they got were mostly useless.

OpenAI however was interested in the idea of getting GPT-3 to write code, and so they whipped up some fancy code-writing engines based on it, which became their Codex line. OpenAI Codex is a tool that is still only available in private beta for you and me, but Microsoft bought exclusive licensing to use GPT-3 and its engines for their own purposes.

After that, GitHub (owned by Microsoft) got to work on their Copilot technology. They fine-tuned the Codex engines on millions of lines of public code on GitHub, which is what runs under the hood in GitHub Copilot. While you’re typing, GitHub Copilot will autocomplete your code with the most accurate completions it can find, based on the knowledge it has from all of the code it’s been trained on.

This can lead to some problems however with training code possibly being licensed. Multiple times people have posted code that they got as completion by GitHub Copilot that was not legal to use without license. Although Copilot was trained on open-source code, some of that code could have been plagiarized and therefore not be safe to use in production.

That being said, for the most part, this isn’t a problem, but it is worth checking. However, as this is a technical preview, it’s reasonable to assume that GitHub will do its best to remove these issues in a complete version. If they don’t though, it could be real trouble for the product.

So, what?

Therefore, some problems are present with the back-end of GitHub Copilot, but for most situations, they’re not deal-breakers. GPT-3 is a revolutionary model and is one of the biggest and most capable in the world. With this powering GitHub Copilot, it’s a big win for the product already and is promising for its code completion.

The Interface

One of the most important pieces of any AI product is the way that a developer interacts with it. The interface of a product can make or break it in some ways because it doesn’t matter how well your fancy app, extension, or API works when no one can figure out how to use it. GitHub Copilot for that matter comes in the form of an extension for either Visual Studio Code or GitHub Codespaces.

A great extension by the way. All that you have to do is add it, enable it, sign into GitHub, and start coding:

All images by the author

This is one of the best ways to interface with an AI because it stays true to its name and feels like you’ve got another developer there helping you out. You lead the way, and it does the heavy lifting. No API to call, no website you have to use to interact with the completion engines, all you have to do is work on your code and you get completions right at your fingertips.

This isn’t necessarily a new idea, Tabnine and Kite have been doing this for years in the form of IntelliSense snippets, but I love the way that Copilot makes its suggestions by putting them alongside the code itself. That way you get to preview the code in its full form, formatted and ready for use at the press of a button (TAB by the way).

If you don’t like the first option that it offers up, you can simply navigate through other options by pressing ALT + ] and ALT + [, Option if you’re on Mac. This is simple, clean, and easy to use without decreasing your typing speed except to check that the solution is what you’re looking for. If you want to turn it off, just click the little Copilot icon on the bottom right to disable it globally, or for a specific language:

The bottom right portion of vscode’s control panel

That being said, the interface can be a bit lacking if you count the fact that it’s only available on Visual Studio Code and Codespaces. This is great if you use those IDE’s, but it’s not going to be helpful for you if you don’t. This goes back to the make or break thing, for a lot of developers this interface is great, but until it’s supported on more IDE’s, it simply can’t be useful at all to others.

However, we have to cut Copilot a bit of a break on this front as it's still in technical preview form. While the developers work out kinks with the product and it’s available only to beta testers, it makes sense that they wouldn’t release it on other platforms yet. Therefore, it remains to be seen whether or not this will be a real issue when the full version is released.

Another small bug is that sometimes when you’re coding, it makes a completion that breaks the current formatting of your code. A quick fix yes, but annoying nonetheless:

If I were to select this code, it would throw an error. I can imagine if I was typing quickly this would be an easy thing to miss as well. Of course in modern IDE’s, syntax like this will light up like a Christmas tree, but it remains the same that little bugs like these, if not ironed out, could definitely get annoying for the product in the future.

So, what?

With all that said, the interface of this AI is great. It’s not perfect, but it’s really solid and there isn’t gonna be much trouble using it for your needs. It blends in almost seamlessly with the IDE and isn’t any trouble to get set up with. It will be interesting to see how it’s incorporated into other IDE’s, but we really won’t know until we see it.

Functions

The next aspect of our test will go over how this technology tackles one of the biggest challenges that developers face, the design and creation of functions. There are basically two types of functions, queries, and commands. A typical app uses both of these, so if Copilot expects to make coding easier, it will need to be helpful in both aspects of function creation.

Queries

Queries take parameters and then manipulate those parameters in some way in order to return a result, all without manipulating the state of the application. These often take the form of algorithms, where a developer has some data, and crafts a method of taking that data and returning an expected result of plugging the data into a function.

We’re gonna start off by going over some Leetcode problems. An easy problem, a medium problem, and a hard problem, all randomly selected from the front page of Leetcode. We’ll be using JavaScript for all of them to keep it consistent.

For the easy problem, we’re going to be solving Roman to Integer, where basically we take some Roman numerals as input and return the equivalent in integer form. Here I’m going to give the AI some context as to what we need our function to do:

Now that it knows what we’re talking about, let's get started with the function:

Just like that, it already found the solution. I couldn’t screenshot the whole thing so I’ll paste its output below.

const RomanToInt = (roman) => {let result = 0;let prev = 0;let current = 0;let next = 0;let i = 0;let len = roman.length;let romanArr = roman.split("");while (i < len) {current = romanArr[i];next = romanArr[i + 1];prev = romanArr[i - 1];switch (current) {case "I":if (next === "V" || next === "X") {result -= 1;} else {result += 1;}break;case "V":result += 5;break;case "X":if (next === "L" || next === "C") {result -= 10;} else {result += 10;}break;case "L":result += 50;break;case "C":if (next === "D" || next === "M") {result -= 100;} else {result += 100;}break;case "D":result += 500;break;case "M":result += 1000;break;default:break;}i++;}return result;};

That was a breeze. All I had to do was give it some context and it found the solution before I was even done writing the function name. But will it be able to do the same thing for a medium question? Let’s find out.

Next, I’ll be trying it with the Add Two Numbers question, where we take an input of two linked lists which each represent a non-negative integer in reverse order, and return the sum of the two integers as a new linked list. First, give it some context:

Then making the function itself:

After testing, this works and returns the linked list we expect. This is already impressive, not only do we not have to put the time into figuring out how to make the function, but we don’t waste much time typing it either because all we have to do to use this solution is press TAB.

Onto the hard question, I’m going to use Median of Sorted Arrays, where we take two sorted arrays, merge them together into another sorted array, then find the median of that. Here’s the context:

And the rest:

Too easy for GitHub Copilot, it gets the right answer in a second, if that. Queries and algorithms are all over the internet, and therefore would have been very prevalent in Copilot’s training data. That means it has an advantage in these kinds of functions, where you simply take an input and return an output. However, the important part of development is being able to put those to good use with things like our next section, commands.

Commands

A command does not necessarily return a result, but goes out into the app and manipulates its state in some way. These are things like DOM manipulation in JavaScript or rendering a window in Python. It often needs to use Queries in order to manipulate the state in the correct way. This can often be a repetitive task for simpler applications but sometimes requires more creativity.

In order to test how effective Copilot is at command type functions, we’re going to go over a couple of examples of command functions in a simple web application, starting off with DOM manipulation. Basically, we want a little display that shows the current weather for a given location, so we need an object that stores that information, and then we need to create new elements to display it.

I’ll start off by creating the object manually:

Then I want it to make the function to add all of this information to the body of the website, so I’ll tell it that:

Then I’ll move on to writing the createElement function:

It seems tedious to screenshot and show you each auto-completion that the Copilot makes for each function, so for the remainder, you can assume that all code I show was written by Copilot unless stated otherwise. Here’s the createElement function:

The append element function:

Creating the weather div:

Then making the temperature paragraph:

Here’s where we run into our first problem because our createElement the function has no parameter for content, we can’t set that property in the function call itself, and instead, have to do it manually. That being said, I’m gonna skip a bit ahead in terms of the script here and show you the finished product:

Image by author

And there we go, Copilot managed to write code that takes a bunch of data from a forecast and insert it into the body of the DOM. In reality, though, we need that forecast object to be dynamic. Let’s pretend we have a fancy weather API, and we want to get the forecast from that instead. I’ll add the instructions before the forecast object:

And the call:

So now we have an app that gets data dynamically, and then creates some elements to display that data. This is a common job for web developers and as such, auto-completion is actually very helpful here. Of course different languages and tasks will behave differently, but for the most common languages with tons of public code, these things are easy for Copilot.

So, what?

Functions are Copilot’s bread and butter, that’s what it’s made for. The idea of the product is to help developers focus on how to put functions together instead of how to make the functions itself. This is accomplished very well in my opinion, oftentimes Copilot gets it right on the first try, and if you don’t like the initial completion, you can pick a different one. Overall, Copilot is great at writing functions, and it might be worth checking out for that reason alone.

App Building

The main job of a software developer is to build apps. These apps have different jobs, but overall similar structures. For example, a web app has a front-end that may make use of a JavaScript framework like React or Angular, and a back-end built with things like Express and Node. The job of a developer is to use the tools available in a way that meets the specific needs of the app that is being made. Since it is such an integral part of a developer’s day-to-day work, a good auto-completion AI needs to be able to help you out.

We’re gonna build a very simple Express server for our new algorithms API. Where we’ll get queries from users at different endpoints, and return the result of those queries. We’ll start off by building the skeleton of the server itself. I’ve installed Express and initialized a new Node project in the root directory of our index.js file. I’m just gonna start typing without giving it too much context as this is a fairly simple task:

It picks up immediately. I’m gonna keep pressing the tab and finish the Express server:

Here, however, we run into a bit of a problem: bodyParser is deprecated. Since GitHub Copilot was trained on public code, and a lot of public code includes this bodyParser line, it thinks we’re supposed to here, even though we don’t actually want to use deprecated syntax.

A good developer should know better when using GitHub Copilot than to trust a line it puts down without knowing what that line does, especially when using APIs and Frameworks like Express. That being said, not every developer will pay that much attention, and this could lead to really obscure bugs and vulnerabilities in your code.

That being said, let’s continue on with our setup:

There we go, if we run this with node we’ll technically have a server running. However, we do need some routes so let’s add those. Here I don’t mind adding some context for the AI to use, as it’s really not fair to assume it should read our minds:

So it makes the first route for us easily. If I run this server and go to localhost:8080/add/2/2 I get The sum of 2 and 2 is 4 . Moving onto the next route:

Easy enough, after a bit of context we get the two routes we need for now and we can always add to this later.

So, what?

Basically, Copilot can code, it can do little tasks very well, but requires you to guide it in the direction that you want it to go. Once again it stays true its name, by letting you lead the way and simply helping you out in the process. That, however, means that in terms of app building, you’re going to be doing most of the work. Copilot can write the lines of code for you, but you have to know where to put them. For that reason, Copilot is helpful here but definitely needs guidance, as is the point.

Final Thoughts

GitHub Copilot is a very nifty product that does its job well. Instead of seeking to replace developers, it aims to make them more productive by lifting weight off of their shoulders. That being said, how well does it do that? I’d say very well. Code completion isn’t new, tools like Tabnine and Kite have been around for years bringing AI-assisted code snippets, but neither compares to Copilot in the accuracy or depth in which they can work.

I’d say that if you’re looking to increase your productivity and get accurate and useful code completion, go sign up for the technical preview for GitHub Copilot, or better yet wait for the full version. That being said, you have to be careful because Copilot will give you code, but if you don’t know what it’s doing, then you have a recipe for disaster on your hands. I’d recommend only using GitHub Copilot for completions of code on whatever level you’re at to avoid it writing abstract functions that you can’t decipher.

All that said, overall GitHub Copilot is a great product, you probably won’t be disappointed using it. If you have thoughts or you think I missed something, let me know below, I’d be happy to hear what you have to say. I plan on keeping this review up to date and as accurate as possible so I will probably update it in the future.

Happy Coding Everyone!

--

--

Software Developer working my way through the world and sharing what I learn with all of you. More of my writing — aidantilgner.substack.com