The Copilot Pause

November 8, 2024

If you regularly use GitHub Copilot or another AI code completion tool, I want you to try something: just briefly turn it off and continue with whatever you’re programming at the moment.

Do you feel it? The Pause?

It’s strange, isn’t it? There’s a moment there where your brain almost shuts off, as you stare at the screen waiting for something that will never arrive. After a couple seconds, you’ll snap back to your senses and continue typing. This will happen repeatedly, though less often over time. There’s a feeling of extra mental effort every time this happens, which is strange because you’ll usually encounter it when doing something incredibly mundane.

I’m certainly not the first person to talk about this, tech influencers like ThePrimeagen have discussed this repeatedly. However, I’d like to dig a little bit deeper into it. What I’m mainly interested in is how this changes the way we think about our code and improve over time.

Why are we doing this?

As I mentioned earlier, it seems like most of the time when you encounter The Pause, you’re not doing something especially complicated. Examples include:

Writing unit tests
Writing basic functions
Writing conditional statements
Moving around small blocks of code

I think that initially when you start using AI Code Completion, these are areas where it immediately becomes useful. You’re an experienced developer, you’ve done all this stuff a million times before! You could type all of this out, but why bother? Copilot spits everything out and you give it a read to check if it matches what was in your head, and you move on.

Over time however, you start checking less. You start accepting things at face value. You stop thinking. It looks right! It’s a simple function! They’re basic tests! I think this ignores the reality that these small things often aren’t simple. Each unit test created or function refactored carries more than just the logic you see, it also rests on a set of requirements and assumptions.

What vs Why

AI Code Completion tools are (usually) fairly good at the What of programming. If you accurately name a function you want to write, or add a useful comment, it can spit out something that closely matches what you want. However, it has no context on why you choose to do absolutely anything. It lacks even the most basic context on your business/industry, the specific feature that you’re trying to build, the problem you’re solving or the bug you’re fixing. It doesn’t understand why you’ve chosen to use one particular algorithm over another, it just writes it.

It also doesn’t have any way to tell if the What you’re having it create lines up with the Why that’s in your head. It just assumes that the code that it can already see is correct and builds that into every future decision it makes. This is especially prominent in Unit Tests. When you ask Copilot to write you a suite of unit tests for a function, it almost never tests the intention of the function, just the implementation.

Imagine you have a function that you intend to do Thing A, but due to a bug it actually does Thing B. You then ask Copilot to write unit tests for this function. It’ll read your function without any context as to why you wanted it, and write a whole suite of tests that test the wrong thing. You’re met with a wall of green ticks. Fantastic! Time to move on.

Now imagine you had taken the time to write these tests yourself. You’re not writing unit tests against the implementation you just wrote. Rather, you’re testing what you’re expecting based on what’s in your head, your intention. Not only would this help you find the bug in your function immediately, you (unlike Copilot) also have the context necessary to fix it. While AI is great at generating code, it’s the why behind each decision that shapes it.

Small Decisions

Even when context is less important, all code we write comes with a set of small decisions. Things like what data structure to use, how (or even if) to handle errors, which algorithms are optimal and when a comment is appropriate are all small decisions that we often only think of while we’re writing the implementation itself. Individually these might be considered small or insignificant, but they pile up. By skipping this step, you’re allowing Copilot to make all these decisions for you. Usually it’ll just pick whatever is most popular, because that’s what’s in its training set. However what’s most popular is often not most appropriate for your situation.

Over time this results in a codebase that is littered with confusing variable names, inefficient data structures, and comments that often make no sense. This then needs to be cleaned up and refactored down the line (often also by Copilot), which seems to be resulting in a huge increase in code churn.

Learning by Doing

As developers most of us learn best by taking new technologies that we encounter and applying them practically. It’s one of the reasons that we spend so much time telling newcomers to the field that they should be building things to learn new skills. I think that this process of practically applying also extends to ingraining knowledge that we already (at least partially) know.

Most developers will at some point reach a point in their preferred languages where they can program for hours on end without having to pause to look up anything minor. Some however follow a mantra of “I know what I want to do, the how is less important.” Personally I’ve always found this somewhat unproductive in the long run because pausing to figure out the “how” somewhat breaks my flow. With Copilot though, this mindset is affecting not just a larger portion of developers but is also impacting more of their work.

Every time you leave a small implementation detail to Copilot is one repetition you’ve lost, one opportunity to further ingrain the intricacies of the tools you use into your mind. Implementing code yourself reinforces not only the syntax but also your ability to navigate and resolve complex issues, which AI may obscure if relied on for every step. Now on it’s face you can argue that this is fine, if you think (like Thomas Frank did) that you’re outsourcing syntax knowledge to focus on higher-level decisions. However as I discussed previously, I’d argue that Copilot doesn’t truly allow us to do that. What’s left feels like the mashing together of barely coherent strings of code into something that resembles well-crafted code while gradually eroding your own problem-solving intuition and technical skills. Stepping back from automation and re-engaging with the basics lets us re-evaluate a balance between efficiency and technical mastery.

Final Thoughts

In some respects I’m probably too harsh on Copilot, there are enough times where you are just writing boilerplate and the dreaded CRUD system where this technology saves your time (and sanity). Further, there are also developers who don’t really enjoy the programming aspect that much and just want to build products and get them out the door quickly.

However I think that all of us benefit from refining and maintaining our skills. In the same way that we can’t always power out new products and features while the mountain of tech debt grows behind us, we should be careful to avoid letting these tools take over too much of our development for the sake of short term efficiency. We should take the time to maintain our skills by doing the rough and boring parts ourselves. So if you’ve made it to the end of this post, this is my challenge to you:

Turn off Copilot or Cursor for at least a week. See how it makes you feel. If you don’t like how you feel, I suggest you keep it off a while longer. Let it push you to reconnect and reinforce your skills, rather than relying on Copilot to fill the gaps for you. That dip in productivity is permanent, don’t risk it becoming permanent.

Tags: