Writing Tests for Legacy Code is Slow – AI Can Help You Do It Faster

Writing tests for legacy code is slow. AI can help you do it faster

Does this pitch sound familiar?

Welcome aboard, new developer! The team was eagerly waiting for you. You see, we are working on an ambitious application that is serving hundreds of ski resorts across the globe and expanding! Our primary focus this year is to develop the module calculating the pricing for ski lift passes based on what kind of lift pass you want, your age, and the specific date you’d like to ski.

Unfortunately, the previous developer has quit, and we have accumulated a lot of delays. So you must work fast!

Does this pitch sound familiar?

There are two new features we would like to ship tomorrow:

  1. Being able to get the price for several lift passes, not just one. Currently, only the pricing for a single lift pass is implemented, but that is not enough for all of our clients.
  2. If the pass type is “night” and the age of the skier is under 20, add an extra 20% reduction.

So go on. The clock is ticking. Make the desired change, and please, please don’t create bugs—we had enough problems with the previous developer who was regularly upsetting some of our biggest clients with new releases breaking their service. We are counting on you!

Well, let’s see the code we are talking about

It turns out to be an HTTP endpoint exposing a REST API to get and update lift pass prices. It is implemented in Python using Flask:

@app.route("/prices", methods=['GET', 'PUT'])
def prices():
    res = {}
    global connection
    if connection is None:
        connection = create_lift_pass_db_connection(connection_options)
    if request.method == 'PUT':
        lift_pass_cost = request.args["cost"]
        lift_pass_type = request.args["type"]
        cursor = connection.cursor()
        cursor.execute('INSERT INTO `base_price` (type, cost) VALUES (?, ?) ' +
            'ON DUPLICATE KEY UPDATE cost = ?', (lift_pass_type, lift_pass_cost, lift_pass_cost))
        return {}
    elif request.method == 'GET':
        cursor = connection.cursor()
        cursor.execute(f'SELECT cost FROM base_price '
                       + 'WHERE type = ? ', (request.args['type'],))
        row = cursor.fetchone()
        result = {"cost": row[0]}
        if 'age' in request.args and request.args.get('age', type=int) < 6:
             res["cost"] = 0
        else:
            if "type" in request.args and request.args["type"] != "night":
                cursor = connection.cursor()
                cursor.execute('SELECT * FROM holidays')
                is_holiday = False
                reduction = 0
                for row in cursor.fetchall():
                    holiday = row[0]
                    if "date" in request.args:
                        d = datetime.fromisoformat(request.args["date"])
                        if d.year == holiday.year and d.month == holiday.month and holiday.day == d.day:
                            is_holiday = True
                if not is_holiday and "date" in request.args and datetime.fromisoformat(request.args["date"]).weekday() == 0:
                    reduction = 35

                # TODO: apply reduction for others
                if 'age' in request.args and request.args.get('age', type=int) < 15:
                     res['cost'] = math.ceil(result["cost"]*.7)
                else:
                    if 'age' not in request.args:
                        cost = result['cost'] * (1 - reduction/100)
                        res['cost'] = math.ceil(cost)
                    else:
                        if 'age' in request.args and request.args.get('age', type=int) > 64:
                            cost = result['cost'] * .75 * (1 - reduction / 100)
                            res['cost'] = math.ceil(cost)
                        elif 'age' in request.args:
                            cost = result['cost'] * (1 - reduction / 100)
                            res['cost'] = math.ceil(cost)
            else:
                if 'age' in request.args and request.args.get('age', type=int) >= 6:
                    if request.args.get('age', type=int) > 64:
                        res['cost'] = math.ceil(result['cost'] * .4)
                    else:
                        res.update(result)
                else:
                    res['cost'] = 0

    return res

On the bright side: there is not too much code. The maintainability of the code is debatable. Different things are going on, and everything is mixed together: some SQL queries, computing the price based on the request, crafting the response… there’s even a TODO comment left by the previous developer.

To run the whole thing, you need to set up a MariaDB database, install the dependencies, and run the server. Once its up, you can query the API locally to put or get prices:

REST API

You have to change that code to implement the desired features quickly. You shouldn’t break the existing scenarios, though. But what are the existing scenarios?

Unfortunately, after a quick tour of the codebase, you realize no automated test captures these. Since the developer is gone already, you’ll have to find them out by yourself! And here comes the dilemma:

  • Should you spend time writing the missing tests? They will help you be more confident that your changes didn’t break any behavior unintentionally. But will you have time to do that before tomorrow?
  • Should you just try to figure it out and hack the features for the release tomorrow? Maybe you’ll have more time later to add the missing tests… but how can you be sure nothing will break?

What if you could write the missing tests faster?

Ideally, you would have some automated tests that would fail if and only if existing behavior changes unexpectedly. Since there are none, you would have to write them yourself. And that can be time-consuming since you don’t know everything the current code does.

The time pressure pushes most developers to keep hacking and skip the tests. The code is unfamiliar, and it wasn’t designed to be tested. As a result, it’s difficult to figure out a way to test it in a timely manner. The sad part is: it’s a vicious circle. There will be another deadline after this one, and another one, and another one…. Regressions will happen, hotfixes will be required, and time will never be a luxury.

Now, imagine you could click a button above the code to generate within seconds:

  • An analysis of the code
  • A few relevant tests

The missing tests faster

This is qodo (formerly Codium). It integrates with VS Code and JetBrains IDE and supports Python, JavaScript, and TypeScript code.

Let’s see what it generates:

qodo Integrates

It suggests a series of tests to be implemented. It’s a quick way to see what behavior this code can handle. Each test can be excluded and refined.

There is also a Code Analysis tab that describes the function behavior in simple English:
– The main goal of the function is to create a Flask application for lift pass pricing and establish a database connection.
– It takes no inputs.
– It first creates a dictionary of connection options for the database.
– It then calls the create_lift_pass_db_connection function to establish a connection to the database.
– It creates a Flask application named “lift-pass-pricing.”
– It defines a route for “/prices” that can handle GET and PUT requests.
– If a PUT request is received, it inserts or updates the cost of a lift pass type in the database.
– If a GET request is received, it retrieves the cost of a lift pass type from the database and applies any relevant discounts based on age and date.
– It returns a dictionary containing the final cost of the lift pass.
– It returns both the Flask application and the database connection as outputs.

This comes in very handy to understand what some unfamiliar code is doing faster.

Refine the tests and persist them

The suggested tests are interesting, but they can be even better. qodo (formerly Codium) has a few options that allow you to:

  • Tell the tests to automatically mock the calls to the database, for instance
  • Provide a reference to some existing tests so the suggested ones look similar
  • Change the number of tests to suggest (more tests = more edge cases)
  • And provide extra instructions to the AI assistant for the generation of the test (e.g., “Use Given-When-Then style.”)

Refine the tests and persist them

Without spending too long on the configuration, you can save the generated tests into a file:

version with auto-mock

This is a version with auto-mock turned on.

If you try to run the tests, they may first fail because:

  • Imports are missing
  • The mocked paths weren’t resolved properly (e.g.,mocker.patch(‘app.xxx’)is incorrect, it couldn’t resolve the proper path)

This is not surprising when it comes to large language model-based products, but as opposed to other AI assistants, qodo (formerly Codium) combines AI, Engineering (context collection, static code analysis, etc.), and interaction with the developer to eventually prevent such mistakes from happening.

Fortunately, these errors are explicit and don’t take long to fix. Sometimes generated tests fail because they are not properly capturing the existing behavior, although they have the benefit of highlighting what the actual behavior is.

Therefore, within a few minutes, you end up with almost a dozen tests that are passing and covering the code you need to change:

Behavior exhaustively

100%

Are these tests covering all behavior exhaustively? Probably not.

Could these tests be written better? Certainly.

Yet, very quickly, you were able to move from “0 test running” to “8 tests passing” on code you are not very familiar with. Reading the tests and trying to make them fail will be an effective way to learn what this code is really doing. They will also give you insights on other things you could test. Or maybe you can just move on and implement your features with a little more confidence that you won’t break existing behavior unintentionally!

Moving on faster, thanks to qodo (formerly Codium)

Under time pressure, we developers tend to stay in our comfort zone as much as possible. We aren’t likely to invest an unknown amount of time trying to write missing tests on code that we are unfamiliar with. We just try to get the work done.

Unfortunately, reality kicks back from time to time:

Source

This is where qodo (formerly Codium) can help: it dramatically reduces the time it takes to capture the existing behavior of the code and start setting up a safety net so you can work with it.

Instead of running in circles for hours, it gives you a direction and a decent set of tests to lean on.

Install it in your favorite IDE now and give it a try!