Better Programming

Advice for programmers.

Follow publication

Benchmarking Command Line Tools Against a Custom Go Line Remover

Stephen Wayne
Better Programming
Published in
4 min readAug 23, 2022

In this fourth (and final) part of our series, we’ll cover benchmarking our custom Go line remover against an equivalent command in grep. You can find the other parts below:

In the previous three segments we’ve built a tool in Go that can remove matching lines from a file. Then we added regular expression support, and finally compared the performance of the two approaches using built-in Go benchmark testing. Now, we’d like to measure its performance against inverse string matching ( -v or --invert-match) in grep.

We’ll be using hyperfine to benchmark, as it’s a highly ranked tool that’s also available via Homebrew for my machine (a MacBook with Apple Silicon). There are various other tools available (such as bench) that might better fit your needs and architecture.

We’ll be comparing the same three test cases that we looked at in Part 3 (described below), but with a significantly larger input text to see how the tools perform with more work to do.

The setup

A fair comparison requires a good setup. To that end, we’ll be benchmarking the Go tool against grep doing three things:

  • loading a file into grep
  • doing the inverse matching in grep
  • writing the results of that inverse match to a file

We’re using the following version of grep:

Grep commands will be of the following format:

grep "<pattern>" <filepath> > <output filepath>

Lineremover commands will be of the following format:

./lineremover -file="path/to/file" -keys="keys if testing substring search" -pattern="pattern if testing regular expression search"

And we can run the benchmarks as:

hyperfine '<command to benchmark>' --min-runs=10000

Where '<command to benchmark>' will be running either grep or lineremover as above. To keep things consistent and ensure we have enough data to extrapolate from, we’ll be using hyperfine’s --min-runs option to ensure each test is performed at least 10,000 times. We also need to be conscious of what is running on the machine performing the benchmarks — if it is slowed down during one test relative to another, the benchmark could show that. With that in mind, we’ll keep the machine in the same state across tests as much as possible, with minimal background processes running.

We’ll be testing against example/inputlong.txt from the lineremover repo. This file is essentially the same as example/input.txt, but repeated 2048 times so that program start/stop time will have less impact on our benchmark results relative to the actual string search work.

Running the benchmarks

Benchmark matching “hello”

Note that for grep we told hyperfine to ignore errors ( -i) since grep gives a non-zero exit code in the event of no match (which is expected for this test).

So we can see that grep is quite a bit faster in this test — that’s sort of the expected outcome given it’s a standard in this space and has been out since 1974. Within the lineremovertool, we can see that pattern matching is a bit slower than substring search as well. There’s also quite a bit of variation between runs.

Benchmark matching “big”

Here, we see that grep is the slowest of the bunch — not what I was expecting! We can also see that pattern match is slightly faster than substring search, but it’s likely not significant and has slightly more variation.

Benchmark matching “big”, “brig”, “bight”, “bright”

As the query gets more complex, we can see that pattern matching slows down significantly compared to substring search. Here again we see that grep is significantly slower than lineremover in both pattern match and substring search modes. It also has way more variance. Again, an unexpected result!

Wrapping up

Today we learned the mechanics of benchmarking two command line tools against each other, and set up a few tests that might reasonably represent how these tools might be used in the real world. The real value work when benchmarking, however, is figuring out what to benchmark, not necessarily how to run the tests. What tests would give you confidence that some tool is faster for your workload or use case? That is the most important question to answer when designing a test. Only then do we design an environment to answer that question.

As far as this test goes, it was interesting! This was my first time using hyperfine, so that’s pretty cool. Within the tool, I was expecting pattern match to be slower than substring search, but was not expecting the Go tool we built to outperform grep under some search conditions. I’m not yet sure why grep was slower, but one guess is perhaps reading the input file and writing the output file are not as optimized as they were using buffered IO in Go? I also thought perhaps some OS stuff (like caching) might have been at play, but I ran the tests back to back and in various orders and got similar results each time. If you have ideas, please let me know!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Stephen Wayne
Stephen Wayne

Written by Stephen Wayne

Backend cloud engineer at HashiCorp. Former Electrical Engineer turned to the dark side.

No responses yet

Write a response