Benchmarking Command Line Tools Against a Custom Go Line Remover

Testing a text filterer built in Go against the tried and true grep

Published in

Better Programming

4 min readAug 23, 2022

In this fourth (and final) part of our series, we’ll cover benchmarking our custom Go line remover against an equivalent command in grep. You can find the other parts below:

In the previous three segments we’ve built a tool in Go that can remove matching lines from a file. Then we added regular expression support, and finally compared the performance of the two approaches using built-in Go benchmark testing. Now, we’d like to measure its performance against inverse string matching ( -v or --invert-match) in grep.

We’ll be using hyperfine to benchmark, as it’s a highly ranked tool that’s also available via Homebrew for my machine (a MacBook with Apple Silicon). There are various other tools available (such as bench) that might better fit your needs and architecture.

We’ll be comparing the same three test cases that we looked at in Part 3 (described below), but with a significantly larger input text to see how the tools perform with more work to do.

The setup

A fair comparison requires a good setup. To that end, we’ll be benchmarking the Go tool against grep doing three things:

loading a file into grep
doing the inverse matching in grep
writing the results of that inverse match to a file

We’re using the following version of grep:

Grep commands will be of the following format:

grep "<pattern>" <filepath> > <output filepath>

Lineremover commands will be of the following format:

./lineremover -file="path/to/file" -keys="keys if testing substring search" -pattern="pattern if testing regular expression search"

And we can run the benchmarks as:

hyperfine '<command to benchmark>' --min-runs=10000

Where '<command to benchmark>' will be running either grep or lineremover as above. To keep things consistent and ensure we have enough data to extrapolate from, we’ll be using hyperfine’s --min-runs option to ensure each test is performed at least 10,000 times. We also need to be conscious of what is running on the machine performing the benchmarks — if it is slowed down during one test relative to another, the benchmark could show that. With that in mind, we’ll keep the machine in the same state across tests as much as possible, with minimal background processes running.

We’ll be testing against example/inputlong.txt from the lineremover repo. This file is essentially the same as example/input.txt, but repeated 2048 times so that program start/stop time will have less impact on our benchmark results relative to the actual string search work.

Running the benchmarks

Benchmark matching “hello”

Note that for grep we told hyperfine to ignore errors ( -i) since grep gives a non-zero exit code in the event of no match (which is expected for this test).

So we can see that grep is quite a bit faster in this test — that’s sort of the expected outcome given it’s a standard in this space and has been out since 1974. Within the lineremovertool, we can see that pattern matching is a bit slower than substring search as well. There’s also quite a bit of variation between runs.

Benchmark matching “big”

Here, we see that grep is the slowest of the bunch — not what I was expecting! We can also see that pattern match is slightly faster than substring search, but it’s likely not significant and has slightly more variation.

Benchmark matching “big”, “brig”, “bight”, “bright”

As the query gets more complex, we can see that pattern matching slows down significantly compared to substring search. Here again we see that grep is significantly slower than lineremover in both pattern match and substring search modes. It also has way more variance. Again, an unexpected result!

Wrapping up

Today we learned the mechanics of benchmarking two command line tools against each other, and set up a few tests that might reasonably represent how these tools might be used in the real world. The real value work when benchmarking, however, is figuring out what to benchmark, not necessarily how to run the tests. What tests would give you confidence that some tool is faster for your workload or use case? That is the most important question to answer when designing a test. Only then do we design an environment to answer that question.

As far as this test goes, it was interesting! This was my first time using hyperfine, so that’s pretty cool. Within the tool, I was expecting pattern match to be slower than substring search, but was not expecting the Go tool we built to outperform grep under some search conditions. I’m not yet sure why grep was slower, but one guess is perhaps reading the input file and writing the output file are not as optimized as they were using buffered IO in Go? I also thought perhaps some OS stuff (like caching) might have been at play, but I ran the tests back to back and in various orders and got similar results each time. If you have ideas, please let me know!

Better Programming

Benchmarking Command Line Tools Against a Custom Go Line Remover

Testing a text filterer built in Go against the tried and true grep

The setup

Running the benchmarks

Benchmark matching “hello”

Benchmark matching “big”

Benchmark matching “big”, “brig”, “bight”, “bright”

Wrapping up

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Better Programming

Written by Stephen Wayne

No responses yet