Adding Regular Expression Support to an Existing Golang Tool
Text Filtering Part 2: Electric Regex Boogaloo

This is part two of a four part series (some still in progress). You can find the others here:
- Part 1: Building a Text Filtering Tool in Go
- Part 3: Benchmarking a Program in Go
- Part 4: Command-line Benchmarking Against grep
Previously, we built an inverse string matching tool in Go, with the goal of removing noisy logs while debugging. The first iteration accepted one or more keyphrases to match against a source file. This is great for somewhat static, repetitive lines, but what if we want to remove lines matching a particular pattern?
Enter regular expressions.
Today, we’ll cover some of the basics of regular expressions in Go, as well as how to extend an existing program to add functionality in a backward compatible manner.
What are regular expressions?
Regular expressions (regex for short) are a series of characters that specify a search pattern. They are used by string searching algorithms to identify a relevant section of text.
You’re likely already familiar with wildcards — using *
to represent any number of characters (including an empty string). Regex take that a step further.
They allow you to pattern match much more specifically. You can search for any character other than something using [^<characters to avoid>]
. You can search for something beginning with a phrase using ^<phrase>
. Likewise, you can search for something ending with a phrase using <phrase>$
. There’s a lot more you can do, which you can learn about here and test out here.
Why do we care?
Log messages are often well suited to substring match against, but not always. What if we wanted to remove log lines between two specific timestamps, say August 1–4 of 2022? We could do that using the existing tool with substring match by repeating ourselves:
-keys="logcreated-20220801:|logcreated-20220802:|logcreated-20220803:|logcreated-20220804:
Or we could search for patterns that match the following regular expression:
^logcreated-2022080([1-4]):
As the complexity of the query increases, regular expressions become powerful tools for custom, repeatable searches.
Adding new functionality
As in the previous article, we should consider how a user will interact with the new feature. For simplicity’s sake, we can assume a user will supply either a set of substrings (-keys
) to match against, or a pattern (-pattern
), or both. This will make the instructions and error handling logic easy — complain if neither keys
nor pattern
were given.
To add inverse pattern matching, we’ll need to do the following:
- update the readme and help instructions
- add the new flag and error handling
- write the regex matching code
- update the processing loop to select between regex and substring match
Updated help and readme guidance can be viewed in the tool repo for brevity.
Add the new flag and error handling:
Because we’re getting user inputs through getUserInput()
and storing that config in config
, we only need to update these entities to add a new flag. Here, we’re simply defining the new flag on line 11, and verifying that either it or keys
has been passed in on line 19.
For this code we’re using regexp, another standard library package to handle the regular expression matching. We’re also validating that the provided pattern can actually be used as a regular expression. By doing it here, we can fail early and in one place for user-input-related things.
Note that Go’s regexpr doesn’t support some functionality, such as negative look-ahead. Doing some quick Googling it seems that there are various implementations of RE2 that should offer additional functionality, but that’s an exploration for another time.
Write the regex matching code:
Since we’ve already compiled our regular expression and included it in the config, we only need to call MatchString()
on the regular expression, passing in the string to check against. This will yield a boolean on whether there was a match. Similarly to substring checking, we can iterate through all lines in the input file, calling MatchString()
against them, and save the ones that do not match.
To simplify the processing logic, let’s add a receiver to config
:
lineMatches()
can be called on a config
instance to determine if the given string matches something provided in the pattern, or something provided in the keyphrases.
Update the processing loop
There isn’t much to change in transformInputImpl()
. We are going to take a *config
as input (rather than a few variables), and we’ll swap:
if !substrInLine(line, keys) {
for:
if !cfg.lineMatches(line) {
A nice side effect of this is now all of the line matching logic is tied up in lineMatches()
, which makes further tweaks to the processing clear and centralized.
Using the Tool
Building on our previous examples, we can now replace:
go run main.go -file="example/input.txt" -keys="hello"
with:
go run main.go -file="example/input.txt" -pattern="hello"
That will accomplish the same thing, but isn’t very exciting or new. How about we do something like:
go run main.go -file="example/input.txt" -pattern=".*b([r]?)ig([ht]?).*"
This pattern can be further explained by using regex101, but essentially it matches any input lines containing either “big”, “brig”, “bight”, or “bright”. In examples/input.txt
this will leave only hello world
remaining!
We can even supply both a set of keyphrases and a pattern if we’d like!
go run main.go -file="example/input.txt" -keys="hello" -pattern=".*b([r]?)ig([ht]?).*"
Will leave nothing remaining from our input file.
Wrapping Up
Today, we’ve extended a pre-existing tool to add additional functionality without needing to change too much of the core structure. When building software we must consider the current use cases, but it is prudent to consider future use cases. Thinking about this during the design and build phase can allow software to be extended with minimal changes to existing code, and therefore less room for bugs, headaches, and gotchas.
Further extending lineremover
to accept a path for the destination file is left as an exercise to the reader.
If you clone the repo and follow the above process you can go on a quick and painless side quest to get your hands a bit dirty with Go!