Better Programming

Advice for programmers.

Follow publication

Adding Regular Expression Support to an Existing Golang Tool

Stephen Wayne
Better Programming
Published in
5 min readAug 15, 2022

--

This is part two of a four part series (some still in progress). You can find the others here:

Previously, we built an inverse string matching tool in Go, with the goal of removing noisy logs while debugging. The first iteration accepted one or more keyphrases to match against a source file. This is great for somewhat static, repetitive lines, but what if we want to remove lines matching a particular pattern?

Enter regular expressions.

Today, we’ll cover some of the basics of regular expressions in Go, as well as how to extend an existing program to add functionality in a backward compatible manner.

What are regular expressions?

Regular expressions (regex for short) are a series of characters that specify a search pattern. They are used by string searching algorithms to identify a relevant section of text.

You’re likely already familiar with wildcards — using * to represent any number of characters (including an empty string). Regex take that a step further.

They allow you to pattern match much more specifically. You can search for any character other than something using [^<characters to avoid>]. You can search for something beginning with a phrase using ^<phrase>. Likewise, you can search for something ending with a phrase using <phrase>$. There’s a lot more you can do, which you can learn about here and test out here.

Why do we care?

Log messages are often well suited to substring match against, but not always. What if we wanted to remove log lines between two specific timestamps, say August 1–4 of 2022? We could do that using the existing tool with substring match by repeating ourselves:

-keys="logcreated-20220801:|logcreated-20220802:|logcreated-20220803:|logcreated-20220804:

Or we could search for patterns that match the following regular expression:

^logcreated-2022080([1-4]):

As the complexity of the query increases, regular expressions become powerful tools for custom, repeatable searches.

Adding new functionality

As in the previous article, we should consider how a user will interact with the new feature. For simplicity’s sake, we can assume a user will supply either a set of substrings (-keys) to match against, or a pattern (-pattern), or both. This will make the instructions and error handling logic easy — complain if neither keys nor pattern were given.

To add inverse pattern matching, we’ll need to do the following:

  • update the readme and help instructions
  • add the new flag and error handling
  • write the regex matching code
  • update the processing loop to select between regex and substring match

Updated help and readme guidance can be viewed in the tool repo for brevity.

Add the new flag and error handling:

Because we’re getting user inputs through getUserInput() and storing that config in config, we only need to update these entities to add a new flag. Here, we’re simply defining the new flag on line 11, and verifying that either it or keys has been passed in on line 19.

For this code we’re using regexp, another standard library package to handle the regular expression matching. We’re also validating that the provided pattern can actually be used as a regular expression. By doing it here, we can fail early and in one place for user-input-related things.

Note that Go’s regexpr doesn’t support some functionality, such as negative look-ahead. Doing some quick Googling it seems that there are various implementations of RE2 that should offer additional functionality, but that’s an exploration for another time.

Write the regex matching code:

Since we’ve already compiled our regular expression and included it in the config, we only need to call MatchString() on the regular expression, passing in the string to check against. This will yield a boolean on whether there was a match. Similarly to substring checking, we can iterate through all lines in the input file, calling MatchString() against them, and save the ones that do not match.

To simplify the processing logic, let’s add a receiver to config:

lineMatches() can be called on a config instance to determine if the given string matches something provided in the pattern, or something provided in the keyphrases.

Update the processing loop

There isn’t much to change in transformInputImpl(). We are going to take a *config as input (rather than a few variables), and we’ll swap:

if !substrInLine(line, keys) {

for:

if !cfg.lineMatches(line) {

A nice side effect of this is now all of the line matching logic is tied up in lineMatches(), which makes further tweaks to the processing clear and centralized.

Using the Tool

Building on our previous examples, we can now replace:

go run main.go -file="example/input.txt" -keys="hello"

with:

go run main.go -file="example/input.txt" -pattern="hello"

That will accomplish the same thing, but isn’t very exciting or new. How about we do something like:

go run main.go -file="example/input.txt" -pattern=".*b([r]?)ig([ht]?).*"

This pattern can be further explained by using regex101, but essentially it matches any input lines containing either “big”, “brig”, “bight”, or “bright”. In examples/input.txt this will leave only hello world remaining!

We can even supply both a set of keyphrases and a pattern if we’d like!

go run main.go -file="example/input.txt" -keys="hello" -pattern=".*b([r]?)ig([ht]?).*"

Will leave nothing remaining from our input file.

Wrapping Up

Today, we’ve extended a pre-existing tool to add additional functionality without needing to change too much of the core structure. When building software we must consider the current use cases, but it is prudent to consider future use cases. Thinking about this during the design and build phase can allow software to be extended with minimal changes to existing code, and therefore less room for bugs, headaches, and gotchas.

Further extending lineremover to accept a path for the destination file is left as an exercise to the reader.

If you clone the repo and follow the above process you can go on a quick and painless side quest to get your hands a bit dirty with Go!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Stephen Wayne
Stephen Wayne

Written by Stephen Wayne

Backend cloud engineer at HashiCorp. Former Electrical Engineer turned to the dark side.

No responses yet

Write a response