Why I Made My Own Code Quality Tool, Attractor

Sometimes, to get what you want, you have to build it yourself

Julian Rubisch
Better Programming

--

Photo by Richard Brutyo on Unsplash

I needed a language-agnostic tool, available as a CLI, to investigate refactoring candidates in full-stack codebases.

Get it here: https://rubygems.org/gems/attractor.

Introduction

Before I start introducing the tool I crafted, and more importantly, why I did so, let me tell you a little bit about my context.

These days, I’m primarily a full-stack web developer (mostly Ruby on Rails stuff), but I also do quite an amount of front-end work (mainly React). More often than not, I’m consulted for codebase reviews, refactorings, or pulled in on short notice to compensate short-term demand.

In all of these roles, I’ve come to realize that I need some tooling to get an overview of the codebase quickly, to address the pain points, and/or find out where the crucial logic of the software lies, to get to grips with it fast.

Now, let’s switch gears. Many authors (Michael Feathers, Sandi Metz, among others) have shown that an evaluation of churn (how often changes occur) vs. complexity of files in software projects provide a valuable metric towards code quality.

Churn/complexity charts are at the heart of many code quality evaluation tools, including CodeClimate, RubyCritic, es6-plato, and others.

You can read up on the science at the links I provided — I just want to concur with these authors, that surveying the files/classes with the highest churn * complexity product virtually always gives the best leads as to where to poke my nose in first.

So, when all these tools already exist, why build another one?

First, I don’t believe that just because an idea already has one materialization, no other variant of it should be built. By the same token — and that’s just speaking of the same domain — we’d only need one operating system, one programming language, one editor, one testing framework, etc. — you could continue this list endlessly.

No, open-source software as such has proven that a variety of tools to choose from is nearly always beneficial for the ecosystem. The community will have to decide whether I have built something valuable.

And second, as I already hinted at in the introduction, I’m not really satisfied with having to install — and learn — one tool for each of the languages I’m concerned with.

I’d rather have one single source of truth I can consult about code quality. So, I’ve set out to create a tool that serves mainly two purposes:

  1. Provide (roughly) the same code quality metrics (along with high-quality visualizations for easy skimming through the codebase) for a multitude of programming languages.
  2. Make it easily extensible so that anybody who’s interested can provide a Calculator for <fill_in_your_language_of_choice>.

Feature Tour

I’ll be referring to the live-served version of attractor here, that is, the one you get if you type:

$ attractor serve

In your console. Most commonly, you will like to give this command a path prefix option -p, to restrict the calculations to a certain directory, such as app/.

The first thing you’ll be presented with is a churn/complexity chart of the Ruby source in the specified folder, along with some UI filtering options.

Initially, the plot is quite minimal, but you can opt-in to display file names and a regression line, if you like. If you’d like to drill further down into the source directory, you can add to the path in a text input box.

If you click on a dot, you’ll get additional information displayed in the sidebar:

  • A method breakdown of the file in question.
  • The Git history of the last 10 commits.

Scrolling down further, you’ll get a list of refactoring candidates, by default the top 95 percentile of the churn * complexity product, but you can also change that percentage.

In addition to the default scatterplot, attractor also features a treemap representation, which visualizes the file "weight" even more radically by dividing the total available area according to the specified metric.

Which brings me to the final cherry on the cake — the ability to select the metric used — either churn * complexity or each of those separately.

Since a scatterplot for a single metric doesn’t make any sense, I’ve fallen back to a histogram in those cases.

Finally, as promised, you can expect the exact same features (as was my primary goal) in your JS(X) source files. Neat!

Extensibility — Bring Your Own Calculator

I’ve tried to keep the code as modular as possible. Since file churn is language agnostic, all you have to do is implement a calculator, as you can observe in the JS calculator on GitHub.

At the moment, all you have to do is provide your file extension in the initializer:

def initialize(file_prefix: '', minimum_churn_count: 3)
super(file_prefix: file_prefix, file_extension: '(js|jsx)', minimum_churn_count: minimum_churn_count)
end

And in the calculate method, a change hash is yielded, providing, among other things, access to the file_path of the file under scrutiny.

The block you supply here is expected to return an array containing two elements:

  1. The total computed complexity of the file.
  2. (Optionally) a “details” hash containing method names as keys and their respective complexities as value.

So, in the case of js_calculator, first, in the Ruby class I overwrote the calculate method like so:

def calculate
super do |change|
complexity, details = JSON.parse(`node #{__dir__}/../../../dist/calculator.bundle.js #{Dir.pwd}/#{change[:file_path]}`)
[complexity, details]
end
end

And wrote a node script that provides these values:

const report = escomplex.analyzeModule(fs.readFileSync(file).toString());const details = {};report.methods.forEach(m => {
details[m.name] = m.cyclomatic;
});
console.log(JSON.stringify([report.aggregate.cyclomatic, details]));

That’s probably one of the drawbacks here: You will have to use whatever tooling the target language provides to supply complexity (or other) metrics.

I’m currently thinking about the architecture to use for supporting different programming languages.

Probably, I will split it up into multiple modules/gems, but I’ll defer that decision until the demand for more languages arrives.

Future Directions

More languages

As outlined above, creating a Calculator is actually quite easy, you don't even need any detailed Ruby knowledge, just a notion of a code quality tool in your preferred language of choice.

If you have demand for <your_language>, open a Github issue and/or give me a ping.

More metrics

Two I can think of are automated code smell detection and test coverage. If you have an idea, or would like to contribute, open a Github issue and let’s talk.

Alternate visualizations

Scatterplots and treemaps are what came to my mind, but maybe there are even more feasible ones? Like network graphs, chord diagrams, arc diagrams or what have you. Just ping me with your ideas or open a Github issue.

Auto-detection

Based on what package files are present in the app’s root directory, it could be auto-detected which complexity calculators to use. There’s already a Github issue for this, and will be implemented in a future version.

GitHub app

Installing Attractor and/or adding it to your project’s Gemfile may not be possible in your case. It is conceivable to create GitHub and/or GitLab integration for such cases, to have a running instance of attractor somewhere in the cloud.

I'm not sure whether this is of any value at all, since Code Climate already offers a generous free plan. At any rate, if you have demand for something like this, let's get in contact!

Export functionality

This might be more realistic — you might have a need to export code quality data, e.g. to track it over time, compare it to other projects, etc. If you have any ideas concerning this, give me a ping!

Concluding Thoughts

My initial motivation was to scratch my own itch with this project, to get a quick overview of whatever codebase I’m in at the moment. That goal has been only partially reached, since I only implemented what I already knew from other tools.

But then, I seriously came to believe that this could become a community effort, because maybe interested code aesthetes such as myself could contribute ideas as to how to better parse, process, and visualize source code, let alone what machine learning, clustering algorithms, etc. could bring to light in such an environment.

So, if you have some high-flying visions of code discoverability in the back of your head, let’s get in touch!

Links

Credits

Thanks go to:

  • Ernesto Tagwerker (@etagwerker) of fastruby.io for consistently giving feedback and pointers. I feel the two of us are very much on the same page!

--

--