How Can We Measure Our Software’s Modularity and Dependencies?

What is modularity, and what metrics are available to measure modularity?

Héla Ben Khalfallah
Better Programming

--

Diagram showing dependence on outside or external coupling.
Dependence on outside or external coupling. Image by the author.

Introduction

In this article, I’ll talk about modularity: What is modularity? Why is it important? And how can we measure modularity?

Modularity consists of dividing a system into separate and independent parts called groups or modules.

Brown wooden blocks on a black table that the author is using to visualize a modular application.
Modular application. Image by Kieran Wood on Unsplash.
  • We can divide an application into separated technical layers: business, persistence, UI, and database
  • We can divide an application into separated functional layers: user, payment, order, etc.
  • We can group related methods inside a class (Car: move, accelerate, park, setPreferences, etc.)

Modularity has a direct impact on reuse and maintainability

A good organization and division of modules increases the clarity and the maintainability.

An example of a non-modular structure and a modular structure. The modular one looks like puzzle pieces that fit together.
Non-modular versus modular. Image by the author.

The relationship between elements inside of the same group is defined according to a common property that can be functional (authentication, payment, delivery, etc.) or technical (business, service, persistence, UI, etc.).

This definition reminds us of a mathematical theory that has already dealt with groups, their elements, their relations, and their operations: set theory.

Sets have strong internal coupling and weak external dependency.
Sets have strong internal coupling and weak external dependency. Image by the author.

To explain modularity, I’ll refer to some set principles. And I’ll be doing this not from a microscopic perspective (e.g., a set as a data structure) but from a macroscopic perspective.

Then, because modularity is a mathematical concept, we can measure it: How modular is an architecture? How independent are the modules? How cohesive are the elements inside a module?

Let’s begin!

Modularity: Mathematical Aspect (Set)

A set is a collection of elements that share a common property and gather together according to this common property.

Different sets of elements: one is a collection of clothing items, one is a collection of transport vehicles. One is foods.
Different sets of elements. Image by the author.

Inside each set, elements are powerfully connected among themselves and relatively weakly connected to elements in other sets.

Other known types of sets:

  • denotes the empty set
  • Z denotes the set of integers
  • R denotes the set of real numbers
  • N denotes the set of natural numbers

“In mathematics, a Set is a well-defined collection of distinct elements or members. The elements that make up a set can be anything: people, letters of the alphabet, or mathematical objects, such as numbers, points in space, lines or other geometrical shapes, algebraic constants and variables, or other sets.”

Jain Ahmad via Wikipedia

  • Well-defined means that for any element whatsoever, the question “does this element belong to the collection?” has an unequivocal yes or no answer
  • Elements being distinct means that no element in a set is counted twice

“In computer science, separation of concerns (SoC) is a design principle for separating a computer program into distinct sections such that each section addresses a separate concern.”—Wikipedia

“The DRY principle is stated as ‘Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.’”— Wikipedia

“The Single Responsibility Principle (SRP) states that each software module should have one and only one reason to change.” — Robert C. Martin

A software module is just a collection of sections or layers grouped together with a certain property in common (a concern). Common properties can be functional (authentication, payment, delivery, etc.) or technical (business, service, persistence, UI, etc.).

In object-oriented programming, a class is a set of member methods that transform member variables.

Sets are structurally independent of one another, but they can be assembled together to form other sets.

For example, we can combine our Clothes set, Transport set, and Foods set into a unique context/set called Human needs.

  • A human needs to wear clothes
  • A human needs a means of transport to go work, visit family, travel, etc.
  • A human needs food to survive
The clothes, transport, and foods sets are grouped within a larger context called “human needs”
Global context — the application. Image by the author.

We can group the different independent modules together into a single global context: the application.

By the way, Human needs is a set, and Clothes, Transport, and Foods are now subsets.

Subsets keep the same set properties.

We can also define independent subsets inside a subset:

The subset transport now has its own subsets: air, land, and sea.
The subset transport now has its own subsets: air, land, and sea. Image by the author.

When we have a complex global context, we can divide it into small independent subcontexts.

“A complex system can be managed by dividing it up into smaller pieces and looking at each one separately.”

Carliss Y. Baldwin and Kim B. Clark in “Design Rules: The Power of Modularity”

Sets or subsets can also have shared elements

We can use/bring a map when we travel using an airplane, boat, train, or car:

The map is shared between different transport’s subsets.
The map is shared between different transport’s subsets. Image by the author.

The intersection of different subsets is the map subset.

{1, 2} ∩ {2, 3} = {2}.
{1, 2} ∩ {3, 4} = ∅.

Each element must exist only one time inside a set:

“A set is a gathering together into a whole of definite, distinct objects of our perception and of our thought — which are called elements of the set.”

— Georg Cantor, the founder of the set theory

When there’s duplicated or common elements between sets or subsets:

“A new set can also be constructed by determining which members two sets have ‘in common.’”Wikipedia

If modules or classes share common elements, we create a common shared module or class (DRY: Don’t repeat yourself).

What Can We Retain From Set Theory ?

Set and elements

  • Internal set cohesion (elements are strongly grouped together according a common criteria)
  • Elements inside a set are cohesive by default
  • Each element inside a set exists only one time (DRY and SRP)
  • A set is independent from others (loosely external coupling, SoC)
  • Decomposition (subsets)
  • New set for common elements between sets or subsets (DRY)

Software modularity is a mathematical aspect

  • A module is a set of sections or layers
  • A class is a set of member methods

Example of Set Theory Applied in Software: Layered Architecture

In a layered architecture, the common property is the technical role. We group files, classes, or code according to their technical roles: presentation, business, persistence, and database.

We have four sets

  • Presentation set: {HTML, CSS}
  • Business set: {UI logic adapters, interfaces}
  • Persistence set = {ORM}
  • Database set = {Tables, SQL queries}

Each set can be also divided into subsets.

Example of Set Theory Applied in Software: SOA Architecture

In an SOA architecture, we group files, classes, or code according to their functional roles: book, order, account, user, etc.

Service-oriented architecture
Via “Hands-On Microservices — Monitoring and Testing” by Dinesh Rajput

How to Measure Modularity

Because modularity is a mathematical concept, we can measure:

  • Internal group cohesion
Diagram showing internal cohesion.
Internal cohesion. Image by the author.
  • Degree of interdependence between software modules, or coupling
Diagram showing dependence on outside or external coupling.
Dependence on the outside or external coupling. Image by the author.

Cohesion

Definition

“In computer programming, cohesion refers to the degree to which the elements inside a module belong together.”

— Edward Yourdon and Larry L. Constatine in “Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design”

Types

Cohesion types: sequential, communicational, procedural, temporal, logical, and coincidental.
Cohesion types. Image by the author.

Functional cohesion is when parts of a module are grouped because they all contribute to a single well-defined task of the module.

Sequential cohesion is when parts of a module are grouped because the output from one part is the input to another part (e.g., a function which reads data from a file and processes the data).

Measuring cohesion

Elements inside the below set aren’t cohesive: Elements are grouped arbitrarily; the only relationship between the parts is they’ve been grouped together (coincidental cohesion).

Noncohesive group: a hoodie, a balloon, a car, an apple, and a lamp.
Noncohesive group. Image by the author.

When elements inside a set don’t share any common property, the set loses its well-defined characteristic, and we can’t ensure distinct criteria because there isn’t a property that defines belonging within that set.

Cohesive subsets: Now the hoodie, balloon, car, apple, and lamp are each placed within their own distinct sets.
Cohesive subsets. Image by the author.

Elements inside a set must be cohesive; otherwise, we’d have to keep decomposing until we have cohesive subsets.

Class cohesion (LCOM)

Example:

Consider a class C with three methods: M1, M2, and M3.

Let {Ij} = set of instance variables used by method Mi

Let {I1} = {a,b,c,d,e} and {I2} = {a,b,e} and {I3} = {x,y,z}

  • {I1} ∩ {I2} is non-empty
  • {I1} ∩ {I3} and {I2} ∩ {I3} are null sets

LCOM = number of null intersections — number of non-empty intersections.

In our case, LCOM = 2-1 = 1.

General formula:

Cohesion general formula
Cohesion general formula

“P = {(Ii, Ij) | Ii ∩ Ij = φ} and Q = {(Ii, Ij) | Ii ∩ Ij ≠ φ}

Take each pair of methods in the class. If they access disjoint sets of instance variables, increase P by one. If they share at least one variable access, increase Q by one.

LCOM is a count of the number of method pairs whose similarity is zero.

LCOM = 0 indicates a cohesive class.

LCOM > 0 indicates that the class needs or can be split into two or more classes, since its variables belong in disjoint sets.”

Chidamber and Kemerer

An example of low cohesion versus high cohesion.
Lack of cohesion — split into subclasses. Image by the author.

Coupling

Definition

“In software engineering, coupling is the degree of interdependence between software modules; a measure of how closely connected two routines or modules are.”— Wikipedia

Types

Coupling types: none, data, stamp, control, external, common, and content.
Coupling types. Image by the author.

Low coupling is often a sign of a well-structured computer system and a good design, and when combined with high cohesion, it’s a sign of high readability and maintainability.

Measuring coupling

Efferent coupling (CE)

CE measures the total number of classes inside a package that depend upon classes outside of the package.

Efferent coupling example.
Efferent coupling example. Image by the author.

In the above example, Module A has outgoing dependencies to three other classes.

The high value of the metric CE>20 indicates instability of a package — change in any of the numerous external classes can cause the need for changes to the package. Preferred values for the metric CE are in the range of 0 to 20 — higher values cause problems with the care and the development of the code.

Afferent coupling (CA)

CA measures the total number of classes outside of a package that depend upon classes within that package.

Afferent coupling example.
Afferent coupling example. Image by the author.

In the above example, Module A has two incoming dependencies.

The CA metric is highly related with portability. Packages with a higher CA are bad packages because they’re harder to be replaced since they have a lot of other packages that depend upon them. Preferred values for the metric CA are in the range of 0 to 500.

Instability

This metric measures the instability of packages, where stability is measured by calculating the effort to change a package without impacting other packages within the application.

Module instability
Module instability. Image by the author.

Instability I = CE / (CE + CA)

“This metric produces results in the range [0, 1]. A value I=0 indicates a maximally stable package that depends upon nothing and I=1 indicates a total Instable package that has no incoming dependencies but depends upon other packages. So instability negatively influences re-usability, maintainability and portability.”

Gurpreet Kaur and Deepak Sharma in “A Study on Robert C.Martin’s Metrics for Packet Categorization Using Fuzzy Logic”

Conclusion

In this article, we’ve discussed what modularity is, why it’s important, and we can measure it.

Cohesion is a metric to measure how good a software design is in terms of the SRP and SoC principles: If elements inside a module or a class aren’t cohesive, this indicates multiresponsibilities and overresponsibilities. In this case, they should be divided.

A piece of code without concern and responsibility separations is like a house without chairs, walls, doors, and windows: Everything is open. Everything changes together (regressions tests must test everything — related and unrelated code).

We lack modularity when we lack cohesion, SRP, and SoC. (without cohesion, SRP, and SoC, there are no borders or groups. Then there are also no modules — there’s only a single big piece).

A class or module without cohesion is like an unordered garage, where we throw everything randomly and quickly. It’ll be hard to find things.

Cohesion measures how strongly related elements inside a group are — that’s why it must be high.

When cohesion is low this indicates that the SRP and SoC principles have been broken.

On the other hand, the impact of the group on its environment or the interaction between the group and its environment must be as weak as possible — only for necessary needs.

Strong coupling with the outside has a strong impact on maintainability, refactoring, and stability. For each small change or evolution on a class or module, we should verify that every link still works.

When efferent coupling (CE), afferent coupling (CA), and instability are high, this gives an idea about weak the design is and how weak each change and refactor will be as well.

I recommend using these metrics during software design and as you move forward in development so you don’t create a big debt.

--

--

I love coding whatever the language and trying new programming tendencies. I have a special love to JS (ES6+), functional programming, clean code & tech-books.