Sweating the small stuff in Java

The story of small FixedSizeCollection types in Eclipse Collections

Donald Raab
Better Programming
Published in
15 min readJun 3, 2023

--

Photo by Edward Howell on Unsplash

Sometimes, You’re on Your Own

Every once in a while, we are required as application developers to roll up our sleeves and find ways to squeeze performance or memory savings beyond the built-in capabilities of our language and libraries.

I started programming professionally in DOS/Clipper in the late 1980s when 640K was the memory limit, so I was accustomed to memory-constrained programming. I wasn’t used to anything else until I started programming in Smalltalk, where I had access to hundreds of megabytes of memory.

Towards the end of the 1990s, I worked in Smalltalk, loading decent-sized object graphs into memory and doing things at blazing memory speed. Processes that used to run in minutes in DOS/Clipper could be completed in hundreds of milliseconds. I was working with a 32-bit memory constraint, but I never seemed to get close to the edge of the RAM limit to worry about running out of memory with the domain I was working in.

This was a good progress. Memory was plentiful and fast. Life as a programmer was good. I hadn’t yet encountered a big data application. That would happen soon enough.

Big memory meet bigger data

In 2004, I worked on a Java application designed and built using an in-memory caching architecture. I experienced firsthand the saying, “you can’t fit ten pounds of $%*# in a five-pound bag.”. I was still working within the confines of 32-bit software imposed memory limit. The hardware had already progressed beyond this and offered tens of gigs of RAM, but it was inaccessible to me now. A 64-bit version of Java was available with the JDK 1.4.0 release, but I didn’t have access yet.

My choices at the time were simple.

  1. Abandon the architecture and start with something more scalable but with a different performance profile.
  2. Wait for a 64-bit JVM.
  3. Figure out how to make ten pounds of $%*# fit in a five-pound bag.

I went with option number 3. By measuring, executing, and repeating small memory efficiency tricks, I could make the application work. I did things that never would have occurred to me to do in the previous 15 years of my programming career. I felt a bit like Mark Watney (Matt Damon) from the movie The Martian.

In the face of overwhelming odds, I’m left with only one option. I’m gonna have to science the $%*# out of this.

— Mark Watney, The Martian

Step 0: Find Tools To Measure Memory

In 2004, I used jmap -histo <pid> to figure out where I can save memory. Jmap is a command line tool in the JDK for analyzing Java heaps. It still works fine in OpenJDK 20 today. Jmap is a blunt tool that can help you spot glaring issues on the Java heap and measure the overall impact of any changes you make.

Today, I use Java Object Layout (JOL) from the OpenJDK tools for measuring the memory cost of specific objects. JOL gives a more precise and targeted set of information than jmap. There is a JOL plugin for IntelliJ available as well. I have not used the IntelliJ plugin, but some folks I know have said positive things about it.

Once you have JOL included as a Maven dependency, you can use GraphLayout to look at the memory cost and layout of particular instances of objects programmatically. You will see some code examples below that use GraphLayout.

Step 1: Understand Your Data

I had a large static object graph loaded one time up front and cached, and then multiple calculated object graphs created from that data graph and also cached that was explorable in either direction. Every node in the graph had a List of children and a List of parents. The requirement I needed to satisfy was to store two calculated graphs in memory to be explored on demand by the users.

  • Total memory required for both data and calculations— ~4-6 GB
  • Total memory needed for application to run — ~7–8 GB
  • Available RAM on Solaris with 32-bit JVM- ~4 GB

I started looking at the heap output using jmap -histo <pid> . Jmap is a simple but effective tool for looking at and starting to understand what is occupying a Java heap. I used jmap for quite a few years, helping folks find waste and trim the fat out of their Java heaps. I once saw a Java heap with two million Boolean objects in it. Yeah, that kind of stuff can happen when you’re not looking. I convinced the team with the two million Boolean objects in their heap that they didn’t need that kind of fault tolerance for boolean and would survive just as well with one instance of each of true and false.

I digress.

In the application I was working on, I saw an extremely large number of List, Set and Map instances in the heap output from jmap, along with the corresponding array instances that occupied the data structures.

What I didn’t know right away was the size of the List, Set and Map instances. So, I dug around in the code. I saw ArrayList being created in multiple places using a default constructor. Anyone who has programmed with Java before JDK 7_u40 might remember when ArrayList eagerly initialized an empty array of size 10.

When I was working on this application, we were using JDK 4. This was seven years before Java 7 was released. I didn’t know then, but I would later help validate the importance of the change to ArrayList that was introduced to lazy initialize the backing array of size 10. The benefit of this for all Java developers on the planet for the rest of the time is that empty ArrayList instances stay empty. This has a significant and lasting impact on the total memory savings of Java applications globally forever. Win!

When I realized all of the List instances would be backed by default sized arrays, my first thought was to initialize them all using new ArrayList(0). This did have a noticeable benefit right away. Unfortunately, this wasn’t enough savings, and I would go on to discover that most of the List, Set, and Map instances in a heap were of sizes zero through six. I kept investigating where I could save memory.

Step 2: Understanding Array Instances

Arrays are used in a lot of places in Java. Lists have them. Maps have them. Strings have them.

Every time you say new with an empty array, you get back a new empty array. Each new empty array is effectively immutable and equal to every other empty array of the same type because it has no elements and is the same size. Empty array sharing is an important optimization I discovered later.

What I couldn’t easily tell is how many of the array instances on the heap were empty. I had to guess. What I started with was adding an EmptyList class that I could use anywhere I wanted an empty List. This would result in reasonable savings for empty lists in the heap. In Java 5, Collections.emptyList() would be added, which returns a type-safe immutable instance. Unfortunately, I didn’t know this and didn’t have time to wait for Java 5.

Step 3: Optimize for Empty Lists

I created an EmptyList that was a singleton and initialized all Lists to this singleton instance. Today, in Eclipse Collections, there is ImmutableEmptyList and EmptyList . EmptyList was created first and implements an interface called FixedSizeList, which extends MutableList and java.util.List. ImmutableEmptyList extends both ImmutableList and java.util.List.

Using EmptyList would necessarily complicate the application code because there is no way to add to a fixed-size empty list. Wherever I used EmptyList, I had to add code that tested first whether it was an empty List (i.e., size == 0) and, if so, created a new List if I needed to add to it. Then I had to set that new List instance into the variable that pointed to the EmptyList. This increased the cost of testing and implementing all of the methods where List instances were created and grown to make sure bugs weren’t introduced. The cost was worth the benefit.

Step 4: Optimize for Fixed Size Lists w/ Sizes 1–6

Creating ArrayList with an initial size of zero, provided some memory savings benefit, but I was still seeing millions of instances of object arrays due to what I guessed were List instances in size one to ten range. I created SingletonList and tried it out and saw some great benefits.

SingletonList holds onto a single Object reference and has no backing array. Then I introduced DoubletonList and saw some more benefits. Then came TripletonList, QuadrupletonList, QuintupletonList and SextupletonList. This is as far as I went in 2004. The savings I saw with all of these changes were dramatic in the dynamic calculation object graph. This was because the design of the object graph was bi-directional, and every node in the graph knew its parents and children. Most nodes in the graph had one parent and most often had one to three children.

The following chart shows the potential savings even today comparing a default sized ArrayList (new ArrayList()), a zero initial-sized ArrayList (new ArrayList(0)), and FixedSizeList instances created using Lists.fixedSize.of() from Eclipse Collections.

I wrote the following test using OpenJDK 20 with JOL version 0.17 to print out all the memory sizes in the chart. Here’s the code I used:

@Test
public void fixedSizeListsToSizeSix()
{
ArrayList arrayList = new ArrayList();
System.out.println("ArrayList Empty: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList.add(null);
System.out.println("ArrayList 1: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList.add(null);
System.out.println("ArrayList 2: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList.add(null);
System.out.println("ArrayList 3: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList.add(null);
System.out.println("ArrayList 4: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList.add(null);
System.out.println("ArrayList 5: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList.add(null);
System.out.println("ArrayList 6: " +
GraphLayout.parseInstance(arrayList).totalSize());

arrayList = new ArrayList(0);
System.out.println("ArrayList 0: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList.add(null);
System.out.println("ArrayList0 1: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList = new ArrayList(0);
arrayList.add(null);
arrayList.add(null);
System.out.println("ArrayList0 2: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList = new ArrayList(0);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
System.out.println("ArrayList0 3: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList = new ArrayList(0);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
System.out.println("ArrayList0 4: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList = new ArrayList(0);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
System.out.println("ArrayList0 5: " +
GraphLayout.parseInstance(arrayList).totalSize());
arrayList = new ArrayList(0);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
arrayList.add(null);
System.out.println("ArrayList0 6: " +
GraphLayout.parseInstance(arrayList).totalSize());

List list = Lists.fixedSize.empty();
System.out.println("FixedSizeList Empty: " +
GraphLayout.parseInstance(list).totalSize());
list = Lists.fixedSize.of((Object)null);
System.out.println("FixedSizeList 1: " +
GraphLayout.parseInstance(list).totalSize());
list = Lists.fixedSize.of(null, null);;
System.out.println("FixedSizeList 2: " +
GraphLayout.parseInstance(list).totalSize());
list = Lists.fixedSize.of(null, null, null);
System.out.println("FixedSizeList 3: " +
GraphLayout.parseInstance(list).totalSize());
list = Lists.fixedSize.of(null, null, null, null);
System.out.println("FixedSizeList 4: " +
GraphLayout.parseInstance(list).totalSize());
list = Lists.fixedSize.of(null, null, null, null, null);
System.out.println("FixedSizeList 5: " +
GraphLayout.parseInstance(list).totalSize());
list = Lists.fixedSize.of(null, null, null, null, null, null);
System.out.println("FixedSizeList 6: " +
GraphLayout.parseInstance(list).totalSize());
}

In these examples, I used null as the elements, so the memory cost of the lists was the only thing on display. Savings add up quickly here when you have millions of EmptyList, SingletonList, DoubletonList, and TripletonList instances. There were smaller numbers of QuadrupletonList, QuintupletonList and SextupletonList, but enough that the memory savings mattered.

Step 5: Optimize for Fixed Size Sets w/ Sizes 0–4

Nothing will prepare you when you discover how terrible the memory footprint of the java.util.HashSet class is. HashSet is a suboptimal class that can grow quietly in your heap if you’re not careful to use it sparingly and dispose of instances when you are done using them. HashSet performs very well for most use cases where a Set is required, but the cost is unnecessarily high. This cost is the product of using delegation to a HashMap inside of HashSet.

The following chart shows the memory savings today comparing a default sized HashSet (new HashSet()), a zero initial-sized HashSet (new HashSet (0)), and FixedSizeSet instances created using Sets.fixedSize.of() from Eclipse Collections.

I wrote the following test using OpenJDK 20 with JOL version 0.17 to print out all the memory sizes in the chart. Here’s the code I used:

@Test
public void fixedSizeSetsToSizeFour()
{
HashSet hashSet = new HashSet();
System.out.println("HashSet Empty: " +
GraphLayout.parseInstance(hashSet).totalSize());
hashSet.add(new Object());
System.out.println("HashSet 1: " +
GraphLayout.parseInstance(hashSet).totalSize());
hashSet.add(new Object());
System.out.println("HashSet 2: " +
GraphLayout.parseInstance(hashSet).totalSize());
hashSet.add(new Object());
System.out.println("HashSet 3: " +
GraphLayout.parseInstance(hashSet).totalSize());
hashSet.add(new Object());
System.out.println("HashSet 4: " +
GraphLayout.parseInstance(hashSet).totalSize());

hashSet = new HashSet(0);
System.out.println("HashSet 0: " +
GraphLayout.parseInstance(hashSet).totalSize());
hashSet.add(new Object());
System.out.println("HashSet0 1: " +
GraphLayout.parseInstance(hashSet).totalSize());
hashSet = new HashSet(0);
hashSet.add(new Object());
hashSet.add(new Object());
System.out.println("HashSet0 2: " +
GraphLayout.parseInstance(hashSet).totalSize());
hashSet = new HashSet(0);
hashSet.add(new Object());
hashSet.add(new Object());
hashSet.add(new Object());
System.out.println("HashSet0 3: " +
GraphLayout.parseInstance(hashSet).totalSize());
hashSet = new HashSet(0);
hashSet.add(new Object());
hashSet.add(new Object());
hashSet.add(new Object());
hashSet.add(new Object());
System.out.println("HashSet0 4: " +
GraphLayout.parseInstance(hashSet).totalSize());

Set set = Sets.fixedSize.empty();
System.out.println("FixedSizeSet Empty: " +
GraphLayout.parseInstance(set).totalSize());
set = Sets.fixedSize.of(new Object());
System.out.println("FixedSizeSet 1: " +
GraphLayout.parseInstance(set).totalSize());
set = Sets.fixedSize.of(new Object(), new Object());;
System.out.println("FixedSizeSet 2: " +
GraphLayout.parseInstance(set).totalSize());
set = Sets.fixedSize.of(new Object(), new Object(), new Object());
System.out.println("FixedSizeSet 3: " +
GraphLayout.parseInstance(set).totalSize());
set = Sets.fixedSize.of(new Object(), new Object(), new Object(), new Object());
System.out.println("FixedSizeSet 4: " +
GraphLayout.parseInstance(set).totalSize());
}

With sets, I needed to add data to the sets that would have unique a hashCode and equals combination. If I used null, there only would have been one element in each Set. So there is an extra 16 bytes for each element in the set. Just multiply 16 times the Set size to determine the extra overhead for the elements.

Step 6 and Beyond

There were other important lessons over the years I learned about how to save memory. Object Pooling is usually the most beneficial for loading data from external storage (e.g., database) into long-lived objects in memory. Immutable objects like String and LocalDate are good candidates for pooling. You have to understand the makeup of your data and have a decent number of duplicate strings and dates to see the benefit of pooling. I will not go more into object pooling in this article as it is a topic worthy of its own post.

Finally, if you know the lower and upper ranges of your numeric data and that they will be known to stay within those ranges forever, you can also save memory by using smaller integral or float types to store data in long-lived objects in memory. Using byte, short, and int instead of long The alignment and padding in the memory layout can help you fit more into an object. Be careful to widen your type when doing math (e.g., summing or counting) because you may otherwise encounter silent overflow errors.

The JDK is constantly improving in the memory-efficiency and performance, so new help is constantly on the way. The future of Java is looking good.

No One Has Ever Been Stranded on Mars

Mark Watney was a character in a movie. The story made a great film with memorable quotes, and that is all. No human that we know has ever been to Mars.

You may have never encountered or even heard of an application that creates millions of small List, Set, and Map instances and holds onto them for a long time in memory. That is maybe until now. I had never seen or heard of an application like this until I started working on one in 2004.

There is one potential gotcha with this small collection memory savings strategy. You might wind up trading off memory for performance. I was faced with the problem of getting an application working. I wasn’t concerned with how fast the calls to methods on the collection classes were or whether there would be megamorphic virtual calls in hot code paths to cause significant slowdowns.

When the application was finally running, it was so fast in memory we didn’t notice any bottlenecks caused by megamorphic virtual calls. If performance and throughput are your biggest concern, I recommend using JMH or other performance profilers to measure specific hotspots for tuning. I recommend only tuning for performance if you see a specific performance issue. You’ll be in the best position to determine whether memory or performance is the biggest concern for your application.

I hope you never face this kind of memory problem in the applications you work on today. If you encounter this kind of situation in the future (hey, we might see real folks go to Mars one day), you can leverage the knowledge you have gained here to help you address any memory issues you have. All of the small List, Set, and Map implementations exist in Eclipse Collections today. The JDK also has a small immutable List and Set implementations (List12, ListN, Set12, SetN) that optimize for both throughput and memory. I wish I had access to these classes in the JDK in 2004.

The following chart compares the memory cost of the JDK and Eclipse Collections ImmutableList implementations up to size 11.

If you’re stranded on Mars with an application that won’t work and memory savings for small List, Set and Map instances are exactly what you need to help you, then Eclipse Collections might be your best option. When it comes to raw throughput performance with some great memory savings for very small collections, the JDK (after Java 11) may be the best option today. The JDK now offers a memory-efficiency option that it didn’t have in 2004 when I was stuck working on my application on Mars. Progress is a good thing. Your mileage may vary.

Tracer Bullets in your Java Heap

There is an additional subtle benefit to having named versions of all of the small List, Set, and Map instances in your Java heap. The named classes show up in both jmap and JOL output. When used, these classes tell you more about the distribution of sizes of your collections in memory. Seeing that you have a large number of ArrayList instances tells you nothing about the size of them.

If we output the following code using JOL, you will see the size distribution of your Lists when using Eclipse Collections ImmutableList implementations. The ImmutableList types are similar to their FixedSizeList counterparts but are hand-optimized to size ten instead of six.

@Test
public void ecListOfToSizeEleven()
{
ImmutableList<String>[] array = new ImmutableList[]{
Lists.immutable.of(),
Lists.immutable.of(""),
Lists.immutable.of("", ""),
Lists.immutable.of("", "", ""),
Lists.immutable.of("", "", "", ""),
Lists.immutable.of("", "", "", "", ""),
Lists.immutable.of("", "", "", "", "", ""),
Lists.immutable.of("", "", "", "", "", "", ""),
Lists.immutable.of("", "", "", "", "", "", "", ""),
Lists.immutable.of("", "", "", "", "", "", "", "", ""),
Lists.immutable.of("", "", "", "", "", "", "", "", "", ""),
Lists.immutable.of("", "", "", "", "", "", "", "", "", "", "")
};
Assertions.assertEquals(496L,
GraphLayout.parseInstance(array).totalSize());
System.out.println(GraphLayout.parseInstance(array).toFootprint());
}

The following is the output from JOL after calling GraphLayout.parseInstance(array).toFootprint().

Note: I shortened the package names manually to remove the scroll bar.

COUNT   AVG   SUM   DESCRIPTION
1 16 16 [B
1 64 64 [Ljava.lang.String;
1 24 24 java.lang.String
1 16 16 org.ecl.co.impl.list.imm.ImmutableArrayList
1 56 56 org.ecl.co.impl.list.imm.ImmutableDecapletonList
1 24 24 org.ecl.co.impl.list.imm.ImmutableDoubletonList
1 16 16 org.ecl.co.impl.list.imm.ImmutableEmptyList
1 48 48 org.ecl.co.impl.list.imm.ImmutableNonupletonList
1 48 48 org.ecl.co.impl.list.imm.ImmutableOctupletonList
1 32 32 org.ecl.co.impl.list.imm.ImmutableQuadrupletonList
1 32 32 org.ecl.co.impl.list.imm.ImmutableQuintupletonList
1 40 40 org.ecl.co.impl.list.imm.ImmutableSeptupletonList
1 40 40 org.ecl.co.impl.list.imm.ImmutableSextupletonList
1 16 16 org.ecl.co.impl.list.imm.ImmutableSingletonList
1 24 24 org.ecl.co.impl.list.imm.ImmutableTripletonList
15 496 (total)

Consider the following code and JOL output for the small immutable collections in the JDK to show the difference where class names do not give you as much information.

@Test
public void jdkListOfToSizeEleven()
{
List<String>[] array = new List[]{
List.of(),
List.of(""),
List.of("", ""),
List.of("", "", ""),
List.of("", "", "", ""),
List.of("", "", "", "", ""),
List.of("", "", "", "", "", ""),
List.of("", "", "", "", "", "", ""),
List.of("", "", "", "", "", "", "", ""),
List.of("", "", "", "", "", "", "", "", ""),
List.of("", "", "", "", "", "", "", "", "", ""),
List.of("", "", "", "", "", "", "", "", "", "", "")
};
Assertions.assertEquals(776L,
GraphLayout.parseInstance(array).totalSize());
System.out.println(GraphLayout.parseInstance(array).toFootprint());
}

JOL Output:

COUNT   AVG   SUM   DESCRIPTION
1 16 16 [B
10 43 432 [Ljava.lang.Object;
1 16 16 java.lang.Object
1 24 24 java.lang.String
2 24 48 java.util.ImmutableCollections$List12
10 24 240 java.util.ImmutableCollections$ListN
25 776 (total)

You’ll notice that there are two instances of List12 and ten instances of ListN. If you know that List12 should actually be read as ListOneTwo and not ListTwelve, you will at least know that all instances have a size of one or two.

In 2004, once we had made the changes to our code using the small FixedSizeCollection implementations, we got immediate feedback on the changes to the size distributions of our small collections any time we looked at jmap -histo <pid>. Changes in code or data could have caused changes in these sizes.

Best of Luck on Your Journey

After almost 20 years, I am telling you this story because you can now buy a MacBook Pro with as many cores (12) and as much memory (96gig) as I had access to between 2004–2010 on big Solaris servers. I don’t know if this will result in more applications with large heaps and lots of collections being built now, but I do have to believe it will not result in less.

If you’re heading out for your own Martian application experience soon, remember that there may be solutions available for memory waste issues you might encounter. The JDK is continually improving and finding ways to save memory that were out of reach for me two decades ago. The Core JDK team has prioritized balancing throughput with memory savings, which is great. Eclipse Collections focused primarily on memory savings in the ImmutableCollection implementations, based on my challenging experience in 2004.

Project Lilliput and Project Valhalla will be two of the most important changes to the JDK regarding memory savings and performance. Both projects are complementary. If you’ve never heard of these OpenJDK projects, you should check out the links I provided to them above. We are fortunate to have such an amazing language and library as the JDK that continues to evolve after 28 years.

There are two articles from Aleksey Shipilёv I would recommend reading whether or not you find yourself in a dire situation involving memory or performance:

JVM Anatomy Quark #24: Object Alignment

The Black Magic of (Java) Method Dispatch

Thank you for reading this story. I hope you find the lessons and information in this story useful in your travels.

Enjoy!

I am the creator of and committer for the Eclipse Collections OSS project, which is managed at the Eclipse Foundation. Eclipse Collections is open for contributions.

--

--

Java Champion. Creator of the Eclipse Collections OSS Java library (https://github.com/eclipse/eclipse-collections). Inspired by Smalltalk. Opinions are my own.