Grouping Fruity Collections
Published: 2025-12-01 10:45AM
Two previous blog posts have been inspired by the use of emojis in combination with collections of fruit or pets as per a related Eclipse Collections kata. Those posts had a bit of fun using deep learning, and clustering using k-means to do image recognition and color guessing based on emojis.
This post doesn’t look at machine learning topics, but instead looks at what might seem like a more mundane task, but one that is very common: grouping. Grouping occurs naturally in any datasets where relationships exist.
The groupBy method does the job perfectly for one-to-many relationships.
It appears as one of the "top 25" methods in the previously mentioned Eclipse Collections kata,
and is one of the examples carried over into the aforementioned Groovy blog posts:
assert Fruit.ALL.groupBy(Fruit::getColor) ==
Multimaps.mutable.list.empty()
.withKeyMultiValues(RED, Fruit.of('🍎'), Fruit.of('🍒'))
.withKeyMultiValues(ORANGE, Fruit.of('🍑'), Fruit.of('🍊'))
.withKeyMultiValues(YELLOW, Fruit.of('🍌'))
.withKeyMultiValues(MAGENTA, Fruit.of('🍇'))
We have a collection of fruit and we have colors.
The fruit is an enum. The colors are java.awt.Color values.
We won’t show the details of those but see the previous blog posts or the
associated GitHub repo if you want all the details.
As originally presented,
there is a one-to-many relationship between fruit and their color.
A fruit has one color, but there can be many fruit of a particular color.
That’s what groupBy allows us to explore.
In this post, we want to explore grouping in the context of many-to-many relationships.
As an example, suppose now that rather than just coming in one typical color,
in our case the predominant color of the supplied emoji, that multiple colors might be possible for
any given fruit: red and green apples, a green unripe banana, and so forth.
So, let’s expand the previous example so that rather than calling getColor to find the color,
we’ll call getColors to find a list of potential colors.
We’ll do that first using Eclipse Collections and then look at various possibilities for JDK collections when using Groovy.
Grouping Eclipse Collections Fruit Salad
We saw earlier that we could do some grouping using the groupBy method offered
by Eclipse collection classes implementing the RichIterable interface.
In fact, if we combine groupBy with some other methods like flatCollect (flatMap) or the inject
family of methods, we’d be able to represent the many fruit to many color relationship that we now
want to explore. However, the groupByEach method combines several steps into just one
and that’s what we’ll use here.
The groupByEach method didn’t make the cut of being in the "top 25" methods in the Eclipse
Collections kata but is exactly what we need for a many-to-many relationship
(many fruit and many potential colors).
For our example, we’ll add GREEN as a possible color for apples, bananas (unripe), and grapes.
Here is how we can explore the relationship between fruit and colors with these additions:
assert Fruit.ALL.groupByEach(Fruit::getColors) ==
Multimaps.mutable.list.empty()
.withKeyMultiValues(GREEN, Fruit.of('🍎'), Fruit.of('🍌'), Fruit.of('🍇'))
.withKeyMultiValues(RED, Fruit.of('🍎'), Fruit.of('🍒'))
.withKeyMultiValues(ORANGE, Fruit.of('🍑'), Fruit.of('🍊'))
.withKeyMultiValues(YELLOW, Fruit.of('🍌'))
.withKeyMultiValues(MAGENTA, Fruit.of('🍇'))
Grape colors are sometimes known by the juice or wine they produce (red and white) and sometimes by their skin color, green and purple or magenta. We’ll stick with the latter for the purposes of this post.
Grouping JDK Collections Fruit Salad
If we want to achieve the same thing for JDK collections, we have a few options.
We could consider stream functionality but, like with Eclipse Collections, there
are ways to achieve what we want building on the fairly widely known groupBy
functionality offered by Groovy. We’ll explore that next, but first let’s
look at our expected result. It will be similar to what we had for Eclipse
Collections but just using normal JDK lists and maps:
var expected = [
(GREEN) : [Fruit.of('🍎'), Fruit.of('🍌'), Fruit.of('🍇')],
(RED) : [Fruit.of('🍎'), Fruit.of('🍒')],
(ORANGE) : [Fruit.of('🍑'), Fruit.of('🍊')],
(YELLOW) : [Fruit.of('🍌')],
(MAGENTA) : [Fruit.of('🍇')]
]
Note that Groovy defaults to keys being String values in its literal map notation,
so we use round brackets around key values so that Groovy will use keys with java.awt.Color
values like we have been using in earlier examples.
Now, we can find fruits by color simply by using groupBy
in combination with collectMany (flatMap) and collectEntries (or we could use inject):
assert expected == Fruit.values()
.collectMany(f -> f.colors.collect{ c -> [c, f] })
.groupBy{ c, f -> c }
.collectEntries{ k, v -> [k, v*.get(1)] }
This works well but isn’t necessarily obvious at first glance.
Grouping JDK Collections Fruit Salad with GQuery
Dealing with many-to-many relationships is very common in database systems. Query languages like SQL have special support for querying such relationships. It should come as no surprise then, that Groovy’s integrated query technology, GQuery (groovy-ginq), would also support such relationships.
Here is the same example again using GQuery:
assert expected == GQL {
from f in Fruit.values()
crossjoin c in Fruit.values()*.colors.sum().toSet()
where c in f.colors
groupby c
select c, list(f)
}.collectEntries()
The crossjoin gives us the cross-product and the where and groupby clauses
select the desired elements. The collectEntries at the end converts from GQuery’s
tabular format to the map used for our expectation.
Grouping JDK Collections Fruit Salad in Groovy 6
Inspired by the groupByEach method from
Eclipse Collections and the examples in the
Eclipse Collections Categorically book,
the Groovy team has recently added a groupByMany method. This is in Groovy 6
which is still in the alpha/snapshot stage of evolution, so is subject to change.
Using groupByMany our example becomes:
assert expected == Fruit.values().groupByMany(Fruit::getColors)
That was easy! And that’s the appeal of adding this method to Groovy.
Let’s look at some other variations. One variant takes a second closure which allows the value to be transformed (mapped). In our case, we’ll just get the emoji representation for our fruit rather than the enum used in previous examples:
assert Fruit.values().groupByMany(Fruit::getEmoji, Fruit::getColors) == [
(GREEN) : ['🍎', '🍌', '🍇'],
(RED) : ['🍎', '🍒'],
(ORANGE) : ['🍑', '🍊'],
(YELLOW) : ['🍌'],
(MAGENTA) : ['🍇']
]
In more typical cases, you might have domain classes and mapping from some
domain, e.g. Person to some desired value, e.g. a String name for the person.
As another example, let’s group the fruit by the vowels they contain:
var vowels = 'AEIOU'.toSet()
var vowelsOf = { String word -> word.toSet().intersect(vowels) }
assert Fruit.values().groupByMany(Fruit::getEmoji, f -> vowelsOf(f.name())) == [
A: ['🍎', '🍑', '🍌', '🍊', '🍇'],
E: ['🍎', '🍑', '🍒', '🍊', '🍇'],
O: ['🍊']
]
Our ORANGE fruit makes all three lists. BANANA and CHERRY are just
in one list. The other fruit all have both A and E.
There is also a variant of groupByMany taking no parameters.
It caters for Maps where the value is already a list of the appropriate keys.
As an example, suppose we want to buy fruit locally.
I’ll roughly base this on subtropical Brisbane, but you could modify as appropriate
if you are interested.
We might now be interested in knowing when seasonal fruit will be available:
var availability = [
'🍎': ['Spring'],
'🍌': ['Spring', 'Summer', 'Autumn', 'Winter'],
'🍇': ['Spring', 'Autumn'],
'🍒': ['Autumn'],
'🍑': ['Spring']
]
assert availability.groupByMany() == [
Winter: ['🍌'],
Autumn: ['🍌', '🍇', '🍒'],
Summer: ['🍌'],
Spring: ['🍎', '🍌', '🍇', '🍑']
]
[Sorry U.S. folks, Autumn has the same number of letters as the other season names and makes the last map look prettier - Fall just didn’t cut it this time!]
Further information
-
Repo with example code: https://github.com/paulk-asert/fruity-eclipse-collections
-
Eclipse collections homepage: https://www.eclipse.org/collections/
-
Eclipse Collections Categorically: https://www.amazon.com/Eclipse-Collections-Categorically-Level-programming/dp/B0DZVK69D3
