23 KiB
Handling maps in Groovy vs Java
Discover the differences in map handling between Groovy and Java with this hands-on demo.
Image by: WOCinTech Chat. Modified by Opensource.com. CC BY-SA 4.0
Java is a great programming language, but sometimes I want a Java-like language that's just a bit more flexible and compact. That's when I opt for Groovy.
In a recent article, I reviewed some of the differences between creating and initializing maps in Groovy and doing the same thing in Java. In brief, Groovy has a concise syntax for setting up maps and accessing map entries compared to the effort necessary in Java.
This article will delve into more differences in map handling between Groovy and Java. For that purpose, I will use the sample table of employees used for demoing the JavaScript DataTables library. To follow along, start by making sure you have recent versions of Groovy and Java installed on your computer.
Install Java and Groovy
Groovy is based on Java and requires a Java installation as well. A recent and/or decent version of Java and Groovy might already be in your Linux distribution's repositories, or you can download and install Groovy from the Apache Groovy website. A good option for Linux users is SDKMan, which can be used to get multiple versions of Java, Groovy, and many other related tools. For this article, I'm using SDK's releases of:
- Java: version 11.0.12-open of OpenJDK 11
- Groovy: version 3.0.8.
Back to the problem: maps
First, in my experience, maps and lists (or at least arrays) often end up in the same program. For example, processing an input file is very similar to passing over a list; often, I do that when I want to categorize data encountered in the input file (or list), storing some kind of value in lookup tables, which are just maps.
Second, Java 8 introduced the whole Streams functionality and lambdas (or anonymous functions). In my experience, converting input data (or lists) into maps often involves using Java Streams. Moreover, Java Streams are at their most flexible when dealing with streams of typed objects, providing grouping and accumulation facilities out of the box.
Employee list processing in Java
Here's a concrete example based on those fictitious employee records. Below is a Java program that defines an Employee class to hold the employee information, builds a list of Employee instances, and processes that list in a few different ways:
1 import java.lang.*;
2 import java.util.Arrays;
3 import java.util.Locale;
4 import java.time.format.DateTimeFormatter;
5 import java.time.LocalDate;
6 import java.time.format.DateTimeParseException;
7 import java.text.NumberFormat;
8 import java.text.ParseException;
9 import java.util.stream.Collectors;
10 public class Test31 {
11 static public void main(String args[]) {
12 var employeeList = Arrays.asList(
13 new Employee("Tiger Nixon", "System Architect",
14 "Edinburgh", "5421", "2011/04/25", "$320,800"),
15 new Employee("Garrett Winters", "Accountant",
6 "Tokyo", "8422", "2011/07/25", "$170,750"),
...
81 new Employee("Martena Mccray", "Post-Sales support",
82 "Edinburgh", "8240", "2011/03/09", "$324,050"),
83 new Employee("Unity Butler", "Marketing Designer",
84 "San Francisco", "5384", "2009/12/09", "$85,675")
85 );
86 // calculate the average salary across the entire company
87 var companyAvgSal = employeeList.
88 stream().
89 collect(Collectors.averagingDouble(Employee::getSalary));
90 System.out.println("company avg salary = " + companyAvgSal);
91 // calculate the average salary for each location,
92 // compare to the company average
93 var locationAvgSal = employeeList.
94 stream().
95 collect(Collectors.groupingBy((Employee e) ->
96 e.getLocation(),
97 Collectors.averagingDouble(Employee::getSalary)));
98 locationAvgSal.forEach((k,v) ->
99 System.out.println(k + " avg salary = " + v +
100 "; diff from avg company salary = " +
101 (v - companyAvgSal)));
102 // show the employees in Edinburgh approach #1
103 System.out.print("employee(s) in Edinburgh (approach #1):");
104 var employeesInEdinburgh = employeeList.
105 stream().
106 filter(e -> e.getLocation().equals("Edinburgh")).
107 collect(Collectors.toList());
108 employeesInEdinburgh.
109 forEach(e ->
110 System.out.print(" " + e.getSurname() + "," +
111 e.getGivenName()));
112 System.out.println();
113 // group employees by location
114 var employeesByLocation = employeeList.
115 stream().
116 collect(Collectors.groupingBy(Employee::getLocation));
117 // show the employees in Edinburgh approach #2
118 System.out.print("employee(s) in Edinburgh (approach #2):");
119 employeesByLocation.get("Edinburgh").
120 forEach(e ->
121 System.out.print(" " + e.getSurname() + "," +
122 e.getGivenName()));
123 System.out.println();
124 }
125 }
126 class Employee {
127 private String surname;
128 private String givenName;
129 private String role;
130 private String location;
131 private int extension;
132 private LocalDate hired;
133 private double salary;
134 public Employee(String fullName, String role, String location,
135 String extension, String hired, String salary) {
136 var nn = fullName.split(" ");
137 if (nn.length > 1) {
138 this.surname = nn[1];
139 this.givenName = nn[0];
140 } else {
141 this.surname = nn[0];
142 this.givenName = "";
143 }
144 this.role = role;
145 this.location = location;
146 try {
147 this.extension = Integer.parseInt(extension);
148 } catch (NumberFormatException nfe) {
149 this.extension = 0;
150 }
151 try {
152 this.hired = LocalDate.parse(hired,
153 DateTimeFormatter.ofPattern("yyyy/MM/dd"));
154 } catch (DateTimeParseException dtpe) {
155 this.hired = LocalDate.EPOCH;
156 }
157 try {
158 this.salary = NumberFormat.getCurrencyInstance(Locale.US).
159 parse(salary).doubleValue();
160 } catch (ParseException pe) {
161 this.salary = 0d;
162 }
163 }
164 public String getSurname() { return this.surname; }
165 public String getGivenName() { return this.givenName; }
166 public String getLocation() { return this.location; }
167 public int getExtension() { return this.extension; }
168 public LocalDate getHired() { return this.hired; }
169 public double getSalary() { return this.salary; }
170 }
Wow, that's a lot of code for a simple demo program! I'll go through it in chunks first.
Starting at the end, lines 126 through 170 define the Employee
class used to store employee data. The most important thing to mention here is that the fields of the employee record are of different types, and in Java that generally leads to defining this type of class. You could make this code a bit more compact by using Project Lombok's @Data annotation to automatically generate the getters (and setters) for the Employee
class. In more recent versions of Java, I can declare these sorts of things as a record rather than a class, since the whole point is to store data. Storing the data as a list of Employee
instances facilitates the use of Java streams.
Lines 12 through 85 create the list of Employee
instances, so now you've already dealt with 119 of 170 lines.
There are nine lines of import statements up front. Interestingly, there are no map-related imports! This is partly because I'm using stream methods that yield maps as their results, and partly because I'm using the var
keyword to declare variables, so the type is inferred by the compiler.
The interesting parts of the above code happen in lines 86 through 123.
In lines 87-90, I convert employeeList
into a stream (line 88) and then use collect()
to apply the Collectors.averagingDouble()
method to the Employee::getSalary
(line 89) method to calculate the average salary across the whole company. This is pure functional list processing; no maps are involved.
In lines 93-101, I convert employeeList
into a stream again. I then use the Collectors.groupingBy()
method to create a map whose keys are employee locations, returned by e.getLocation()
, and whose values are the average salary for each location, returned by Collectors.averagingDouble()
again applied to the Employee::getSalary
method applied to each employee in the location subset, rather than the entire company. That is, the groupingBy()
method creates subsets by location, which are then averaged. Lines 98-101 use forEach()
to step through the map entries printing location, average salary, and the difference between the location averages and company average.
Now, suppose you wanted to look at just those employees located in Edinburgh. One way to accomplish this is shown in lines 103-112, where I use the stream filter()
method to create a list of only those employees based in Edinburgh and the forEach()
method to print their names. No maps here, either.
Another way to solve this problem is shown in lines 113-123. In this method, I create a map where each entry holds a list of employees by location. First, in lines 113-116, I use the groupingBy()
method to produce the map I want with keys of employee locations whose values are sublists of employees at that location. Then, in lines 117-123, I use the forEach()
method to print out the sublist of names of employees at the Edinburgh location.
When we compile and run the above, the output is:
company avg salary = 292082.5
San Francisco avg salary = 284703.125; diff from avg company salary = -7379.375
New York avg salary = 410158.3333333333; diff from avg company salary = 118075.83333333331
Singapore avg salary = 357650.0; diff from avg company salary = 65567.5
Tokyo avg salary = 206087.5; diff from avg company salary = -85995.0
London avg salary = 322476.25; diff from avg company salary = 30393.75
Edinburgh avg salary = 261940.7142857143; diff from avg company salary = -30141.78571428571
Sydney avg salary = 90500.0; diff from avg company salary = -201582.5
employee(s) in Edinburgh (approach #1): Nixon,Tiger Kelly,Cedric Frost,Sonya Flynn,Quinn Rios,Dai Joyce,Gavin Mccray,Martena
employee(s) in Edinburgh (approach #2): Nixon,Tiger Kelly,Cedric Frost,Sonya Flynn,Quinn Rios,Dai Joyce,Gavin Mccray,Martena
Employee list processing in Groovy
Groovy has always provided enhanced facilities for processing lists and maps, partly by extending the Java Collections library and partly by providing closures, which are somewhat like lambdas.
One outcome of this is that maps in Groovy can easily be used with different types of values. As a result, you can't be pushed into making the auxiliary Employee class; instead, you can just use a map. Let's examine a Groovy version of the same functionality:
1 import java.util.Locale
2 import java.time.format.DateTimeFormatter
3 import java.time.LocalDate
4 import java.time.format.DateTimeParseException
5 import java.text.NumberFormat
6 import java.text.ParseException
7 def employeeList = [
8 ["Tiger Nixon", "System Architect", "Edinburgh",
9 "5421", "2011/04/25", "\$320,800"],
10 ["Garrett Winters", "Accountant", "Tokyo",
11 "8422", "2011/07/25", "\$170,750"],
...
76 ["Martena Mccray", "Post-Sales support", "Edinburgh",
77 "8240", "2011/03/09", "\$324,050"],
78 ["Unity Butler", "Marketing Designer", "San Francisco",
79 "5384", "2009/12/09", "\$85,675"]
80 ].collect { ef ->
81 def surname, givenName, role, location, extension, hired, salary
82 def nn = ef[0].split(" ")
83 if (nn.length > 1) {
84 surname = nn[1]
85 givenName = nn[0]
86 } else {
87 surname = nn[0]
88 givenName = ""
89 }
90 role = ef[1]
91 location = ef[2]
92 try {
93 extension = Integer.parseInt(ef[3]);
94 } catch (NumberFormatException nfe) {
95 extension = 0;
96 }
97 try {
98 hired = LocalDate.parse(ef[4],
99 DateTimeFormatter.ofPattern("yyyy/MM/dd"));
100 } catch (DateTimeParseException dtpe) {
101 hired = LocalDate.EPOCH;
102 }
103 try {
104 salary = NumberFormat.getCurrencyInstance(Locale.US).
105 parse(ef[5]).doubleValue();
106 } catch (ParseException pe) {
107 salary = 0d;
108 }
109 [surname: surname, givenName: givenName, role: role,
110 location: location, extension: extension, hired: hired, salary: salary]
111 }
112 // calculate the average salary across the entire company
113 def companyAvgSal = employeeList.average { e -> e.salary }
114 println "company avg salary = " + companyAvgSal
115 // calculate the average salary for each location,
116 // compare to the company average
117 def locationAvgSal = employeeList.groupBy { e ->
118 e.location
119 }.collectEntries { l, el ->
120 [l, el.average { e -> e.salary }]
121 }
122 locationAvgSal.each { l, a ->
123 println l + " avg salary = " + a +
124 "; diff from avg company salary = " + (a - companyAvgSal)
125 }
126 // show the employees in Edinburgh approach #1
127 print "employee(s) in Edinburgh (approach #1):"
128 def employeesInEdinburgh = employeeList.findAll { e ->
129 e.location == "Edinburgh"
130 }
131 employeesInEdinburgh.each { e ->
132 print " " + e.surname + "," + e.givenName
133 }
134 println()
135 // group employees by location
136 def employeesByLocation = employeeList.groupBy { e ->
137 e.location
138 }
139 // show the employees in Edinburgh approach #2
140 print "employee(s) in Edinburgh (approach #1):"
141 employeesByLocation["Edinburgh"].each { e ->
142 print " " + e.surname + "," + e.givenName
143 }
144 println()
Because I am just writing a script here, I don't need to put the program body inside a method inside a class; Groovy handles that for us.
In lines 1-6, I still need to import the classes needed for the data parsing. Groovy imports quite a bit of useful stuff by default, including java.lang.*
and java.util.*
.
In lines 7-90, I use Groovy's syntactic support for lists as comma-separated values bracketed by [
and ]
. In this case, there is a list of lists; each sublist is the employee data. Notice that you need the \
in front of the $
in the salary field. This is because a $
occurring inside a string surrounded by double quotes indicates the presence of a field whose value is to be interpolated into the string. An alternative would be to use single quotes.
But I don't want to work with a list of lists; I would rather have a list of maps analogous to the list of Employee class instances in the Java version. I use the Groovy Collection.collect()
method in lines 90-111 to take apart each sublist of employee data and convert it into a map. The collect method takes a Groovy Closure argument, and the syntax for creating a closure surrounds the code with {
and }
and lists the parameters as a, b, c ->
in a manner similar to Java's lambdas. Most of the code looks quite similar to the constructor method in the Java Employee class, except that there are items in the sublist rather than arguments to the constructor. However, the last two lines—
[surname: surname, givenName: givenName, role: role,
location: location, extension: extension, hired: hired, salary: salary]
—create a map with keys surname
, givenName
, role
, location
, extension
, hired
, and salary
. And, since this is the last line of the closure, the value returned to the caller is this map. No need for a return statement. No need to quote these key values; Groovy assumes they are strings. In fact, if they were variables, you would need to put them in parentheses to indicate the need to evaluate them. The value assigned to each key appears on its right side. Note that this is a map whose values are of different types: The first four are String
, then int
, LocalDate
, and double
. It would have been possible to define the sublists with elements of those different types, but I chose to take this approach because the data would often be read in as string values from a text file.
The interesting bits appear in lines 112-144. I've kept the same kind of processing steps as in the Java version.
In lines 112-114, I use the Groovy Collection average()
method, which like collect()
takes a Closure argument, here iterating over the list of employee maps and picking out the salary
value. Note that using these methods on the Collection class means you don't have to learn how to transform lists, maps, or some other element to streams and then learn the stream methods to handle your calculations, as in Java. For those who like Java Streams, they are available in newer Groovy versions.
In lines 115-125, I calculate the average salary by location. First, in lines 117-119, I transform employeeList
, which is a list of maps, into a map, using the CollectiongroupBy()
method, whose keys are the location values and whose values are linked sublists of the employee maps pertaining to that location. Then I process those map entries with the collectEntries()
method, using the average()
method to compute the average salary for each location.
Note that collectEntries()
passes each key (location) and value (employee sublist at that location) into the closure (the l, el ->
string) and expects a two-element list of key (location) and value (average salary at that location) to be returned, converting those into map entries. Once I have the map of average salaries by location, locationAvgSal
, I can print it out using the Collection each()
method, which also takes a closure. When each()
is applied to a map, it passes in the key (location) and value (average salary) in the same way as collectEntries()
.
In lines 126-134, I filter the employeeList
to get a sublist of employeesInEdinburgh
, using the findAll()
method, which is analogous to the Java Streams filter()
method. And again, I use the each()
method to print out the sublist of employees in Edinburgh.
In lines 135-144, I take the alternative approach of grouping the employeeList
into a map of employee sublists at each location, employeesByLocation
. Then in lines 139-144, I select the employee sublist at Edinburgh, using the expression employeesByLocation[“Edinburgh”]
and the each()
method to print out the sublist of employee names at that location.
Why I often prefer Groovy
Maybe it's just my familiarity with Groovy, built up over the last 12 years or so, but I feel more comfortable with the Groovy approach to enhancing Collection with all these methods that take a closure as an argument, rather than the Java approach of converting the list, map, or whatever is at hand to a stream and then using streams, lambdas, and data classes to handle the processing steps. I seem to spend a lot more time with the Java equivalents before I get something working.
I'm also a huge fan of strong static typing and parameterized types, such as Map,employee>
,employee>
as found in Java. However, on a day-to-day basis, I find that the more relaxed approach of lists and maps accommodating different types does a better job of supporting me in the real world of data without requiring a lot of extra code. Dynamic typing can definitely come back to bite the programmer. Still, even knowing that I can turn static type checking on in Groovy, I bet I haven't done so more than a handful of times. Maybe my appreciation for Groovy comes from my work, which usually involves bashing a bunch of data into shape and then analyzing it; I'm certainly not your average developer. So is Groovy really a more Pythonic Java? Food for thought.
I would love to see in both Java and Groovy a few more facilities like average()
and averagingDouble()
. Two-argument versions to produce weighted averages and statistical methods beyond averaging—like median, standard deviation, and so forth—would also be helpful. Tabnine offers interesting suggestions on implementing some of these.
Groovy resources
The Apache Groovy site has a lot of great documentation. Other good sources include the reference page for Groovy enhancements to the Java Collection class, the more tutorial-like introduction to working with collections, and Mr. Haki. The Baeldung site provides a lot of helpful how-tos in Java and Groovy. And a really great reason to learn Groovy is to learn Grails, a wonderfully productive full-stack web framework built on top of excellent components like Hibernate, Spring Boot, and Micronaut.
via: https://opensource.com/article/22/6/maps-groovy-vs-java
作者:Chris Hermansen 选题:lkxed 译者:译者ID 校对:校对者ID