Streams#
Stream API simplifies the process of manupulating collection of objects. It supports operations like map
, filter
, limit
, reduce
, find
, match
and sort
in a declarative way.
Collectors#
Collectors in Java are a part of the java.util.stream
package and are typically used with streams to perform various operations such as grouping, summarizing, and partitioning data.
Collecting to a List:
List<String> list = stream.collect(Collectors.toList());
Collecting to a Set:
Set<String> set = stream.collect(Collectors.toSet());
Joining Strings:
String result = stream.collect(Collectors.joining(", "));
Summarizing Integers:
IntSummaryStatistics summary = stream.collect(Collectors.summarizingInt(Integer::intValue));
Grouping by a Classifier:
Map<Integer, List<String>> groupedByLength = stream.collect(Collectors.groupingBy(String::length));
Partitioning by a Predicate:
Map<Boolean, List<String>> partitioned = stream.collect(Collectors.partitioningBy(s -> s.length() > 3));
Counting Elements:
long count = stream.collect(Collectors.counting());
Mapping and Collecting:
List<Integer> lengths = stream.collect(Collectors.mapping(String::length, Collectors.toList()));
Reducing to a Single Value:
Optional<String> concatenated = stream.collect(Collectors.reducing((s1, s2) -> s1 + s2));
Collectors provide a powerful way to perform complex collection operations succinctly and efficiently in Java.
Reduce#
Optional<T> reduce(BinaryOperator<T> accumulator)
List<Integer> spendings = List.of(14, 22, 10, 18, 16, 15, 20);
int total = spendings
.stream()
.reduce((partialSum, next) partialSum + next)
.orElse(0);
Collect#
<R> R collect(
Supplier<R> supplier,
BiConsumer<R, ? super T> accumulator,
BiConsumer<R, R> combiner
);
ArrayList<String> strings = stream.collect(
() -> new ArrayList<>(),
(collection, element) -> collection.add(element.toString()),
(collection1, collection2) -> collection1.addAll(collection2)
);
Pulling the mapping operation from accumulator function:
ArrayList<String> strings =
stream
.map(Object::toString)
.collect(ArrayList::new, ArrayList::add, ArrayList::addAll);
Processing methods#
String top3EmpNames =
employees()
.stream()
.filter(Employee::isActive)
.limit(3)
.collect(Collectors.joining(", "));
map() vs flatMap()#
Collectors#
collect(Collectors.toList());
collect(Collectors.toSet());
collect(Collectors.toMap(e -> e.name, e -> e));
collect(Collectors.joining(", "));
Group By department
Map<Department, List<Employee>> deptWiseEmployees =
employees
.stream()
.collect(Collectors.groupingBy(e -> e.getDepartment()));
Department wise count of employees
Map<Department, Long> deptWiseEmpCount =
employees
.stream()
.collect(Collectors.groupingBy(Employee::getDepartment, Collectors.counting()));
Reduction, concurrency and ordering#
Map<Buyer, List<Transaction>> salesByBuyer =
txns
.parallelStream();
.collect(Collectors.groupingBy(Transaction::getBuyer));
Parallel Streams#
Beneficial only when have a significant size of data like 10K+ else, the overhead might be more and doesn’t give any improvement in speed. Benchmarking is the way to go.