Idiomatic Kotlin: Sequences
This article is a part of the Idiomatic Kotlin series. The complete list is at the bottom of the article.
In this article, we will talk about Kotlin sequences in a manner consistent with the Idiomatic Kotlin series. I have written an article previously regarding sequences and Java streams and I encourage you to check them out first here and here.
What is a Sequence?
Sequence is similar to Java streams in the sense that they both serve as a data source and you can perform operations on them in a declarative (or functional) way. They are not data structures or storage. They are simply abstractions of the underlying source but with an added support for aggregate operations.
Motivation
The main selling point of a sequence in contrast with the standard collections is that they evaluate operations lazily and minimizes creation of intermediate data sources. They are perfect for large collections that passes through a number of operations in its processing pipeline.
Sequence is similar in function to Java Streams. Kotlin created its own implementation to support this feature pre-Java 8.
How to create a Sequence?
There are a number of ways to define a sequence. Some of them are listed below.
- From a collection using the
asSequence
extension function
listOf<String>().asSequence()
mapOf<String, String>().asSequence()
- Using the
Sequence
function
Sequence { listOf<String>().listIterator() }
- Using the sequenceOf function
sequenceOf(1,2,3)
Sequence operations and operator classifications are described in detail in my previous article (here if you missed it at the start).
Sequence under the hood
Let us investigate how sequences processes operations lazily compared to standard collections. First we take a look at filter
operation of List
.
Notice that an invoke to the filter
method immediately instantiates an ArrayList and the entire collection is iterated eagerly to produce a new list with elements that satisfies the predicate as output.
Now let’s take a look at the same function in sequence.
The code is longer than expected but we do not need to analyze it all to understand how lazy loading happens. We only need to look at a few key things.
First let us check how exactly a Sequence
is defined.
Sequence
is a generic interface with an iterator. This iterator will be used to iterate over the the sequence. So far so good.
Now, the filter method. FilteringSequence
is an implementation of the Sequence interface. Notice that there are no init blocks, meaning no processing done upon invocation. There are no containers defined as well. The processing is done in the iterator methods. When the iterator method is invoked, it performs a filtering function to the sequence element before returning a new sequence item downstream (if it satisfies the predicate). If you haven’t notice, this is laziness. Nothing is done eagerly and only a terminating or a stateful operation (the one that iterates over the sequence) can trigger the iterator methods.
Notes
Unlike Java streams, some Sequences can be iterated multiple times. The official documentation states:
Sequences can be iterated multiple times, however some sequence implementations might constrain themselves to be iterated only once. That is mentioned specifically in their documentation (e.g. [generateSequence] overload).
The latter sequences throw an exception on an attempt to iterate them the second time.
Check out the other articles in the idiomatic kotlin series. The sample source code for each article can be found here in Github.
- Extension Functions
- Sealed Classes
- Infix Functions
- Class Delegation
- Local functions
- Object and Singleton
- Sequences
- Lambdas and SAM constructors
- Lambdas with Receiver and DSL
- Elvis operator
- Property Delegates and Lazy
- Higher-order functions and Function Types
- Inline functions
- Lambdas and Control Flows
- Reified Parameters
- Noinline and Crossinline
- Variance
- Annotations and Reflection
- Annotation Processor and Code Generation