Reactive Programming in the Netflix API with RxJava

by Ben Christensen and Jafar Husain

Our recent post on optimizing the Netflix API introduced how our web service endpoints are implemented using a reactive programming model for composition of asynchronous callbacks from our service layer.

This post takes a closer look at how and why we use the reactive model and introduces our open source project RxJava — a Java implementation of Rx (Reactive Extensions).

Embrace Concurrency

Server-side concurrency is needed to effectively reduce network chattiness. Without concurrent execution on the server, a single “heavy” client request might not be much better than many “light” requests because each network request from a device naturally executes in parallel with other network requests. If the server-side execution of a collapsed “heavy” request does not achieve a similar level of parallel execution it may be slower than the multiple “light” requests even accounting for saved network latency.

Java Futures are Expensive to Compose

Java Futures are straight-forward to use for a single level of asynchronous execution but they start to add non-trivial complexity when they’re nested (prior to Java 8 CompletableFuture).

Conditional asynchronous execution flows become difficult to optimally compose (particularly as latencies of each request vary at runtime) using Futures. It can be done of course, but it quickly becomes complicated (and thus error prone) or prematurely blocks on ‘Future.get()’, eliminating the benefit of asynchronous execution.

Callbacks Have Their Own Problems

Callbacks offer a solution to the tendency to block on Future.get() by not allowing anything to block. They are naturally efficient because they execute when the response is ready.

Similar to Futures though, they are easy to use with a single level of asynchronous execution but become unwieldy with nested composition.


Reactive programming offers efficient execution and composition by providing a collection of operators capable of filtering, selecting, transforming, combining and composing Observables.

The Observable data type can be thought of as a “push” equivalent to Iterable which is “pull”. With an Iterable, the consumer pulls values from the producer and the thread blocks until those values arrive. By contrast with the Observable type, the producer pushes values to the consumer whenever values are available. This approach is more flexible, because values can arrive synchronously or asynchronously.

The Observable type adds two missing semantics to the Gang of Four’s Observer pattern, which are available in the Iterable type:

  1. The ability for the producer to signal to the consumer that there is no more data available.
  2. The ability for the producer to signal to the consumer that an error has occurred.

With these two simple additions, we have unified the Iterable and Observable types. The only difference between them is the direction in which the data flows. This is very important because now any operation we perform on an Iterable, can also be performed on an Observable. Let’s take a look at an example…

* Asynchronously calls 'customObservableNonBlocking' and defines
* a chain of operators to apply to the callback sequence.
def simpleComposition() {
// fetch an asynchronous Observable<String>
// that emits 75 Strings of 'anotherValue_#'
// skip the first 10
// take the next 5
// transform each String with the provided function
.map({ stringValue -> return stringValue + "_transformed"})
// subscribe to the sequence and print each transformed String
.subscribe({ println "onNext => " + it})
// output:
onNext => anotherValue_10_transformed
onNext => anotherValue_11_transformed
onNext => anotherValue_12_transformed
onNext => anotherValue_13_transformed
onNext => anotherValue_14_transformed

Observable Service Layer

The Netflix API takes advantage of Rx by making the entire service layer asynchronous (or at least appear so) — all “service” methods return an Observable<T>.

Making all return types Observable combined with a functional programming style frees up the service layer implementation to safely use concurrency. It also enables the service layer implementation to:

  • conditionally return immediately from a cache
  • block instead of using threads if resources are constrained
  • use multiple threads
  • use non-blocking IO
  • migrate an underlying implementation from network based to in-memory cache

This can all happen without ever changing how client code interacts with or composes responses.

In short, client code treats all interactions with the API as asynchronous but the implementation chooses if something is blocking or non-blocking.

This next example code demonstrates how a service layer method can choose whether to synchronously return data from an in-memory cache or asynchronously retrieve data from a remote service and callback with the data once retrieved. In both cases the client code consumes it the same way.

* Non-blocking method that immediately returns the value
* if available or uses a thread to fetch the value and
* callback via `onNext()` when done.
def Observable<T> getData(int id) {
if(availableInMemory) {
// if data available return immediately with data
return Observable.create({ observer ->
} else {
// else spawn thread or async IO to fetch data
return Observable.create({ observer ->
try {
// do work on separate thread
T value = getValueFromRemoteService(id);
// callback with value
}catch(Exception e) {

Retaining this level of control in the service layer is a major architectural advantage particularly for maintaining and optimizing functionality over time. Many different endpoint implementations can be coded against an Observable API and they work efficiently and correctly with the current thread or one or more worker threads backing their execution.

The following code demonstrates the consumption of an Observable API with a common Netflix use case — a grid of movies:

* Demonstrate how Rx is used to compose Observables together
* such as how a web service would to generate a JSON response.
* The simulated methods for the metadata represent different
* services that are often backed by network calls.
* This will return a sequence of dictionaries such as this:
* [id:1000, title:video-1000-title, length:5428, bookmark:0,
* rating:[actual:4, average:3, predicted:0]]
def Observable getVideoGridForDisplay(userId) {
getListOfLists(userId).mapMany({ VideoList list ->
// for each VideoList we want to fetch the videos
.take(10) // we only want the first 10 of each list
.mapMany({ Video video ->
// for each video we want to fetch metadata
def m = video.getMetadata().map({
Map<String, String> md ->
// transform to the data and format we want
return [title: md.get("title"),
length: md.get("duration")]
def b = video.getBookmark(userId).map({
position ->
return [bookmark: position]
def r = video.getRating(userId).map({
VideoRating rating ->
return [rating:
[actual: rating.getActualStarRating(),
average: rating.getAverageStarRating(),
predicted: rating.getPredictedStarRating()]]
// compose these together
return, b, r, {
metadata, bookmark, rating ->
// now transform to complete dictionary of data
// we want for each Video
return [id: video.videoId] << metadata << bookmark << rating

// emits results such as
[id:1002, title:video-1002-title, length:5428, bookmark:0,
rating:[actual:2, average:4, predicted:3]]
[id:1003, title:video-1003-title, length:5428, bookmark:0,
rating:[actual:4, average:4, predicted:4]]
[id:1004, title:video-1004-title, length:5428, bookmark:0,
rating:[actual:4, average:1, predicted:1]]

That code is declarative and lazy as well as functionally “pure” in that no mutation of state is occurring that would cause thread-safety issues.

The API Service Layer is now free to change the behavior of the methods ‘getListOfLists’, ‘getVideos’, ‘getMetadata’, ‘getBookmark’ and ‘getRating’ — some blocking others non-blocking but all consumed the same way.

In the example, ‘getListOfLists’ pushes each ‘VideoList’ object via ‘onNext()’ and then ‘getVideos()’ operates on that same parent thread. The implementation of that method could however change from blocking to non-blocking and the code would not need to change.


RxJava is our implementation of Rx for the JVM and is available in the ReactiveX repository in Github (prior to September 2014 was in the Netflix repo).

It is not yet feature complete with the .Net version of Rx, but what is implemented has been in use for the past year in production within the Netflix API.

We are open sourcing the code as version 0.5 as a way to acknowledgement that it’s not yet feature complete. The outstanding work is logged in the RxJava Issues.

(Update: As of August 2014 the project hit the 1.0.0 Release Candidate milestone.)

Documentation is available on the RxJava Wiki including links to material available on the internet.

Some of the goals of RxJava are:

  • Stay close to the original Rx.Net implementation while adjusting naming conventions and idioms to Java
  • All contracts of Rx should be the same
  • Target the JVM not a language. The first languages supported (beyond Java itself) are Groovy, Clojure, Scala and JRuby. New language adapters can be contributed.
  • Support Java 6 (to include Android support) and higher with an eventual goal to target a build for Java 8 with its lambda support. (Update: Java 8 support was achieved without a separate build)

Here is an implementation of one of the examples above but using Clojure instead of Groovy:

(defn simpleComposition []
"Asynchronously calls 'customObservableNonBlocking' and defines a
chain of operators to apply to the callback sequence."
; fetch an asynchronous Observable<String>
; that emits 75 Strings of 'anotherValue_#'
; skip the first 10
(.skip 10)
; take the next 5
(.take 5)
; transform each String with the provided function
(.map #(str % "_transformed"))
; subscribe to the sequence and print each transformed String
(.subscribe #(println "onNext =>" %))))

; output
onNext => anotherValue_10_transformed
onNext => anotherValue_11_transformed
onNext => anotherValue_12_transformed
onNext => anotherValue_13_transformed
onNext => anotherValue_14_transformed


Reactive programming with RxJava has enabled Netflix developers to leverage server-side concurrency without the typical thread-safety and synchronization concerns. The API service layer implementation has control over concurrency primitives, which enables us to pursue system performance improvements without fear of breaking client code.

RxJava is effective on the server for us and it spreads deeper into our code the more we use it.

We hope you find the RxJava project as useful as we have and look forward to your contributions.

If this type of work interests you we are always looking for talented engineers.

September 2014 Update

  • This blog post originally used the term “functional reactive programming” or FRP. This term was used in error. RxJava does not implement “continuous time” which is a requirement for FRP from previous literature.
  • Updated to new ReactiveX location for RxJava.

Originally published at on February 4, 2013.