LINQ in Javascript

Ondrej Hlouzek
The Startup
Published in
29 min readJul 23, 2020

LINQ in Javascript is a great simple tool for manipulating data collections in javascript. You can easily transform, sort, select and combine data collections using javascript commands in react, angular, etc.

Photo by Cláudio Luiz Castro on Unsplash

LINQ (Language Integrated Query)

LINQ is the name for a set of technologies that can be used to make adjustments to data sources in simple and easy-to-read operations. In our case, we will be based on an array of objects, which is widely used, especially in javascript.

Who program in C# certainly know advantage of this technology, which has existed since 3.0 version under the .NET Framework 3.5.

Why LINQ in javascript?

The benefit of LINQ is a simple and readable syntax for accessing and manipulating data. It makes easier transformation of data structures into others, sorting and merging data between multiple sources.

Data sources

A huge advantage is that the source does not have to be just a simple an array (table) but it can be a complex structured array of data. Nesting objects and collections in other objects.

Project goal

In this article, we will be based exclusively on practice and the most commonly used forms, which often occur in real projects. We will not show only simple examples. In most cases, more complex adjustments are needed, both at the request of superiors and to optimize and speed up queries. Next, I will deal with cases that the developer often solves when manipulating data, but on the internet, the process is often confusing or unnecessarily complicated.

The aim of this project is to demonstrate technology that will simplify your work and solve situations that often occur in real projects and using javascript is much more complicated, almost “impossible”. I believe that you will find the benefits of this technology and will be a good helper in future. From my many years of experience, I can assure you that this technology will save you a lot of time and nights, and at the same time this code will be much more readable for developers who will read back your code.

It’s interesting? so let’s do it!

Npm LINQ

Thanks to the efforts of the developers who created the npm linq package, this technology can be used in javascript projects or, as in my case, in the react web framework.

Install the npm package

npm install linq

Import LINQ into the project

Import Enumerable from 'linq';orconst Enumerable = require(‘linq’);

Method of writing

The method of writing can be divided into three parts.

  • Input — Wrap array and gain access to Enumerable.from() transformation methods
  • Transformations — Transformation, sorting and merging of data. Using transformation commands, you can transform data into new structures, create new objects and connect multiple data sources into one .select(), .groupBy(), .join() ...
  • Output — return the result to the array using the .toArray()method. Calling the .toArray() command calls the operation itself and returns the result in the collection of the Enumerable<T> element.

Enumerable.from()

Enumerable.from<T>(obj: T[]): IEnumerable<T>;

To access the advantages of LINQ technology, we use the static method Enumerable.from(obj: T[]) which returns an object of type IEnumerable<T>. IEnumerableis an interface of collection and T is an object of generic type (collection item).

var _arr = [1, 2, 3, 4];
var _linq = Enumerable.from(_arr);

Wrapping the array with this static command gives us access to a large number of commands that can be used to manipulate data. A list of all functions can be found in the github repository in the linq.d.ts file.

In practice, you will certainly encounter a situation where you want to generate a numerical series instead of data. For example, you may want to display data in a row over days. There are a number of methods available to help you with this. For example:

make<T>(element: T): IEnumerable<T>;range(start: number, count: number, step?: number): IEnumerable<number>;repeat<T>(element: T, count?: number): IEnumerable<T>;toInfinity(start?: number, step?: number): IEnumerable<number>;

Probably the most interesting for us will be the range method.

Enumerable.range()

range(start: number, count: number, step?: number): IEnumerable<number>;

This method returns a collection of consecutive numbers. start says the starting number. count the number of iterations and a step with a default value of 1 by how much the iterator moves on the next cycle. E.g. for range (10, 5, 2) the result will be 10, 12, 14, 16, 18, 20.

Enumerable.make()

export function make<T>(element: T): IEnumerable<T>;

If you have an object and want to convert it to a collection of objects with one element, use the make method. Why would you do something like that? For example, if you use a grid to display it, it probably expects an IEnumerable <T> data collection on the input.

Enumerable.repeat()

export function repeat<T>(element: T, count?: number): IEnumerable<T>;

The repeat function generates a collection of a specified amount of a particular T element.

Arrow function

Using the arrow functions, we get an even more readable code. This notation clearly indicates that we want to sort in descending order by the name field. Therefore, we will prefer this type of notation in the rest of the article.

orderByDescending(item => item.name);

While the following notation is much less readable

orderBy((item) { return { item.name }});

Select

Projects each element of a sequence into a new form.

select<TResult>(selector: (element: TSource, index: number) => TResult): IEnumerable<TResult>;

The select method is used to transform the source data into a new structure. The method is called for each TSource element and returns a new element of type TResult. It helps us to simply transform an object into a new object array structure, extend an existing object with new properties, or modify existing properties.

var array = [
{name: 'ondrej', lastname: 'hlouzek', age: 38},
{name: 'jakub', lastname: 'hlouzek', age: 39}
];
var result = Enumerable.from(array)
.select(user => ({
first_name: user.name,
last_name: user.lastname,
year_of_birth: 2020 - user.age,
adult: ((user.age) >= 18)
})).toArray();

https://stackblitz.com/edit/linq-select

Spread operator in select

Using the spread operator, you can easily extend and modify an existing object. For example, if you want to extend an existing object structure with two new properties. See the MDN documentation for more information on spread syntax.

var result = Enumerable.from(array)
.select(user => ({
...user,
full_name:`${user.name} ${user.lastname}`,
year_of_birth: 2020 - user.age
}))
.toArray();

https://stackblitz.com/edit/linq-spread

Indexing records

The Select method provides another parameter, the select (T, number) iterator. It tells the number of the currently executed line. In practice, it is often required to have a serial number for each record. In our example, the serial number of the user. Let’s take an example.

var result = Enumerable.from(array)
.select((user, index) => ({
...user,
id: index + 1
}))
.toArray();

https://stackblitz.com/edit/linq-select-index

Example from practice

We want to display a list of users and their current age calculated using their date of birth. For this case I will use the package “moment”. Install this package npm install moment.

var result = Enumerable.from(array)
.select(user => ({
first_name: user.name,
last_name: user.lastname,
year_of_birth: moment().diff(user.day_of_birth, 'year')
})).toArray();

https://stackblitz.com/edit/linq-select-moment

moment() returns the current date, and the diff example calculates the difference from today and day in the user.day_of_birth variable.

First, FirstOrDefault

Returns the first element of a sequence.

first(predicate?: (element: T, index: number) => boolean): T;
firstOrDefault(predicate?: (element: T, index: number) => boolean, defaultValue?: T): T | undefined;

first returns the first element of the collection. If no element exists in the collection, the application throws an exception. FirstOrDefault returns the default value if the collection is empty.

var array = [
{name: 'ondrej', lastname: 'hlouzek', age: 38},
{name: 'jakub', lastname: 'hlouzek', age: 39}
];
var result = Enumerable.from(array)
.orderBy(o => o.name)
.first();

https://stackblitz.com/edit/linq-first

Last, LastOrDefault

Returns the last element of a sequence.

last(predicate?: (element: T, index: number) => boolean): T; lastOrDefault(predicate?: (element: T, index: number) => boolean, defaultValue?: T): T | undefined;

last returns the last element of the collection. If no element exists in the collection, the application throws an exception. LastOrDefault returns the default value if the collection is empty.

var array = [
{name: 'ondrej', lastname: 'hlouzek', age: 38},
{name: 'jakub', lastname: 'hlouzek', age: 39}
];
var result = Enumerable.from(array)
.orderBy(o => o.name)
.last();

https://stackblitz.com/edit/linq-last

Single, SingleOrDefault

Returns a single, specific element of a sequence.

single(predicate?: (element: T, index: number) => boolean): T; singleOrDefault(predicate?: (element: T, index: number) => boolean, defaultValue?: T): T | undefined;

The single method returns an object only if the collection to which it is applied has only one element. In the case of singleOrDefault, it returns an object only if the collection to which it is applied returns a single element or an empty collection. If the collection contains 2 or more elements, an exception is thrown.

var array = [
{name: 'ondrej', lastname: 'hlouzek', age: 38}
];
var result = Enumerable.from(array)
.single();

https://stackblitz.com/edit/linq-single

Take

Returns a specified number of contiguous elements from the start of a sequence.

take(count: number): IEnumerable<T>;

This method returns the specified number of records. It is usually used together with the orderBy method.

var array = [
{name: 'Order 1', cost: 15233},
{name: 'Order 2', cost: 33533},
{name: 'Order 3', cost: 23511},
{name: 'Order 4', cost: 11244},
{name: 'Order 5', cost: 44432}
];
var result = Enumerable.from(array)
.orderBy(o => o.cost)
.take(2)
.toArray();

https://stackblitz.com/edit/linq-take

SelectMany

Projects each element of a sequence to an IEnumerable<T> and flattens the resulting sequences into one sequence.

selectMany<TCollection, TResult>(collectionSelector: (element: T, index: number) => IEnumerable<TCollection>, resultSelector: (outer: T, inner: TCollection) => TResult): IEnumerable<TResult>;

Let’s show the use with a specific example. We have a collection of orders and each order has a collection of items. It is therefore a collection of items nested in a collection of orders.

And you want to combine the collections of items of all orders. SelectMany goes through the individual orders and returns a collection of order items. Finally, it combines all collections of items into one.

Another illustration of the example in the tables.

Now you can tell, if I get a list of items using selectMany, I will lose the order information. SelectMany has another parameter resultSelector: (outer T: inner TCollection) => TResult. This parameter is a function that returns the current element and the entire input object. In our case, the selected orderItem and the entire order object.

So, for example, if you want a collection of orderItems of all orders and a specific order name, id, you do so as follows.

var result = Enumerable.from(array)
.selectMany(m => m.orderItems, (order, result) => ({
...result,
id: order.id,
name: order.name
})).toArray();

https://stackblitz.com/edit/linq-selectmany

With this approach, the “left join” operation can be performed as we will show below.

Conditions

You do not always want to return the entire collection, but specific records.
The conditions are just coming. You can use a condition to specify which data set to return.

Where

Filters a sequence of values based on a predicate.

where(predicate: (element: T, index: number) => boolean): IEnumerable<T>;

The most common way for conditional queries is to use the where method. I assume everyone knows the logical operators well. and && a or || etc.

var result = Enumerable.from(array)
.where(w => w.createdDate >= '2019-11-10' && w.total < 10000)
.toArray();

https://stackblitz.com/edit/linq-where

Contains

Determines whether an element is in the List<T>.

contains(value: T): boolean;

Another condition is the contains method. You can use this function to search for a specific list of values. In this example, I’m looking for two specific document numbers.

var numbers = ['139976', '139971'];var result = Enumerable.from(array)
.where(w =>
Enumerable.from(numbers).contains(w.documentNumber))
.toArray();

https://stackblitz.com/edit/linq-contains

All

Determines whether all elements of a sequence satisfy a condition.

all(predicate: (element: T) => boolean): boolean;

The all method returns data only if all elements satisfy the condition specified in the predicate.

Any

Determines whether any element of a sequence exists or satisfies a condition.

any(predicate?: (element: T) => boolean): boolean;

The any method returns data only if no element satisfies the condition specified in the predicate.

Distinct

Returns distinct elements from a sequence.

distinct<TCompare>(compareSelector: (element: T) => TCompare): IEnumerable<T>;

If we have duplicates in the collection that we want to eliminate and always use only the first element of the duplicity, we use the distinct command. When applied to simple data types, the distinct() syntax can be used. However, if it is a complex data type object, we must specifically define what is duplication and what is not.

distinct data type number:

var array = [
1234,
1234,
1555,
1422
];
var result = Enumerable.from(array)
.distinct()
.toArray();

distinct complex type object. Duplicate object can be defined eg by a string. If the string appears more than once, this element is considered a duplicate. The easiest way to generate a string from an object is to convert the record to a JSON string.

return Enumerable.from(object)
.distinct(d => JSON.stringify(d))
.toArray();

https://stackblitz.com/edit/linq-distinct-object

I added the Distinct field to make it clear how LINQ decides that it is a duplicate.

Result:

Groups

Grouping is one of the most powerful capabilities of LINQ. With grouping, you can group data according to selected keys and creates a result value from each group and its key.

This group is located in a special type of collection named IGrouping <TKey, T>. TKey is the key according to which the group is created and T is the collection of elements that corresponds to the given key.

GroupBy

Groups the elements of a sequence according to a specified key selector function and creates a result value from each group and its key. The elements of each group are projected by using a specified function.

groupBy<TKey, TElement, TResult, TCompare>(
keySelector: (element: T) => TKey,
elementSelector: (element: T) => TElement,
resultSelector: (key: TKey, element: IEnumerable<TElement>) => TResult, compareSelector: (element: TKey) => TCompare): IEnumerable<TResult>;

You could say that groupBy is actually the opposite of the selectMany method. Use groupBy to group data according to the specified key. keySelector is a function where fields are defined according to which it will be grouped. What does grouping actually mean? Everything that has the same key, or the specified field in the key, is grouped. elementSelector is a function that describes the resulting group structure. resultSelect is a function that returns a key and a group of type IEnumerable.

var array = [
{id: 1, name: 'Order 1', product: 'Ford'},
{id: 1, name: 'Order 1', product: 'BMW'},
{id: 2, name: 'Order 2', product: 'Toyota'},
{id: 2, name: 'Order 2', product: 'Skoda'},
{id: 2, name: 'Order 2', product: 'Fiat'}
];
var result = Enumerable.from(array)
.groupBy(
g => g.name,
element => element,
(key, items) => ({key, items: items.toArray()}
)).toArray();

https://stackblitz.com/edit/linq-groupby

In this example, I grouped the data by the name of the order object. For example, if you want to group using multiple fields, the notation is as follows:

.groupBy(
g => ({ g.name, g.id })),
element => element,
(key, items) => ({key, items: items.toArray()}
)).toArray();

GroupJoin

Correlates the elements of two sequences based on key equality and groups the results.

groupJoin<TInner, TKey, TResult>(inner: IEnumerable<TInner>, outerKeySelector: (outer: T) => TKey, innerKeySelector: (inner: TInner) => TKey, resultSelector: (outer: T, inner: IEnumerable<TInner>) => TResult, compareSelector?: (obj: T) => TKey): IEnumerable<TResult>;

An atypical function that has no equivalent in databases. This involves joining a table and grouping at the same time. GroupJoin appends the table according to the primary and foreign keys and then groups this data according to the specified grouping key.

var orders = [
{id: 1, name: 'Order 1'}
];
var orderItems = [
{ orderId: 1, product: 'Ford' },
{ orderId: 1, product: 'BMW' }
]
var result = Enumerable.from(orders)
.groupJoin(
Enumerable.from(orderItems),
pk => pk.id,
fk => fk.orderId,
(order, items) => ({...order, items: items.toArray()})
).toArray();

https://stackblitz.com/edit/linq-groupjoin

Aggregation functions

An aggregation operation computes a single value from a collection of values.

min(selector?: (element: T) => number): number;
max(selector?: (element: T) => number): number;
average(selector?: (element: T) => number): number; count(predicate?: (element: T, index: number) => boolean): number;
sum(selector?: (element: T) => number): number;

You can apply aggfunc (selector ?: (element: T) => number): number aggregation functions of the following types to the IEnumerable <T> collection:

  • min — returns the lowest value from the collection
  • max — returns the highest value from the collection
  • average — returns the average values from the collection
  • count — returns the number of elements in the collection
  • sum — returns the sum of elements in the collection
var orderItems = [
{ orderId: 1, product: 'Ford', cost: 10000 },
{ orderId: 1, product: 'BMW', cost: 30000 },
{ orderId: 2, product: 'Toyota', cost: 20000 },
{ orderId: 2, product: 'Skoda', cost: 8000 },
{ orderId: 2, product: 'Fiat', cost: 6000 }
]
var result = Enumerable.from(orderItems)
.groupBy(g => g.orderId)
.select(s => ({
orderId: s.key(),
max: s.max(m => m.cost),
min: s.min(m => m.cost),
avg: s.average(m => m.cost),
count: s.count(),
sum: s.sum(s => s.cost)
}))
.toArray();

https://stackblitz.com/edit/linq-aggregations

MaxBy, MinBy

maxBy<TKey>(keySelector: (element: T) => TKey): T; 
minBy<TKey>(keySelector: (element: T) => TKey): T;

maxBy and minBy are very interesting and useful functions that do not normally occur in LINQ or the database world. It’s actually the same as min and max, but with the difference that the result is not a value but an entire element.

const result = Enumerable.from(orders)
.maxBy(s => s.total);

https://stackblitz.com/edit/linq-maxby

The query looks for the highest value in the column (red column) and returns the entire row (yellow row).

Zip

Applies a specified function to the corresponding elements of two sequences, producing a sequence of the results.

zip<U, TResult>(second: IEnumerable<U>, resultSelector: (first: T, second: U, index: number) => TResult): IEnumerable<TResult>

The zip function joins two collections simply according to the order of the items stored in the collection. If the collections do not have the same number of items, the result is the number of records with the least number of items.

https://stackblitz.com/edit/linq-zip

We will use the zip function as we will solve partitioning.

Loops

forEach(action: (element: T, index: number) => void): void;

You can use the ForEach command to scroll through the individual elements of an array. With this command you can perform your own action for each record, such as in our case display each item in the log:

Enumerable.from(array)
.forEach(f => console.log(`item ${f}`);

Ordering

orderBy<TKey>(keySelector: (element: T) => TKey): IOrderedEnumerable<T>;
thenBy<TKey>(keySelector: (element: T) => TKey): IOrderedEnumerable<T>;
orderByDescending<TKey>(keySelector: (element: T) => TKey): IOrderedEnumerable<T>;
thenByDescending<TKey>(keySelector: (element: T) => TKey): IOrderedEnumerable<T>;

Have you ever tried to sort javascript? No problem if you sort by one field. And what if you need to sort using multiple fields? With LINQ it can be done really very elegantly.

Enumerable.from(array)
.orderByDescending(o => o.lastname)
.thenBy(o => o.name)
.thenByDescending(o => o.age)

https://stackblitz.com/edit/linq-ordering

It sorts first in descending order by lastname, then in ascending order by name, and last in descending order by age.

The sort commands are as follows:
orderBy — the first command to sort in ascending order
orderByDescending — the first command to sort descending
thenBy — the following command to sort in ascending order
thenByDescending — The following command to sort descending

Joining collections

Often you will not be enough with just one table, respectively. with one data field. Often in real projects, you will need to join two or more tables together, for example because you are retrieving data from different rest API. And such a reason can be optimization. The volume of data transferred in the two tables can be orders of magnitude smaller than their join on the server side and subsequent transfer to the client. We will show such a real situation below.

It is a combination of two or more records that have a mutual relationship (common keys)

There are several types of joins. We will show the most frequently used ones, which will most likely be sufficient for all projects.

Relationship with each other

1: 1 OneToOne — there can be only one specific record from the second table for one record from the first table.
1: N OneToMany — there can be several records from the second table for one record from the first table
N: N ManyToMany — that there may be several records from the second table for several records from the first table.

Inner Join

In relational database terms, an inner join produces a result set in which each element of the first collection appears one time for every matching element in the second collection. If an element in the first collection has no matching elements, it does not appear in the result set.

join<TInner, TKey, TResult>(inner: IEnumerable<TInner>, outerKeySelector: (outer: T) => TKey, innerKeySelector: (inner: TInner) => TKey, resultSelector: (outer: T, inner: TInner) => TResult, compareSelector?: (obj: T) => TKey): IEnumerable<TResult>;

An inner join returns data only if the record exists on both sides (collections, tables). If the record is missing on one or the other side, the record will not be returned.

The join command is used to implement an inner join. The join of the left and right collections (tables) is solved using the primary and foreign keys. As soon as a match is found on the left and right, the joined record is returned. E.g. according to the picture. On the left is the primary key Id = 1 and on the right is the foreign key OrderId = 1 pk 1 = fk 1. In this case, two lines are returned, where the left side is repeated according to the lines on the right. This is a type 1: N connection.

return Enumerable.from(Orders)
.join(OrderItems,
pk => pk.id,
fk => fk.orderId,
(left, right) => ({...left, ...right}));

https://stackblitz.com/edit/linq-inner-join

Result:

Left Join

left outer join is a join in which each element of the first collection is returned, regardless of whether it has any correlated elements in the second collection. You can use LINQ to perform a left outer join by calling the DefaultIfEmpty method on the results of a group join.

In practice, the most common case of joining two data sources. This is a join, where a secondary resource joins to a primary resource only if there is a join between the primary and foreign keys. The primary source remains unchanged

It is paradoxical that this most common type of joining has no equivalent in javascript, and performing such an operation is not at all a simple topic and often discussed in forums. There are often examples where a single solution is seen by going through the records, which can have a big impact on the performance of the whole system. And that’s what we’re all about.

Even with linq.js and linq in C#, it’s no different, and the leftJoin isn’t implemented anywhere. But don’t despair. I will show you how to create your own method that will solve everything for us.

But first let’s show how to perform a left join using LINQ.

const result = Enumerable.from(orders)
.groupJoin(Enumerable.from(orderItems),
pk => pk.id,
fk => fk.orderId,
(left, right) => ({...left, right}))
.selectMany(m => m.right.defaultIfEmpty(),
(left, right) => ({...left, ...right}))
.toArray();

https://stackblitz.com/edit/linq-left-join

We have explained groupJoin and selectMany before, so it should not be difficult to understand this syntax. But let’s be honest. Is it readable? Not much :(… never mind.

Left join component

We can create our own component that will be much more readable.

export const LeftJoin = (source, inner, pk, fk, result) 
=> Enumerable.from(source)
.groupJoin(Enumerable.from(inner),
s => pk(s),
i => fk(i),
(left, right) => ({left, right}))
.selectMany(m => m.right.defaultIfEmpty(),
(prev, right) => result(prev.left, right));

I will encapsulate the previous code that performs the left join operation on the LeftJoin method. This is much better. In addition, we will extend Enumerable with a new leftJoin function that can chain functions.

Enumerable.prototype.leftJoin = function(inner, pk, fk, result) {
return LeftJoin(this, inner, pk, fk, result);
}

The result can be following:

Enumerable.from(orders)
.leftJoin(orderItems,
pk => pk.id,
fk => fk.orderId,
(left, right) => ({...left, ...right}))
.toArray();

https://stackblitz.com/edit/linq-left-join-component

Better? This is much more readable and also shorter notation. Says that it is a combination of two collections using the leftJoin method. There is a clearly defined primary key, foreign key and join left and right collections.

Right join

right outer join is a join in which each element of the second collection is returned, regardless of whether it has any correlated elements in the first collection. You can use LINQ to perform a right outer join by calling the DefaultIfEmpty method on the results of a group join.

Right join is the exact opposite of left join. This is a join, where a primary resource joins to a secondary resource only if there is a link between the primary and foreign keys. The secondary source remains unchanged.

As in the case of leftJoin, there is no rightJoin function in LINQ, so we will write it ourselves using LINQ.

export const RightJoin = (source, inner, pk, fk, result) 
=> Enumerable.from(inner)
.groupJoin(Enumerable.from(source),
i => fk(i),
s => pk(s),
(right, left) => ({right, left}))
.selectMany(
m => m.left.defaultIfEmpty(),
(prev, left) => result(left, prev.right));

And then we extend Enumerable with the rightJoin function for concatenation.

Enumerable.prototype.rightJoin = function(inner, pk, fk, result) {
return RightJoin(this, inner, pk, fk, result);
}

The result can look like this:

const result = Enumerable.from(orders)
.rightJoin(Enumerable.from(orderItems),
pk => pk.id,
fk => fk.orderId,
(left, right) => ({...left, ...right})
).toArray();

https://stackblitz.com/edit/linq-right-join

Full outer join

The FULL OUTER JOIN keyword returns all records when there is a match in left (table1) or right (table2) table records.

It basically performs a left join and a right join together. As a result, all records from both the left and right collections are displayed. Again, there is no equivalent in LINQ for calling full outer join, so we write our own component. And as you will see, the full outer join will be a composition from our already created LeftJoin and RightJoin methods.

export const FullOuterJoin = (source, inner, pk, fk, result) => {
var left = LeftJoin(source, inner, pk, fk, result);
var right = RightJoin(source, inner, pk, fk, result);

return left.unionDistinct(right);
}

And we will extend Enumerable with fullOuterJoin methods

Enumerable.prototype.fullOuterJoin = function(inner, pk, fk, result) {
return FullOuterJoin(this, inner, pk, fk, result);
}

https://stackblitz.com/edit/linq-full-outer-join

You will also use this method in the merge method. The following example shows the result of joining two collections using a full outer join.

Even if the record Id = 3 on the left side does not have the corresponding record on the right side and the record UserId = 4 on the right side does not have the corresponding record on the left side, they will still be in the resulting collection. The result is as follows:

Cross join

The cross join produces a result set which is the number of rows in the first table multiplied by the number of rows in the second table if no where clause is used along with cross join.

The last case of join is where each record from the left collection joins with each record from the right collection. The result is then a collection that has the number of records (left collection * right collection).

There is no equivalent for this function in LINQ, but it is easy to use. We apply an inner join where the primary key and the foreign key always return the same value. eg true.

export const CrossJoin = (source, inner, result) =>
Enumerable.from(source)
.join(Enumerable.from(inner),
pk => true,
fk => true,
(left, right) => result(left, right));

And we will extend Enumerable with the crossJoin function.

Enumerable.prototype.crossJoin = function(inner, result) {
return CrossJoin(this, inner, result);
}

https://stackblitz.com/edit/linq-cross-join

The result is as follows:

Union

Produces the set union of two sequences.

union(second: { length: number;[x: number]: T; }): IEnumerable<T>;

In LINQ npm, there is only one method for merging two collections, called union. This method simply merges two collections together.

const result = Enumerable.from(first)
.union(second)
.toArray();

Result:

https://stackblitz.com/edit/linq-union

This operation can be replaced by the spread operator in javascript, for example, as follows:

const result = [...firstCollection, ...secondCollection];

Union distinct

A more interesting function than union would be a function that joins two collections together, but if a record exists in both collections, the process returns only one record.

The process performs a distinct over the joined collections. viz. following image

There is no such function in LINQ, but we can create our own function.

https://stackblitz.com/edit/linq-union-distinct

export const UnionDistinct = (left, right) => {
return Enumerable.from(left)
.union(Enumerable.from(right))
.distinct(s => JSON.stringify(s));
}

And again we will extend Enumerable with a new function unionDistinct in order to chain functions.

Enumerable.prototype.unionDistinct = function(right) {
return UnionDistinct(this, right);
}

Result:

const result = Enumerable.from(leftCollection)
.unionDistinct(Enumerable.from(rightCollection)
.toArray();

Except

Produces the set difference of two sequences.

except<TCompare>(second: IEnumerable<T>, compareSelector: (element: T) => TCompare): IEnumerable<T>;

The except method makes a difference to the sets. It basically subtracts the secondary set from the primary set. The result is records from the first collection that do not exist in the second collection.

Implementation is simple. You take the first collection, you apply the except method with the parameter of the second collection. The second parameter tells which field to compare in terms of duplication.

In our case, the documentNumber column is compared.

const except = Enumerable.from(first)
.except(Enumerable.from(second), key => key.documentNumber)
.toArray();

https://stackblitz.com/edit/linq-except

Intersect

Produces the set intersection of two sequences.

intersect<TCompare>(second: IEnumerable<T>, compareSelector: (element: T) => TCompare): IEnumerable<T>;

The result is a collection that is both in the first collection and in the second. The use is the same as for the except method. You take the first collection, you apply the intersect method with the parameter of the second collection. The second parameter tells which field to compare in terms of duplication.

https://stackblitz.com/edit/linq-intersect

const except = Enumerable.from(first)
.intersect(Enumerable.from(second), key => key.documentNumber)
.toArray();

Partition

Bosses are crazy and have crazy ideas. From the programmer’s point of view of course :)

The boss comes up with the idea that he wants the most expensive order for today. hmm. that’s not a problem.

var result = Enumerable.from(orders)
.where(w => moment(w.createdDate).format("YYYY.MM.DD") === '2020.08.01')
.orderByDescending(o => o.cost)
.take(1);

so return me the most expensive orders for each day. And to make matters worse, return the two most expensive orders for each day

… How to do it… loop? is this the only solution?

no, definitely not .. forget the loop … Now comes the partitioning scene. The procedure is as follows:

  • First, I group orders by individual days and descending by price.
  • then you mark the group’s orders with a serial number descending according to the price
  • finally, we select orders by serial number
return Enumerable.from(orders)
.orderBy(d => moment(d.createdDate).format("YYYY.MM.DD"))
.thenByDescending(c => c.total)
.groupBy(g => moment(g.createdDate).format('YYYY.MM.DD'))
.select(group => ({group, count: group.count()}))
.selectMany(g
=> g.group.zip(Enumerable.range(1, g.count),
(j, i) => ({...j, ["partition"]: i})))
.where(w => w.partition <= 2)
.toArray();

https://stackblitz.com/edit/linq-partition

In this case, we will use the zip function to attach the serial number of the group line. However, this is already a more complicated notation, so we put it in the method.

export const PartitionBy = (grouping, rowNumberPropertyName)
=> grouping
.select(group => ({group, count: group.count()}))
.selectMany(groupWithCount
=> groupWithCount.group.zip(
Enumerable.range(1, groupWithCount.count),
(j, i) => ({...j, [rowNumberPropertyName]: i})));

and we will extend Enumerable with the partitionBy method with which we will be able to chain queries.

Enumerable.prototype.partitionBy = function(rowNumberPropertyName) {
return PartitionBy(this, rowNumberPropertyName);
}

and we apply our example to the just created component

return Enumerable.from(orders)
.orderBy(d => moment(d.createdDate).format('YYYY.MM.DD'))
.thenByDescending(c => c.total)
.groupBy(g => moment(g.createdDate).format('YYYY.MM.DD')
.partitionBy("partition")
.where(w => w.partition <= 2);

Result:

Pivot

PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output. And PIVOT runs aggregations where they're required on any remaining column values that are wanted in the final output.

You will definitely be dealing with a situation where you need to display the data you have stored in a row in a column. This method of rotating rows into columns is called a pivot. Typical examples are orders and products. The source data is stored in a row as shown in the following figure. Both order number and product code with quantity.

The resulting collection looks like product codes in a row and orders in a column. The columns will therefore be dynamic and there will be as many orders as there are orders.

The pivot function is characterized by the fact that the data in both the row and the column are grouped. As can be seen from the left picture, Ford repeats twice, but on the right side is ford only once. The same in the column with orderId 1. This means that the values (cost) are collections of type IEnumerable <T>. So you can apply aggregation functions to them. But it’s up to you what you display in the value, it doesn’t have to be just an aggregation of values, but you can easily display a list of values in a row, etc ..

Since LINQ has no pivot function, we will write it using LINQ.

export const Pivot = (source, columnSelector, rowSelector, rowPropertyName, dataSelector)
=> {
var columns = Enumerable
.from(source)
.select(columnSelector)
.distinct();
var cols = Enumerable
.from([rowPropertyName])
.concat(columns.select(s => s));
var rows = Enumerable
.from(source)
.groupBy(rowSelector)
.select(rowGroup => {
var values = columns.groupJoin(
rowGroup,
c => c,
r => columnSelector(r),
(c, columnGroup) => dataSelector(columnGroup));
return {
key: rowGroup.key(),
values: [rowGroup.key(), ...values.toArray()]
};
}).toArray();
var dics = Enumerable
.from(rows)
.select(s => {
var result = {};
cols.zip(s.values, (k, v) => ({k, v}))
.forEach(s => {result = {...result, [s.k]: s.v}});
return result;
}, {});
return dics;
}

Before we get to the detailed description of this function, we must explain the individual input parameters:

  • source — source data.
  • columnSelector (red) — is a function that describes how to display column,
  • datarowSelector (green) — is a function that describes how to display data in a row,
  • rowPropertyName (blue) — is the name of the first column of rowSelector,
  • dataSelector (yellow) — is a function that describes the display of individual values.

And a new method in Enumerable

Enumerable.prototype.pivot = function(columnSelector, rowSelector, rowPropertyName, dataSelector) {
return Pivot(this, columnSelector, rowSelector, rowPropertyName, dataSelector);
}

Using our function, you can create a pivot from source data that is simple and readable.

const result = Enumerable.from(orderItems)
.pivot(
columnSelector => `orderId:${columnSelector.orderId}`,
rowSelector => rowSelector.product,
"Product",
dataSelector => dataSelector.sum(s => s.cost)
).toArray();

https://stackblitz.com/edit/linq-pivot

Merge

Merge the specified collection with the current collection.

Imagine a situation where you have a list of orders. And suddenly you get a list of corrected and new orders. How easy is it to update existing orders and assign new orders? There is nothing simple in javascript. So we will write our own component that will solve it for us. It won’t be complicated at all. But we will use our components, which we have created so far.

How to do it?

I take a list of original orders and attach to them (using the full outer join method) a list of new or changed orders. Full outer join because we need both records from the left part of the collection (original collection) and records from the right part (new collection) on the output. How the original and new collections will be merged:

  • i will return the right side record,
  • if there is no record on the right side (new collection) then I will return the left side (original collection).

I’ll just show it in the picture:

var result = Enumerable.from(orders)
.fullOuterJoin(newOrders,
pk => pk.documentNumber,
fk => fk.documentNumber,
(left, right) => (right || left))

To make it even more readable, we wrap all the functionality in a function

const Merge = (source, inner, pk, fk, result)
=> Enumerable.from(source).fullOuterJoin(
Enumerable.from(inner),
s => pk(s),
i => fk(i),
(left, right) => result(left, right));

And again we extend the functionality Enumerable a new function merge in order to chaining.

Enumerable.prototype.merge = function(inner, pk, fk, result) {
return Merge(this, inner, pk, fk, result);
}

Calling the merge function is really simple and readable:

Enumerable.from(orders)
.merge(newOrders,
pk => pk.documentNumber,
fk => fk.documentNumber,
(left, right) => ({...left, ...right}));

the result is as follows. Document Number 140710 has been updated and document number 139617 has been added.

https://stackblitz.com/edit/linq-merge

Denormalize

Everyone tries to write code to match the specified patterns, to follow all the rules, not to duplicate the code, etc. Unfortunately, not everything is as you imagine, and sometimes it is necessary to denormalize the code. Mainly for optimization purposes. Someone just can’t avoid denormalization. Although it may be less readable, non-standard, but it can be much faster, it can load data faster, response responses can be faster, and so on.

Imagine a situation where we need to return 100,000 orders with all the details, such as currency, supplier detail, product detail, unit, etc., and display them as items, ie. one line = one order item.

The first thing that occurs to me is that I use the functionality that returns the detail of one order and I apply all 100,000 orders to it. Ideal situation. The code is not duplicated and everything works as it should.

The problem occurs when the customer wants to retrieve this data set and waits for the result for 2 minutes because a huge amount of data is being transmitted.

We also got into a similar problem and we finally managed to transfer the data in the order of seconds. And here comes the code denormalization. I will describe specifically one of the many cases of denormalization that could be applied. Don’t take it as a consensus that everything should be done this way. This is really just one specific case out of many, which can also be solved in various ways.

Let’s take a deep look at it. The detail of one order item could look like this.

Pretty much data, right? Couldn’t it be optimized? Of course. I will not focus on element names (documentItems will be repeated many thousands of times, so such an element could be rewritten to di). We will be mainly interested in duplicate data.

Elements such as client, supplier, currency, product, unit will be repeated many times. And they are not small data. Therefore, we will send a bulk list of clients, suppliers, product and currencies, and in orders we will refer to them only by id. Then such denormalized data could look like this.

We already have denormalized data and we transfer an order of magnitude smaller amount of data. In our case, there can be a reduction of up to 50%. Now I will show you how to connect such data and display it in a table structure for the grid.

Because the result will be individual items of orders, the first thing we will start with is that we display a list of all items of all orders in one table.

const result = Enumerable.from(denormalize)
.selectMany(m => m.documentItems,
(all, item) => ({...item, all }))

Now I will use leftJoin to join other objects.

.leftJoin(Enumerable.from(clients),
left => left.all.clientId,
right => right.id,
(left, client) => ({...left, client}))
.leftJoin(Enumerable.from(suppliers),
left => left.all.supplierId,
right => right.id,
(left, supplier) => ({...left, supplier}))
.leftJoin(Enumerable.from(products),
left => left.productId,
right => right.id,
(left, product) => ({...left, product}))

Finally, I display this structured data in one table so that it can be displayed in a grid.

.select(item => ({
id: item.id,
clientCode: item.client.code,
clientName: item.client.name,
supplierCode: item.supplier.code,
supplierName: item.supplier.name,
supplierAddress: item.supplier.address,
currencyCode: item.currency.code,
code: item.product.code,
productName: item.product.name,
unit: item.product.unit,
description: item.description,
quantity: item.quantity,
cost: item.cost,
createdDate: item.createdDate
}));

https://stackblitz.com/edit/linq-denormalize

The result can be the following:

This was a demonstration I wanted to show how well LINQ can be used to format data.

Conclusion

LINQ is a great technology, thanks to which you can easily manipulate data and I believe that you will stay true to this technology forever. When I learned that it also exists in the javascript version, I already had a lot of experience with LINQ in C #. But for javascript programmers, the documentation is really sparse, almost non-existent. That’s why I decided to write this comprehensive article to help them and solve most common questions.

Thank you for every like you give me. It gives me the strength to go on and create other interesting articles, and the feeling that what I’m doing makes at least a little sense.

If you like this article and it helps you, you can buy me coffee. I love coffee👍

I would like to apologize for the grammar mistakes that will certainly appear in the text. I’m not an English speaking person. But I believe that you will forgive me for these shortcomings, because this article is not about grammar, but about the technical values that are contained in the article.

--

--