Update: After a bit of thought, I created a generalized memoizer than doesn’t require foreknowledge of the arity of the calculator function. See the gist at the bottom. I also created a similar memoizer that uses plain stores rather than memoizer functions if you have access to those.
To start, this isn’t something I’m certain I’m going to use anywhere with real regularlity; perhaps as a quick measure while working to convert the current implementation of a project closer to the ideal.
I made this because when I’d started on one of my work projects, I didn’t have a clear understanding of how to properly use Reselect with React-Redux, selector factories, nor how to best make use of selectors to derive data and how to best employ memoization measures such as Reselect, and now trying to convert things to use those two libraries properly will be a very large ordeal indeed.
Still, it was an interesting problem to solve, I think, and one worth exploring. It is (maybe?) only partially solved, but solved for the case I was interested in: Memomization of a large number of derived entities that are extensions of a particular entity combined with several others.
Taking a look at most memoizers out there, one usually sees memoizers that memoize on the first parameter and leave it at that. So far as I can tell, the reason for this is that managing 1-dimensional Maps is easy, n-dimensional Maps, not so much. Perhaps if JS had a native tuple type that might be possible, where an n-tuple could be used to treat a 1D Map as an nD Map, but alas it is not to be.
So just about any strategy for memoizing on more than one parameter is going to be less efficient and, indeed, most strategies I’ve seen so far involve serialization which is an interesting trade off indeed, or iterating over an array doing shallow comparisons on arrays of args. Surely there was a better way than this?
After some thought, I came to the idea of combining closures with single-parameter memos to create a sort of tiered memoization. With the assumption that creation of functions and closures in JS cheaper than serialization, and that multiple equality checks is much cheaper than repeated serializations, I devised a simple strategy to break the memoization of multiple parameters into multiple memoizations of single parameters.
To start off, I made a few assumptions:
- The problem space I’m interested in assumes that, in the set of all sets of arguments to a given calculator, the first argument is unique across every set.
- Any argument changing means a different result, as is expected for a memoization.
As it would turn out, this solution neatly disregards the need for those assumptions, but I needed to start smaller.
To achieve the use of existing memos across multiple parameters, I essentially wrap the calculator function in a series of functions which each take one single parameter of that original function and memoize on that single parameter.
That is, I take a function like this:
(a0, a1, a2, ...aN) => result
And make it into a series of functions like this:
a0 => a1 => a2 => ... aN => result
Each function is memoized, and each subsequent function closes over the previous “partial result”s, meaning that we are able to effectively leverage existing single memos to cache the results of each argument application.
How does this work?
Note that in the second form of
a0 => a1 => ... etc. it’s really a series of separate functions, each one returned by the previous. For added clarity, you can rewrite it to this identical form:
a0 => (a1 => (a2 => ... (aN => result) ...))
(ML folks: Hush, we’re still learning FP over here in JS land.)
The solution becomes obvious here: Each function from the first to the second-to-last caches the next function as its result! The last of course caches the actual result itself.
We thus get something like this:
memoize(a0 => memoize(a1 => memoize(a2 => ... memoize(aN => calcResult(a0, a1, a2, ... aN)) ...))
This is easy enough to do manually, just nest each function and return a memoization on the next much like the sketch above, but what about the general case of N parameters?
I turned to a bit of recursion here to allow arbitrary nesting of these memoized selectors, and broke up the internal process into two steps:
- To start, we have two functions:
applyValuehas a signature
a => (applyValue | result)
createApplierhas a signature
internalState => applyValue
This then works by
createApplier returning an
applyValue, which itself will, if it is accepting the last argument, memoize the resulting calculation and if it is not, memoize the result of calling
createApplier again thus caching the next
Since I’m building each next function recursively/iteratively rather than eagerly I need to carry the currently applied values forward, as well as have some way of knowing how many values are left before we call the actual result-calculator function. Those things constitute the internalState here.
This also conveniently enough shows how we kick the whole process off: Just call
createApplier with the initial
internalState (no values received, all N values remaining) to get back the first
applyValue and feed in all the values from there.
In fact, this part is trivially implemented by reducing over each argument.
const initialApplyValue =
createApplier(calculateResult, memoizers, argCount, );
const result = args.reduce(
(applyValue, arg) => applyValue(arg),
Where here, the exact signature I used for
(calculateResult, memoizers, expectedArgCount, collectedArgs) => applyValue .
createApplier look like, then?
Note that I passed in a list of memoizers because, in this case, I wanted to be able to use different strategies for each level. Particularly, given the first assumption where I stipulate that one of the arguments in each set of arguments is unique across all sets of arguments, the first argument specifically, I can use only one multi-value memo (WeakMap or LRU) and use quick single-memos for every other tier after that since I know that the first one is unique.
In fact, if one or more arguments are unique across all sets of arguments then you only need one multi-value memo at the very beginning, on one of the unique arguments! This is because if any of the arguments are unique, then you already know that each set of arguments is itself unique among all sets of arguments.
Conversely, however, if none of the arguments are unique, you may need multi-value memos for every tier! That is certainly one reason why creating memoized selectors per-instance with React-Redux is superior to this solution.
Here’s a gist with the final result of what I came up with. Because I’d intended to use it along side Reselect, I also have an enhancer that wraps around its
createSelectorCreator function that makes it pass the selectors themselves as the first argument.
It seems to work!
I currently have no idea how performant it is! I hope it’s at least nearly as performant as the slowest memoizer passed to it, though.
Note that if none of the arguments are unique across all sets of arguments you expect to call that selector with you’ll have to use multimemos for everything. I suppose WeakMaps may be an option if you expect to call it with a number of objects to compute derived entities or something, but you can’t use WeakMaps with non-Objects because, well, primitive values are refs.
Some finagling could probably be done to not require the expected number of args be passed in before hand and just memoize on what’s given, dynamically building the tiers. This could probably be done by just passing in values until we have no more than calling the last
applyValue with 0 arguments instead of 1, or maybe each
applyValue could return with a
computeResult attached to it to apply the current list of values to the original computer function, and you just call that on the last
applyValue returned. Perhaps I’ll take a whack at that for fun some time.
Update: Did just that!
Turned out not to be too difficult, a trivial variation of the first version. Like the first implementation, though, it’s probably not optimized, and likely is going to always be slower due to having to always assume we won’t know the number of args until call time. But, at least it doesn’t require finagling Reselect’s
createSelectorCreator! I suppose if you were certain you were going to use only formal params in your functions, you could just use
fn.length but in both of these cases I wrote it without that. It’s likely at least fast enough that it’s better than a React redraw, so, there’s that?
Update 2: Stores Only
I also figured this same strategy could be used with stores themselves. Like the sub-memoizers ones described in the rest of this post, you have to ensure whatever store (or memoizer) you’re using for each arg can support what you’re throwing at it; for example, don’t use WeakMap for an argument position that might see primitives.
I think it’s worth at least mentioning that I didn’t come to this solution in minutes, but rather across several days of now-and-again idle thought, and that the thinking that led to this current result was only the last bout over several weeks of picking it up and putting it down! I knew I’d needed to somehow reduce the number of arguments per memo, but wasn’t sure at first how to do it as the use of closures for some reason eluded me. Probably because I’m just a better programmer now. (And that’s why you should never stop learning!)
One idea I’d had was to start with a series of two argument functions which go something like
(a, b) => ((a, b), c) => ((a, b, c), d) => ...but that still runs afoul of the original issue of multi-parameter memoization. It wasn’t until after exploration of a number of methods that I finally arrived on the
a => memoize(b => memoize(c => ...)) method.
Going from the n-dimensional Maps for a moment, you could conceive of this as converting a set of coordinates in an n-D space into a series of mappings of 1-D spaces, where each additional argument constrains the target space by 1-D at a time until you built up to the specific point in the result space that those arguments act as coordinates to.