V8 engine JSObject structure analysis and memory optimization ideas

Background: The author used a large-scale data file when developing a JS application. During the process of exploring and optimizing memory, abnormal data memory changes were observed, which led to the research of this paper. Most of the content in this article come from the V8 source code reading, gdb debugging and js experiment on Chrome. If there is any error, please let me know.

Research and writing time: January 2019

Chrome version: 69 (V8 version may be 6.9) x86_64

V8 source version: 7.3 x86_64

p.s. this is an translation (mainly by google translation, and a slightly little of my own correction) of my Chinese article. The original article can be found at https://zhuanlan.zhihu.com/p/55903492

Preface

Suppose there are 100,000*100 data stored in a JSON file. The expression is an array of 100,000 objects, each of which has the same 100 attributes. The attribute name and attribute values ​​are very simple, such as {…” t98":0, “t99”: 0}.

After JSON.parse loads this data, the JS memory footprint is 82.5MB (all Chrome memory reports have been garbage collected).

At this point, if we delete an attribute in the middle of each element, then add again:

Delete an attribute and add it back. The reader thinks how the memory will change. At first the author believes that memory should be almost unchanged. However, the actual result JS memory footprint soared to 596MB. The amount of data is the same for the user, but it is up to 6 times the memory.

If only the intermediate attributes are removed and not re-added, the memory is also soaring. If we only add attributes to each object step by step, the memory will soar when it is increased to 134 attributes.

Readers may have noticed that the storage structure of Object has changed in these specific situations. (Easily observed by Heap Snapshot of Chrome Developer Tools)

The kernel of Chrome is the V8 engine. How does V8 design JS objects, how the storage structure of objects is, what happens when storage structure changes occur, how to calculate object memory usage, how to avoid and weaken negative effects, these are the questions we try to answered.

References

There have been several excellent articles on the design pattern of JSObject in V8. Here are a few articles that help the author to form a basic understanding. (The order implies the recommended reading order)

https://v8.dev/blog/fast-properties V8 official blog post, illustrated, quickly understand the JSObject attribute storage structure. [Please read this at least, otherwise the reading of the rest of this article may be very difficult]

https://www.jayconrod.com/posts/52/a-tour-of-v8--object-representation An old blog post of a V8 contributor, as a complement to the first article.

https://zhuanlan.zhihu.com/p/28777722 a series of V8 articles in Chinese, recommended to Chinese reader.

http://www.mattzeunert.com/2017/03/29/v8-object-size.html experiment-based object construction understanding, easy to read, recommend readers with less experience in JS memory analysis.

http://eternalsakura13.com/2018/08/02/v8_debug/ another Chinese article about source code debugging, is recommended for Chinese readers who are ready to debug the source code.

More V8 contributors’ blog posts can find more. Some articles analyze or verify performance, but lack articles on memory analysis.

Expand reading:

http://www.cs.cmu.edu/afs/cs/academic/class/15210-f13/www/lectures/lecture24.pdf Understand the open addressing and quadratic probing algorithm for V8 implementation of HashTable

https://mrale.ph/blog/2015/01/11/whats-up-with-monomorphism.htmlRecommend another blog of V8 contributors, with outstanding personal characteristics. This article is mainly to introduce inline-cache, the query performance rarely mentioned in this article can be extended to read his related blog posts.

https://github.com/thlorenz/v8-perf Found in the late stage of writing. Recommended for web or node developers interested in performance to expand reading

The basic structure of JsObject

A full storage view of JSObject

JSObject has at least three pointers to Map, Property back store, and Element back store.There may also be in-object properties, which can be up to 128 depending on the creation.

The Map in V8 is also called the hidden class in some articles. All Maps that appear in this article refer to the hidden class and should not be confused with other terms. Map is used to describe the structure of the HeapObject, including information such as size, how to traverse object data, and so on.

Note for readers who are unfamiliar with JS objects: Element is used to access a numeric index to an array, and accessed by string name like a dictionary is Property. However, in JS, 1 and “1” are no different for defining elements, they are all numeric characters. JSObject can be used not only as a dictionary, but also as an array (generally not recommended).

Another: This article only focuses on the typical object, that is, no elements and not contained functions, a simply dictionary usage object.

The Properties back store may be a PropertyArray or a NameDictionary. The former is a FixedArray corresponding to the fast-properties mode, and the latter is the HashTable implementation corresponding to the slow-properties. Simply put, the former query performance is very high, but the cost of attribute changes increases with the number of properties. The latter query will be slower but more suitable for scenarios such as adding and deleting properties and a large number of properties.

Element also has fast and slow modes. If add a element may cause the data to be too sparse, the slow mode will be switched, and a HashTable like solution will be used to save storage.

In-object properties are similar to fast-properties but are stored directly in the object, with higher performance (a less pointer query). The number of in-object properties is limited and is usually determined by how the object is created.

Map itself is also inherited from HeapObject so there will also be a pointer to its own map (MetaMap). V8 uses a lot of bit-level data definitions, which integrates boolean and small integers into one word. The instance descriptor of the Map is a FixedArray that directly stores the name, property location and value or type details of each property of the object. The instance descriptor is only used in fast mode.

The property back store in fast mode is a PropertyArray, which is a very intuitive FixedArray that stores property values ​​(such as numbers) or pointers (such as strings).

The slow mode NameDictionary is a HashTable implementation (of course, the underlying storage expression is still a FixedArray). In slow mode, V8 no longer relies on DescriptorArray, and performs hash query directly in the dictionary. Here each properties is also expressed in three parts: name, value and property attribute details.

How JSObject determine its structure: creation and transformation detail

This section discusses the creation and transformation of two different properties forms, fast and slow. From fast mode to slow mode conversion, generally two cases are too many fast properties, and the second is property deletion (mainly caused by deleting non-last added property). The following is a brief description of the process of adding properties in the fast mode and the transition mechanism to understand the inherent logic of the conversion.

In the fast mode, through the structure description in the previous section, we can intuitively imagine: when the added a new property, if the DescriptorArray is full, it expands and stores the property information, and the property value is stored in the in-object properties or PropertyArray (expanding if needed). From the point of view of property storage, this is basically the case, but we missed the important transition process.

V8 tends to share objects with the same structure or shape created in the same way. In practice, many objects often have similar structures, but the values ​​of the properties are different. Sharing these structure data is a good way to save memory. The main idea V8 does this is by building a transition tree. Two figures in the official documentation of the first reference are cited here.

The transition chain is formed when adding fast properties

As shown in the figure above, when object o adds three properties one by one, it does not simply change the value of the original map, but creates a new map. The new map’s backpointer can point to the previous transferred map (and backpointer backtracking can eventually find the map’s constructor). Along the transfer chain, objects of the same order properties share the same map, thus sharing the same instance descriptor.

create a new branch when choose a different second property

When an object creates a new property or the order of the property definition changes, the transition tree creates a new branch.

At this point, we can try to understand the inner logic of fast to slow mode conversion.With the addition of properties, the cost of inheriting and maintaining the transition tree will be larger and larger; when deleting the last attribute, just adjust the backpointer to find the previous map; and when deleting the middle of any properties, V8 cannot ensure a matching transition. For these reasons, V8 will give up maintaining the transition when adding too many attributes or deleting non-last elements, and switch to dictionary mode. Understand the transition, then we will describe the details of object creation and property addition.

Objects created by different ways have different storage structure and converse behavior.There are two main ways to create data objects, JSON.parse and runtime object literal (such as let obj = {“a”: 1, “b”: 2, “c”: 3}), where the former is the runtime side. Constructing objects while parsing strings, the latter happens at compile time (also includes runtime eval). Please pay attention to observe the similarities and differences between different object creation methods (note: all are single object creation, not including transition search etc.).

The newly created object of JSON.parse will first allocate 4 inobject properties. Inobject properties can be left empty, but will not increase once allocated. When the inobject properties are used up, the new properties are placed in the PropertyArray. In order to reduce the number of re-created arrays due to insufficient length, DescriptorArray and PropertyArray usually use a larger than necessary array when the space is insufficient. For example, PropertyArray grows by 3 units at a time. If there are too many properties of the object (explained in detail later), a migration from fast mode to slow is initiated. When MigrateSlowToFast, the newly created slow object no longer has inobject properties, the new map’s instance descriptor is empty FixedArray, and all the properties are added to the new property back store, NameDictionary.

Some readers may wonder why objects or dictionaries is “ordered”. In fast mode, sequentially added properties are stored directly in a contiguous array, while in slow mode, the storage of the dictionary may be different due to different hash values, but the dictionary still holds the order (encoded in Entry detail).

The object literal creation occurs at compile time. The template can already report how many properties the object has before the object is created, so the behavior is very different from JSON.parse. If there are more than 128 properties, V8 will directly create a slow mode map for some compatibility reasons. After the properties are added, V8 will try to converted the object to fast mode (of course, if the number of properties is more than 1020, it will still not be transferred). And if there are no more than 128 properties, it will directly create a fast map, and all properties are as inobject properties.

what circumstances does V8 think that fast properties are too much that need to be converted?

First introduce the two property definition methods of V8. When use a static declaration property name such as foo.bar = 0 or foo[“bar”] = 0, this is a NamedStore. and in the contrast, using expression such as foo[expression] = 0, stands for KeyedStore.

In the case that the PropertyArray has no space, if you directly define it with a static property name, V8 will determine whether there are more than 128 external (mainly non-function) external properties (that is, non-inobject properties, or properties in the back store) in the current PropertyArray, or Whether all properties are more than 1020. If you add property with a “dynamic” expression, the new total property count in the PropertyArray cannot exceed the number of inobject properties (the larger of 12 or the number of inobject properties).

For example, using object literal creates an object with 80 properties, all of which are inobject properties. If you generate a batch of property names (expression) and then at most 80 fast attributes can be added(may be one or two more, related to the step size of the PropertyArray growth). If you create a 40-property object with JSON.parse, there are naturally 36 external properties in the propertyArray. If added by property name literal, you can add 92 properties till reach 128 external properties (of course, the upper limit may still be one or two more), and if you add an property by an expression, only need one property addition to trigger the slow mode conversion.

In the above description, we are deliberately ignoring transition searching and extending. In fact, when the object is created, V8 will try to find the transition tree and the shared map. If not found, the fast mode object will build its transition chain when building and adding properties . This does not make much sense when creating a single object, and because there is no huge transition tree, many of the objects in the slow mode store a lot of memory. But when there are a large number of isomorphic objects, the difference will be obvious, we will introduce in the next section.

JSObject memory estimation

A full storage view of JSObject

First of all, explain the initial example. The reason why the memory is soaring is that all the 100,000 objects in the Array are changed into slow mode.For example, when property count is increased to 132, the upper limit of the fast property number is reached (4+128), the 133th takes up the last store empty slot, and the 134th cause a true return of TooManyFastProperties function, then resulting the MigrateFastToSlow process. The deletion of the intermediate attribute destroys the transition tree, which directly leads to the slow mode conversion. At this time, adding several attributes in the slow mode does not increase the memory(dictionary has a lot of slacks).

Explain that memory soaring first gives the basic calculation method for the memory usage of fast and slow mode objects.

On a 64-bit system, one word is 8 bytes long. The object itself has 3 pointers totaling 24 bytes and the map itself has 80 bytes (80 bytes in V8 and 88 bytes in Chrome, unresolved questions). According to the specific data, referenced property value may take extra memory, such as string objects.

The fast mode object memory usage mainly comes from the descriptor array, transition tree (backpointers) and inobject properties, property array in the map.

Inobject properties memory usage (number of bytes, the same after) is 8 * inobject properties.

The properties array memory is 16 + 8 * external properties + 8 * slack. Among them, 16 is the type header size, and slack is extra allocated empty slots, usually only a few (the expansion of 3 units at a time when the storage space is insufficient).

The descriptor array memory occupied is 24 + 3 * 8 * number of properties + 3 * 8 * slack count. 24 bytes are type headers, and each property has three types of data, one for each. The heuristic assignment of slack is complicated. V8 optimizes the slack allocation by learning the structure of the created object, thus a number of isomorphic objects should not have empty slots. (Chrome’s instance descriptor is also 8 bytes more than the V8 measured, don’t know why, maybe it’s the V8 kernel difference)

The transition tree is relatively cost more. But in many cases a fast object will not contain transition infomation. When there are transitions, each map’s backpointer points to its parent map on the transition tree. The root backpointer points to the constructor. Each transition or properties create a new map, and each map itself is 80 bytes large.

The slow mode object generally has no in-object properties, there is no transition tree in the map, the descriptor array length is also zero, and the main memory usage comes from the NameDictionary.

The memory calculation of the dictionary is more complicated:

16+5 * 8+ round_up_to_2^n( (number of properties +2)* 1.5) ) * 3 * 8

Among them, 16 + 5 * 8 can be regarded as the head of NameDictionary. Each property(Entry) also stores three kinds of 8-byte data in the dictionary, which are property name, property value, and additional property attribute information. The number of properties that the dictionary can hold is more complex. V8’s HashTable implementation is open addressing + Quadratic Probing algorithm. The quadratic probing algorithm has a significantly lower probability of a hash collision when the vacancy is above 50%. For comprehensive performance and storage, V8 will increase the vacancy capacity by at least 50% and guarantee there is always at least half of the remaining space after adding properties.

Take the example of the beginning of the article as an example. The 100 properties of the slow mode, (100+2)*1.5 = 153, after round_up_to_2^n, the capacity is 256, and the 256*3*8+ header size is 6200 bytes.

In fast mode, 100 properties, 4 inobject properties 32 bytes, properties array has 96 properties, 16+96*8=784 bytes, descriptor array is 24+3*8*100=2424 bytes, transition tree approximately 8000 bytes totals 11336 bytes. Our calculations show that the fast mode object occupied more memory, which contracts with our previous answer.

This number is correct in slow mode, but in fast mode we just ignored the design of the isomorphic object shared map. In fact, 100,000 objects share the same map, and each object’s unique memory is only its property store, 32+784+24=840 bytes.

Not all isomorphic fast objects can share a map. One of the prerequisites for sharing a map is that the object maintains the transition and can find the desired node on the transition tree. As mentioned above, if the object literal creates more than 128 attributes of less than 1021, it will directly create a slow object and then try to convert it to a fast object, the MigrateSlowToFast procedure. During this process, the newly created map does not create or reference the original transition. Such these kind of fast objects do not have a huge transition tree, but they do not share a map as well. In this case, the memory footprint will be 16+8*100(property array) + 24(jsobject) + 3 * 8 * 100 (descriptor array) = 3240, which still has a great advantage than the slow mode. (Attentive readers may find that the introduced methods cannot directly create such fast objects, but can be obtained through a two-step conversion)

In practice, only considering the memory of JSObject’s own structure is not enough. The property name and property value affect the size of the object memory in different situations. For property names, intuitively, long property names consume more memory than short attribute names. However, for a large number of homogeneous objects, the same property name actually shares the same string object, and does not continue to increase memory usage due to the increase of isomorphic objects. The numeric data in the attribute values ​​will not be shared. The most commonly used numeric types in V8 are Smi and HeapNumber, the former being a small integer of 4–8 bytes, and the latter describing values ​​that cannot be expressed by Smi, such as float number.HeapNumber inherits from HeapObject, and size of 16 bytes (Map pointer + value) prevents it from being stored directly in the 8-byte property value slot.

When a fast object adds a numeric property, the HeapNumber is optimized for storage, taking only its 8bytes double value as bits. Slow objects store HeapNumber pointer directly (thus an additional 16 bytes overhead). At the same time, the MigrateSlowToFast process does not actively optimize the values ​​from the dictionary. Therefore, the decimal property added to a slow object has a higher memory overhead than a fast object.

JSObject memory optimization solution

In practice, the data actually used by the author is 187 properties per object, 66 of which are floating point types, and each object has a unique string property value (32 bytes).Loading 50,000 rows directly using JSON.parse will consume 691MB of memory (the peak memory required for loading can reach 830MB). This section will present a series of different optimization strategies with some experimental data.

Due to the large number of properties, each object will calculate the size of the dictionary in slow mode, capacity=round_up_to_2^n( (187+2)*1.5)=512, the size is 512*3*8+16+5*8 = 12344, plus fractional and string property values ​​and other structures, etc., about another 1192 bytes, each object memory up to 13536 bytes. In addition to the initial memory usage of about 48MB for environments, the calculation is basically consistent with the actual occupancy.

According to our analysis above, the general idea should be to make the object in fast mode and share the map as much as possible, the goal is to reduce the memory usage at peak and after garbage collection.

With no optimization on decimal fields, no sharing map, and if let the object fixed structure be 100 bytes, the size of object will be: (16+187*8)+(24+187*3*8) + 66*16 + 32 + 100 =7212 bytes. At this time, the memory usage is about 392MB.

The memory usage of the fast object sharing the map is mainly from the inobject property and the property array (16+187*8), including the string value and the object structure totaling 1572 bytes, and the minimum memory usage can be pressed to about 123MB, which will the best results for our optimization.

Method 1: Migrate slow objects to fast mode

As we have seen above, V8 has an implementation of MigrateSlowToFast. How to call this function during production?

When an object is set to prototype, V8 will internally optimize it: call MigrateSlowToFast during the MakePrototypesFast procedure. An example of a practical tool used in practice is as follows:

Here the author does not try to explain why the object in the above example is instantiated and the property query is performed twice after setting the object as a function prototype. Interested readers can view other blog posts about the inline cache.

By setting the object to a prototype, we can migrate the slow object to the fast mode, in which we can continue to add the fast property until the upper limit of the fast properties is reached 1020 (the general fast property quantity check is not used for the prototype map). This method does not occur at the time of creation, so it is more flexible and can convert slow objects as needed.

However, the object converted by this method cannot share the map (the new map does not contain transitions after conversion), and the added decimal property will not be optimized. At the same time, because it is designated as a prototype, it will be slightly larger than the general fast object (some related reference objects are newly created).

Experimental results:

Memory peak: 1135MB

Loading completed: 399MB

The new data memory footprint has dropped, but the extra conversion boost has raised the peak memory (the running process has continued to force gc).

Method 2: Use the object literal to parse the JSON object

Looking back at the previous analysis, JSON.parse can parse up to 132 (or 133 empty slots) properties of fast objects and share maps. The object literal can create up to 1020 attributes of fast objects, and can share maps when creating at most 128 properties objects. How to use object literal at runtime? The eval function does this if we can guarantee that the data is trustworthy and controllable.

Hint: You shouldn’t call eval directly in the same scope where the data is located, which will cause memory leaks. Eval should be separated from data.

Experimental results:

Memory peak: 620MB

Loading completed: 384MB

Memory usage has generally fallen, but this method still can’t share maps, and there is room to continue to squeeze?

Method 3: Create a quick object first, then add properties

Both methods one and two implement memory optimization without changing the original data. The two methods discussed later will change the input or parsed data structure.

Still reviewing the previous section, when the number of object properties created by literal is less than or equal to 128, all attributes are inobject attributes, and the transition tree is maintained. The more inobject properties, the more external fast attributes we can add.This can also be used in this example, but the development implementation may not be elegant: split the data file into two by property, such as one has first 128 properties for each object, and the other has the remaining 59 properties.

Specifically load first use eval to create an array of 128 property fast objects, then create (using eval or parser, it depends) the remaining 59 properties fast object array, and finally traverse the array to add 59 properties to the first fast objects. This method can share maps (always maintain the transition tree) and can take advantage of the fractional storage optimization of fast objects.

Experimental results:

Memory peak: 290MB

Loading completed: 125MB

In theory, it is entirely possible to further reduce the peak memory according to the specific situation. For example, split the file into a more balanced 100 and 87 properties.

At this point, we have basically reached the limit of data object memory optimization.

Method 4: Embrace the array

If you can accept further modification of the data structure, try array usage of object or simply migrate to the array of arrays data structure, then use a mapping table to restore property name query (to bring development efficiency and performance overhead), can also significantly improve storage efficiency .

If simply using JSON.parse to load an array usage object ({…”185": some_value, 186: some_value}):

Memory peak: 214MB

Loading completed: 182MB

using nested arrays:

Memory peak: 178MB

Loading completed: 164MB

Embracing arrays can also significantly reduce the hard disk storage size of the data file itself. The authors did not delve into the memory optimization limits of such data structures, and there may be room for further optimization, but its regular usage has performed very well.

to sum up:

This paper analyzes the implementation details of JSObject from a concrete perspective, and discusses some effective large-scale isomorphic object memory optimization ideas. The author acknowledges that most js or node developers won’t encounter such problems in practice.

Objects certainly can not be considered purely from the storage efficiency, query, add, delete performance, expansion and compatibility are also very important considerations, the author is unable to comprehensively evaluate the JSObject storage design.

Although the article gives a number of memory optimization methods in combination with actual cases. However, in practice, it is still necessary to design for specific scenarios.How many properties are there, are there any needs to add and delete, how much may be added when adding, what are the rules for deletion, efficiency and memory goals, etc. will directly limit the choice of optimization means: understanding the object structure and transformation logic is the foundation of learning optimization.

Sorry for the poor writing, and thanks for your tough reading. Please feel free to leave any opinion or ideas.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store