Understanding memory management in JavaScript: Garbage collection and more
Memory management ensures systems perform as they need to
Many businesses and developers opt for JavaScript due to its extensive support across various platforms, including web, mobile, servers, and embedded systems. Understanding how JavaScript manages memory is crucial for developers, as it relies on garbage collection instead of manual memory allocation and deallocation like in low-level languages such as C and C++. This knowledge enables developers to minimise memory overhead, optimise cache creation, and maintain application stability. In short, it means they can ensure systems perform as they need to.
This post will explain this with examples. We’ll cover topics such as:
The difference between reference and value types
Garbage collection
Weak data structures
Finalisation
Reference vs value
We'll now go through some basics of JavaScript's type system to provide you with context for the topics covered later on in this article. This won't be a detailed explanation, for more information you can look up JavaScript data structures on MDN.
Value
Any data represented by the primitive types will be considered a value. This means that you can assign the same primitive value to another variable, but you cannot modify the primitive itself.
Taking numbers as an example:
The value 1
is copied into a new variable bar
but we can't possibly change the value 1
itself to something else.
Also, if the value of the newly created variable is changed to another primitive, JavaScript won’t change the first variable's data.
More information about the primitive data types in JavaScript can be found on MDN.
Reference
In JavaScript, there is another type known as objects. All other constructs in the language, such as arrays, are either derived from objects or add special behaviours to them, as is the case with functions.
Objects in JavaScript are referential values, meaning that they contain a reference to their location in memory rather than the actual value itself.
The referential nature of objects has an important implication: any modification made to an object will be reflected wherever the object's reference is stored. This means that if you have multiple variables or data structures that reference the same object, any changes made to the object will be visible through all the variables which reference the same object.
Understanding the referential nature of objects is crucial when working with JavaScript. It allows you to manipulate and modify objects effectively, but it also requires careful consideration to avoid unintended side effects when multiple references to the same object exist.
Let's see an example of this concept:
Garbage collection
Garbage collection is the process of freeing up memory by removing anything that the collector deems as no longer needed to be kept in memory.
Different programming languages use different algorithms to determine when memory should be released. Tracing, Reference Counting, and Escape Analysis are a few of them.
Each JavaScript runtime might use a different algorithm, so the algorithm itself is not important for the sake of this post, but the concept of garbage collection is.
Here’s an example:
What we have here is a function that defines an object and passes back its reference, to be assigned to the result
variable, which now holds the reference to it.
The variable inside the function, foo
, is no longer needed once the function has finished executing, so the garbage collector (GC) can get rid of it in its next run.
Weak data structures
We know that objects are used for everything in JavaScript and that all other constructs are built on top of them. The most basic use case of an object is to store values indexed by a key.
The key
is what defines how a particular value can be accessed again, but what if the key
gets lost?
This cannot happen when working with regular objects because the keys are primitive values, making them strong keys or keys that cannot be garbage collected. It isn't a big problem, but it also makes it easy for someone to create a large number of references and fill up the memory quickly and unnecessarily.
Here's an example:
After the for
loop executes, we use a[0]
. We are not referencing the other elements elsewhere. However, the garbage collector cannot do anything because the elements are referenced and accessible by their indexes. In other words, the GC cannot make any assumptions about whether the other elements of the array can be garbage collected.
Weak data structures differ from other, more traditional data structures, in that they don’t hold strong references towards the values they contain.
JavaScript provides three types of Weak data structures:
WeakMap
WeakSet
WeakRef
Let's take an example where we define 2 objects and use them as keys and 2 other objects that will act as values. The example tries to demonstrate how un-referenced keys would get garbage collected using a WeakMap.
To understand what happens, we need to consider that keyOne
and keyTwo
are 2 object references. When we pass them as the key to a WeakMap
, the reference becomes the index used to access the value.
Once the reference is cleaned up, there is no way for you to access the value that was stored with it, unless you are also hard referencing it from somewhere else. In our case, valueOne
and valueTwo
are hard references to the value, so we can still use them and that would avoid them from being GC'd.
Let’s have a look at another example:
Due to how this works, when using weak data structures, it is advisable to carry out null checks, as reference might get lost during program execution due to the GC kicking in, and this is what FinalizationRegistry
can help with.
Finalisation
To address the issue of losing references held by weak data structures, it is convenient to have a mechanism that allows us to inform our code when the referenced value has been lost.
In other programming languages, such as Java, where the concept of finalisation is commonly used, a class can have a finalize
method, which is called when an instance of the class is no longer reachable. In this method, you can perform necessary cleanup steps or modify flags to prevent further processing of the data associated with the instance.
Similarly, in JavaScript there is a helper construct available, although it is not specifically tied to a class like in Java. This helper function allows you to register a reference/instance that you want to track. Once that reference is cleared, the registered callback function is invoked, passing the reference value that you had previously registered.
In the example below we will see a simple approach at creating a cache for data loaded from the network, with the intent of keeping it around for as long as it's referentially accessible. Once it's lost, we don't need it in the cache and we wish to clear any remains of it and fetch from the network again.
To achieve this, we will need to:
Store the cache in a data structure
Fetch network data
Map the data to the cache in a way that it is garbage collectable
Fetch from the cache and fallback on the network if the cached value no longer exists
Let's go through the example to gain a better understanding of how it works:
We create a
Map
to keep track of the hard keys, which will represent the path of the URL used to fetch the network data.Next, we instantiate a
FinalizationRegistry
to assist in tracking what has been cleared. In this specific case, the callback will be called with the samekey
that we might use to make the API call.We will use a network wrapper function, with the additional usage of something called a
WeakRef
. Instead of storing the actual data in thecache
, we store aWeakRef
to it. This approach is necessary because directly setting thedata
into thecache
with a key would create another reference to the data, making it impossible to be garbage collected. By creating a weak reference, theWeakRef
will also be lost if the referenced object is lost when the GC runs.We register the original
data
reference, which was created during the network call in Step 3. During registration, we also provide akey
to theregister
call. This is done to ensure that the finalisation registry knows what to pass to the callback.We attempt to retrieve the reference and check if the contained object still exists. The
if
conditions will perform these checks twice. Firstly, it will check if the key is still part of theMap
. Secondly, it will verify if the retrievedWeakRef
still points to the network data and if it is accessible. If neither of these conditions is met, the function will fetch the data again.
By following this example, we will have a cache that only retains memory as long as the required data from the cache is being referenced somewhere in our program.
It's important to note that implementing these strategies can be challenging and should be approached only when there’s a real need to optimise an application’s memory footprint.
A deep dive into this fascinating topic can be found in Memory Management on MDN.
Conclusion
Having read this article, you now understand how:
The garbage collector works
We can run clean up steps once the GC is done cleaning up an instance or reference that we wanted to monitor — a.k.a “What is Finalisation” and
FinalizationRegistry
And you also got a simple example of how to create a cache for network data
These concepts represent a subset of the methods available to mitigate memory overhead, create and manage caches, and enhance the fault tolerance of GC-driven processes.
Insight, imagination and expertly engineered solutions to accelerate and sustain progress.
Contact