The Virtual DOM was one of React’s main differentiators when it first appeared. It was a big advantage in comparison with previous frameworks, and newer libraries started to follow the same approach (e.g. Vue.js).
Even with all the attention that the concept received in the past few years, there are still several questions surrounding the topic. How does it work behind the scenes? Why is it considered faster than direct DOM manipulation? How does it relate to dirty model checking?
When you're dealing with a client-side application, you quickly face one issue: DOM manipulation is expensive. If your application is big or very dynamic, the time spent in manipulating DOM elements rapidly adds up and you hit a performance bottleneck.
The obvious answer to the problem is to avoid manipulating elements unless strictly necessary. The approach used by Angular
, which is arguably the framework that popularized the concept of SPAs (Single Page Applications), is called Dirty Model Checking
.
Example model:
1{
2 subject: 'World'
3}
Example template:
1<div>
2 <div id="header">
3 <h1>Hello, {{model.subject}}!</h1>
4 <p>How are you today?</p>
5 </div>
6</div>
With this approach, the framework keeps tabs on all models
. If the model changes, it interpolates/executes the corresponding templates, manipulating the DOM directly. If the model doesn't change, it won't touch the DOM.
Now, this is a smart solution. There are still problems with it, though. One of the main issues becomes very obvious when changes to your model don't necessarily translate into a change in the template - or, even worse, when your model and template are super complex.
In the example shown above, that p
tag will never change. It will still be updated after every single time your model is considered dirty - there is nothing between your template and the actual DOM, so the whole thing is modified every time.
A simple solution to this problem is: Add a layer between your template and your DOM!
Basically, it's an in-memory representation of the actual elements that are being created for your page.
Let's go back to that previous HTML:
1<div>
2 <div id="header">
3 <h1>Hello, {{state.subject}}!</h1>
4 <p>How are you today?</p>
5 </div>
6</div>
After rendering, your virtual DOM could be represented as something like this:
1{
2 tag: 'div',
3 children: [
4 {
5 tag: 'div',
6 attributes: {
7 id: 'header'
8 },
9 children: [
10 {
11 tag: 'h1',
12 children: 'Hello, World!'
13 },
14 {
15 tag: 'p',
16 children: 'How are you today?'
17 }
18 ]
19 }
20 ]
21}
Now, let's say our state
changed - state.subject
is now Mom
. The new representation will be:
1{
2 tag: 'div',
3 children: [
4 {
5 tag: 'div',
6 attributes: {
7 id: 'header'
8 },
9 children: [
10 {
11 tag: 'h1',
12 children: 'Hello, Mom!'
13 },
14 {
15 tag: 'p',
16 children: 'How are you today?'
17 }
18 ]
19 }
20 ]
21}
We can now diff the two trees and identify that only that h1
changed. We then surgically update that single element - no need to manipulate the whole thing.
Let's make things a bit more interesting - we'll write our own naive implementation of a Virtual DOM library!
Since we want to keep things as simple as possible, let's not worry about edge cases at all - we'll provide just enough functionality to abstract our previous Hello World example.
We'll write a few base components: div
, p
, and h1
.
In order to keep things as simple as we possibly can, we'll force each node to contain an id, so that we can easily and quickly find the actual DOM element later.
1/*
2 * Helper to create DOM abstraction
3 */
4const makeComponent = tag => (attributes, children) => {
5 if (!attributes || !attributes.id) {
6 throw new Error('Component needs an id');
7 }
8
9 return {
10 tag,
11 attributes,
12 children,
13 };
14};
15
16const div = makeComponent('div');
17const p = makeComponent('p');
18const h1 = makeComponent('h1');
We now have the functions div
, p
, and h1
in scope. If you're into Functional Programming, you'll identify that as partial application. If you're not, you can see the functions as just a bit of syntactic sugar - you won't have to provide the tag
argument every single time you need a component.
Now that we have a few basic elements, we can start composing more complex components. Let's introduce the concept of state here.
Again, because we want to keep this simple, we won't go into state management. Let's just assume the state is being tracked/managed somewhere else.
1/*
2 * app component - creates a slightly more complex component out of our base elements
3 */
4const app = state => div({ id: 'main' }, [
5 div({ id: 'header' }, [
6 h1({ id: 'title' }, `Hello, ${state.subject}!`)
7 ]),
8 div({ id: 'content' }, [
9 p({ id: 'static1' }, 'This is a static component'),
10 p({ id: 'static2' }, 'It should never have to be re-created'),
11 ]),
12]);
As you can see, we've just represented something similar to the previous HTML template - but this time in JavaScript. This is the basic essence behind JSX. Below the HTML-esque syntax, it ultimately gets translated to JavaScript function calls - something that's not so fundamentally different from our naive implementation here.
In a nutshell, that "component" is a simple function that takes a state
(analogous to our previously-mentioned model
) and returns a Virtual DOM tree. Assuming our state looks like this:
1{
2 subject: 'World'
3}
Then our DOM tree should look like this:
1{
2 "tag": "div",
3 "attributes": {
4 "id": "main"
5 },
6 "children": [
7 {
8 "tag": "div",
9 "attributes": {
10 "id": "header"
11 },
12 "children": [
13 {
14 "tag": "h1",
15 "attributes": {
16 "id": "title"
17 },
18 "children": "Hello, World!"
19 }
20 ]
21 },
22 {
23 "tag": "div",
24 "attributes": {
25 "id": "content"
26 },
27 "children": [
28 {
29 "tag": "p",
30 "attributes": {
31 "id": "static1"
32 },
33 "children": "This is a static component"
34 },
35 {
36 "tag": "p",
37 "attributes": {
38 "id": "static2"
39 },
40 "children": "It should never have to be re-created"
41 }
42 ]
43 }
44 ]
45}
You didn't think we'd stop there, did you?
Again, to keep with the theme of this guide, let's not build anything too complicated. We'll write just enough code to cover our simple app.
Here's the code:
1/*
2 * Sets element attributes
3 * element: a DOM element
4 * attributes: object in the format { attributeName: attributeValue }
5 */
6const setAttributes = (element, attributes) => {
7 return Object
8 .keys(attributes)
9 .forEach(a => element.setAttribute(a, attributes[a]));
10};
11
12/*
13 * Renders a virtual DOM node (and its children)
14 */
15const renderNode = ({ tag, children = '', attributes = {} }) => {
16 // Let's start by creating the actual DOM element and setting attributes
17 const el = document.createElement(tag);
18 setAttributes(el, attributes);
19
20 if ((typeof children) === 'string') {
21 // If our "children" property is a string, just set the innerHTML in our element
22 el.innerHTML = children;
23 } else {
24 // If it's not a string, then we're dealing with an array. Render each child and then run the `appendChild` command from this element
25 children.map(renderNode).forEach(el.appendChild.bind(el));
26 }
27
28 // We finally have the node and its children - return it
29 return el;
30};
As you can see, this is not super sophisticated and doesn't cover a whole lot of edge cases - but it's just enough for us.
We can now see it in action by running the following script (assuming our HTML contains an element with id #root
):
1const virtualDOMTree = app({ subject: 'World' });
2const rootEl = document.querySelector('#root');
3rootEl.appendChild(renderNode(virtualDOMTree));
So far, we've created a DOM abstraction layer - let's now work on our diff
.
The first step is to get two nodes and check if they're different. Let's use the following code:
1/*
2 * Runs a shallow comparison between 2 objects
3 */
4const areObjectsDifferent = (a, b) => {
5 // Set of all unique keys (quick and dirty way of doing it)
6 const allKeys = Array.from(new Set([...Object.keys(a), ...Object.keys(b)]));
7
8 // Return true if one or more elements are different
9 return allKeys.some(k => a[k] !== b[k]);
10};
11
12/*
13 * Diff 2 nodes
14 * Returns true if different, false if equal
15 */
16const areNodesDifferent = (a, b) => {
17 // If at least one of the nodes doesn't exist, we'll consider them different.
18 // Also, if the actual `tag` changed, we don't need to check anything else.
19 if (!a || !b || (a.tag !== b.tag)) return true;
20
21 const typeA = typeof a.children;
22 const typeB = typeof b.children;
23
24 return typeA !== typeB // Cover the case where we went from children being a string to an array
25 || areObjectsDifferent(a.attributes, b.attributes) // changes in attributes
26 || (typeA === 'string' && a.children !== b.children); // if it's a string, did the text change?
27};
Finally, let's write a function that navigates our virtual DOM tree and re-renders elements if necessary:
1/*
2 * Gets the previous and current node representations
3 * replaces the real DOM based on whether or not the representation changed
4 */
5const diffAndReRender = (previousNode, currentNode) => {
6 if (areNodesDifferent(currentNode, previousNode)) {
7 // Is the current node different? If so, replace it.
8 const nodeId = currentNode.attributes.id;
9 console.log('Replacing DOM node:', nodeId);
10
11 return document
12 .querySelector(`#${nodeId}`)
13 .replaceWith(renderNode(currentNode));
14 } else if (currentNode.children instanceof Array) {
15 // If not, and the children prop is an array, recursivelly call this function for each child
16 currentNode.children.forEach((currChildNode, index) => {
17 diffAndReRender(previousNode.children[index], currChildNode);
18 });
19 }
20};
Note that we're matching children based on index here. This kind of matching is not good enough for a real-world scenario but works in our example app.
Now that we have a way to run a diff
and surgically replace specific elements that actually change, let's run our code again - this time simulating a state update:
1// Render the initial application
2const virtualDOMTree = app({ subject: 'World' });
3const root = document.querySelector('#root');
4root.appendChild(renderNode(virtualDOMTree));
5
6// Generate a new virtual DOM tree based on a change in state:
7const newVirtualDOMTree = app({ subject: 'Mom' });
8
9diffAndReRender(virtualDOMTree, newVirtualDOMTree);
After running our diffAndReRender
function, we'll see a message in the console saying Replacing DOM node: title
. That's it, no other element replaced.
And indeed, our #title
element will now say Hi, Mom!
.
Now, this gives us a nice segway into the next segment.
If you've read the previous section, you'll have noticed that we reran the whole app after changing our state. I wrote that for a reason - this is exactly what React does. In most situations, this behavior is completely fine. You're avoiding hammering the actual DOM, and most of your components won't leave a large footprint anyway.
That said, there are always scenarios where your component is far more complex or is running an expensive algorithm. In these situations, you'll need to worry about optimizing your component to prevent wasted update cycles. In other words, you need to make sure your component is only being executed again if it's actually resulting in changes to the output. Luckily, React provides us with a few ways to optimize for that scenario - namely the shouldComponentUpdate
lifecycle method, the React.memo
HOC, and the React.PureComponent
class. I wrote a blog post focusing on performance tuning for React components some time ago; you can find it here, if you're interested.
Another common issue is when one of the elements near the top of the tree changes so dramatically that it ends up completely replaced - say, for instance, that you changed from a <MyComponentForLargeScreens>
to a <MyComponentForSmallScreens>
. Because you completely replaced this node, every single element branching off of it will be re-created as well. I've seen it first-hand in situations where, as an example, the application changes its root element based on window width (hence the component names!). Running it on a smartphone and changing the device orientation (i.e. rotating between horizontal and vertical) causes the whole application to be unmounted and re-created from scratch. This sounds OK until you realize that you also lost all states kept within components - half-filled forms can suddenly go blank - and that's on top of the performance penalty. This is something that requires attention.
The Virtual DOM is definitely going to be around for a while. It provides a really nice way of decoupling your application's logic from its DOM elements and, therefore, reduces the likelihood of creating unintentional bottlenecks when it comes to DOM manipulation. Other libraries are moving forward with the same approach, further solidifying the concept as one of the preferred strategies for web applications.
It's worth mentioning that dirty model checking
and virtual DOM
are not mutually exclusive. They both came as solutions for the same problem but tackling it in different ways. An MVC
framework could very well implement both techniques. In React's case, it just didn't make much sense - React is mostly a View
library after all.
So, in summary, the Virtual DOM implements:
diff
algorithm designed to identify changes between DOM representations.I consider the virtual DOM one of the cornerstones of mastering React - it certainly allowed me to have more context on some of the choices that went into designing the framework, and even to improve my own components and optimization techniques. Hopefully, it'll be as useful to you as it was to me!
If you liked this guide, you could check out some of my other content here: