Always plan to virtualize

Posted on October 21, 2018  -  👨🏾‍💻 7 min read

Always plan to virtualize

In this post we’ll see what virtualizing is, why it’s very useful when dealing with large data sets and how to write your components as a proxy for when you actually need to virtualize.

What is virtualizing?

simply put, virtualizing is the act of making your computer think it actually needs/does something while in fact you’re controlling it to do what you want, up to the point of basically “lying” to it.

What do I mean by that? The most well known topic when talking about virtualization is Operating System (OS) virtualization. If you’re familiar with it then you can skip this part, OS virtualization means to run an OS on top of another OS and not on top of the actual hardware. The hosting OS fakes a lot of resources for the guest OS (essentially lying) so it’d think it’s working on specific hardware. This is very helpful when trying to test out an OS, how it behaves on different hardware configurations all without having to buy more than one physical computer.

Why do I need this?

At this point you might be wondering, “how this applies to components?” That’s a good question. to answer it we need to determine the part of the software that needs to be lied to, the part of the software that does the actual lying, and how to actually perform the lying. In our case the software that’s being lied to is the browser, and the part that does the lying is your client code. Later we’ll get to the lying itself and how to achieve it.

Lying to the browser??

Why do we need to lie to the browser? doesn’t the browser know how to best optimize for whatever you throw at it? Well, some browsers do, some don’t. Some optimize for certain scenarios, others for different cases. Not all browsers were created equal. Even though browser vendors may work pretty fast and add optimizations for your case, you can’t control your users and force them to move to the browser of your choice - that’s what differentiate the web from the desktop, every user can use what you’ve built.

Let’s say you have to support IE11 (released 5 years ago, and now only getting important security updates because it’s being replaced by Edge), while Chrome has highly optimized the call to Object.assign due to the rise of immutability to the popular development jargon, IE11 barely supports 11% out of all ES6 spec. So you see, sometimes the browsers need us to do some extra work in order for the same code to run the same way everywhere.

Wait, we’re gonna lie?? How?

Yeah, we’re going to have to lie. We’re going to virtualizing the components. The browser doesn’t know all the possible variations of components you’re showing in a single collection, the dimensions of each item, etc. We need to make it think that there’s something there that takes up that space, we’ll tell it what to actually display when we need to.

How does this solve anything? it feels like we’re doing a whole lot more code just to get the same effect. Imagine you have a 1000 items in a list, each item is comprised of title, that is be editable, a description, that is be editable, and a collection of attachments. So the browser doesn’t know what you need, it’ll put on the DOM 1000 elements where each has a whole bunch of complex DOM in itself. You’ll easily end up with a DOM tree of 10000s elements. At that point, things start to jitter and lag, no matter the browsers. That’s why you need to lie to the browser and make it think there’s a whole lot of items (to show the scroll) but technically, you only need to show the items that are currently viewable(< few dozens). Instead of handling 1000s of DOM elements the browser now only needs to handle 10s of DOM elements. This is far easier for it to do and a lot more memory efficient.

But I don’t have 1000s of items

Now that you’re convinced that this is something worth doing, why do we need to always plan for it? You know that your list will never exceed 10-20-30 items, why should you care? for the off chance that it might get bigger. The cost of thinking about it & writing the code in a way that it could potentially support virtualizing in advance is very small and negligible, I’ll show you just why that is.

Assuming you have a list of items, like most of us do, we can have a component called ItemList that looks something like this:

const ItemList = ({ items }) => {
  return (
    <div>
      {
        items.map(item => (
          <div>
            <div>{item.title}</div>
            <div>{item.description}<div>
          </div>
        ))
      }
    </div>
  );
}

This is fairly straightforward, we get an array of items and we simply render each and every one of them. What if we decided to split this up, so that the component rendering the collection doesn’t really know how to render each item. Something like this:

const ListComponent = ({ items, renderItem }) => {
  return (
    <div>
      { items.map(item => renderItem(item)) }
    </div>
  );
};

const Item = ({ item }) => {
  return (
    <div>
      <div>{item.title}</div>
      <div>{item.description}<div>
    </div>
  );
};

const ItemList = ({ items }) => {
  return (
    <ListComponent
      items={items}
      renderItem={(item) => <Item item={item} />} />
  );
}

Other than the fact you’re now far readier to virtualize, you’re also separating the responsibilities, one to each component.

  • ListComponent - Knows how to render a collection of items.
  • Item - Knows how to render a specific item.
  • ItemList - Composes ListComponent & Item to render a collection of items in a specific way.

So, how do this make our code virtualization-ready? To answer this we need to go back to our definition of virtualization:

The browser doesn’t know all the possible variations of components you’re showing in a single collection, the dimensions of each item, etc. Which means that we need to lie to it.

Well, now you know all the variations of the items in the collections, right? they’re all Item. You can also know the dimensions of each and every one of them, because they all have the same structure. The only difference is that now you can switch the implementation of ListComponent with something else, let’s say a virtualized one, while Item & ItemList remain the same.

Let’s see how to do just that, we’ll use the awesome react-virtualized package.

import { List } from 'react-virtualized';

const ListComponent = ({ items, renderItem }) => {
  return <List
    rowHeight={55}
    rowCount={items.length}
    rowRenderer={({ key, style, index }) => {
      return (
        <div style={style} key={key}>
          { renderItem(items[index]) }
        <div>
      );
    }}
};

And we’re done! now our component is virtualized, whether we have 1000s of items or 10s of items, the browser won’t even break a sweat!

The important thing is that both Item & ItemList remained the same.

It all makes sense now!

I’m glad I could help! you can now go and implement this in your code! All you have to do is find the place where you render a list of elements, usually map calls in render functions, extract it to its own component. That component will get the the items to render and a function that knows how to render each item.

This might not work for all of your use cases, but some is better than nothing, right? Also, I highly recommend you check out react-virtualized & react-window, because I barely scratched the surface here. Scrolling virtualization is a highly complex subject and they’re doing a wonderful job exposing the right APIs when you need that added customizability.