The way you create and access your data can affect your app’s performance. Let’s see how.
The Romantic Triangle
When the computer needs to do some computation, the processing unit (the CPU) needs data to process. So, according to the task at hand, it sends a request to the memory to fetch the data via the bus.
It looks like this:
So this is our romantic triangle -
CPU -> Bus -> Memory
Keeping a Healthy Relationship in a Trio is Hard…
The CPU is much faster than the memory. So this process of CPU -> BUS -> Memory -> BUS -> CPU “wastes” computation time. While memory is being looked up, the CPU sits idle and cannot do anything else.
In order to prevent idle time, a cache was added to the system. We won’t go into much detail about cache or cache types, but suffice to say that the cache is the CPU’s internal memory.
When the CPU gets orders to run, it first searches the cache for the data, and if the data isn’t there, it sends a request via the bus.
The bus, in turn, brings the requested data plus a contingent portion of memory and stores it in the cache for quick reference.
This way, the CPU will have fewer complaints about how slow the memory is and hence less CPU idle time.
Quarrels are a Part of Every Relationship
A problem that might arise — especially when we are dealing with processing massive amounts of data — is a phenomenon called “ cache miss” .
Cache missmeans that during a computation, the CPU finds out it doesn’t have the necessary data in the cache, and needs to request this data via the regular channels — you know, that sloooow memory.
An illustration of a cache miss. Data from an array is being processed, but data outside of cache limit is also requested for the computation — and this creates a “cache miss”
Good question. In most cases, you wouldn’t. Today though, as more and more data is pouring into Node.js servers and even unfortunate rich clients , you have more chance of getting into a cache miss issue while iterating over large data sets.
I’ll believe it when I see it!!!
Fair enough. Let’s view an example.
Here’s a class called Boom .
This class ( Boom ) has just 3 properties — id , x, and y.
Now, let’s create a method that populates x and y.
Let’s build the setup:
Now we will use this setup in a method:
What local access does is go over the array linearly and sets x to be 0.
If we repeat this function 100 times (look at the repeats constant in the setup), we can measure how long it takes to run:
The log output is this:
The Price to Pay for Missing Cache
Now, according to what we learned above, we can cause cache misses if we process data that is far away during the iteration. Far away data is data that is not in adjacent index. Here’s the function for this:
What happens here is, that on every iteration we address the index that is ROWS away from the last iteration. So if ROWS is 1000 (as in our case), we get the following iteration: [0, 1000, 2000, … , 1, 1001, 2001, …].
Let’s add it to our speed test:
And here’s the end result:
The non-local iteration was almost 4 times slower. This difference will grow with bigger data. It happens because of the time that the CPU sits idle due to cache misses.
So what is the price you pay? It all depends on the size of your data.
Ok, I swear I will never do that!
You might not think so, but… there are cases in which you’d like to access an array with some logic that is not linear (e.g. 1,2,3,4,5) or not contingent (e.g. for (let i = 0; i < n; i+=1000) ).
For instance, you get data from a service or the DB, and need to process it sorted or filtered by some complex logic. This might cause the data to be accessed similarly to what was shown in the farAccess function.
Here’s an illustration that will (hopefully) make it clearer:
Looking at the image above, we see the data as it is stored in memory (top gray bar). Below, we see the array that was created when the data arrived from the server. Finally, we see the sorted array that holds references to objects stored in various positions in memory.
This way, iterating over the sorted array, can cause the multiple cache misses seen in the example above.
Note that this example is for a small array. Cache misses are relevant for much bigger data.
In a world where you need slick animations in the frontend and where you can be charged for every millisecond of CPU time in the backend (serverless or other) — this can become critical.
Oh no! All is lost!!!
No, not really.
There are various solutions to this problem, but now that you know the reason for this performance hit, you can probably think of a solution yourself. You just need to store data that is processed together closer in memory.
Such techniques are called the Data Locality design pattern .
Let’s continue our example. Assume that in our app the most common process is going over the data with the logic shown in the farAccess function. We’d like to optimize the data so that it will run fast in the most common for loop .
We will arrange the same data like this:
So now, in diffArr , the objects that are in indices [0,1,2, …] in the original array, are now set like this [0,1000,2000, … , 1, 1001, 2001, …, 2, 1002, 2002, …]. The numbers denote the index of the object. This simulates sorting the array — which is one way to implement the Data Locality design pattern.
In order to easily test this, we will change our farAccess function a bit to get a custom array:
And now add the scenario to our test:
We run this, and we get:
And voila! — we’ve optimized our data to fit the more common way we need to look at it.
The full example is live at //yonatankra.com/performance/memory/liveExamples/objectPool2/index.html
In this article, we demonstrated the Data Locality design pattern. In essence, we showed that accessing a data structure in a way it is not optimized for, may cost us in performance. We then optimized the data to fit our way of processing it — and we saw how it improved performance.
Today, with the amount of data being transferred between servers or even pushed into browsers, design patterns that used to be the day-to-day of game developers , should be considered by app developers — whether server-side or client-side.