<aside> ❗ The content of this guide was transferred to Memgraph docs on August 2nd, 2022. https://memgraph.com/docs/memgraph/next/under-the-hood/storage (and later) https://memgraph.com/docs/memgraph/under-the-hood/storage

This page will no longer be maintained.

</aside>

Basics

Estimating Memgraph's storage memory usage is not entirely straightforward because it depends on a lot of variables, but it is possible to do so quite accurately. Below is an example that will try to show the basic reasoning.

If you want to estimate the storage memory usage, use the following formula:

$$ StorageRAMUsage = NumberOfNodes260B + NumberOfEdges180B $$

Let's test this formula on the Marvel Comic Universe Social Network dataset, which is also available as a dataset inside Memgraph Lab and contains 21,723 nodes and 682,943 edges.

According to the formula, storage memory usage should be:

StorageRAMUsage = 21,723260B + 682,943180B StorageRAMUsage = 5,647,980B + 122,929,740B = 128,577,720B ~125MB

Now, let's run an empty Memgraph instance on a x86 Ubuntu. It consumes ~75MB of RAM due to baseline runtime overhead. Once the dataset is loaded, RAM usage rises up to ~260MB. Memory usage primarily consists of storage and query execution memory usage. After executing FREE MEMORY query to force the cleanup of query execution, the RAM usage drops to ~200MB. If the baseline runtime overhead of 75MB is subtracted from the total memory usage of the dataset, which is 200MB, and storage memory usage comes up to ~125MB, which shows that the formula is correct.

Storage Memory Usage

Let's dive deeper into the memory usage values. Because Memgraph works on the x86 architecture, calculations are based on the x86 Linux memory usage.

<aside> ℹ️ For precise/latest memory layout please clone Memgraph and use, e.g., pahole to discover accurate info.

</aside>

Each Vertex and Edge object has a pointer to a Delta object. The Delta object stores all changes on a certain Vertex or Edge and that's why Vertex and Edge memory usage will be increased by the memory of the Delta objects they are pointing to. If there are few updates, there are also few Delta objects because the latest data is stored in the object. But, if the database has a lot of concurrent operations, many Delta objects will be created. Of course, the Delta objects will be kept in memory as long as needed, and a bit more, because of the internal GC inefficiencies.

Delta Memory Layout

Each Delta object has a least 104B

Vertex Memory Layout

Each Vertex object has at least 112B + 104B for the ****Delta object, in total, a minimum of 216B