2025.02.18 - Rolling Cache fragmentation, zero-fill and a 100% full cache
During my recap of the last dev stream I wrote that I wanted to revisit the topic of a “100% Full” cache. So here we go.
In that recap I made the following claims:
- At some point every cache will reach its size limit and be “100% full”.
- Actually one should expect that the RC is always at 100%
- … and if not, then it is not a cache, but a waste of storage.
While I think that the above is obvious I still want to stress some related aspects:
- The goal of a cache is to cache … and not to be empty
- … but there is a tradeoff between the “cost of size” vs the “benefit of performance increase”.
- A cache which on average reduces performance should be disabled.
- To find data quickly one always needs an index
- … and a lookup (read) in a smaller index is always (somewhat) faster than a lookup in a large index
- … especially because an index also needs to be kept “up to date” (write)
- … to match the actual content of the cache (blob items).
- Bigger cache sizes always required bigger cache index size
- … which means more overhead.
- A cache tries to store a small part (e.g. GB) of the original data set (e.g. PB = 1 Mio * GB)
- … but the cache stores the data close to the point of usage
- … so that the access is (a lot) faster.
- Finding (defining) the proper size of a cache is more of an “art” … and somewhat less “science”
- … as there are many uncertainties (tradeoffs).
The fact that any cache (storage) is limited in its size, results in a totally obvious, but somewhat non-trivial “problem”:
In order to store new data a cache always needs to delete old data.
For the implementation of a cache subsystem the resulting questions then are:
- A) Which data to delete?
- B) Which data is the least likely to be needed in the near future again?
- C) Does the new data fit into the place(s) where the old data was?
- D) How to delete the old data?
- E) How to store the new data?
I covered (A) and (B) in a lot of detail in the past. FS2024 currently uses a cache subsystem with a Least Recently Used (LRU) deletion strategy, that does not fit the nature of FS2024. I will not repeat that analysis here.
However, I want to provide some insight about (C) to (E).
(E) How to store the new data?
As I have shown in the cache content paintings, the storage (write) is straight forward (simplistic?). Here is the image from the fresh 16 GB RC of Test 1:
- There are two regions:
- 512 MB index area at the beginning
- … with that tiny purple single pixel line at the top-left.
- The blob area after that … with 15.5 GB (ca. 16,000 MB).
- “Blue” are the cached text files and “purple” are the binary texture files, mesh files, etc.
- The “white” regions show zero values in the (yet) unused areas.
- On initial RC creation the zero-fill is important
- … to ensure that the filesystem will really reserve the space for the cache file.
- The “white” lines in the initial purple block most likely are “real” data with lots of zero bytes
- … as it seems unlikely to me that any data deletion was performed in a basically empty cache file.
In Test 7 the cache is approaching “full”.
- The purple area is (basically) a big purple block.
- The white (unused) regions are (basically) only at the end.
What does that tell us?
New data is stored in a continuous “stream” of blob items … without “gaps”!
So contrary to all existing filesystems the cache is not using a “common (minimum) block size” to organizes the storage inside the RollingCache.ccc
file.
(D) How to delete the old data?
After some heavy usage we get to Test P which was “100% full” many times over.
- There are lots of (tiny) white lines (pixels) sprinkled all over the purple blob area.
- This indicates that (B) does not simply delete the oldest data written
- … which was also part of the previous “LRU analysis”.
What do the white lines tell us?
Old data is delete actively … by a zero-fill of the old data cache (memory) regions.
But why?
- Basically every data is written twice into the RC:
- First as a “zero-fill” (to “clean up” the cache region)
- … and later with the “real” blob data.
- No matter how fast the write operation is
- … there is no speed difference in writing only “zero” or any other byte pattern
- So the zero-fill is reducing the speed of the cache writes by (basically) 50%.
- In addition data needs to be deleted in the index area too.
- So why not only delete it in the index area?
- That would deliver factor 2 write speed increase.
(C) Does the new data fit into the place(s) where the old data was?
In combination (A+D+E) define the nature of what is known as …
Memory fragmentation
It is not specific to the RC file, it is common for any kind of storage of “stuff” with different sizes. For example during the last dev stream Sebastian talked about memory fragmentation in the VRAM of the GPU.
Memory fragmentation …
- is like gravity: it can not be avoided,
- … and one simply has to find a way to deal with it.
- It always results in loss of really usable (precious?) memory.
- It induces additional memory management overhead
- … which will reduce performance (significantly).
In computer science there is a lot of research about how to limit the amount or impact of memory fragmentation. As always there are numerous tradeoffs.
In the case of FS2024:
- (E) = New data is stored in a continuous “stream” of blob items … without “gaps”!
- The major benefit is …
- very efficient (for initial) storage, as no bytes are “wasted”
- … due to a mismatch between blob item size and cache storage block size.
- A major drawbacks is …
- (somewhat) inefficient storage, once fragmentation increases
- … as it becomes harder to find (really) old data
- … which (really) fits the size of the new data.
- This again increases fragmentation.
As I already wrote. All (modern) storage solutions work with some kind of “data size aware” placement strategy. Only in the case of “write-once” archive files (or network data streams) a continuous “stream” of blob items is normal (e.g. in a ZIP file).
A cache is not a “write-once” archive file!
So let me put some numbers to those theoretical observations. The following shows the number of “white” regions in my 16 GB RC file over time.
- Test
- The identifier which I used in the tests that have been published in previous posts.
- Only for 3 of those tests I have included a painting in this post.
- Zero Items
- The (rough, rule of thumb) count of “white” regions with zero-fill.
- Size (MB)
- The MB which (I guess, based on the 9 KB “pixel size”) the zero-fill regions have.
Test |
Zero Items |
Size (MB) |
1 |
1 |
16,000 |
7 |
1 |
1,700 |
8 |
450 |
10 |
B |
1,400 |
1 |
D |
25,000 |
20 |
F |
30,000 |
10 |
I |
70,000 |
110 |
J |
60,000 |
100 |
K |
17,000 |
30 |
N |
12,000 |
15 |
P |
30,000 |
25 |
- Once the RC has reached “100% full” … after Test 7
- … the sum of the empty regions remains in a 10 to 100 MB range
- … and the fragmentation never drops below 10,000 empty “zero-item” regions.
- Just for context …
- The average size of a blob item is around 30 KB.
The actual performance implications of all this on the FS2024 cache subsystem can not be derived from my paintings. Real code profiling would be necessary. But a goose would dare to claim: fragmentation does not come for free.
Update 2025-02-19: Maybe I am totally misreading what I am seeing in the “white” pixels!. Please check my next post below.
Fragmentation is real. But “zero-fill” overhead does not (seem to) exist.
I am leaving my mistakes in this post, for documentation purposes.
To summarize:
- Data is stored in the blob area as a continuous “stream” of blob items … without “gaps”!
- … and clearly not in a “block” strategy.
- So initial writes into the RC file are fast and efficient … during the first days
- … at the cost of higher memory management overhead, slower writes, etc
- … for the majority of the cache usage … during the next years.
- Fragmentation of the RC blob area is increasing quickly once the RC is 100% full.
- The managment overhead of finding the “proper place” for new data increases over time.
- This will have a negative impact on the cache performance.
- Data in the RC is deleted with a “zero-fill” strategy.
- But there is zero benefit to a zero-fill … only added overhead (50% performance loss).
- FS2024 is not a banking system or a real flight control computer
- … where excessive caution or safety requirements may justify the “zero-fill” overhead.
- However, FS2024 would benefit from higher performance (e.g lower latency).
I would recommend for a “Hotfix” release of the Rolling Cache:
- Remove the “zero-fill” deletion overhead.
- Simply delete old data by marking it as deleted in the index items.
Beware … you are entering “Brave New Cache” territory.
In addition to that I would also recommend:
- Adopt new strategies (algorithms) which reduce cache fragmentation and reduce the resulting management overhead.
- To avoid a full rewrite of the cache subsystem it could be investigated if …
- a generational cache design, with multiple but smaller (e.g. 4 GB each) cache files, might achieve good results.
I will try to explain the “generational cache design” idea in a future post in more detail.