RollingCache.ccc performance debugging and tuning … How?

2025.01.13 - Trying to understand the “purple lines” of the Rolling Cache Index Area

In the big picture it was just a tiny purple line, with one pixel summarizing the content of 9 KB of data. So the obvious questions always have been:

  • How does the index area really look like, if painted at a one byte per pixel resolution?
  • How do the changes (deltas) between two snapshots look (and feel)
    • … and what could such paintings tell us?
  • Are there 72 or 80 bytes per index item?

To recap, here are some observations and assumptions that I made and published during previous tests about the “high level” structure of the RollingCache.ccc file, based on Process Monitor file event recordings:

  • The index area
    • … that tiny purple line … is starting at “Offset: 80”
      • … but has some additional TOC (table of contents) data from “Offset: 0”.
    • Each index item seems to have a size of 72 or 80 bytes
      • … due to a suspected 64 bit alignment.
  • The blob area
    • … seems to start after the index … at “Offset: 536,877,473” (512 MB).
    • Each blob item seems to consist of three parts:
      • A kind of TOC
        • … with 28 bytes.
      • Some kind of HEADER
        • … with around 1 KB.
      • The blob (chunk) BODY
        • … with sizes from 32 bytes to 40 MB or more.
          • The write operations seem to take place in 4 MB chunks.

Before I showing the actual delta content paintings, I want to list what I would expect to see in an index item of a cache with a Least Recently Used (LRU) replacement policy:

  • Must have:
    • A marker for the least recent usage.
    • A pointer (offset) to the beginning of the actual blob item.
    • The length information for the blob item.
      • Keeping that only in the blob item TOC is possible
        • … but it would be very bad for performance, as it would require multiple read operations.
      • Besides that, the recorded file access events show, that the sim “knows” the length of each blob item
        • … without reading the blob item TOC first.

… and since the above would only justify 3 * 4 byte … and not 72 or more … there should be some additional informantion:

  • Nice to have:
    • Some form of checksum … for the blob item?
      • Like with the length … duplicating that in the index item would make sense for performance reasons.
    • Some form of checksum … for the index item?
      • To detect bit rot and data corruption in big files.
      • Modern file systems use that in their “index” structure for a long time.
      • This would be a very good idea, because a currupt index needs to be detected (and deleted)
        • … instead of reading and feeding corrupt data into the sim and the GPU.
    • Due to the benefits of 64 bit alignment on 64 bit CPUs
      • … I would expect most information to use and be aligned with 64 bits.
      • That also does allow to scale the size of the cache file into the Terabyte range.

The following test runs have been used to paint the delta content paintings:

  • Test 1
    • Was performed at 2024.12.15-12.10.
    • It created and zero-filled the RC, and then inserted the first (essential) data.
    • It thereby created the “iconic” blue text area at the beginning of the blob area
      • … which provided (asset index?) blob items that have been reused (kept alive) in all subsequent test runs.
  • Test 2
    • Was performed on 2024.12.15-15.17
      • … 3 hours … or 10,800+ seconds later … so very shortly after Test 1.
    • Due to the almost empty cache no data from Test 1 had to be deleted
      • … but a lot of (early) data must have been reused.
  • Test G
    • Was performed on 2024.12.22-17.35
      • … 7 days … or 168+ hours … or 604,800+ seconds … or 604,800,000,000+ microseconds … after Test 1
  • Test H
    • Was performed on 2024.12.23-07.58
      • … shortly after Test G.
    • Here 30 GB of new data have wiped out all old data from the RC.

The first painting compares Test 1 to Test 2. The colors are as in previous delta content paintings: a “green” pixel marks identical data, a “red” pixel indicates differences, and “white” represents shared “zeros”.

I picked 144 pixels per row for those paintings. This makes it easy to answer the “72 vs 80 bytes” question.

  • The obvious column structure of the images tells us right away,
    • … that each index item uses 72 bytes.
  • There are a lot of shared “zeros” in Test 1 and Test 2
    • … so maybe later tests might indicated their usage.

Beware … in the following text the word “column” means “column within one of the index item columns”, as there are two index items per pixel row.

The differences (red) are the interesting parts:

  • Two red columns indicate updates to index items
    • … where the blob items did not change.
      • So the pointer (offset) to the beginning of the actual blob item must be in one of the green columns.
  • One red column … is 4 bytes long.
  • The second red column … is 2 bytes long.

Where and what is the LRU marker?

There are two common techniques for a least recent usage marker:

  1. A high resolution timestamp.
  2. A constantly growing counter
    • … sometimes in the form of vector clocks.

Since the word “recent” is a concept of time one does expect (1) to be the most likely candidate. A 64 bit integer can easily hold a timestamp in microseconds or nanoseconds … and filesystems and distributed databases do use such timestamps.

The time resolution will restrict the speed at which new information can enter the cache, an so seconds can be rules out, and even milliseconds feel very very “risky”.

A time delta of 3 hours … or 10,800 seconds … or 10,800,000 (0xA4CB80) milliseconds would at least affect 3 bytes of data. Once we go to the more “safe” microseconds we should already see at least 5 bytes changing constantly. But the largest red column (so far) is only 4 bytes.

So the timestamp hypothesis seems highly unlikely, because to the right of the 4 red bytes we can see many white zeros. But a 64 bit timestamp should have at least common green bytes, encoding the month and years parts of a timestamp. It also seems highly unlikely that the sim would, without need, choose some custom time epoch.

So either the FS2024 team decided to use a very uncommon and risky (short) timestamp … or they decided to go with option (2), a constantly growing counter. Since the sim can (and as far as I can see in the file events … does) serialize all access to the cache, using a constantly growing counter would be a cheap and robust, and a very good solution.

Option (2) would easily fit into 2 bytes at this early life of the cache. After all the cache has not seen that many items yet.

So where is the LRU marker?

The next painting compares Test 1 to Test G, which was the last test that still contained the “iconic” shared blue text area.

After a heavy usage of the cache over a period of 7 days …

  • We still basically see the same picture
    • … which makes sense, as this early data is still in the cache, unchanged.
  • The main difference is … the 2 byte red column
    • … has turned into a 3 byte red column.
    • This would fit the logic of a constantly growing counter.
  • As the 4 byte red column is still 4 byte
    • … but it is always changing everywhere
    • … it is highly likely, that we are looking at a 32 bit checksum (hash) of the index item.

So what are the four (or more) green columns then? They highly likely contain the following essential information:

  • A pointer (offset) to the beginning of the actual blob item.
    • This most likely is the “fixed” 4 byte column, as early data is written to roughly the same offset.
      • 512 MB … is Offset 536,877,473 … or 0x2000_0000
    • That would also fit the white “zero” pattern at the top of those green columns.
  • The length information for the blob item.
    • This most likely is the 2 or 3 byte data, because file length is not always the same.

Now a cache index needs to know “what” is inside each blob item. FS2024 on launch reports, that it does build a virtual file system (VFS). This takes a long time, because a couple of million items needs to get placed inside a tree … most likely based on a unified file (URL) path notation. This requires a lot of memory.

An index needs to be more efficient. And so I would consider it likely that one of the remaining green columns contains …

  • Some form of checksum … for the blob item.

That still leaves (at least) 2 green columns in the realm of mystery.

What happened in the white zero areas?

The next painting compares Test 1 to Test H, after the oldest data in the RC has been wiped out.

  • This mainly confirms, that when the blob item of and index item changes
    • … then everything turns red (is different).
  • So it is highly likely that the index item contains some more “nice to have” information:
    • Some form of checksum … for the blob item.
      • That blob checkum can only account for 4 or 8 bytes … but not for all changed bytes.
      • I would assume 8 bytes,
        • … because the hash values inside the layout.json files stored on the regular filesystem do not fit into 32 bit.

However, some interesting questions remain:

  • There is now actual (red) data in the white zero parts.
    • Maybe this is related to …
      • data deletions?
      • larger blob items than before?
      • whatever … I have no clue.

To summarize the observations presented above:

  • The index area uses 72 Bytes for each index item.
  • Each index item is highly likely to contain:
    • A constantly growing counter as the marker for the least recent usage.
      • This is a very good choice for this purpose.
    • A pointer (offset) to the beginning of the actual blob item.
    • The length information for the blob item.
    • Some form of checksum … for the index item.
      • Most likely a 32 bit checksum.
    • Some form of checksum … for the blob item.
      • Most likely a 64 bit checksum … like the hash in layout.json files.
  • Each index item also contain … some additional information
    • … where the purpose can not be deducted from delta content paintings.
  • The usage of checksums for the index items as well as the blob items is a very good decision.
    • This makes storing files in the RollingCache.ccc more robust than storing them in the normal filesystem.

The above will become more important once I will try to outline (and figure out) what a Brave New World caching concept should introduce to the existing system.

15 Likes

What a terrific Job !
Thanks you for all these informations.

1 Like

This post is taking me back to the early 2000s on the AVsim scenery design forum, where folks like Dick “rhumbaflappy”, Arno Gerretsen, among others, were really digging deep to figure out the undocumented (in the SDK) mysteries behind the .BGL files. Their efforts led to the development of tools that weren’t even included in the SDK, which in turn made it possible to create sceneries that were way more advanced than anyone ever thought possible. Mr. goose, I just want to say how much I admire your determination, dedication, and your knack for reading data like it’s tea leaves.

And if I may add one last thing, this post is a true example of what is really made by the community for the community.

8 Likes

I can’t comment on any of the technical observations, questions, and conclusions in the myriad of tests shared in this thread. I’m simply not smart enough.

What I can say is that I am learning new things. And I appreciate the methodology. This thread takes up a lot of your time. I can see that you enjoy doing it. I’m enjoying the thoroughness, as well as your taking the time to draw conclusions when you can, and sharing those conclusions in a way that makes them at least somewhat digestible for a dummy like me.

Little lightbulbs sparking questions in my skull jelly. Thank you.

7 Likes

2025.01.18 - v1.2.11.0 - Growing the Rolling Cache from 16 to 256 GB

So far my focus was on testing the default 16 GB size of the RollingCache.ccc file.

The primary insight that I gained from all the past tests was, that today the RC uses a strict Least Recently Used (LRU) cache replacement policy.

Based on that finding back on 2024.12.23 I did post the following “guess” for practical recommendations:

  • Increase the size of your RC as much as possible
    • … but I might change my opinion on this, once I take a closer look at the “tiny purple line” at the beginning of the cache.
    • Especially the landscape data on long flights can and will wipe out important sim data
      • … which then needs to be downloaded again if it only was in the RC.
    • Sizes of up to 250 GB seem to be without noticable problems (so far).
    • Sizes above 750 GB might be a waste of storage,
      • … as the sim might not be able to use it all (due to the nature of the RC index structure).
  • Once the feature gets added to FS2024, manually cache (aka. “Install”) as many relevant FS2024 Marketplace assets (aircraft, airports) as you can outside of the RC.
    • This form of manually caching outside the RC will reliably reduce the need for network usage.

But that was theory … is that true in practice?

So with that background the first obvious question was:

  • What is the nature of the data in the RC after a very long flight in a very large cache?

For that I did run the following test:

  • Test A1 to A4
    • I upgraded to sim hotfix v1.2.11.0
    • I deleted the RC file … to trigger a clean rebuild of the RC.
    • I waited until all the initial data for the cache arrived
      • … which took some time, due to stressed servers in a rainy cloud.
    • I restarted the sim … and raised the RC size to 256 GB.
  • Test A5
    • I took a long flight … 11,5 hours
      • … from KSFO to KLAX to CYYZ
      • … visiting the 3 TIN cities at low altitude.

As always I made system event recordings with the Process Monitor tool. For easier and more detailed analysis I exported the results to a more readable CSV file format.

What I did see during Test A5 was:

  • Around 2.5 Mio new blob items have been added to the RC.
    • That is around 200,000 items per hour.
  • The average blob item size was around 30 KB.
  • This translates to around 70 GB of new data in total
    • … or 6 GB per hour
      • … which on average is around 15 to 20 Mbps.
  • Based on the 256 GB RC size
    • … the 70 GB are around 27% of the available size.

With those more concrete blob item numbers I would now make the following index area observations:

  • At 72 bytes per index item
    • … the existing 2.5 Mio new blob items
      • … require around 171 MB or index area space.
      • Since that area has a fixed size of 512 MB
        • … around 33% (1/3) of the index has been used up.
        • So this means 3 * 70 “long flight” blob item GB will exhause the index area.

So with that I will revise my recommendation from 2024.12.23, which said:

  • Sizes above 750 GB might be a waste of storage,

Today I would say:

  • Sizes above 250 GB might be a waste of storage,
    - … as the sim might not be able to use it all (due to the nature of the RC index structure).

However, I also stumbled over an issue, which I increasingly do consider a (major) performance issue of large cache sizes. I will try to explain that in the next post … and until then I will leave you with this … cliff hanger.

12 Likes

I hate cliffhangers :grin:

2 Likes

2025.01.20 - v1.2.11.0 - Major performance problems of the index area (especially in a large RC)

Beware … you are (slowly) entering “Brave New Cache” territory.

Leading up to a future post about all the ideas for a “Brave New Cache” (BNC) release, which would require substantial code changes, I want to identify and explain individual (major performance) issues … one by one. I have the feeling that otherwise my “Brave New Cache” post might be even longer than the already long posts of the past.

As a goose I hold the … perhaps outdated … opinion, that bold claims demand bold evidence.

Today my bold claim will be:

  • The RC subsystem of the sim is suffering from a severe “index area full dump problem”.
    • As a consequence the entire caching and data processing pipeline will “stutter”
      • … which perhaps even might result in visible “FPS stutter”.

I might return to the “visible FPS stutter” in the future, as that is highly likely caused by a different problem. But I would file it as a “backpressure” consequence of todays major topic.

I have seen indications of the “index area full dump problem” early on in the Process Monitor recordings. But bold evidence demands a dedicated set of tests.

So now I performed the following:

  • Test A5
    • … was explained in full detail in my previous post
    • It mainly delivered a fresh new 256 GB RC file
      • … which then stored about 70 GB of landscape data from northern America
      • … written into consecutive blocks of around 2.5 Mio blob items and index items.
  • Test A6
    • Here I try to touch as little existing, old data as possible.
    • I took a 1.75 hours bush trip in the beautiful landscape of northern Chad.
    • The airstrip of FTTZ
      • … has no buildings, no ground vehicles, no ground personal, no nothing
        • (OK … for some reason it had a parked ATR aircraft … well).

The main idea of Test A6 was that there should be very large, totally untouched parts in the cache. And so the delta content painting comparing the first 600 MB of Test A5 with Test A6 looks like this:

  • Each pixel represents 512 bytes.
  • Since the index area is only 512 MB …
    • the lower green area shows 88 MB of identical initial blob item content.
  • The tiny red line at the beginning references the essential scenery index data
    • … and other items which get touched and used during every sim launch.
  • The white region is the yet unused index area.
  • The red above the white area references all the new blob items
    • … which have been loaded during Test 6.
  • Within the large green at the top one can see sporadic red pixels
    • … they again indicate some Test 5 item reuse.

When painting the same data with a block size of 4 MB, the results becomes somewhat more interesting and … less and more obvious:

ROLLINGCACHE.CCC_indexArea_600MB.bin.FCD_deltaz100pc_4096KB_A6_A5

  • Many parts of the previously white area turned green!
    • That is because there are tiny green pixels in the high-res painting, which are hard to see.
      • It seems like the “zero-fill” does periodically leave some “marker” bytes … for yet unknown reasons.
  • The red pixels in the second row mark the actual end of the used index area.
  • Overall the first 46 pixels represent the 184 MB of used index area.
    • 18 (40%) are red, showing blocks with modified data.
    • 28 (60%) are green, showing blocks which have stayed untouched and identical over the entire 1.75 hour flight.

At a more detailed level Test A6

  • … lasted for around 1.75 hours … which is 6300 seconds.
  • There has been no TIN landscape in this flight.
  • It did (again) see an average blob item size of around 30 KB.
  • And with around 110,000 new blob item TOCs (Length: 28 bytes) written
    • … we are seeing ca. 17 blob items per second
      • or 0.5 MB per second … which is 30 MB per minute.

Index area write pattern

Looking at the CSV version of the Process Monitor recordings, and focusing on the write events related to the index area, one can see this pattern:

"07:13:49.4099631",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
..
"07:13:49.4665955",,,"FASTIO_WRITE",,,"O: 176,160,848, L: 4,194,304"
..

"08:00:01.2260702",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
..
"08:00:01.2932592",,,"FASTIO_WRITE",,,"O: 184,549,456, L: 4,194,304"
..

"09:04:27.6738518",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
..
"09:04:27.7271167",,,"FASTIO_WRITE",,,"O: 184,549,456, L: 4,194,304"

The above are three examples: the first index area write, some random in the middle, and the very last write. They all have this in common, and by “all” I do not mean the three examples, but actually “all” index area write activities that I can find in the recording:

  • The index area is always written as a full dump!
    • Each write has around 180 MB
      • … which are copied to “disk” in 4 MB blocks
      • … using the “FASTIO_WRITE” technique.
  • “Writing” the full 180 MB took … 54 to 67 milliseconds.
    • If we go with 60 ms … then this translates to 3,000 MB/s of storage throughput? Hmmm.

The key observation here is, that the RC subsystem always writes the full index area, even when basically nothing did change!

Index area write frequency

So how often does this happen … during a flight, or an hour, or a minute?

If I only focus on the “Offset: 80” events, which mark the beginning of a write process, during one hour in flight Test A6, then I can see this:

"08:00:01.2260702",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:00:47.7367668",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:06:12.0288421",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:09:28.8070146",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:10:21.8668873",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:12:35.5750656",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:14:48.9907696",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:17:27.3803098",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:20:22.6257179",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:23:06.4669770",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:25:55.7940173",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:28:41.9122268",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:31:36.7099956",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:34:31.2476710",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:37:23.2834541",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:40:29.6643522",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:43:28.5271062",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:46:11.9059819",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:48:54.1651227",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:52:04.7905014",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:55:46.5810911",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"08:59:21.2446528",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
  • Sometimes the delta is … 46 seconds
    • sometimes it is … over 5 minutes.

However, in the 11.5 hour Test A5 flight I can see segments with this pattern:

..
"17:07:02.1289697",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:07:20.0209665",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:07:39.3841972",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:07:57.8853231",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:08:15.8243826",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:08:35.2747403",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:08:54.4175400",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:09:13.0769473",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:09:32.6322394",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:09:45.8385360",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:10:06.6215734",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:10:27.7512239",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"17:10:54.0164214",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
..

… or this pattern, while I was (with heavy FPS stuttering) above the TIN landscape of Toronto:

"18:15:54.4138176",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:16:05.2952318",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:18:06.3384782",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:18:31.9234027",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:19:35.2588535",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:20:01.6700722",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:20:23.5176490",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:20:37.1728612",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:33:19.8139865",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:35:56.0415092",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:37:33.0595375",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:38:02.0353212",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:38:39.1219073",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:38:59.8414678",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:39:25.3549464",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:39:59.0339799",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:40:14.4490319",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:40:44.4258269",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:41:04.8299137",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:41:56.3904035",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:43:03.8270682",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:44:41.8691164",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
"18:46:18.3579385",,,"FASTIO_WRITE",,,"O: 80, L: 4,194,304"
..
  • The first shows a very consistent … 20 seconds … pattern!
  • And under heavy stuttering (around 18:20) it is … once every 13 minutes?

Even when I take into account, that especially under heavy system stress, Process Monitor will not be able reliably record 100% of all events, then the level of variation is still very large.

The most likely reason for the observed variations is, that index write operations are triggered by some RC control algorithm, which is not simply based on a fixed time interval.

Index area write volume

If I take a “best case” example, like the very low write activity during hour 08:xx, then the numbers are the following:

  • 22 index area write events
    • … each with 180 MB … is in total 3,960 MB per hour.
  • 17 blob items per second
    • … translates to 30 MB per minute … is in total 1,800 MB per hour.

I do not even want to do the math for the “worst case” example with 20 second intervals.

I also want to stress that the blob item numbers depend on the type of flight (speed, altitude) and landscape (TIN or not). But they are a fairly predictable (somewhat “constant”) subject.

However, the index area size will grow until either the blob area is full, or the index area is full. So for a 256 GB RC file the index area can actually grow up to the hard limit of 512 MB. In that case there will be factor 3 more index writes and they will required 3 x 60 = 180 ms to complete.

So …

50% to over 90% of all bytes written to disk in Test 6 are caused by totally unnecessary writes of unchanged index area.

What is this “FASTIO_WRITE” thing?

While scaling the RC up to 256 GB my observed “zero-fill” write rate to my SSD was 300 MB/s … so almost twice the 160 MB/s which I observed back on 2024.12.11 during a previous cache size increase operation. I cannot explain the difference, but at least those two values provide a reasonable “real world” corridor.

This raises two interesting questions:

  • How can I observe 3,000 MB/s index area write speed
    • … on a system that can only deliver 300 MB/s to the actual SSD drive?
  • What does “FASTIO_WRITE” mean?

The RAM caches inside an SSD cannot answer the first question, because the SATA III bus can only deliver around 600 MB/s to the SSD drive. So question two must be the answer to question one.

Access to the file is opened with the following event:

"07:09:48.3574311",,,"IRP_MJ_CREATE",,,"Desired Access: Generic Read/Write, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: None, AllocationSize: 0, OpenResult: Opened"

So “fast” does not mean “asynchronous”.

The official Microsoft documentation explains “fast” like this:

“Fast I/O is designed for rapid synchronous I/O on cached files. In fast I/O operations, data is transferred directly between user buffers and the system cache, bypassing the file system and the storage driver stack.”

The key point is “cached files”. So the timing here is basically a “user process RAM” to “operating system RAM” copy operation. That explains the speed.

However, no matter how fast a storage subsystem is, doing (so much) useless work is never an efficient strategy … it never comes for free.

No other RC read or write operation can take place during that period of time (“Synchronous IO”).

Combined with the observed Read after Write RC usage pattern (see post “2025.01.08 - v1.2.8.0 - Rolling Cache read-to-write ratio”) unnecessary index writes will block (important?) data from reaching the rendering pipeline.

Rendering a single frame at 60 FPS means each frame must be finished in 16 ms, with all the visual magic.

At the same time, with a 256 GB RC full of cached data, for up to 10 frames (180 ms) the sim will not be able to read any data from the RC, due to a very simplistic index area full dump into the system file cache RAM.

But the story does not end there. Those writes will cause system file cache pages to be marked as “dirty”, needing a flush to disk at some point. So lots of bytes will be written to the SSD over and over again … even when the data never actually changed!

Or, if you look at it from another perspective; if you try to reduce the impact of the “LRU problem” by raising the size of the RC, you will automatically increase the effect of the “index area full dump problem”.

Increasingly the text of my posts will be covering “problems” .. and so I think I should try to balance that with beautiful FS2024 landscape images.

The northern part of Chad did teach this old goose that desert regions are mostly underappreciated. And there is so much more to discover in northern Chad.

To summarize the observations and claims presented above:

  • A major performance problem is:
    • The sim always writes (dumps) the entire index area … up to 512 MB
      • … even when very large parts of the index did not change.
      • And this happens so frequently, that 50 to over 90% of all data written into the RollingCache.ccc file is useless, unchanged index area!
  • It feels reasonable to suspect, that blocking the RC file for 50 to 200 milliseconds might cause visible FPS stutter … by means of backpressure.
    • People reported in other threads that they seem to observe less FPS stutter with fresh (new) empty RC files.
      • That would be consistent with bigger and slower index area writes … due to larger RC files.

Ideas for a Brave New Cache (BNC) …

  • Index item writes:
    • … should only take place, when data actually did change.
    • … should be better aligned with the system cache page size.
    • … should be performed in small chunks only … not a full 512 MB dump.
    • … should have a well defined “worst case” latency impact on the “real” data rendering pipeline.
    • … should never compete with “real” blob item data access.

In my next post I think I will try to provide more background on the nature of index concepts. This might be helpful to better understand some of my (future) BNC ideas. But I also want to come back to the “zero-fill” areas and the problems of “cache fragmentation”.

11 Likes

Can we assume this same methodology is employed in 2020’s RC, hence the conclusion so many of us came to is to turn it off?

I used to see significant stuttering when it was on, which immediately ceased when I turned it off. I’ve had it off for years, as a result.

2024, obviously, can’t be setup that way.

2 Likes

OK, I admit I DID NOT read all posts.
Could you tell me what would be “safe” recommended setting for Rolling Cache for high end PC? I have 64 GB RAM and GTX 4090
Thank you

Yes … I would assume that too.

I think the code for managing the RC file was not changed during the transition to FS2024. However, since I uninstalled FS2020 I can not (and do not plan to) verify that assumption.

I also fairly quickly decided in FS2020 that there is less stutter if I completely disable it.

But as I hinted at in the beginning of my index area full dump post … I actually suspect that another bottleneck is the true cause. I will try to explain that in more detail at some point … but basically the hand over between the “cache data subsystem” and the downstream 3D rendering pipeline seem to be synchronous (blocking) … and that is the primary “hick up stutter” IMHO.

2 Likes

Good question. I honestly do not know (at this point).

Now with 64 GB of RAM, RAM is not the issue. I have never seen the FS2024 process go above 32 GB … and my system is equipped with 128 GB.

Regarding the RC size we find ourselves between “a rock and a hard place”.

  • 16 GB is too small (= big LRU problem) … but at least it keeps the index small.
  • 256 GB “solves” the LRU problem … but the large index will cause “stutters”.

As it seems even a heavily used 16 GB cache may trigger more stutters than a clean new 16 GB cache (I have seen people with such observations).

In the very first post of this thread I try to maintain a form of high level summary.
There I posted:

  • Sizes above 250 GB might be a waste of storage,
    - … as the sim might not be able to use it all (due to the nature of the RC index structure).

I think I will now have to add something like

  • Sizes above 100 GB may impact overall performances (e.g. more stutters etc.)

In the end it will depend on the kind of flights that individual pilots prefer to make. As a sightseeing goose I take very long flights at low altitudes … so I will try to stick with larger RC sizes … because I do “prefer” stutters over waiting 10 to 30 minutes for the sim to finish launching.

I am sorry that I can not really give a good answer.

5 Likes

That’s OK. I understand as it is kid of a guessing game from what I see and developers will not explain it.

Well, I have tried now with 64GB, was 32 GB, we will see. :slight_smile:

Thank you again.

Listening to the current Dev Stream I think I will do a “re-cap post” here about the Rolling Cache aspects, once I can download the video in order to get the quotes and timestamps correctly.

In short … I think I agree with most statements.

However … there are details which IMHO matter, and were I would take a slight different position.

4 Likes

Mine is set to 100GB, and it seems like a good number. The sim is (mostly) stutter-free, and load times are around 3 minutes.

I can live with that. I don’t do long cross-country flights, though, and once I do my opinion may change.

I eagerly await any more info you can share. This thread has been most illuminating.

1 Like

In today’s Dev Update, Seb discussed the rolling cache in some detail. He recommends it be used by those wanting to download data until a “Force Download” function is available. He recommends 16gb, 32gb or 64gb, but not more.

2025.02.05 - Comments to the Rolling Cache related information of the Dev Stream

Allow me to preface this post with the following:

  • IMHO it is extremely unfair to expect that a person can answer every question or knows every detail about a large project or topic
    • … on the spot
    • … in a live stream.
  • It is simply not possible to cover every technical detail
    • … in short common language sentences
    • … that will or can be correctly understood by everybody.
  • Everything is clearer in hindsight … or in “slow motion”.
  • I do not know the FS2024 code
    • … and, looking at it from the outside, I may interpret some aspects totally wrong.

With that background I want to turn now to the Rolling Cache related information that was given in the last Developer Live Stream.

I downloaded the video from the link above and will provide the quotes with timestamps (h:m:s) … plus my comments. Where possible I will add links to some of my older posts where you can find more details and the background of my comments.

And as mentioned in many previous posts, my experience is only based on flying at 4K-Ultra-everything on PC. I know nothing about Xbox.

Data volume aspects

Let me start with the topic of data volumes:

01:19:10

S: "… the first hour of flying I downloaded eight gigs
    but then it sort of settled on an 
	  average of two to three per hour."
		
S: "… if you go from TIN to TIN at low altitude 
    it's going to fill faster … "
	
01:23:33

S: "… the average on PC was five gigs an hour …"

I have measured the same numbers as Sebastian.

However …

  • I think one should put a real number to the TIN case.
    • A 3 hour flight in a (heavy) TIN region like Florida
      • KMIA to KJAX … 150 ktas … 1,000 ft
      • … did add around 60 GB to my Rolling Cache.
    • 1 hour “TIN” = 20 GB

Why is this important?

The magic of FS2024, and a focus of the intro video that we are watching on every launch of the sim for 3 minutes, is … flying at very low altidute.

None of the FS2024 challenges is: “Flying at 40,000 ft from KJFK to ZBAA”.

All challenges are about … flying low and fast.

So I would recommend to embrace the obvious …

Flying low and fast … in countryside TIN landscape.

The future of the global twin is TIN

I really love the vision that Jorg repeatedly puts on the table:

01:17:17

J: "… I always love that about the digital twin.
    This is only going to get better.
    There's more and more data."

… and before that he mentioned the goal to increase “countryside TIN”. Later Sebastian added:

01:32:24

S: "… so first of all there's the world data
    … which is three or so petabytes.
    Every time there is a new TIN it's more and more data …"

This is all correct and IMHO this is what makes FS2024 such an impressive tool for exploring our planet.

However …

  • Precisely because of this increasing volume of TIN data
    • … and precisely because that is the future of FS2024
    • … the caching system must become at least one order of magnitude better at “caching”.
      • It at least needs to distingush between “important” and “not important” data.

I would recommend to embrace some …

Ideas for a “Hotfix” release of the Rolling Cache … which have been posted here:

Rolling Cache only stores the large data?

01:21:33

S: "… we have really two different caching systems.

    The rolling cache caches heavy stuff 
    like meshes, textures, and all these things.

    The little files, like CFG text files for planes,
    these are always cached on … the drive …"

Sebastian is clearly correct. There are two caching systems … actually I would argue that there are already three … and soon four:

  • Automatically managed downloads
    • … cached (stored) in the RollingCache.ccc file.
    • … stored (cached) in the normal packages on the filesystem.
  • Manually managed downloads
    • … stored (cached) in the Community folder
    • … and soon also FS2024 packages in some special FS2024 folder.

However …

  • The “heavy” vs “little” distinction seems somewhat oversimplified to me.
  • “Heavy” sound files are stored in the regular filesystem
    • … and I suspect that this is for reasons of reliable data streaming to the sound subsystem of the OS.
  • There actually are lots of “little” files in the RC.
    • I would argue that 40% of the files in the RC are “little”
      • … but, just to be clear, I am not saying that this is “bad”.

To back up my counterposition let me show concrete numbers. Over 6 flights with a total of 26 hours my 256 GB RC did archive around 5.5 million blob items with a total of 180 GB. A “rule of thumb” analysis of the distribution does look something like this:

Size category Item % Size %
0 to 1 KB < 5 0.1
1 to 10 KB 35 5
10 to 100 KB 55 50
0.1 to 1 MB 5 30
1 to 10 MB 0.10 10
above 10 MB 0.01 < 5
  • Small items below 1 KB still use up 5% of the index area.
  • +50% of all items and blob volume is in the 10 to 100 KB category.
  • 5% of the data volume requires 40% of the index (management) “attention”.

I did not (yet) make any similar analysis of the regular filesystem data, which is stored outside of the RollingCache.ccc file.

I would recommend to embrace this challenge:

The large number of small files does have a large impact on cache performance.

To repeat …

  • I am not recommending to remove the “little” files from the cache.
  • But I am recommending to raise the importance of the RC
    • … so that it will always perform better than the filesystem or manual downloads.

RC size recommendations

During the stream a number of comments have been made with recommendations for the RC size:

01:22:10

S: "32 is what I usually use …

    If you put 128 gigs of rolling cache you're going to 
    have a very good experience because it's going to
    take … weeks to fill it up 

    … 32 or 64 is really … already very good.

    It depends how you fly …"

Again, I agree with Sebastian and in this thread I basically made similar recommendations.

However …

The essence of that post was:

Regarding the RC size we find ourselves between “a rock and a hard place”.

  • 16 GB is too small (= big LRU problem) … but at least it keeps the index small.
  • 256 GB “solves” the LRU problem … but the large index will cause “stutters”.

As it seems even a heavily used 16 GB cache may trigger more stutters than a clean new 16 GB cache (I have seen people with such observations).

There are a number of previous posts which explain the technical background of the “LRU” and “full index area dump” problems. You should be able to find them if you search for:

  • Todays Least Recently Used (LRU) strategy does not fit the nature of FS2024.
  • The sim always writes (dumps) the entire index area … up to 512 MB

“100% Full” question

One question related to the RC that was asked was:

“Could the rolling cache display information on how full it is and what it contains? Perhaps in My Library.”

… and the direct answer was:

01:30:23

S: "… that's I think something we will write down on the backlog …"

I agree with Sebastian that this is currently not the most important (pressing) issue they have. In addition I would like to point out:

  • At some point every cache will reach its size limit and be “100% full”.
    • That is normal.
  • Actually one should expect that the RC is always at 100%
    • … and if not, then it is not a cache, but a waste of storage.

However …

  • I think the question was really targeted at the problems that the RC is causing today:
    • like “FPS stutter” triggered by a fully (large) cache
      • … where one “manual hotfix” (on PC) is to delete and rebuild the cache once it reached a certain “fill level”.
  • I fully agree with the person who asked the question, that the RC
    • … needs to become more “transparent” to the user
    • … which also means to offer more “customization” options to better fit individual flight habits.

I would recommend to embrace …

Ideas of Rolling Cache customization (configs) and some “analysis” features (maybe in the Developer menu?)

… and I posted some detailed customization ideas here:

I should revisit the topic of a “100% Full” cache in my next post in more detail. But just briefly … from the perspective of a goose the (hard) engineering question is:

How to design a cache that will not degrade in performance … once it gets big or full?

Sim Update 2

That directly leads me to the information about sim update 2.

I was very glad to hear that the FS2024 team is seeing the data download and data caching topic as one of the focus points for sim update 2. I hope that some information that has (and perhaps will be) provided here in the forum turns out to be useful.

Jorg repeatedly said that they want to take it slow in order to do it right.

And as a goose I would recommend the same. There is no other way. They have to get this right.

8 Likes

Thanks for your efforts! Please forgive my laziness but that’s a lot of reading!

I sent this to support. They never answered or even acknowledged.

I was watching the sim slam my data limits. I went to the cache setting to find it set to 0. I set it to 16G and it shows me a window saying something like, “This could take several minutes, don’t shut me down.”. The window shows 100%. At this point, MSFS2024 will take no more input. 10, then 20 gigs without stopping. I had to shut it down via Task Manager. It appears that the shutdown was harmless other than all that lost data. On rerunning, the cache size is zero again.

I also note that there’s a continuous downloading occurring even though I sit on the ground in the same position.

I would think that the “rolling” cache would simply keep the newest stuff around and replace older stuff with newly downloaded stuff. Isn’t that what a cache is supposed to do?

Any idea what’s happening when I change the cache size?

Thanks!

I changed the cache size from 16 to 64 and moved the cache at the same time.
It then spent maybe an hour transferring something to somewhere, but not downloading from the internet.
After waiting, as asked to, MSFS 2024 is once more working.
Perhaps just let it finish?

1 Like

FWIW (not much) I set my RC to 100GB when I installed FS24 and haven’t touched it since. I’m very happy with the stutter-free performance I get flying around both new and frequently visited places. Is there a very occasional ‘stutter?’
Sure. But it’s more of a very short pause, rather than a true stuttering, which to me behaves more like a lot of dropped frames in a row.

I have personally not had this issue and I sadly can not be of any help.

What I have seen is that Windows File Explorer is lagging behind the real status of the filesystem when it comes to displaying size information. Once File Explorer still showed (a cached number) of Zero Bytes for the RC … while in Ubuntu on the Windows Subsystem for Linux I could already see the correct RC size.

So I fear that there are multiple issues here.

However, I have seen that there is a dedicated thread on that topic over here … and I hope it might help you to find a solution.

1 Like