Affects Version/s: Snapshot 13w10a, Snapshot 13w10b, Minecraft 1.5, Snapshot 13w11a, Minecraft 1.5.1, Snapshot 13w17a, Minecraft 1.6.2, Minecraft 1.7.4, Minecraft 14w08a, Minecraft 14w21b, Minecraft 1.8.3, Minecraft 1.8.8, Minecraft 16w02a, Minecraft 1.9, Minecraft 1.9.1 Pre-Release 1, Minecraft 1.9.1 Pre-Release 2, Minecraft 1.9.1 Pre-Release 3, Minecraft 1.9.1, Minecraft 1.9.2, Minecraft 16w15b, Minecraft 1.10, Minecraft 1.10.2, Minecraft 16w39b, Minecraft 16w39c, Minecraft 16w50a, Minecraft 1.11.2, Minecraft 1.12.1, Minecraft 1.12.2, Minecraft 18w20c
Hello, this report fixes a saving-bug that has been in minecraft for years.
What is the problem? What happens?
Sometimes Chunks in the world wont be saved or overwritten by a newly generated chunk. The information of these chunks will be lost. This even happens partially without any error messages.
When does this happen?
Very rarely. This is mostly a problem of Servers with a high player count and a bigger playable area. At ~100 Players this problem occurs several times a day. With 10 players this problem might never occur.
What causes this?
The background process that saves already unloaded chunk-data to the specific regionfiles (.mca). This is happening because the service handling the currently opened regionfiles hands out references to files. This breaks synchronization-protocol.
Was this reported before?
Yes, several times without the knowledge of why, where or cause. An example would be MC-10976
Why a new report?
This report contains exact information as to why this happens and how to fix this.
Technical background information
I am using the spigot-namings. See my report here for additional information: https://hub.spigotmc.org/jira/browse/SPIGOT-2385
- FileIOThread is the BackgroundService to save already unloaded Chunks to the specific RegionFile. This service is async
- RegionFileCache holds a bunch of cached RegionFiles for loading/unloading data. FileIOThread regularly inserts data via this. Here is the problem, RegionFileCache hands out references to files!
- ChunkRegionLoader uses RegionFileCache for different lookups/loading/unloading. For example it checks if a chunk already exists in a region file or not very frequently.
When does this problem occur?
Exactly when the RegionFileCache is viewed as "full" and a cleanup is triggered. The cleanup will remove all references to regionfiles from the RegionFileCache and close all regionfiles.
FileIOThread tries to saves data every X milliseconds to a regionfile.
If it already started a new try to save data - thus getting a reference to a file from ChunkRegionLoader - and directly after this the RegionFileCache cleanup is started then the data FileIOThread is saving in that moment will be lost. Additionally it is possible the data gets corrupted and minecraft will generate a fresh chunk at that location the next time it is requested.
How to reproduce?
As a developer this is very easy. Change the RegionFileCache limit from 256 to 2. This will heavily increase the frequency this problem occurs. This should be enough to spam the console with saving-errors.
Reprodution of chunk-regenerations
The best way is to change the limit to 2 as seen above
- Create a flat world and generate enough chunks in an area
- Create a normal world
- Copy all the regionfiles from the flat world to the normal world (dont overwrite the level.dat)
- Start up the server and fly around in gamemode 3.
- The console will be full of errors. About every 1-2 minutes there will be a newly generated chunk will appear in the previously flat area.
How to fix the synchronization/reference problem?
One way to fix this to no longer give out references to files that could be unloaded at any time. Instead the service managing the references should be the only one to know about them.
This is possible with relatively low effort.
As the general implementation of RegionFileCache is faulty the method "c" and "d" need to be rewritten. They are the problem as they hand out references to region files. Instead we can change them to hand out the NBTData directly and mark them syncronized. With this setup the syncronization actually works:
Here is an example that has been tested. Only ~10 lines of code need to be changed in total.
Let me know if additional information is needed.