Minecraft 1.12.1, Minecraft 1.12.2, Minecraft 1.13
In the process of testing the fix for
MC-79154, I uncovered another potentially serious but unrelated Minecraft bug. I found it by looking at MCP for 1.11.2, although I'M SURE it affects 1.12.x also.
Here's what I saw in my logs:
(In case you're wondering why the saves are only 96 ticks apart, I think it's because the one on tick 5034170 happened when I logged out or restarted the server. I forget exactly, but it appears I triggered an off-schedule auto-save. This is actually relevant to Rich Crosby's fix for
MC-22147. Briefly, I thought it might be moot, since auto-saves are 45 seconds apart. What are the chances that the I/O system would be backlogged by 45 seconds, triggering that bug? But anything that causes two auto-saves to occur close to each other CAN trigger that bug.)
Anyhow, so, how could we be loading an old version of this data? Well, here's what I found:
There is a time between when AnvilChunkLoader.writeNextIO dequeues a chunk from chunksToRemove and when the I/O write completes when AnvilChunkLoader.loadChunk could come along looking for the same chunk. If that happens, AnvilChunkLoader.loadChunk will load a stale version of the chunk.
The key methods on RegionFile are synchronized, so I'm pretty sure it can't read corrupt data. Sector allocation and low-level file access are all synchronized. The problem is what happens in the period of time when CompressedStreamTools and the various NBTTag objects are are writing out the NBT data to the ChunkBuffer (derived from ByteArrayOutputStream), including the overhead of doing the compression.
I don't have an "easy one-line" fix for this.
Let's say that AnvilChunkLoader.writeNextIO were to pick a chunk out of the chunksToRemove but not actually remove it form that container during I/O. That would be great for the case when the same chunk is reloaded. But what if another save for the same chunk comes along during that time? Then the newer one would replace the older one in chunksToRemove while the older one is being saved, and then when the write was done, the newer one would get deleted from chunksToRemove.
That condition could be eliminated (fixing Crosby's observation in the process) if an auto-save can't even be started if AnvilChunkLoader's I/O queue is not empty.
How to reproduce: Point two hoppers into each other on a chunk border, throw one item in, create lag.