-
Bug
-
Resolution: Fixed
-
Minecraft 1.12.1, Minecraft 1.12.2, Minecraft 1.13
-
None
-
Unconfirmed
In the process of testing the fix for MC-79154, I uncovered another potentially serious but unrelated Minecraft bug. I found it by looking at MCP for 1.11.2, although I'M SURE it affects 1.12.x also.
Here's what I saw in my logs:
// On tick 5034266, a chunk unload happens, causing a hopper I was monitoring to get serialized as NBT data and queued for I/O: Hopper save on tick 5034266 ser=29333 oser=7 Position at co{x=-1, y=56, z=2} data: {TransferCooldown:3,x:-1,y:56,z:2,Items:[],id:"minecraft:hopper",serial_num:7L,Lock:""} // Two game ticks later, the same chunk is reloaded, but we get completely different data: Hopper load on tick 5034268 ser=29341 oser=7 Position at co{x=-1, y=56, z=2} data: {TransferCooldown:7,x:-1,y:56,z:2,Items:[0:{Slot:0b,id:"minecraft:redstone",Count:1b,Damage:0s}],id:"minecraft:hopper",serial_num:7L,Lock:""} // Searching backwards in the log, we find that this is what was written out the previous time this chunk was unloaded: Hopper save on tick 5034170 ser=29323 oser=7 Position at co{x=-1, y=56, z=2} data: {TransferCooldown:7,x:-1,y:56,z:2,Items:[0:{Slot:0b,id:"minecraft:redstone",Count:1b,Damage:0s}],id:"minecraft:hopper",serial_num:7L,Lock:""}
(In case you're wondering why the saves are only 96 ticks apart, I think it's because the one on tick 5034170 happened when I logged out or restarted the server. I forget exactly, but it appears I triggered an off-schedule auto-save. This is actually relevant to Rich Crosby's fix for MC-22147. Briefly, I thought it might be moot, since auto-saves are 45 seconds apart. What are the chances that the I/O system would be backlogged by 45 seconds, triggering that bug? But anything that causes two auto-saves to occur close to each other CAN trigger that bug.)
Anyhow, so, how could we be loading an old version of this data? Well, here's what I found:
There is a time between when AnvilChunkLoader.writeNextIO dequeues a chunk from chunksToRemove and when the I/O write completes when AnvilChunkLoader.loadChunk could come along looking for the same chunk. If that happens, AnvilChunkLoader.loadChunk will load a stale version of the chunk.
The key methods on RegionFile are synchronized, so I'm pretty sure it can't read corrupt data. Sector allocation and low-level file access are all synchronized. The problem is what happens in the period of time when CompressedStreamTools and the various NBTTag objects are are writing out the NBT data to the ChunkBuffer (derived from ByteArrayOutputStream), including the overhead of doing the compression.
I don't have an "easy one-line" fix for this.
Let's say that AnvilChunkLoader.writeNextIO were to pick a chunk out of the chunksToRemove but not actually remove it form that container during I/O. That would be great for the case when the same chunk is reloaded. But what if another save for the same chunk comes along during that time? Then the newer one would replace the older one in chunksToRemove while the older one is being saved, and then when the write was done, the newer one would get deleted from chunksToRemove.
That condition could be eliminated (fixing Crosby's observation in the process) if an auto-save can't even be started if AnvilChunkLoader's I/O queue is not empty.
How to reproduce: Point two hoppers into each other on a chunk border, throw one item in, create lag.
- discovered while testing
-
MC-79154 Hoppers occasionally duplicate items
- Resolved
- is duplicated by
-
MC-117930 Sticky pistons, slime blocks, and other blocks being moved may disappear when chunks load/unload
- Resolved
-
MC-153201 Hoppers can lose items
- Resolved
- relates to
-
MC-22147 Entity duplication bug
- Resolved
-
MC-102348 Order of chunk unloading, executing commands and removing unloaded entities causes problems
- Resolved
- testing discovered
-
MC-134979 Extended sticky piston at border of unloaded chunk gets stuck permanently extended
- Resolved