[BDS-17527] Multiple server crashes due to memory leak when loading chunks Created: 22/Aug/22 Updated: 26/Sep/23 Resolved: 19/Sep/23 |
|
| Status: | Resolved |
| Project: | Bedrock Dedicated Server |
| Affects Version/s: | 1.19.20, 1.19.21 Hotfix, 1.19.30, 1.19.31 Hotfix, 1.19.41, 1.19.63, 1.20.12 Hotfix |
| Fix Version/s: | 1.20.30 |
| Type: | Bug | ||
| Reporter: | MaladjustedPlatypus | Assignee: | Unassigned |
| Resolution: | Fixed | Votes: | 83 |
| Labels: | crash, server | ||
| Environment: |
OS = Debian GNU/Linux 10 (Buster) |
||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Confirmation Status: | Community Consensus | ||||||||||||||||||||||||||||
| ADO: | 881031 | ||||||||||||||||||||||||||||
| Description |
|
Made the bug public and removed the world file since I confirmed the memory leak occurs with a fresh world. Also adding some extra info from my comments below: After further testing this seems to be a memory leak related to chunk loading. If you check the memory while standing still, not loading chunks, it should be stable. When you load chunks, whether it's via crossing portals, flying, etc. the memory use increases and not come back down, which it should not be doing. Running farms while standing still did not affect the memory usage for us. Passive mob farms, redstone mechanisms, etc. did not contribute to the memory leak. Here's a log of our memory use up until the server crashed again. Crash reports are fairly limited in information, but here are some of them attached below. |
| Comments |
| Comment by jtp10181 [ 23/Aug/23 ] |
|
I feel like they just ignored it until people recently started to pester them about it more. Seems like they were very confused on how to replicate the issue up until at least the last Mojang post on 6/16/23. Not sure how though since all you had to do was load a server and play on it. Either way, glad a fix is finally coming. Will be interesting to see how it runs after the fix. I had to bump up the RAM allocation on my VM just to deal with this issue. Will have to keep an eye on the release notes to watch for the update. |
| Comment by Luke [ 23/Aug/23 ] |
|
Looks like this issue has been finally fixed in the latest preview as the changelog says! Next stable release should bring the fix to everyone! “Only” took a year, but at least they fixed it. https://feedback.minecraft.net/hc/en-us/articles/18619357250701-Minecraft-Beta-Preview-1-20-30-22 |
| Comment by jtp10181 [ 18/Aug/23 ] |
|
@GoldenHelmet, fair enough. Would be nice to get some sort of an update. Is it actively being worked on? Can they reproduce it? Do they need anything from the users? Even a response acknowledging the problem and saying they cannot figure out what is causing it would be better than totally ignoring everyone. I imagine there are thousands of people experiencing this bug but they either have no idea why the server is crashing or just have not bothered to search around and find this. I found it with a search trying to figure out if it was something I did wrong and could fix. Pretty sad this has been on here for just about a year now and only two responses from "Mojang" in that entire time. |
| Comment by [Mod] GoldenHelmet [ 17/Aug/23 ] |
|
jtp10181: the Bedrock team does not use the "Assignee" field on this bug tracker. You can see that the report is being tracked internally by Mojang when it has a number in the ADO field. |
| Comment by mattwelke [ 06/Aug/23 ] |
|
Just chiming in to say I'm experiencing this too. The only difference is that for me, the server hangs instead of crashing while a process called "kswapd" climbs to 100% CPU usage and stays there. Sometimes it resolves itself after a few minutes. Sometimes it doesn't and requires me to restart my cloud VPS. I think this is because they provision the VPSs for me with swap enabled, so I get a hang instead of a crash. From the looks of the comments here, other people have more experience doing debugging steps and providing logs etc than me, so I'll hold off on adding mine until asked. I'll subscribe to this for updates on this issue. Thanks! |
| Comment by jtp10181 [ 03/Aug/23 ] |
|
Now mods are editing and deleting peoples comments, while continuing to ignore the issue giving no responses at all. In a few weeks it will be the 1yr anniversary of this being reported. |
| Comment by Matthew Dietrich [ 27/Jul/23 ] |
|
Not sure why this issue is being ignored people like myself are paying good money for our servers to only have them be ruined by an issue that was reported a long time ago. This is completely unfair to server owners such as myself as it ruins the overall enjoyment people have. |
| Comment by wuhupoo [ 27/Jul/23 ] |
|
Very serious problem, when we want to travel far, soon the server crashes |
| Comment by jtp10181 [ 04/Jul/23 ] |
|
@[Mojang] darknavi, @darknavi Do you still need a world to test this? Should really work on any world but I could upload if you really need something. Where do I send it? This is ridiculous this has been going on since August with basically no visible effort to fix. I am just hosting a small server at home for 4-8 people and it will tear through 4gb of ram every few days and crash. |
| Comment by Matthew Dietrich [ 22/Jun/23 ] |
|
Maybe all server owners should send Mojang/ Microsoft a partial server cost bill for each month this goes unresolved maybe then it will get the proper attention it deserves. |
| Comment by Tasel [ 21/Jun/23 ] |
|
I could upload the 77th's world file if you want, but it's every world we put on the server. We play pure vanilla, no adding, no resource packs. Any world that's up (both our old one and the brand new server we created the Saturday after the update) will constantly increase in RAM usage until it hits 100% and then crash. Seems to be related more to chunks loading. The server just never seems to let go of the RAM. |
| Comment by Copitoch [ 18/Jun/23 ] |
|
This is a very old problem, I remember it has existed almost since the release of the kernel. That's why I had to abandon the idea of using a vanilla kernel. I fully support the negative comments here and am waiting for the long-awaited fix. (translation) |
| Comment by glenb711 [ 17/Jun/23 ] |
|
You don't even need to go to the Nether. Just the act of teleporting around will cause the memory heap to grow and grow. This is all documented in the comment history for this issue. You can pick two locations, teleport back and forth between them, and watch memory utilization increase. The point is that memory is never freed: BDS keeps allocating new memory every time it needs to load a new chunk but it never frees up old memory. It's as if the server can't remember which chunks it has loaded in memory, and keeps reloading the same chunks into new memory allocations over and over again. And just to be clear, this happens even on brand new worlds, bare servers with no plugins. It has nothing to do with content and can be easily reproduced even on brand new worlds. Create new world, log in, teleport around, watch memory usage. That's it! |
| Comment by Coleton Watt [ 17/Jun/23 ] |
|
@darknavi I could send my server file however it is really easy to replicate the problem, just by going back and forth from the nether and overworld. My hypothesis is that when going back and forth it causes the chunks in the nether and overworld to load again. Meaning the server has multiple of the same chunks loaded in memory without realizing. This might be caused by the server writing to storage but never returns the memory to the stack, a skipped destructor. This is only worsened by have building and content beyond the chunks that need to be loaded. |
| Comment by Luke [ 16/Jun/23 ] |
|
@darknavi That's what I thought at first as well, but after removing all content (I uninstalled BDS completely and reinstalled without changing anything so it creates a brand new fresh world) the issue still remained tho the interval between the crashes seemed to be a bit longer. The issue is definitely not caused by content, but it may be aggravated by them. The mystery is why this doesn't seem to happen to every single person. |
| Comment by [Mojang] darknavi [ 16/Jun/23 ] |
|
It seems like this could be provoked by content. Does anyone here mind uploading their world/packs (probably just the entire server folder) so we can take a look at this? |
| Comment by Tasel [ 13/Jun/23 ] |
|
Confirming issue still active on BDS servers (ours is running on Nodecraft if that helps any). Running current 1.20 update. Issue did not seem to happen in previously generated world as everything had been explored and built before we restarted for 1.20. New world crashes multiple times a day, even with the system rebooting twice a day. Have to have a script running to see if down every 5minutes and start it back up to accommodate players when they all get kicked due to the RAM hitting 100%+ and taking the server down. If I watch the RAM just slowly creeps up until it hits 100% and crashes. |
| Comment by 0ld guy [ 08/Jun/23 ] |
|
this is still a problem in the 1.20 update, i restart my server 2x a day and ive still had crashes due to excessive memory usage, the server has 8gb memory assigned to it, if it takes more than 8gb of memory to run for 12 hours thats ridiculous, listen i am willing to do whatever it takes, if i need to add a dev to the dashboard so they can watch the server or if i need to provide server downloads of the files, i can do that, but the current state of the BDS is not acceptable, |
| Comment by Matthew Dietrich [ 02/Jun/23 ] |
|
Are we going to get any type of Proper resolution for this my server crashed 22 times in 4 days. This issue has gone on far too long. It needs to stop being ignored and made a priority. Many of us are spending hard earned money on our servers and these types of crashed are having a major impact on my community as a whole |
| Comment by Pavel [ 14/May/23 ] |
|
I seem to be having the same issue, however I don't get any crash logs. Just a message that says "Killed" and nothing else. |
| Comment by Matthew Dietrich [ 10/May/23 ] |
| Comment by [Mojang] darknavi [ 10/May/23 ] |
|
Does someone mind uploading a more recent server log? Specifically I am looking for the bits at the end that contain the session ID. |
| Comment by Matthew Dietrich [ 08/May/23 ] |
|
THIS ISSUE IS STILL PRESENT IN 1.19.81!!!! |
| Comment by Luke [ 08/May/23 ] |
|
can anyone confirm if this happens in 1.19.81? |
| Comment by Ivo Burkart [ 16/Apr/23 ] |
|
1.19.73 affected as well |
| Comment by Scotty B [ 03/Apr/23 ] |
|
Our server is still having the issue. Our work around that has help us, is setting server render distance to 16, and we do 4 restarts a day. Every once in a while the server starts to lag just before the restart, but it is manageable . If we increase render distance or reduce the number of restarts the server will almost always crash, within the restart window. |
| Comment by Luke [ 01/Apr/23 ] |
|
@Matthew Dietrich EXACTLY! As a temporary solution, i suggest moving your BDS server to realm. It has 10 player limitation, but at least no crashes that i'm aware of. |
| Comment by Matthew Dietrich [ 01/Apr/23 ] |
|
This is absolutely absurd things like this are shoved off to the side people are spending their hard earned money for their servers and yall are just ignoring a problem with your software and your game. You people need to get off your lazy ***** and get this resolved or I think you should have to reimburse every bds owner partial of nlot all the money for their servers since this has gotten to the point of sheer neglect by the dev team |
| Comment by Luke [ 01/Apr/23 ] |
|
Come on Mojang! This has been happening for 8 MONTHS NOW! That is ALMOST A YEAR and still not fixed. My server crashes every 1-2 hours with 6-10 players online... If this is not fixed all the work I put into my server is lost. And moderators: can't you do something? like bumping this issue up? I haven't seen a single comment from a dev for a long time since this issue was open. |
| Comment by Matthew Dietrich [ 05/Mar/23 ] |
|
This has been happening to me as well with my server host had it crash on its own 4 times in one day. It has been crashing on a daily. This issue needs to be resolved as it is having a major effect on my community and their ability to enjoy my server. |
| Comment by Coleton Watt [ 28/Feb/23 ] |
|
The problem continue to exist in the latest update patch 1.19.63.01. Depending on activity my self hosted server crashes more then once a day. |
| Comment by Jack Richard [ 15/Feb/23 ] |
|
This is happening to me, as well. Restarting my server around four times every week or it'll crash on its own. Running latest bedrock server on Ubuntu. |
| Comment by Austin Farmer [ 02/Feb/23 ] |
|
Still happening on 1.19.51.01. I have to restart my server constantly to avoid filling up all the allocated RAM. |
| Comment by TJ Spann [ 15/Jan/23 ] |
|
Happening here as well. 1.19.50 |
| Comment by Paul Bramhall [ 12/Jan/23 ] |
|
I've also been experiencing this more as of late, I've also been monitoring CPU/Memory usage and can confirm the same pattern as others. I've done some tests on my own BDS server too which caused it to crash within 5 minutes by simply teleporting between 2 previously generated areas at a 10 second interval. I'll attach both the output from pidstat, server log (with tp commands prefixed with the time at which they occured) and server.properties. You can clearly see memory utilisation increasing at each teleportation event. |
| Comment by Scotty B [ 11/Jan/23 ] |
|
We have dropped our server render distance from 32 to16 and it has helped a lot since each client s now asking the server to generate less chunks. We have client-side-chunk-generation-enabled set to false as well. |
| Comment by Scotty B [ 09/Jan/23 ] |
|
We are also seeing this on our newly created world running on Ubuntu 20.04 with 4GB of ram and bedrock server version 1.19.51.01. Restarting of the sever fix the problem for a short while. As members are on the world moving around after about 5 hours block lag is noticed, and at the server has crashed/rebooted almost daily now. We have implanted 6 hour restart to help minimize the problem. |
| Comment by tyro89 [ 08/Jan/23 ] |
|
This issue is also impacting my recent deployment, fresh world, and fresh ubuntu server. To make things worse, this issue makes deploying bedrock on AWS with EBS-backed storage extremely problematic as the ever-increasing memory usage leads to swapping, which eats up the server's storage IOPS—rendering the whole instance unresponsive. I'm investigating setting hard memory limits, scheduling daily server restarts, and disabling swapping to support hosting a bedrock server. Which indeed is a bit of a bummer to have to deal with. Hopefully, the issue can be looked at soon, and I am happy to help in any way I can. |
| Comment by tct1997t.2 [ 06/Jan/23 ] |
|
This issue is ongoing as of 1.19.51. With a clean install of ubuntu server and only the Minecraft BDS running with two players, the memory usage started out on server start with around 270 MB then slowly increased over around 4 hours time up to 2.4 GB of memory usage. Both players logout and even after thirty minutes logged out the memory usage was still at 2.4 GB. After testing the memory usage increases at a faster rate when loading chunks, even chunks that were loaded very recently (riding a horse back and forth 500 block distance). However the usage still continues to climb (at a lower rate) even when just standing there. Only restarting the server released the memory. |
| Comment by DNS Crypt [ 30/Dec/22 ] |
|
I've noticed performance take a huge hit recently on latest updates in 1.19.5 series, chunks slow to load when flying with elytra, game locks up sometimes (server side freezing), portals in and out are way slower to load (world gen). I'm assuming memory usage is higher than normal (checked mine and it's 4.8 GB) with just myself connected, but something is definitely not right. Running BDS on Windows, 1-3 players typically. |
| Comment by RicoXu [ 29/Dec/22 ] |
|
The problem still seems to exist in the latest 1.19.50 update. I am running the server with a new world on a 8GB RAM Windows machine, but the bedrock server used about 6GB RAM after 3-4 players played for a week(I guess mainly because they travel long distances in games and as a result loaded a lot of chunks along the way). I have to manually stop and start the server to make the RAM usage go down again. |
| Comment by timmyc1983 [ 19/Nov/22 ] |
|
Have been having this exact issue since I think 1.19.40. Only way to get the RAM down is to restart the server.
|
| Comment by Foxy No-Tail [ 15/Nov/22 ] |
|
It also affects the other server that I run which resets twice per day |
| Comment by Foxy No-Tail [ 15/Nov/22 ] |
|
This affects 1.19.41 too. |
| Comment by PlodPlod [ 15/Nov/22 ] |
|
I can confirm this affects 1.19.41; in fact it seems significantly worse since we updated our servers to that version. |
| Comment by Omar Berrow [ 14/Oct/22 ] |
|
Now seems to affect windows |
| Comment by NavDaz76 [ 06/Sep/22 ] |
|
We are having the exact same issue. We are running on a server hosted by Nodecraft. Originally set to 2GB RAM, 2-3 players online and only working in about 500 blocks around spawn we would consistently get to 80-90% usage. Upgraded the ram to 4GB earlier, and with only 2 people on we were able to have the server crash due to 100% RAM usage. We are running a primarily vanilla server with some resource and behaviour packs, however these packs are the same that were on the server before the upgrade to 1.19.20, and we have only been experiencing these issues since 1.19.20 ::EDIT::
|
| Comment by PeonyGirl13 [ 29/Aug/22 ] |
|
Same here!!! Random crashes with 100% RAM usage and nobody online. |
| Comment by MaladjustedPlatypus [ 28/Aug/22 ] |
|
The use of the word "should" is with regards to someone trying to reproduce the bug, not that said behavior is what one would expect in normal gameplay. If you were to test for the bug, that is what you "should" be seeing. I'll change the word to avoid any more grammar semantics unrelated to the bug. |
| Comment by glenb711 [ 28/Aug/22 ] |
|
Just by way of follow-up: 14 hours later my completely idle server process released no memory and is still at 705.64MB. I note that the author of the other ticket wrote "When you load chunks, whether it's via crossing portals, flying, etc. the memory use should increase and not come back down." With no disrespect to anyone intended, and with deep apologies, I disagree with that "should". I think that condition is actually a part of the problem. I would suggest instead that when you load chunks [by any method] the memory use should increase, but when chunks are unused/un-accessed for a period of time, they should be written back to disk and released from memory. That is to say, "memory use should increase, and then should come back down after a period of inactivity for unused chunk(s)." If memory is never released, the process runs the risk of growing to infinity, which is the behavior we're seeing here. Proper memory management is essential to the health of the process. |
| Comment by glenb711 [ 28/Aug/22 ] |
|
My ticket was merged here, so I'm adding my comments here as requested: There appears to be a significant memory leak in BDS for Linux. The bedrock_server process continues to grow its memory utilization whenever any activity occurs, and appears to not ever release that memory. Utilization continues to increase until all available RAM on the host is exhausted, at which point either swap kicks in (resulting in host paging/thrashing), or the process is killed by the OOM killer in the kernel, which of course releases all the memory but terminates the BDS service. Either situation results in all players being forcibly disconnected, and could result in database corruption. The problem appears to be related to loading chunks. It is exacerbated when teleporting. Teleporting to a distant location can trigger significant memory allocations (on the order of 10MB per second) making this problem easier to duplicate. It also exposes an additional aspect of this bug, which I will describe below. Steps to duplicate: I used an Azure server with 2 CPUs, 4GB of RAM, and 32GB of disk, but this has also been tested on larger server configurations with the same result. I used Ubuntu 20 LTS as directed, but I also tested this under OpenSuse 15.3, with the same result. Following the above steps, I was, within the space of 10 minutes, able to more than double the process' RAM allocation to above 700MB. Clearly I could have continued teleporting, back and forth, until I brought the server down from memory exhaustion. This highlights a number of points: 1. As C++ has no automated garbage collection mechanism, save for very limited scope-exit recoveries, it is necessary for the program to track and release its own memory. This clearly is not happening. An idle server with no players on it should detect this condition and release unused memory. A chunk which is no longer in use after a period of time should be written back to disk and similarly released from memory. Neither of these things are happening. 2. If you're building an in-memory copy of loaded portions of the database (as is clearly the case here), an in-core index or other similar data structure should be used to track which chunks are already loaded, and point functions back to them. This clearly is also not happening. Only two chunks (or areas) were being visited in my test: 0,0,0 (the spawn point, or near to it), and 2000,80,2000. Yet, each visit to those same chunks in either direction, caused additional RAM to be alloc'ed by the process. Each teleport required, in my case, an additional 10-20MB of RAM to complete. This suggests that multiple copies of the same chunk(s) were being maintained in RAM (clearly without the process realizing it), which amplifies the leak: Not only is RAM not being released, but data is being duplicated in RAM, causing growth to expand at least geometrically. This may be related to why memory is not being freed: If the process isn't tracking which memory it's allocated, and loses the pointer to the allocated memory block(s), it CANNOT release them. That type of thing seems to be happening here. 3. This obviously exposes a potential for a denial-of-service-style attack against a BDS. If a world has either a malicious operator, or if the world has set up (for example) command blocks enabling regular users to teleport, then repeated use of the teleport by players - either maliciously in quick succession - or, simply over time, if the server process continues to run - is guaranteed to speed up memory consumption and hasten the crashing of the server process itself. Again this applies to any in-server actions: the more action, the faster the RAM exhaustion appears to occur. Note here that this bug is NOT about teleporting itself: Memory usage increases whenever new chunks are loaded via ANY method, and such memory appears to never be released. Teleporting simply speeds up the process and the visibility of the problem. Even with teleporting disabled, BDS on Linux slowly grows in RAM size, and never releases any RAM, eventually exhausting available resources on the server causing a crash of some kind to happen. I've been off the test instance for an hour as I post this bug, and I firewalled it off so nobody could get in. It's still at 705.64MB - after being unused by anyone for an hour - and will continue that way until it's reset. |
| Comment by MaladjustedPlatypus [ 27/Aug/22 ] |
|
After further testing this seems to be a memory leak related to chunk loading. If you check the memory while standing still, not loading chunks, it should be stable. When you load chunks, whether it's via crossing portals, flying, etc. the memory use should increase and not come back down. Running farms while standing still did not affect the memory usage for us. Passive mob farms, redstone mechanisms, etc. did not contribute to the memory leak. Here's a log of our memory use up until the server crashed again. syrupy_20220827204110.ps.log
Edit: Tested with a fresh world on the server, issue still persists. |
| Comment by MaladjustedPlatypus [ 22/Aug/22 ] |
|
Sorry for the duplicate comment, can't seem to add another attachment via editing existing comments. Here's a screenshot of the server directory. If anything I'm seeing more files than what is normally present so I'll back up the server files, download a fresh install and try again. |
| Comment by MaladjustedPlatypus [ 22/Aug/22 ] |
|
I don't think this is related to |
| Comment by Maciej Piornik [ 22/Aug/22 ] |
|
Hi Were there any changes made to server.properties? Any files from root BDS folder deleted? Have you tried reinstalling BDS from new download? This ticket will automatically reopen when you reply. |