Today I was using my VirtualBox VM to download some files and that used up all the disk space on my actual hard drive where the VM was stored. This caused the VMs to halt with an error, basically they just got stuck and can't do anything.
"Ok, no problem" I thought, I'll just copy over the VM to my bigger hard drive and resume it. That was a dumb decision in hindsight.
The VirtualBox VM just got stuck doing something, not sure exactly what it was doing but I couldn't shut it down. In the end I used kill -9 to kill it, thinking "I've copied over all the files so it should be fine". But, that was incredibly dumb to do in hindsight.
The vbox files (XML) were somehow truncated. The .vbox-prev files were also truncated in exactly the same way. Oh shit.
Anyway, I did something very stupid, which was to try to start the VM from the previous vbox files. One of them did work, but ended up I believe overwriting the "latest" state (contained in the "latest" snapshot, which was a VDI file). And I ended up losing multiple months worth of data.
I also tried to create a new VM using the existing VDI disk image, but that disk image turns out to be super super old, like more than a year old, so that didn't work.
I think the problem is that I used the snapshot feature, which makes restoring VMs a huge pain in the ass. I made a snapshot when I was upgrading the OS to the next version, which I think was sensible at the time but, I didn't know that snapshots are really fragile and prone to causing data loss.
Anyway, so it turns out that all of the new data is stored in snapshots. There was a 62GB snapshot which I had overwritten because I started the VM from an old vbox file. But, luckily, there was another snapshot VDI file which was from the time the VM got stuck, so if I could recover from that then I could basically recover all of the data.
Anyway, recovering from snapshots was not easy.
Actually, first I tried to fix my vbox file by editing the XML and closing the tags and so on. That didn't work, although it gave me some idea about how the information about the disk images and snapshots were organized, where they were etc.
Next, I googled online and read the VirtualBox forums - it turns out someone has had encountered the truncated vbox file problem before, though the answers were not encouraging at all.
But anyway, I somehow came across a comment from someone who mentioned the vboxmanage clonehd command, which he says FLATTENS snapshots into a single VDI file! This was exactly what I needed! I just need to run clonehd on the snapshot that I want, and it will (I presume) automatically resolve all the snapshot's parent snapshots to produce a single VDI image of the entire state at that snapshot.
Anyway I tried it on the latest snapshot and as I expected, because I had over-written the "latest" state by starting from an old vbox file, the resulting VDI image had stale data that was around 18 days ago, so I lost almost 3 weeks worth of data.
More importantly, that was the time I upgraded the OS version!! I remember upgrading the OS from Ubuntu 21.10 to 22.04 was a huge pain in the ass because all the sources were expired and I had to edit a whole bunch of config files, install packages, modify sources.list etc, can't remember exactly but it was a huge pain I had to google a lot of stuff, and now I had to go through all that trouble again, plus I lost 18 days worth of browsing history which was very painful for me, so I was determined to do everything within my power to get that data back.
Anyway, so I saw that snapshot file that was last modified at around the time the VM got stuck rather than later (which was when I stupidly overwrite the "latest" state by opening an old vbox file), and I decided to try my luck "flattening" the snapshot into a vdi disk file that I could import in a new VM.
However, when I tried to do that, it gave me an error:
VBoxManage: error: Parent medium with UUID {} of the medium is not found in the media r
egistry (.config/VirtualBox/VirtualBox.xml)
I tried to get around this error by copying and pasting from the truncated vbox file into the media registry file but it didn't work. I think you're not supposed to directly edit the VirtualBox.xml file and if you do it won't work...
Well, how did I get into this mess? Oh right, by opening the old vbox file instead of the latest. No, that wasn't the actual error.
My real error was not making an additional copy of the VM directory before accessing the files.
The VirtualBox program, when it opens the vbox file, of course it modifies the existing vdi files - the snapshots, the hard disk image etc.
I should have made an additional copy of all the files before doing anything with them.
I was too careless - I got complacent even though trying to move VirtualBox files had burned me really badly before.
ALWAYS ALWAYS make an additional copy of the copied-over VM files before you try to open them.
Anyway that was the hard lesson learned.
But let's go back to my very lucky, miraculous recovery of the VM.
So how did I manage to resolve the error in the end? I'm not 100% sure. I tried to create a new VM using the parent snapshot of the snapshot that I was trying to clone, but nothing happened basically. But I also used the VboxManage tool to create an image from that parent snapshot, and after doing that, I was able to do vboxmanage cloneimage the snapshot that I couldn't clone before. I still don't know exactly how I fixed it, but now it worked, and the cloned (flattened) VDI image had the latest data in it.
So, that was lucky. Although it took me a few hours, I managed to recover all of the data in the end.
Lesson learned: NEVER EVER COPY A VM WHILE IT IS STILL RUNNING. IT WILL 100% BE CORRUPTED.
AND ALWAYS ALWAYS make an additional copy of the copied-over VM files before you try to open them.