Tale of a Broken SSD

Broken SSD when the power cut

Update: Further to ongoing power outage, as the PC was moved to another room, not a thought was given to what main power circuit the moved PC would be on.

Unknowingly the PC was put onto a main power circuit with a washing machine, a clothes dryer, and a wall heater. With all of the devices switched on, which did happen, the inevitable happen, power overload.

As usual I’m tiding the mess I made, this time with the ad-hoc arrangement of the multi-port power plugs, placement was poor, on top my PC Tower Case.

Looking back at the power failure that rendered the OS boot disk to fail completely, luckily all the VM disks were intact, I’m sure the placement was not at fault. All that was clear, was the PC power LED was off, and hitting the Power On Button did nothing. Turning my head to view the mains power supply distribution board, where the power breakers are, it was noted one had triggered, oh dear, “Oh dear indeed”, screamed inside.

After breaker reset and PC powered on, “No boot Disk”, fuck, no actually backup of the SATA SSD OS Boot disk partitions or even the /etc directory on the Debian Linux install.

What has a backup are the LXCs and some VMs, stored on a HDD. Note though the VM Disks are on the NVMe SSD disks and were fine, along with SATA HDD. With some luck we can get this Proxmox VE hypervisor back to life and the infrastructure back working.

After replacing the SATA SSD and installing Proxmox VE (essentially Debian Linux) the Web UI showed without any assistance the disk LVM groups and volumes, great, meaning the VM disks were listed (but not for the LXC). Now by creating a new VM with the same ID, one can attach the existing virtual disk and config. The actually process per VM wasn’t too hard using the Web UI and some CLI help, namely pct rescan, for the Linux Containers (LXC) and qm rescan, for the VMs.

As noted the LXC were not listed in the LVM groups, that was okay because we have a backup, job well done, found out later creating a new LXC with the same ID may have worked too, by using the CLI command pct rescan.

With a little bit of luck, soon after I realised the PC was down, the laptop was running with the Proxmox VE Web UI open, quick screen capture.

Having this system running for sometime, I’d noted most of the changes, but forgot a few too. For example, the use of Open vSwitch to make configuring VM/LXC VLANs much simpler, by using one bridge for multiple VLANs.