I have an older Laptop HP ProBook running Linux Mint Debian Edition (LMDE), normally without any issue. The system starts after powering on as expected. But whenever I invoke a restart, e.g. after installing kernel updates, the system hangs with a “Boot device not found” message.
What could be the origin of this issue? How to investigate this?
I’ve to confess that I applied some hardware changes to this system.
The original HDD was replaced by a 1TB SSD.
The RAM size was increased to 8 GB.
The SSD cannot be the issue because it was the same before with the original HDD in use.
Cmos battery is my usual first stop on these issues… boot into the bios (f2, f10, esc ? Could be any of these) check the date and time is correct, usually first point . Depending on the make and model some cmos are easy to change (cr2032) but others are soldered onto the motherboard or connected via a short cable and inside a package.
Next step would be check the startup
When you boot into lmde you get a quick screen display with options to press to arrive at advanced menu, usually 30 seconds, and usually esc key.
This should take you to a short menu where you can run fsck (file system check)
Try running that option
It will ask to confirm or fix errors, select yes to these
There are a couple of other options on this menu to select, sorry not at my desk to explain what each one does, doing this from memory.
You can select any as they all help boot issues.
Thank you, @callpaul.eu. I’ve checked all your suggestions in advance, with no success. The system time is kept, the FS of the boot drive is intact; otherwise the cold boot would fail as well, I assume.
Booted from a life system to check the FS. Fine.
The issue has survived several (>5) kernel updates.
I always keep my systems up to date, at least with the security updates.
EDIT:
It’s not LMDE related. I had Archlinux installed earlier, to try something. Same.
That would suggest more of a hardware related issue.
What are your settings in the bios related to secure boot ?
Something runs in my mind about hp laptops and a similar issue, but not been able to find my notes as to why just remember a bios secure boot being involved
Paul’s on the right track. Secure boot should definitely be disabled, but check also boot devices and boot orders to make sure they’re accurate. I have a Toshiba laptop that required the boot order be edited to put USB first so I could install Devuan Peppermint after I had flushed Windows.
I had to install Debian on a HP micro-machine, whatever it is really called I don’t know.
It had an 8th gen CPU, looked like an intel NUC. It was intended to be a headless server for a friend.
That one had serious problems with reboot, when I initiated reboot from command line, it shut down the system, but never rebooted, it stuck at some point, I saw only blinking cursor on a black screen.
After I updated the BIOS (UEFI), everything worked like a charm.
Maybe a BIOS update will solve your problem too.
I can remember that there was a difference between ‘cold boot’ and ‘soft boot’ on old DOS and CP/M systems. Cold boot does a POST and reinitializes all the hardware, and clears memory. There were situations where firmware loaded into hardware became corrupted while running and required a cold boot to recover.
I dont know if this is relevant to your computer today, but it might be a place to start looking.
“boot device not found” has an obvious meaning… it could not find the disk… so the disk did not respond to a probe … so the disk controller’s firmware may not have been in the correct state.?.. what could corrupt it?.. maybe malware, maybe the wrong driver.
Other thoughts:
It would not hurt to run a memcheck.
Check your logs after reboot attempt
Is there an option in your BIOS to set the disk detection delay? Make it larger… the dusk may not be ready when the BIOS tries to probe it.
The reboot command seems to disconnect or turn off the drive. In fact, it isn’t longer listed as a selectable boot device. This isn’t a timing issue, the drive remains in its nonexistent state.
What reboot should do is sync the disks, then unmount the filesystems… before initating the boot.
The disk isnt seen by the BIOS… hence it is not listed as a boot device.
I wonder if the sync is hanging or the unmount fails ?
Is it any different if you manually sync and unmount all filesystems before doing reboot?.. Hang on that may not be sensible… I dont think you can do umount /
How do you know it is not a timing issue?
The disk has a lot more time to initialize during a cold boot, because the POST takes time. In a reboot you go straight into the BIOS… the disk may not be ready to receive the probe from the BIOS, so the BIOS will not detect it.
Final left field suggsstion:
Try configuring the drive as hot-pluggable in the BIOS
That will change the way the drive is detected.
But the “boot device not found” seems to me a BIOS message too, @nevj exactly pointed this out. So I think the problem rises before the system boots (well, for the second time).
Disk /dev/sda: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: Samsung SSD 870
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x5fcec6ec
Device Boot Start End Sectors Size Id Type
/dev/sda1 3906 16800781 16796876 8G 82 Linux swap / Solaris
/dev/sda2 * 16801792 83910655 67108864 32G 83 Linux
/dev/sda3 83910656 1886412799 1802502144 859.5G 83 Linux
/dev/sda4 1886412800 1953521663 67108864 32G 83 Linux
I’m sure, that the disk is fully OK.
It’s more an assumption. If I increase the time the very first BIOS screen (Hit ESC…) is shown, the drive should have enough time to get ready/online, but it doesn’t.
If I reboot into a live system, then the drive will become ready there, even if it wasn’t at the beginning.
Linux minu offers a system information option , usually bottom right of screen looks like a clipboard. Can you check that and the info that offers.
I notice that the disk is showing msdos !
Worth formatting to a linux type (ext4) and re installing the system ?
I have had issues with drives in the wrong format showing strange errors similar to this related to speed. In principal linux works with dos drives, but after a certain size it goes strange, experiance says around 1tb, but not logical.
There should be a setting in the BIOS.
You need to change something, just to see if we can get some more information.
Reboot is very difficult to debug. You cant use strace, you cant try to replicate it stepwise. All you have is the logs ( look at dmesg) or changing something is the BIOS.