it was nice to see someone was able to get back to you so quickly. hopefully a re-install does the trick and you don’t end up losing the drive. good luck.
!!! Error is still there !!!
I deleted my partition and reinstall Ubuntu 20.04 again and after using half an hour same error shows up … my system is dual boot when I am using windows no error it’s working fine. Maybe there are some issues with the new Ubuntu only…
have you tried running any tests on the disk? i believe windows has one called crystaldisk that does pretty much the same as
gsmartcontrol does on linux. if you wanted to try
gsmartcontrol, you could probably do so from a live usb. i have never run it on an nvme drive, but it seems to read my ssd’s just fine so i figure it might be an option.
or trying a different distro? to see if the problem is indeed ubuntu specific.
Yeah, i tried some commands of file system like (fsck) but all looks fine. OK, thanks for sharing i will check my system again and let you know…and there are some logs and one crash file in my system (_usr_bin_gnome-shell.1000.crash ).
fsck checks your file system which is helpful if you think unbuntu is the problem. something like gsmartcontrol will check the physical hard drive to make sure it isn’t the problem. my understanding from reading what was posted on your bug report was the person who looked at your screenshot with the errors believed it was a disk problem.
i understand you are saying that windows isn’t showing the same problems, but it is written to a different physical place on the disk than where you have written ubuntu.
These are my partitions and I am using this INTEL SSDPEKKF256G7H hardrive 256GB.
Two ext4 partitions where I install my ubuntu system.
exactly. since they are on a distinctly different part of your disk from windows, it is entirely possible that windows could run without error while you are having so much trouble with ubuntu. a test like gsmartcontrol would help you possibly come to a better understanding if that part of your system was sound or not.
Even though they are written to different area’s of ‘disc’, doesn’t an SSD write to any unused portion rather than in a circle as on a ‘spinning’ drive?
I thought SSD didn’t overwrite stuff ‘all the time’ in same area but sequentially used all available space so drive lasts longer (where an old style spinning disc would start at middle and work outwards.) This should mean any bad sections will automatically be skipped? A bad area may be small but may show up as a smaller capacity drive?
Being paranoid, I think it’s NSA plot ;o)
my understanding is that within its own partition, an os will do just as you have described. i don’t use windows often, but the last time i looked my ext4 partition wasn’t listed in the disk management program. i think i read once that there were some windows tools to write or read some linux file systems, but i don’t believe it does so natively. the idea of a partition table is that addresses 1 through 50 are set aside for os1 and 51 through whatever for os2 etc.
if my second partition is ext4 and a third is as well, that second may run fstrim on the third if it is mounted at the time and have some interaction like you suggest but i doubt that there is anything like that going between windows and a foreign (ie not ntfs, fat or exfat) file system.
that is one of the things the smart tools look at. how many bad spots are there and how many maybe-bad spots are pending? enough bad spots or necessary file system info written to maybe-bad spot that have yet to be recovered and the os can’t do what it needs to operate properly.
Yes. That is pretty much the case.
Concluding everything said here, it is unfortunately rather likely, that the sorage medium experiences a hardware fault.
The more random the errors are, the more likely it is a hardware fault. Especially with random file corruptions all over the place, without reason. Usually software related issues like that get resolved at least by a full clean upgrade or reinstallation.
You probably need to replace your current sorage medium with a newer, working one. If you have still warranty on this one available, you should make use of that.
You could try using the
badblocks utility, however I am not sure how effective it is for solid state drives.
I just read through the bug thread on launchpad and discovered, that the guy says pretty much the same as I do:
a disclaimer to start: neither
smartctl will end up telling you that your drive is broken beyond repair. both can only run a couple of tests and show you any errors if they exist. the first photo you posted shows
journalctl unable to write because your system was set to read-only which only happens when the file system is healthy enough to not just shut down, but only just since your system locks up and becomes unusable.
if you believe the problem might be ubuntu, you might consider trying a different distro. both fedora and opensuse are not based on ubuntu (and therefore debian either). if you want to try something familiar but upstream from ubuntu, there is always debian. it does look from here like the disk is not healthy though.
that being said, you can sometimes get
gsmartcontrol to read a disk by going into the Device menu and selecting Add Device. you will need to know your device name (/dev/nvmeX) which you should be able to get from
lsblk. you need the name of the disk (the first line), not the partition on the line below. for example, mine is /dev/sdb not /dev/sdb1. you put that in the “Device name:” field and add -d nvme to the “Smartctl parameters:” field. if that works, it would be helpful to look at the error log but it would be best to run both the short and long tests available under the Self-test tab. if the short test fails, the long test will most likely as well.
gsmartcontrol still refuses, you may be able to run
smartctl on the drive from the command line with:
sudo smartctl -a /dev/nvmeX.
again, you will need to get the actual device name to replace nvmeX. if that doesn’t work, you can try with
sudo smartctl -a /dev/nvmeX -d nmve.
if either of those works, the goal is to run both short and long tests with
-t short and
-t long. that command would look something like:
sudo smartctl -t short /dev/nvmeX.
the output will tell you how long the test will take. you would then need to run
smartctl -a again to see the result. most short tests take about 2 minutes.
sudo smartctl -t long /dev/nvmeX for the long test. those vary depending on the size and type of disk.
i will repeat part of my disclaimer and add to what @Akito and the person on the bug report said:
gsmartcontrol can only show you some numbers. at the end of all of this you still need to decide if you trust this disk with your information and time. the part about the warranty is a good thought and if the disk fails smart tests that may make it easier to file a claim.
SSD can fail in strange ways. Forgot most of the details, but had a 240 GB SSD with Windows on it. Windows claim errors on it and want to scan disk at boot time. Scan an SSD for read / write errors? Crystal Disk show no errors and it was 98% good. Long story short, replace the SSD with a new one and no more disk problems. BTW, the price of a 240 GB SSD has come way down. 30 US at Best Buy.
I replaced my SSD with Samsung Evo Plus. But there are some errors still there and now I am using only Ubuntu (no dual boot system).