Partitions lost without warning

It has happened again and I am hugely frustrated.
I went to /dev/sda to find that the partition table is lost. Again!
I strongly suspect there is something wrong with the wiring to that drive (whether mainboard or cabling). So I will spell out the key plot points in this tragedy (comedy?) and ask for insights.

For three years, I ran both Ubuntu and Win7 from another HDD at /dev/sda. THen, it failed and badblocks revealed lots of bad sectors, all in the beginning of the drive.

I copied them to a clean drive at /dev/sdb and things have been running problem-free.
I replaced the bad drive, created 3 partitions on the new drive (sda1 in NTFS for various documents, pictures, pdfs, etc; sda2 in EXT4 for one distro; sda3 in EXT2 for another distro), which worked well for a month.
Three weeks ago, I decided to look into switching over to Mint, which I have been running off a USB pendrive. To prepare, I deleted the two Linux partitions, reformatted as a single EXT2 partition, then installed Mint from the USB.
Since then, I rebooted into that Mint installation twice, opened the NTFS directory a few times searching for documents, and ran DISKS to make sure I had the correct paths. All seemed well, even as late as one week ago.
The last time I looked, four days ago, the entire drive was unallocated.
GPartEd could find no file systems, testdisk found a few but none that could be recovered, and a data recovery program for Windows (RS-Studio) found some of my old files, but with automatic naming. In other words, all data was lost. I think I may have made one of those mistakes that @Akito warned @meetdilip about. C’est la vie.
I ran badblocks. All is clean. I ran SMART tools. All is good.

So the only programs I ran that even touched that drive during the time of concern were mount, DISKS and Nautilus Files.

I know that the first response will suggest to me that I open a log. I dread the very thought for these reasons:

  • I have been doing a lot on this machine, at least 10 hours every day, so the logs are gonna be long. What string can I search for?
  • Even after opening a log, I have no confidence I can interpret the information usefully. And that is a lot of information;
  • If it is a hardware/electricity issue (as I mentioned at the top), a log will probably not say anything that guides me, right?

Here are the ideas I had for figuring this out.

  1. Take out the hard drive, use a SATA–USB adapter, plug into the USB, and try running the recovery programs.
  2. Stress-test the drive to try to recreate the problem. I am thinking of creating three different Linux partitions, install a different distro in each, then boot and update each one every day.
  3. Just give up, leave sda disconnected and live with Ubuntu on sdb1

Anything else?

1 Like

Think you are doing everything right at this time Cliff, if they fail to produce results then perhaps other things might be worth trying. Until then I personally feel other suggestions might mix the pot too much and lead to greater confusion. I would start with your first idea then it it didn’t work try the others and then report back- sorry don’t feel it would be wise to make any further suggestions at this time.

Just a guess, but I think a Windows virus did this to you. If there is ANY way to drop Windows altogether and move to Mint with a large partition where you can try other distros using VMware or something.

Here is an article that explains some things to try when a Windows Virus causes issues: https://www.eassos.com/how-to/how-to-recover-deleted-partition.php

1 Like

@cliffsloane , sorry to hear about your disk problem. Just last night I lost a Western Digital disk w/o any signs it was going bad. It was not a system disk. Lucky, I had a backup of the data.

1 Like

After years of frustration, I simply don’t keep anything of any importance on my hard drive. Everything is stored on an external drive or in the cloud (Dropbox or MEGA). At the first sign of a repeating problem, I flush everything and reinstall. I have an inexpensive HP running Windows 10 that is only used for games–my decent machine only runs Linux.

2 Likes

Let me vent a bit about how I am at my wit’s end about this.

I did nothing wrong, and FOUR TIMES partitions just vanished. It has only happened with Linux (two versions of Ubuntu). I never had that happen in Windows, and this is four times in a year with Linux.
The solutions with Linux rob me of time I cannot spare (between my 40 hours of online tutoring and my wife’s requests to help her in the kitchen, etc).
The most valuable files were all very precious music files. I can restore about half of them, but with paths and ID3 information stripped. So I have a few thousand in one flat directory.
In my rage, I am tempted to just forget about the music and become a man without a history (my past writings and lesson plans were the lost documents).

But I cannot help but blame Linux. When I was getting into learning stuff, it was to expand what I knew, not to repair such unnecessary issues.

@wgberninghausen’s suggestion? I had JUST MOVED those files to that new hard drive only two months earlier. And now they’re lost.

If my passions can subside tomorrow, I will retract this tiny polemic. Otherwise, I may just return to Win 7, do that Cloud option and accept my computer’based amnesia.

I have a backup server cluster that hosts several ZFS pools with a lot of redundancy. If any of those disks fails, another one can jump in and I can restore the data without issues. It works a bit like RAID, but it is better (in my opinion).
I have it already set up, so the only work left to do for me is thinking A LOT about what data I need to survive even if everything goes on fire. After that decision, I back up the data. The most important data of all is backed up into several pools, that are already redundant on their own and additionally saved into an emergency cloud.

Last time my big drive’s file system died, I lost a 300GB TrueCrypt container, because the headers were damaged. All this happened, because this big drive was very new at that time and I was evidently to slow and undisciplined to force myself to instantly back up all the data I need, as much as I would’ve needed it.
Since this happened, I will never put backups in the back of my schedules. It is the first thing on my list now. Permanently.

1 Like

Cliff, I feel your pain. I scanned and saved (I thought) a whole wad of kid pictures from the days of film and paper, only to find out that Windows XP had disappeared the whole collection. Dropbox and MEGA, on the other hand, haven’t lost anything, nor have Google or Amazon in the storage facilities I use. It’s hard to not blame Windows for my loss or Linux for yours. Instead, I blame myself for not using the Cloud option. An OS is just a tool; it can’t think for us.

2 Likes

Speaking about clouds:
I think if you set up 2 servers in 2 completely different geological areas, each with a 4-way mirrored ZFS pool, e.g. 4x 1TB mirrored HDDs each, then this would be an extremely safe solution for pretty much every average user. With such a set up, you probably wouldn’t even need to rely on a 3rd party cloud.

Bill, you are a man with considerably more experience than I have. Can you help me understand?
Why would a relatively new hard drive with three partitions suddenly appear as “unallocated”?
If it be software, how could “mount” “nautilus” and “Disks” (the only programs I launched that accessed that drive) do such a thing?
If it be hardware, how can I test the mainboard and SATA connections for problems?

@cliffsloane, again I’m so sorry about your lost data. You may never know what caused the data lost. I know there are profession data recovery services out there, but have never used one. At this time, I’m afraid of what could be lost forever. All of my talking does not help you. But if you can, learn from this, so it NEVER happens again. Perform backups! Multi-copies, different hardware. My most important files are 20 years of digital photos of family and my many trips. For these files, I make 4 copies, each copy on a separate piece of hardware.

2 Likes

Can create issues if you e. g. mounted the drive in the wrong mode or with the falsely presumed file system and then doing write operations to it. Same for the other 2 mentioned tools, as they are probably using mount internally anyway.

Secondly, I would like to clarify a misconception I noticed. Of course, no (re)-seller would ever tell you that, but the truth is:
there is no such thing as a brand-new drive.
Is this statement over exaggerated? A little bit, yes. But not as much as everyone thinks. Every drive, no matter how unused, fresh and new it is, is basically already starting to fail. It’s just that the failing process most of the time takes years and years of usage. Sometimes though, a new hard drive can fail any moment. Technically, if you are strictly looking at it from a technical perspective, you should be pretty much expecting your hard drive to fail any time for no reason. It’s not probable. But definitely happens and the chance is there.
As I know how a hard drive works internally and have opened many of them myself, I have seen with my own eyes why this is the case. Hard drives are very complex structures (complex, not in hard to understand, but hard to physically be put together, hold together, etc.) that, unavoidably, implies that they are fragile. I mean, there is literally a tiny pin hovering a couple of NM over a platter rotating about 5700RPM or something along those speeds. Imagine the hard drive is moved abruptly just half a centimeter. Boom, this could mean destruction of most data, speaking from the view of an average user.

2 Likes

this discussion (though dated 2011) would seem to indicate testing the sata connector is something that is still reserved for manufacturers and not available to end users. a quick web search for “sata connector test tool” yielded only that result and many others that wanted to express an opinion about a software method to test the drive itself. i’m not even sure how an end user would begin to test a board without extensive knowledge of micro-circuitry and the schematics of the board in question.

this sounds like a loss of a partition table to take out all of the drive and that seems like it would be a symptom of a failing drive. it would be interesting to know (even if not entirely helpful) if this was the same drive that lost ubuntu mate system files. like @Akito has described, drive life can be frustratingly variable. one of the tests i have taken to running on my new drives are the SMART tests you mentioned. even a new (to me) drive showed most attributes as old age and one even as pre-failure. i don’t know if there are better disk tests out there. that is just the one i have come to be familiar with.

i can understand why you would be wary of what to trust between the system itself and the drive. in a lot of years of doing repair mostly as a hobby and sometimes as a gig, i have seen way more drive failures than entire systems.

1 Like

Cliff I don’t know if this will help you. It is not Linux specific, Troubleshooting A Faulty SATA Port - How To Test and Fix It - TechLogon it might be worth you looking at. Hope you get things sorted

1 Like

@01101111
It is the same system that lost those system files, but a different drive.

The hard drive passed badblocks, fsck and SMART. but to do that, I needed to create a new all-disk partition, thus losing forever my ability to restore the old partitions or directory structure.

What can I look for in a log? And which log? I have the window when this happened. If I have solid evidence of a software issue, I am more than happy to abandon Ubuntu or do any other remedy.

to do what?

please be very clear, i am not in any way suggesting you write anything else to this drive if you are still hoping to retrieve any of your missing information from it. i have even read in cases of a drive failure, it is best practice to image the damaged drive on to a known good drive and only attempt data recover from the good drive. i know not everyone has a few extra drives sitting around. i’m just saying that the process can be laborious and i have zero experience beyond a few single file recovery attempts on windows many moons ago.

as far as logs go, i would need more of an explanation of what this means:

1 Like

After I tried and failed to restore the partition table using testdisk, I had to create a partition, expanded to the whole disk, so I could run those other utilities (badblocks, fsck, SMART)

The “window” I meant was time. The partitions were last verified June 28, and the disk was found to be unallocated July 2.

thank you for the clarification. what makes you believe it is a software issue?

Well, that is quite the worst mistake. Those tools are only supposed to show you the status quo and fsck wouldn’t have fixed anything in this situation. The worst thing you can do is performing write operations to the investigated medium. Especially if this write operation directly overwrites the partition table. The very last thing you can try is using qPhotorec while allowing only certain file types to be restored, e.g. audio data. The data won’t be ordered and not recognizable by its name, but at least you get the data at all, if it would work.

1 Like

Here I extracted another interesting point. Did you create the NTFS file system within Windows or Linux?