Clonezilla for backup or just in case

01101111 · 8 January 2019 01:59

there was some discussion of clonezilla kind of buried in an unrelated thread so i thought i would open up a place for some of that info in case anyone is in need or interested and didn’t find it in the other thread.

i still use clonezilla on a monthly basis even though i have timeshift set up to do the daily incrementals. i figure it is a backup for my backup. perhaps not entirely necessary, but i like knowing that i have an extended safety net. in general i don’t mind building the system itself from a fresh install but losing some of the family photos and other personal documents would be a blow.

it does make a full system (or partition) copy or clone, as mentioned in the other thread, though there is some compression (i want to say about 30%. i checked the last time i made a backup, but don’t recall. i will try to come back and update the next time i make one). it took me a few tries with the setup wizard before i could understand all of the steps involved, but there were plenty of tutorials to help after a quick search.

i had read previously that a clonezilla image could be mounted to view or retrieve files, but never got around to trying it. @Rosika posted a great walk through of how to do so and i was hoping maybe a copy could end up here as well

https://clonezilla.org/

edit: to add link

Akito · 8 January 2019 08:02

So apparently many people are interested in a good backup solution in general, so I will present to you the one I am mainly using and I personally think it is better than all the backup solutions I have seen being talked about in the whole forum, together.

BorgBackup Github
BorgBackup Documentation

What is BorgBackup?

BorgBackup (short: Borg) is a deduplicating backup program. Optionally, it supports compression and authenticated encryption.

The main goal of Borg is to provide an efficient and secure way to backup data. The data deduplication technique used makes Borg suitable for daily backups since only changes are stored. The authenticated encryption technique makes it suitable for backups to not fully trusted targets.

Main Features

Space efficient storage

Deduplication based on content-defined chunking is used to reduce the number of bytes stored: each file is split into a number of variable length chunks and only chunks that have never been seen before are added to the repository.

Speed

performance critical code (chunking, compression, encryption) is implemented in C/Cython

local caching of files/chunks index data

quick detection of unmodified files

Data encryption

All data can be protected using 256-bit AES encryption, data integrity and authenticity is verified using HMAC-SHA256. Data is encrypted clientside.

Compression

All data can be optionally compressed:

lz4 (super fast, low compression)

zstd (wide range from high speed and low compression to high compression and lower speed)

zlib (medium speed and compression)

lzma (low speed, high compression)

Off-site backups

Borg can store data on any remote host accessible over SSH. If Borg is installed on the remote host, big performance gains can be achieved compared to using a network filesystem (sshfs, nfs, …).
Backups mountable as filesystems

Free and Open Source Software

security and functionality can be audited independently

licensed under the BSD (3-clause) license, see License for the complete license

The catch with BorgBackup is that it takes some time to understand the concept and use it appropriately. It definitely takes some time getting used to, especially for people not familiar with advanced backup solutions. But once you get the gist of it, it will definitely be a pleasure and I am sure a majority would stick to this solution for most backup targets.

Here’s how I personally would explain how it works and how I use it

The program has 2 main functions. The first is creating a repository, the second is creating an archive within a repository.

Repository

The repository is basically the world that stores all the content as archives for you. The special thing about this world is that every single thing exists only once. Everyone can have a simulation of all things, e.g. everyone can have an orange and use it in this world, but actually there is literally only a single orange in the whole world.

Archive

An archive is a certain state of the world, defined by the time the state (snapshot, basically) was captured. Let’s say, yesterday your brother had an orange, but your sister and you had none. Today, your brother gives you an orange, so you have one now, but your sister and brother have none. If you create a snapshot, i.e. archive, every day in the evening, then there will be 2 archives. 1 from yesterday where your brother had an orange but your sister and you didn’t, the second one is from today where you have an orange but your brother and sister have none. That means that you have 2 entire archives that are basically standalone (you can delete one of them and the other one will remain as it should) while the space used for both archives equals to the space only 1 archive uses because all that changed in the world is that the orange changed its owner, so no additional data was added, which means that the size of the new archive seems to increase by 0 because the other archive already contains the needed data for the new archive.
Now if your sister gets an orange tomorrow, so that you and your sister have one each now, then the archive from tomorrow will only increase the size of the respository by a couple of bytes (if the size of owning an orange would be a couple of bytes, that is; NOTE: the orange itself does not get duplicated, the only thing that gets saved additionally, is that your sister has the orange now, but the orange exists only once in the whole world, as explained above).

Now comes the even more interesting part. Let’s say, every day there are major changes in the whole world but the only thing you care about is the orange situation at home, for now. Your very first archive already contains the whole world ( i.e. e.g. root directory / ). Now further backups only make a snapshot of the orange situation ( i.e. e.g. /home/*/oranges-directory ). This directory is part of the root directory so all the data is already in the initial backup and doesn’t need to be additionally stored. The only thing that is stored in the newest archive, are the changes in the oranges-directory, effectively ignoring all other changes in other places.

Real world example

I had a repository containing an initial archive of my root directory /. Yesterday, I created an additional archive of my Downloads folder, because I downloaded some .deb files; i.e. /home/user/Downloads. Today, I updated my Debian archive mirror, so I only backed up the /var/debian folder. Tomorrow, I will update the whole root directory / once again.

How much space will all this use? I have 2 separate backups from 2 separate days from the whole root directory / and yet all the space that will be used is pretty much the space that the whole root directory / + the couple of .deb files I downloaded, need. Nothing else. My Debian archive mirror only updated the packages, didn’t add any new ones. My system overall didn’t change much, except I have a couple more .deb files in my Downloads folder. So you can pretty much have 100s of different archives, each saving the state of when the snapshot was taken and at what location, but the size won’t increase, at all, except you actually add entirely new data. Therefore it already takes almost no space to backup everything you need to backup, and yet you can optionally compress everything, too, so the space needed is EVEN SMALLER.

Real world example from my Raspberry Pi system:

The root directory / of my Raspberry Pi 3B takes about 12-14GB of space on my SD card. The actual initial Borg archive of the whole SD card takes up about 4GB in space, after low compression (so you can compress the data even higher if you have a more compute ready machine).
Now, do you have several Raspberry Pis but don’t want to use ~4GB for each Raspberry Pi? No problem, just make archives of all the different Pis in the same repository and if the data on all the Raspberry Pis is more or less the same datawise, then the repository will be maybe ~4.5-5GB in size, despite backing up 4 Raspberry Pis (real world example from my own setup).

I hope I could explain the system well enough to you, since I had to try out BorgBackup several times to finally get the gist of how to use it at best.

P.S.: You can also safely encrypt all your backup data. I personally don’t need that option, but it definitely pumps up the value of this backup solution by a whole lot, as well.

anon56357095 · 8 January 2019 11:21

BorgBackup (short: Borg)

Resistance is futile. You will be assimilated

Akito · 8 January 2019 11:26

I was thinking the exact same when my friend told me about BorgBackup!!

01101111 · 9 January 2019 06:43

i had only done a quick search or two trying to find an option (more for the experience than actual need) to back up my debian partition since timeshift (i thought) was only available for ubuntu and it’s derivatives. turns out i was mistaken about timeshift being limited to ubuntu, but i would still like to be familiar with other options just in case. thank you for this helpful overview of borg

davemerritt · 11 January 2019 17:29

Many thanks Akito! As I generally use Clonezilla to image and I have loads of memory, I’ve never needed to use compression or encrytpion. But I’ve often wondered about it.

So thanks. I love the way you bring the topic from the abstract to the practical. It seems I always step away from your posts a little wiser!

01101111 · 15 January 2019 12:12

i made a copy so i thought i would fill in this blank. gparted says i have used just under 50 gb of my hdd and the image clonezilla made weighs in at 22.6 gb so there is a decent amount (over 50% in this case) of compression.

wfallen · 1 May 2019 06:39

I am considering replacing a small hard drive (250GB) on which my system Ubuntu 18.04 runs, with a much larger hard drive (2TB) and would like to know if it is possible to ‘clone’ my system from the small HDD to the larger hard drive and have business as usual after I remove the smaller HDD making the 2TB HDD the replacement?

Akito · 1 May 2019 07:31

Probably the only thing you need to do after running ddrescue successfullly, is resizing the ~250GB partition to the full 2TB, that’s it.

wfallen · 1 May 2019 08:03

Thanks, I will see how it goes.

01101111 · 1 May 2019 23:25

from the linked ddrescue article:

This will take a while, and dd doesn’t really provide any progress info, so be patient. When the process is finished, reboot and you should be good to go.

i have started adding status=progress at the end of my dd commands to keep an eye on how the write process is proceeding.

Akito · 2 May 2019 02:15

ddrescue != dd

ddrescue provides progress information by default and is far superior to classic dd.