Posts Tagged ‘Volume Archive’

Creating Multi-Volume Archives and Checksums

Wednesday, April 30th, 2008

The goal of this article is to help you create multi-volume archives and generate checksums to validate integrity. Why? You have data larger than any single CD-R disc or DVD and you need to split it into pieces, or you have to transfer gigabytes of data over the net and would rather send smaller segments instead of a giant glob. As an example, I’ll create a multi-volume gzip archive of /home with a MD5 checksum using tar, gzip, split, and md5sum.

Creating Volumes from your Data

1. Create a single TAR archive of all your data using tar to preserve permissions, directory structures, etc.

[root@linux archive]# tar -cf home.tar /home
tar: Removing leading `/' from member names
[root@linux archive]# ls -la home.tar
-rw-r--r-- 1 root root 304220160 Apr 30 13:34 home.tar

2. Compress your TAR archive using gzip (or any other compressing program of your choice).

[root@linux archive]# gzip home.tar
[root@linux archive]# ls -la home.tar.gz
-rw-r--r-- 1 root root 284859091 Apr 30 13:34 home.tar.gz

3. Use the split command to chop the compressed archive into smaller segments (I’ll be using 100MB pieces).

[root@linux archive]# split -d -b100m home.tar.gz home.tar.gz.
[root@linux archive]# ls -la
total 556940
drwxr-xr-x 2 root root      4096 Apr 30 13:56 .
drwxr-x--- 6 root root      4096 Apr 30 13:31 ..
-rw-r--r-- 1 root root 284859091 Apr 30 13:34 home.tar.gz
-rw-r--r-- 1 root root 104857600 Apr 30 13:56 home.tar.gz.00
-rw-r--r-- 1 root root 104857600 Apr 30 13:56 home.tar.gz.01
-rw-r--r-- 1 root root  75143891 Apr 30 13:57 home.tar.gz.02

4. Create a MD5 checksum (or a SHA1 checksum).

[root@linux archive]# md5sum home.tar.gz* > MD5SUM
[root@linux archive]# cat MD5SUM
cb16175f4acad02f977f74d5c142879b  home.tar.gz
33c745ca49ab6e63b727658ec148cf67  home.tar.gz.00
14e6952b632fbb7f4c0731067afdb46c  home.tar.gz.01
386655357f8553c7730fd792c22fde2a  home.tar.gz.02

Same thing but creating a SHA1 checksum instead (you don’t need two checksums, I just illustrate to use both types — pick one).

[root@linux archive]# sha1sum home.tar.gz* > SHA1SUM
[root@linux archive]# cat SHA1SUM
3858b51622dc9135c192a7c98dec24ccd35c63d6  home.tar.gz
6bc12b26dc1388d70d1a7cc0290dc6c9e8e0f97e  home.tar.gz.00
0683a44538ac65330fe103440e4f2a4a3a652be5  home.tar.gz.01
eb0f65fd0f4b3d98221e3ae8600f1691b536ad1d  home.tar.gz.02

Restoring your Data from the Volumes

You’ve burned or transferred your volumes and now want to restore them to the original. Here are the steps.

1. Verify the checksum against the volumes (ignore the error on the original file).

[root@linux resurrection]# md5sum --check MD5SUM
md5sum: home.tar.gz: No such file or directory
home.tar.gz: FAILED open or read
home.tar.gz.00: OK
home.tar.gz.01: OK
home.tar.gz.02: OK
md5sum: WARNING: 1 of 4 listed files could not be read

Once again, same deal but with the SHA1SUM file.

[root@linux resurrection]# sha1sum --check SHA1SUM
sha1sum: home.tar.gz: No such file or directory
home.tar.gz: FAILED open or read
home.tar.gz.00: OK
home.tar.gz.01: OK
home.tar.gz.02: OK
sha1sum: WARNING: 1 of 4 listed files could not be read

2. Join the volume pieces together using cat (after you finish you can validate the checksum *again* to see if the original file passes an integrity check).

[root@node2 resurrection]# cat home.tar.gz.* > home.tar.gz
[root@node2 resurrection]# ls -la home.tar.gz
-rw-r--r-- 1 root root 284859091 Apr 30 15:23 home.tar.gz

3. Decompress and extract the tar.gz file contents and you’re done.

[root@linux resurrection]# tar zxvf home.tar.gz
... verbose file list ...
[root@linux resurrection]# ls -la
total 556952
drwxr-xr-x 3 root root      4096 Apr 30 15:33 .
drwxr-x--- 7 root root      4096 Apr 30 15:02 ..
drwxr-xr-x 4 root root      4096 Apr 30 13:06 home
-rw-r--r-- 1 root root 284859091 Apr 30 15:23 home.tar.gz
-rw-r--r-- 1 root root 104857600 Apr 30 15:04 home.tar.gz.00
-rw-r--r-- 1 root root 104857600 Apr 30 15:04 home.tar.gz.01
-rw-r--r-- 1 root root  75143891 Apr 30 15:04 home.tar.gz.02
-rw-r--r-- 1 root root       193 Apr 30 15:05 MD5SUM
-rw-r--r-- 1 root root       225 Apr 30 15:05 SHA1SUM

Contents extracted and there is the home directory. The End.

I got the idea of using the split command from this post on the Ubuntu Forum (how to create multi zip files) because I couldn’t get tar or gzip to create multi-volume archives.