The goal of this article is to help you create multi-volume archives and generate checksums to validate integrity. Why? You have data larger than any single CD-R disc or DVD and you need to split it into pieces, or you have to transfer gigabytes of data over the net and would rather send smaller segments instead of a giant glob. As an example, I’ll create a multi-volume gzip archive of /home with a MD5 checksum using tar, gzip, split, and md5sum.
Creating Volumes from your Data
1. Create a single TAR archive of all your data using tar to preserve permissions, directory structures, etc.
[root@linux archive]# tar -cf home.tar /home tar: Removing leading `/' from member names [root@linux archive]# ls -la home.tar -rw-r--r-- 1 root root 304220160 Apr 30 13:34 home.tar
2. Compress your TAR archive using gzip (or any other compressing program of your choice).
[root@linux archive]# gzip home.tar [root@linux archive]# ls -la home.tar.gz -rw-r--r-- 1 root root 284859091 Apr 30 13:34 home.tar.gz
3. Use the split command to chop the compressed archive into smaller segments (I’ll be using 100MB pieces).
[root@linux archive]# split -d -b100m home.tar.gz home.tar.gz. [root@linux archive]# ls -la total 556940 drwxr-xr-x 2 root root 4096 Apr 30 13:56 . drwxr-x--- 6 root root 4096 Apr 30 13:31 .. -rw-r--r-- 1 root root 284859091 Apr 30 13:34 home.tar.gz -rw-r--r-- 1 root root 104857600 Apr 30 13:56 home.tar.gz.00 -rw-r--r-- 1 root root 104857600 Apr 30 13:56 home.tar.gz.01 -rw-r--r-- 1 root root 75143891 Apr 30 13:57 home.tar.gz.02
4. Create a MD5 checksum (or a SHA1 checksum).
[root@linux archive]# md5sum home.tar.gz* > MD5SUM [root@linux archive]# cat MD5SUM cb16175f4acad02f977f74d5c142879b home.tar.gz 33c745ca49ab6e63b727658ec148cf67 home.tar.gz.00 14e6952b632fbb7f4c0731067afdb46c home.tar.gz.01 386655357f8553c7730fd792c22fde2a home.tar.gz.02
Same thing but creating a SHA1 checksum instead (you don’t need two checksums, I just illustrate to use both types — pick one).
[root@linux archive]# sha1sum home.tar.gz* > SHA1SUM [root@linux archive]# cat SHA1SUM 3858b51622dc9135c192a7c98dec24ccd35c63d6 home.tar.gz 6bc12b26dc1388d70d1a7cc0290dc6c9e8e0f97e home.tar.gz.00 0683a44538ac65330fe103440e4f2a4a3a652be5 home.tar.gz.01 eb0f65fd0f4b3d98221e3ae8600f1691b536ad1d home.tar.gz.02
Restoring your Data from the Volumes
You’ve burned or transferred your volumes and now want to restore them to the original. Here are the steps.
1. Verify the checksum against the volumes (ignore the error on the original file).
[root@linux resurrection]# md5sum --check MD5SUM md5sum: home.tar.gz: No such file or directory home.tar.gz: FAILED open or read home.tar.gz.00: OK home.tar.gz.01: OK home.tar.gz.02: OK md5sum: WARNING: 1 of 4 listed files could not be read
Once again, same deal but with the SHA1SUM file.
[root@linux resurrection]# sha1sum --check SHA1SUM sha1sum: home.tar.gz: No such file or directory home.tar.gz: FAILED open or read home.tar.gz.00: OK home.tar.gz.01: OK home.tar.gz.02: OK sha1sum: WARNING: 1 of 4 listed files could not be read
2. Join the volume pieces together using cat (after you finish you can validate the checksum *again* to see if the original file passes an integrity check).
[root@node2 resurrection]# cat home.tar.gz.* > home.tar.gz [root@node2 resurrection]# ls -la home.tar.gz -rw-r--r-- 1 root root 284859091 Apr 30 15:23 home.tar.gz
3. Decompress and extract the tar.gz file contents and you’re done.
[root@linux resurrection]# tar zxvf home.tar.gz ... verbose file list ... [root@linux resurrection]# ls -la total 556952 drwxr-xr-x 3 root root 4096 Apr 30 15:33 . drwxr-x--- 7 root root 4096 Apr 30 15:02 .. drwxr-xr-x 4 root root 4096 Apr 30 13:06 home -rw-r--r-- 1 root root 284859091 Apr 30 15:23 home.tar.gz -rw-r--r-- 1 root root 104857600 Apr 30 15:04 home.tar.gz.00 -rw-r--r-- 1 root root 104857600 Apr 30 15:04 home.tar.gz.01 -rw-r--r-- 1 root root 75143891 Apr 30 15:04 home.tar.gz.02 -rw-r--r-- 1 root root 193 Apr 30 15:05 MD5SUM -rw-r--r-- 1 root root 225 Apr 30 15:05 SHA1SUM
Contents extracted and there is the home directory. The End.
I got the idea of using the split command from this post on the Ubuntu Forum (how to create multi zip files) because I couldn’t get tar or gzip to create multi-volume archives.
