DAR - Disk ARchive Source Code
For full, incremental, compressed and encrypted backups or archives
Brought to you by:
edrusb
PRESENTATION ================ dar is a shell command, that makes backup of a directory tree and files. It has been tested under Linux, and now under Windows 95 (using Cygwin). A GUI for Linux and Windows will also follow, probably as a separated tool. Actually dar is a set of four commands: dar dar_xform dar_slave dar_manager PACKAGE CONTENT =============== I am looking for a new job, thus I have actually less free time for DAR, but that's temporary. If you are interested in my skills, please have a look at the "RESUME" file. Thanks. else you can find the following: for functional description see the following "FEATURES" paragraph below for actual limitations see below "LIMITATIONS" paragraph below for installation instructions see INSTALL file for a brief tutorial see TUTORIAL file for complete usage help see man page (type 'man dar' after installation) for detailed information on miscellaneous topics see the NOTES file for future features see TODO for license information see LICENSE FEATURES ================ FILTERS: dar is able to backup from total file system to a single file. Additionally a mechanism of filters permits, based on the filename, to exclude or include some files while backing up or restoring a directory tree. In the other side, a secondary filter mechanism permits to exclude some branches of a directory tree, or to only include some branches. DIFFERENTIAL BACKUP: When making a backup with dar, you have the possibility to make a full backup or a differential backup. A full backup, as expected makes backup of all files as specified on the command line (with or without filters). Instead, a differential backup, (over filter mechanism), saves only files that have changed since a given reference backup. Additionally, files that existed in the reference backup and which do no more exist at the time of the differential backup are recorded in the backup. At recovery time, (unless you deactivate it), restoring a differential backup will update changed files and new files, but also remove files that have been recorded as deleted. Note that the reference backup can be a full backup or another differential backup. This way you can make a first full backup, then many differential backup, each taking as reference the last backup made. SLICES: Dar stands for Disk ARchive. From the beginning it was designed to be able to split an archive over several removable media whatever their number is and whatever their size is. Thus dar is able to save over old floppy disk, CD-R, DVD-R, CD-RW, DVD-RW, Zip, Jazz, etc... Dar is not concerned by un/mounting a removable medium, instead it is independent of hardware. Given the size, it will split the archive in several files (called SLICES), eventually pausing before creating the next one, allowing this way, the user to un/mount a medium, burn the file on CD-R, send it by email (if your mail system does not allow huge file in emails, dar can help you here also). By default, (no size specified), dar will make one slice whatever its size is. Additionally, the size of the first slice can be specified separately, if for example you want first to fulfil a partially filled disk before starting using empty ones. Last, at restoration time, dar will just pause and prompt the user asking a slice only if it is missing. COMPRESSION: last, dar can use compression. By default no compression is used. Actually only gzip algorithm is implemented, but some room has been done for bzip2 and any other compression algorithm. Note that, compression is made before slices, which means that using compression with slices, will not make slices smaller, but will probably make less slices in the backup. DIRECT ACCESS: even using compression dar has not to read the whole backup to extract one file. This way if you just want to restore one file from a huge backup, the process will be much faster than using tar. Dar first reads the catalogue (i.e. the contents of the backup), then it goes directly to the location of the saved file(s) you want to restore and proceed to restoration. In particular using slices dar will ask only for the slice(s) containing the file(s) to restore. HARD LINK CONSIDERATION: hard links are now properly saved. They are properly restored if possible. If for example restoring across a mounted file system, hard linking will fail, but dar will then duplicate the inode and file content, issuing a warning. EXTENDED ATTRIBUTES: support for extended attributes have to be activated at compilation time (see INSTALL). Dar is able to save and restore EA, all or just those of a given namespace (system or user). If no EA have been saved and restoration occurs over a file that has EA, they will be preserved. But if they have been saved empty for a given file, any existing EA for that file will be removed at restoration time, unless -u and/or -U is given on command-line. ARCHIVE TESTING thanks to CRC (cyclic redundancy checks), dar is able to detect data corruption in the archive. Only the file where data corruption occurred will not be possible to restore, but dar will restore the other even when compression is used. USING PIPES / REMOTE OPERATIONS dar is now able to produce an archive to its standard output or named pipe. it is also able to read an archive through a pair of pipes, to take a remote archive as reference, or even to restore data from an archive ona remote host. This way it is now possible to store an archive remotely and in total security (if using encrypted means, like ssh sessions) ISOLATION the catalogue (i.e.: the contents of an archive), can be extracted (this operation is called isolation) to a small file, that can in turn be used as reference for differential archive. There is no more need to provide an archive to be able to create a differential backup over it, just its catalogue is necessary. RE-SHAPE SLICES OF AN EXISTING ARCHIVE the external program named "dar_xform" is able to change the size of slices of a given archive. The resulting archive is totally identical to archives directly created by dar. Source archive can be taken from a set of slice, from standard input or even a named pipe. USER COMMAND BETWEEN SLICES several hooks are provided for dar to call a given command once a slice has been written or before reading a slice. Severak macros allow the user command or script to know the slice number, path and archive basename. SCRAMBLING the archive can be "scrambled" given a pass phrase. The same pass phrase must be given to retrieve or extract the archive contents. Of course this is not a very strong encryption, and its use is against simple user that do not have much mean to crack this scheme. CONFIGURATION FILE dar can now read parameter from file. This is a way to extends the command-line length limited length input. A configuration file can ask dar to read (to include) other configuration files. A simple but efficient mechanism forbids a file to include itself directly or not, and there is no limitation in the degree of recursion for the inclusion of configuration files. DAR MANAGER The advantage of differential backup is that it takes much less space to store and time to complete than always making full backup. But, in the other hand, you can have a lot of them. If you want to restore a particular file, you can thus spend time to find in which backup is located the most recent version. This is solved using dar_manager. This little command-line program, will gather contents information of all your backups. At restoration time, it will call dar for you to restore the asked file(s) from the proper backup. dar_manager is actually stable but is in its first release. I thus, expect a lot of comments about its (slow) speed, which I will try to improve in future. But anyway, the more you will have file in a backup (whatever the amount of data is), the more it will take time to execute. Still remains the advantage of the automatic processing of the restoration that allows you to do something else during that time. LIMITATIONS : ============== The size of SLICES may be limited by the file system or kernel (maximum file size is 2 GByte with Linux kernel 2.2.x), the number of SLICES is only limited by the size of the filenames, thus using a basename of 10 chars, considering your file system can support 256 char per filename at most, you could already get up to 10^241 SLICES, 1 followed by 241 zero. But as soon as your file system will support bigger files or longer filename, dar will follow without change. dar_manager can gather up to 65534 different backups. This limit should be high enough to not be a problem. CONTACT : ============ Denis Corbin http://dar.linux.free.fr dar.linux@free.fr