Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Unison: A File Synchronization Tool


SysAdminMag.com

Unison: A File Synchronization Tool

Mihalis Tsoukalos

Unison is an open source file synchronization tool for both text and binary files. Although Unison also has a GUI, this article covers only the command-line version. Unison is convenient when you are working with more than two computers and you want your files synchronized. It can be securely used through the SSH service, but it can also be used through rsh (which is not recommended for security reasons) and works equally well on both Unix (Linux, Solaris, etc.) and Windows (98, 2000, XP) systems.

Unison has been inspired by the rsync utility. Unison differs from rsync in that rsync is a mirroring tool, whereas Unison is a synchronization tool that identifies the files that have been changed since the last synchronization process and decides how the changes are going to be propagated.

Installing Unison

Most Linux distributions have Unison as a package ready for installation so that you do not have to compile it. On my Apple Power Mac G5, which is running Tiger (a.k.a. version 10.4), I had to compile it. However, the compilation was straightforward. The current stable version of Unison is 2.10.2, but this article uses Unison version 2.9.1. Every machine that is part of the synchronization process must have a copy of the command-line version of Unison installed. Additionally, this copy of Unison must be in the default path. I put mine in /usr/bin instead of changing the default PATH shell variable.

How to Set Up SSH

The single most timesaving step is to set up SSH so that you will not need to enter your password every time you synchronize your data. The procedure is easy and involves the following two steps:

1. In your local server, run ssh-keygen -t dsa -f ~/.ssh/id_dsa -C "username@remote_machine". You will have to enter a passphrase twice (please do remember the passphrase!). Two files are going to be created: ~/.ssh/id_dsa and ~/.ssh/id_dsa.pub.

2. Copy the contents of file ~/.ssh/id_dsa.pub from your local server inside file ~/.ssh/authorized_keys in the remote server.

When you try to connect to the remote machine, you will get something like the following output:

dialup25:~ mtsouk$ ssh [email protected]
Enter passphrase for key '/Users/mtsouk/.ssh/id_dsa':
Last login: Sat Jul 16 11:00:05 2005 from dialup25.chi.sch.gr

[mtsouk@pluto mtsouk]$
You can see that it is not so timesaving to enter the passphrase from step 1 instead of the real password. Using the following two steps, you can get away with this: 1. Run eval 'ssh-agent' (for bash shell) 2. Run ssh-add ~/.ssh/id_dsa and you will be asked to type the passphrase for the last time for this particular bash shell. A Basic Unison Setup File Unison can run from the command line without using configuration files, but having a configuration file available greatly simplifies its use. In this article, I will not deal much with the command-line options of Unison. In the unusual case that you have trouble working with Unison, you may run it using the -debug all command-line option so that you can better trace and resolve errors. The following is a simple configuration file of Unison, named SysAdmin.prf that is located inside .unison directory, which is the directory in which Unison does its housekeeping:
big:~ mtsouk$ cat .unison/SysAdmin.prf
# Saturday 25 June 2005
root = /Users/mtsouk
root = ssh://racoon//Users/mtsouk

# Paths to synchronize
path = docs/DSMS
path = docs/article
path = docs/SysAdmin
path = docs/PIK
path = Desktop/Eugenia
path = Sites/MacLand
Lines starting with a # denote a comment and are being ignored. Lines starting with root = declare the machines that are going to participate in the synchronization process as well as the directories that are considered root directories for each machine. After those important declarations, the directories to be synchronized are listed. In this particular example, we have six directories. The full path of the first one is /Users/mtsouk/docs/ DSMS for the local machine, the machine whose declaration does not begin with ssh://, and /Users/mtsouk/docs/DSMS for the machine called racoon. Each remote machine starts with root = ssh://. The command that must be used to run Unison using the SysAdmin.prf configuration file is "unison SysAdmin", provided that SysAdmin.prf is located inside the .unison directory. Unison Examples Unison may be slow the first time you run it, especially if you have many files to synchronize. This happens only once, so subsequent synchronizations will be much faster. The following configuration file is used as a simple, complete example of Unison:
big:~ mtsouk$ cat .unison/PLUTO.prf
# Saturday 25 June 2005
root = /Users/mtsouk
root = ssh://pluto....gr//home/mtsouk

# Paths to synchronize
path = Sites/PHP

# Log file
logfile = /Users/mtsouk/.unison/unison.log

# Backup files
backup = Name *
big:~ mtsouk$
If you have never run the command unison PLUTO before, you are going to see an output that is similar to that of Figure 1. Note that the directory /home/mtsouk/Sites must already exist at the remote server or the synchronization will fail. Figure 2 shows another example of running Unison using a configuration file called DSMS.prf. In this example, it is dictated that Unison should:
  • Use /Users/mtsouk/.unison/unison.log as its log file.
  • Take backup copies of all the files.
  • During the synchronization process, ignore files with names ending in .DS_Store:
    big:~ mtsouk$ cat .unison/DSMS.prf
    # Thursday 14 August 2003
    root = /Users/mtsouk
    root = ssh://racoon//Users/mtsouk
    
    # Paths to synchronize
    path = docs/DSMS
    path = docs/article
    path = Desktop/docs.var
    path = Sites/PHP
    
    # Log file
    logfile = /Users/mtsouk/.unison/unison.log
    
    # Backup files
    backup = Name *
    ignore = Name *.DS_Store
    ignore = Name .DS_Store
    big:~ mtsouk$
    
    The keyword backupversions in a configuration file tells how many preceding versions of each file will be stored. If the backupversions keyword is not defined, a default value of 2 is attached to the keyword, which means that the last plus one versions of the file are kept inside the ~/.unison/backup directory. Please note that if you are synchronizing big or huge files, a backupversions option with a value of 4 means that each file, including its backup copies, may exist five times and occupy five times its space. For a comprehensive tutorial on Unison, type unison -doc topics at the command line of your terminal. There are rare occasions -- usually due to user error -- when Unison will not be able to determine whether a file or directory has changed on the local or the remote server. In such situations, Unison asks for our help so that it will not mistakenly proceed using the wrong file or directory. Figure 3 shows this situation as well as another error situation where a file or directory has changed during the synchronization process. Unison outputs the following error message:
    The file /Users/mtsouk/docs/article/unison.SysAdmin/article.txt\n 
    has been modified during synchronization: transfer aborted
    
    and does not update that particular file in order to avoid further faults. Figure 4 shows the contents of the .unison directory in my local machine.

    Unison can also utilize external programs to perform merging on conflicting versions of a file. The keyword merge defines how the merging process is going to happen. Please use this option only if you know exactly what you are doing.

    Unison Development

    The Unison project was led by Benjamin C. Pierce at the University of Pennsylvania. Unison began as a research project but it is no longer one. Benjamin C. Pierce is now leading the Harmony Project, which is also related to file synchronization. Nevertheless, Harmony is still in its early stages.

    The people interested in Unison maintain the following three mailing lists:

    1. unison-announce -- New Unison release announcements.

    2. unison-users -- General discussion of Unison.

    3. unison-hackers -- Informal discussion for developers and experts.

    Conclusions

    This article described some of the uses of Unison. There are many more things to do with Unison: run it as a cron job at nights, synchronize Web servers, keep backups of configuration files (note that Unison cannot replace backup procedures), etc. For non-critical data files, you may run Unison once a day, but for critical data you should run it more often.

    Acknowledgments

    I thank Nikos Platis for letting me use his machine for the purposes of this article.

    References

    Unison home page: http://www.cis.upenn.edu/~bcpierce/unison/

    Unison manual: http://www.cis.upenn.edu/~bcpierce/ \ unison/download/stable/latest/unison-manual.html

    Harmony Project: http://www.cis.upenn.edu/~bcpierce/ \ harmony/index.html

    OpenSSH key management: http://www-128.ibm.com/ \ developerworks/library/l-keyc.html

    Mihalis Tsoukalos lives in Greece with his wife, Eugenia, and works as a high school teacher. He holds a B.Sc. in Mathematics and a M.Sc. in IT from University College London. Before teaching, he worked as a Unix systems administrator and an Oracle DBA. Mihalis can be reached at: [email protected].


  • Related Reading


    More Insights






    Currently we allow the following HTML tags in comments:

    Single tags

    These tags can be used alone and don't need an ending tag.

    <br> Defines a single line break

    <hr> Defines a horizontal line

    Matching tags

    These require an ending tag - e.g. <i>italic text</i>

    <a> Defines an anchor

    <b> Defines bold text

    <big> Defines big text

    <blockquote> Defines a long quotation

    <caption> Defines a table caption

    <cite> Defines a citation

    <code> Defines computer code text

    <em> Defines emphasized text

    <fieldset> Defines a border around elements in a form

    <h1> This is heading 1

    <h2> This is heading 2

    <h3> This is heading 3

    <h4> This is heading 4

    <h5> This is heading 5

    <h6> This is heading 6

    <i> Defines italic text

    <p> Defines a paragraph

    <pre> Defines preformatted text

    <q> Defines a short quotation

    <samp> Defines sample computer code text

    <small> Defines small text

    <span> Defines a section in a document

    <s> Defines strikethrough text

    <strike> Defines strikethrough text

    <strong> Defines strong text

    <sub> Defines subscripted text

    <sup> Defines superscripted text

    <u> Defines underlined text

    Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

     
    Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.