CVS HOWTO

Written by T.D.Bishop, Oct 2002.
Content borrowed from i-scream document written by the author.

This document as not yet been completely checked, so please e-mail me with any questions if something is unclear.

Overview

CVS means "Concurrent Versioning System" and is used to keep version control on plain text files - usually source code, but can be anything that has a plain text structure. It can store binaries, but doesn't offer the same version comparison features for obvious reasons.

CVS also offers many features for group working, and doesn't lock the files when someone starts editing them. Instead it works by each user having their own "checked out" local copy of the source code (or a portion of it). The user then works on this code until they are happy with it, then runs an update command. This checks to see if someone else has updated the version in the repository, and if so brings their changes down into the users local copy. This can cause conflicts if both users change the same bit, but this shouldn't really happen - unless group communication is lacking. CVS will do it's best to merge these new changes in, but occasionaly it will need help from the user. This is just a case of reviewing the file with conflicts and manually resolving them. When CVS is happy that any updates in the repository have been merged it will allow a commit. This puts the users changes into the repository for all to access.

In essence, that's CVS. Checkout, update, commit. That's all there really is to it.

CVS can be beneficial in project work for two main reasons. Firstly it provides an excellent basis to keep track of changes over the project, allowing all group members to see what the others have been doing. By keeping a record of all changes it is also possible to roll back to older versions of the code if a problem emerges. This can be extremely beneficial when tracking down a bug and wondering how/why this bit of code was changed. The other benefit that will quickly become apparent is how CVS aids group working - it is a lot easier for people to work cooperatively whilst maintaining a central, and more importantly, backed up copy of the code.

Setting up CVS on raptor or swallow

Setting up CVS on a teaching host such as raptor or swallow is fortunately an easy process. First you need to set some things up in your .bashrc (amend accordingly for csh/tcsh) :-

unset CVS_RSH
unset CVSROOT
export CVSROOT=/usr/local/proj/co600/project/groupname/cvsroot

The CVSROOT variable points to the location of the CVS repository. It's a good idea to keep this under a shared project area, such as the one given. You'll then need to create the directory and initialise the cvs repository. This only needs to be done once in a group!.

mkdir /usr/local/proj/co600/project/groupname/cvsroot
cvs init

If you get no errors all is well. You're now set to start using CVS.

Using CVS

This next section looks at using CVS from the Unix command prompt. You should be familiar with Unix by now, but if not, maybe now is a good time to get started? Although the commands given here are for the Unix command line version of CVS, they ideas directly apply to any version of CVS, including WinCVS. A full list of commands can be found by typing :-

cvs --help-commands

There are three main CVS operations that you'll use on a regular basis; checkout, update and commit. Then there's the smaller commands such as add, delete, and export. I'll describe them, then give an example. This all assumes you've setup the CVSROOT environment variable as described above.

A quick mention of the version numbers first. Each file has a unique version number starting at 1.1 and incrementing to 1.2, 1.3 and so on. It never becomes 2.x, unless you manually change it. These version numbers are independant for every single file, and have no overall bearing on any "release versions" that may be used to describe the product as a whole.

Firstly, the 'checkout' command. This extracts a copy of a section of the CVS Repository to a local working copy. All work (editing) is then carried out on this local copy. The local copy can be changed, completely deleted, or whatever... it'll have no effect on the repository. The command has the following format;

cvs checkout 

For example, to checkout the server directory inside the source directory one would type :-

cvs checkout source/server

This would extract a copy of all the files in server/source and all subdirectories below this. Note that CVS control files will be created in 'CVS' subdirectories all over the place, they're best ignored (but not deleted!).

Next the 'update' command. This command ensures that the local copy is up-to-date with the repository. If the checkout was done a few days ago it's possible someone else may have updated something in the main repository since the original checkout. After issuing the update the local copy will be updated. The command is very simple;

cvs update

The exception to this is if the user has changed the local copy in some way. In this case the update still takes place, but any changes are merged into the local copy. No updates are sent to the repository, this is important to remember! This could of course cause problems if the update on the repository and the local update are on the same bit of code. Usually CVS can merge updates, but in this instance it will give a conflict error and the user will have to manually resolve them. This is a simple case of checking the code the problem areas will be marked out. After the conflicts have been resolved another update could be done to verify it's all OK.

The update command also lets you know the state of files. If it puts an 'M' next to a file (when it lists them) it means you've modified it. If it's got a '?' it means the file is new or not part of the CVS (you have to add new files). A 'C' means a conflicting file, and I think maybe 'A' means the file is scheduled to be added. There are probably others, check "man cvs" :)

Right, the last main command is 'commit'. This puts local changes back into the repository and updates the version number. You should run an update first, though, because commit will complain if you haven't done when you need to. You will need to enter a comment, and it's good practice to describe roughly the reason for the commit - it helps keep a good audit trail, and lets other users know why you did what you did. In Unix you'll get the default editor, which might be vi - to change this, set the EDITOR environment variable to something different. The commit command is as easy as;

cvs commit

That's all there is to the everyday CVS commands. Next the occasional commands. Now on to the less used ones. Lets start with the 'add' command. Say you've done a checkout of source/server and you've added a file called main.java into the directory. Now lets add it to the repository.

cd source/server
cvs add main.java

The add command will tell you that it's scheduled to be added at the next commit. Simply run an update then commit for the file to be added.

The 'remove' command works in a similar way I think. I'm not sure whether you have to delete the file first, you'll have to try it. Again you'll need to update then commit. A note about deleted. The file obviously won't be deleted completely, that would defeat the point in a versioning system. Instead it's moved to an Attic subdirectory in the CVS so you can still review it's revisions in the future, and even resurrect it.

Next there's the 'release' command. This is used as a tidy way of cleaning up the local files when you've finished with them. To remove the above example you'd do the following;

cvs release source/server

This will ensure that you don't accidently delete any local changes by informing you first of any non-commited files. If you add the -d switch it will actually delete the files too;

cvs release -d source/server

Finally, the 'export' command. This has almost the same effect as the checkout command, but it doesn't create any of the CVS control files. The aim of this command is to extract sources for release, so you can zip them up and send them out to a client. For example;

cvs export source
tar -cvf source.tar source
gzip -v9 source.tar

How CVS handles binary files

Basically CVS can't do version management of binary files, and quite obviously this is because it's not possible to make version comparisons between two revisions in a logical way. But, CVS will let you store binary files and will let you add new versions. However, it will not store "changes" between them, or allow you to view differences between them.

IMPORTANT: You must make sure you tell CVS that a file is binary, otherwise it will not store it correctly and the file will be corrupted.

To add a binary file you just need to add a flag to the add command, just like this :-

cvs add -kb 

Now, a quick explanation of what this "-kb" means. Basically it is telling CVS not to do keyword expansion (turning $Revision$ into $Revision: 1.3 $) and tell is not to attempt to convert between unix/windows text formats.

Tagging the Repository

Tagging the repository is a way of "marking" all the files at a particular point in time. For example, when "release 1" is finished you might want to mark this in the history of all the files, so you can easily come back to it in the future. As every file that makes up "release 1" may be on a different individual revision, it would be tedious to reproduce the same state of files at a later date.

This is the best example I've seen that clearly shows how a tag is implemented. Imagine the dotted line signifies the point at which you declared the files were "release 1". As you can see, work as continued on the files, but you could easily pull out each file at "release 1".

   File A      File B      File C      File D      File E
   ------      ------      ------      ------      ------
   1.1         1.1         1.1         1.1         1.1
---1.2-.       1.2         1.2         1.2         1.2
   1.3 |       1.3         1.3         1.3         1.3
        \      1.4       .-1.4-.       1.4         1.4
         \     1.5      /  1.5  \      1.5         1.5
          \    1.6     /   1.6   |     1.6         1.6
           \   1.7    /          |     1.7         1.7
            \  1.8   /           |     1.8       .-1.8----->
             \ 1.9  /            |     1.9      /  1.9
              `1.10'             |     1.10    /   1.10
               1.11              |     1.11    |
                                 |     1.12    |
                                 |     1.13    |
                                  \    1.14    |
                                   \   1.15   /
                                    \  1.16  /
                                     `-1.17-'

In fact, if you imagine pulling this line straight, you'd get the following effect which even more clearly shows where "release 1" is.

   File A      File B      File C      File D      File E
   ------      ------      ------      ------      ------
                                       1.1
                                       1.2
                                       1.3
                                       1.4
                                       1.5
                                       1.6
                                       1.7
               1.1                     1.8
               1.2                     1.9
               1.3                     1.10        1.1
               1.4                     1.11        1.2
               1.5                     1.12        1.3
               1.6                     1.13        1.4
               1.7         1.1         1.14        1.5
               1.8         1.2         1.15        1.6
   1.1         1.9         1.3         1.16        1.7
---1.2---------1.10--------1.4---------1.17--------1.8----->
   1.3         1.11        1.5         1.17        1.9
                           1.6         1.17        1.10

Tagging is actually remarkably easy to do. All you do is issue a single command to tag the files at the current point in time. You should make sure all development stops whilst you do it, otherwise a bit of new code could sneak in. Here's how to do it;

cvs tag public-release_1

This would mark the files with the tag "public-release_1". You can then continue with developement, knowing that all those files can easily be retrieved. It's again suprisingly easy to do. Instead of typing "cvs checkout source" you would type;

cvs checkout -r public-release_1 source

Which would checkout the source module at the point which you tagged. Alternatively the following would update (or downdate ?) your working copy to the tagged point;

cvs update -r public-release_1

Finally, a quick mention of why this would be useful. Imagine the development team was split into two groups, one working on coding, another working on testing. When the coding group have finished "release 1" they can tag it and carry on developing for "release 2". At the same time, the testing team can checkout "release 1" and test it, reporting bugs back to the coding team for fixing in "release 2". It could also mean the a customer could still get hold of the first release whilst you're busy coding for the second.

Branching in CVS

So far we've only looked about a single CVS branch, the trunk. There is only ever one "current" revision of each file, and everyone is working on the same files. However, it is possible to branch this out into more than one copy of the same files. A good example of why this is useful is continuing the idea at the end of the last section.

Imagine the coding team are now busy working on "release 2" and the testing team come up with a major flaw in the older "release 1". Customers have already got this version installed, so it needs fixing straight away. The trick is to make a branch at the point where "release 1" was tagged, and then fix the bugs on this branch. This branch need not carry on, it could just have the few bugs fixed, whilst the main branch (the trunk) has development of "release 2" carrying on. This diagram shows this;

                |
                |    -- tag "release_1"
                |\
                | \
                | |
                | |
                | -  -- tag "release_1-fixed"
                |
                |
                |    -- tag "release_2"

As you can see the customer could then get a copy of "release_1-fixed" and not worry about the unstable code in "release_2".

Branches can get much more complicated than this, but it's beyond the scope of our requirements. Further info can be found on the web.

Keyword Expansion

CVS allows you to put certain keywords into a file which CVS will expand when you check files in and out of the repository. There are quite a few, and they're all pretty useful in places. Here's a quick summary;

$Author$
  The author of the latest change.
    eg. $Author: tdb $

$Date$
  The date/time of the latest change.
    eg. $Date: 2002/10/08 13:10:04 $

$Header$
  Various useful bits of information.
    eg. $Header: /usr/local/proj/co600/project/group/cvs/file.txt,v \
        1.1 2002/10/08 13:10:04 tdb Exp tdb $

$Id$
  Simliar to $Header$, but without that wieldy path.
    eg.
     $Id: cvs-2.txt,v 1.3 2002/10/08 13:10:04 tdb Exp $

$Log$
  Puts the latest log message in.

$Name$
  Name of the tag.
    eg. $Name: release_1 $

$Revision$
  Revision number of the file.
    eg. $Revision: 1.3 $

$Source$
  Full path to the file in the repository.
    eg. $Source: /usr/local/proj/co600/project/group/cvs/file.txt,v $

The keywords are automatically replaced by CVS, even if it's already been expanded. You just put it in and leave it be!

Online CVS Resources

The following websites provide a lot of useful information about CVS - I learnt a lot from them. I'd highly recommend anyone learning CVS take a good look, especially at the first URL.

http://cvsbook.red-bean.com
http://www.cvshome.org