Managing QuantLib source using Bazaar

QuantLib is an open source project providing a comprehensive software framework for quantitative finance. It is developed in an open way by discussion on mailing lists and the source code is manged in a centralised QuantLib SVN repository. In this article I discuss how to efficiently manage non-public modifications to the source code tree of this project. These modifications could, for example, be patches that are not yet ready for a public review and integration in the central source code tree; or they could be modifications intended as permanently private extensions of the QuantLib functionality.

The simplest mechanism for managing such changes would be no mechanism at all, that is, download the source code in the form of compressed archives and make changes directly to it. There are many reasons why this approach is unsatisfactory for all but the smallest changes:

  • It is error prone: There is no automated way identify which files changed, when they changed and why they were changed
  • It is not possible to layer several changes on top of each other in such a way that each of them can be later separated
  • It is very inefficient for more than one person to work on same project in this way
  • Working on several un-related features requires copies of complete source tree
  • etc. In fact, all the typical arguments for source-control in the first place, but translated to incremental changes build on top of a public source tree.

Working on a QuantLib SVN checkout does not solve any of the above problems if you do not want to release your changes to the public immediately that they are complete.

Quilt: a patch management system

Many of the problems described above can be solved through use of the patch management system Quilt: this is a system that keeps track of changes as logical units ("patches") and the order in which they need to be applied.

I will not describe Quilt in detail here since I think the solution describe below using Bazaar is more complete and probably just as simple or simpler in terms of the user interface. The Quilt system is however attractive in its minimalist approach and it is probably simpler to understand its inner workings. If you are interested in finding out more, here are the Quilt man page and a Quilt PDF manual.

Bazaar: a distributed version control system

I will concentrate on describing a workflow based on Bazaar, which is a distributed, third generation, version control system (DVCS). Other popular distributed DVCSs are Git and Mercurial. As they have a roughly similar feature-set to Bazaar you probably could use those instead if you a-priori prefer them. In this article however I will not mention them further.

The key feature of DVCSs is that there is no single central repository to which you have to connect to make any version changed to source tree. Instead, each user can have a local database of revisions, which can subsequently efficiently and safely be merged together. This feature automatically solves the problem of keeping revisions private until and when they are ready to be made public.

Setting-up Bazaar and QuantLib source tree

Installing Bazaar

Installing Bazaar is straightforward on both Linux and Windows. If you have Python already installed, Bazaar will run straight from its source tree, without any compilation or installation required. The best method for installation is, however, to use the pre-built packages. On many distributions of Linux, the simplest way is to install through the package manager.

Inter-operation of Bazaar and Subversion

Clearly, a highly desirable feature is that the Bazaar revision management builds on top of information contained in the Subversion repository. That is the full history of the main-line project and any local changes should be managed and presented together. A patch management system like Quilt does not provide this feature, since the mechanism for managing patches is completely separate from the mechanism for obtaining the main development line.

Bazaar supports this very well, and in a number of different ways (full description of tools is available at the Bazaar Migration page):

1.) The entire subversion repository can be converted into a Bazaar repository as a one-time, non-incremental operation

This does not fulfil the current objectives since the mainline QuantLib development is expected to continue with Subversion.

2.) Incremental import of the Subversion branch

This could be done using one of the tools listed Bazaar Migration, but probably it is easier to let somebody else handle the work. This is the route that I have taken: I have set up the Launchpad service to import the Quantlib trunk branch and periodically update it with new commits. The branch is available at https://code.launchpad.net/~vcs-imports/quantlib/trunk

3.) Accessing the Subversion branch through the bzr-svn plugin

This solutions is based on the bzr-svn plugin and allows a high level of transparency and integration with Subversion repository.

The main drawback of this approach is that it is difficult to setup -- beside the bzr-svn plugin it is necessary to install a very new or patched version of Subversion as well as the Subversion Python binding. The details are available on the bzr-svn page.

Getting QuantLib source

Of the options given above, by far the simplest is to use the Launchpad mirror. You only need to have set up plain Bazaar; after this, getting the QuantLib tree can be as simple as

bzr branch lp:quantlib

And that is it! (For a bit more information on this process bzr branch.) You now have a complete history of the Quantlib trunk branch on your disk. You also have full write permission to this branch can add new revisions as you wish.

Basics: accessing the history

Even if you never any make changes to the Quantlib code, the above approach has the benefit that you do not need network access to access history information.

For example, even in completely disconnected operation it is possible to obtain the log using the bzr log command:

cd quantlib
bzr log --short -r -10..
 7840 markjoshi     2008-07-23
      Added first cut at doing derivative of implied vol of a swaption with respect to pseudo-root elements. Also added easy to use HW approximation of swaption vol and a test for it.

 7839 markjoshi     2008-07-22
      added OrthogonalProjections class to allow finding vectors orthogonal to the rest of a collection of vectors

 7838 markjoshi     2008-07-20
      changed market model to use incremental statistics gathering.

 7837 markjoshi     2008-07-16
      pathwise vegas and deltas now have the option to do deflation inside or outside the product, and tests have been added.

 7836 markjoshi     2008-07-16
      pathwise vega tests added and are passed!

 7835 markjoshi     2008-07-16
      vega stuff actually works!

 7834 markjoshi     2008-07-16
      increased commenting

 7833 markjoshi     2008-07-16
      hopefully in final form!

 7832 markjoshi     2008-07-16
      added accessor method for Brownians, this is needed for pathwise vegas

 7831 drjoe 2008-07-13
      revert changes to else

Similarly, it is also possible to obtain annotation information (that is names of people and revision numbers which last changed each line of a file) using the bzr annotate command:

bzr annotate ql/grid.hpp
4116 lballab | /* -*- mode: c++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */
2271 lballab |
1847 sadreje | /*
2232 lballab |  Copyright (C) 2001, 2002, 2003 Sadruddin Rejeb
1847 sadreje |
             |  This file is part of QuantLib, a free-software/open-source library
             |  for financial quantitative analysts and developers - http://quantlib.org/
             |
4116 lballab |  QuantLib is free software: you can redistribute it and/or modify it
             |  under the terms of the QuantLib license.  You should have received a
             |  copy of the license along with this program; if not, please email
             |  <quantlib-dev@lists.sf.net>. The license is also available online at
6707 lballab |  <http://quantlib.org/license.shtml>.
1847 sadreje |
             |  This program is distributed in the hope that it will be useful, but WITHOUT
             |  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
             |  FOR A PARTICULAR PURPOSE.  See the license for more details.
             | */
2271 lballab |
1847 sadreje | /*! \file grid.hpp
4114 lballab |     \brief Grid constructors
1847 sadreje | */
             |
4114 lballab | #ifndef quantlib_grid_hpp
             | #define quantlib_grid_hpp
1847 sadreje |
6309 lballab | #include <ql/math/array.hpp>
1847 sadreje |
             | namespace QuantLib {
             |
3477 lballab |     Disposable<Array> CenteredGrid(Real center, Real dx, Size steps);
             |     Disposable<Array> BoundedGrid(Real xMin, Real xMax, Size steps);
4325 nando   |     Disposable<Array> BoundedLogGrid(Real xMin, Real xMax, Size steps);
1847 sadreje |
2468 lballab |     // inline definitions
             |
3614 lballab |     inline Disposable<Array> CenteredGrid(Real center, Real dx,
2682 marmar  |                                           Size steps) {
             |         Array result(steps+1);
             |         for (Size i=0; i<steps+1; i++)
             |             result[i] = center + (i - steps/2.0)*dx;
             |         return result;
             |     }
3392 lballab |
3614 lballab |     inline Disposable<Array> BoundedGrid(Real xMin, Real xMax,
2682 marmar  |                                          Size steps) {
             |         Array result(steps+1);
3477 lballab |         Real x=xMin, dx=(xMax-xMin)/steps;
2682 marmar  |         for (Size i=0; i<steps+1; i++, x+=dx)
             |             result[i] = x;
             |         return result;
             |     }
             |
4283 drjoe   |     inline Disposable<Array> BoundedLogGrid(Real xMin, Real xMax,
             |                                             Size steps) {
             |         Array result(steps+1);
             |         Real gridLogSpacing = (std::log(xMax) - std::log(xMin)) /
             |             (steps);
             |         Real edx = std::exp(gridLogSpacing);
             |         result[0] = xMin;
             |         for (Size j=1; j < steps+1; j++) {
             |             result[j] = result[j-1]*edx;
             |         }
             |         return result;
             |     }
1847 sadreje | }
             |
2271 lballab |
1847 sadreje | #endif

Basics: reverting

You can also revert the source tree to any previous revision through the bzr revert command. For example to go back to last committed version, you invoke:

bzr revert

If you want to go to the revision before last, you can use the revision specifier:

bzr revert -r -2

To go pack to latest revision you need to again invoke:

bzr revert

Branching

The main motivation for adopting the model of working described here is to be able to track your own changes to QuantLib. The most obvious way of doing this is to create your own private branches. This of course is very easy to do in Bazaar. If you have already obtained the QuantLib trunk as described above, you can issue the bzr branch command as follows:

bzr branch quantlib mymods
Branched 7840 revision(s).

You can now make and record changes to mymods as you wish.

Alternatively you can branch directly from the Launchpad mirror:

bzr branch lp:quantlib mymods
Branched 7840 revision(s).

Merging

Changes between Bazaar branches can be merged using the bzr merge command.

Repositories

In the above branching example, each directory with a branch of QuantLib contained its own complete history information. If you have many branches this of course adds up to a lot of redundant information and therefore wasted disk space (currently for QuantLib, about 60Mb per branch). Therefore, if you are planning to have a number of branches, it is best to create a repository. The relevant Bazaar command is bzr init-repo; for example:

bzr init-repo ql-repo

This will create a directory ql-repo, which is a Bazaar repository. Any branches created within this repository directory will share the same storage area.

There is more on repositories at Repository Tutorial

Advanced features

Checkouts and branches without working-trees

QuantLib is both large and relatively slow to compile. Both of these factors tend to make developers reluctant to make topic branches, since in a normal workflow this would create a new source-tree which would then have to be re-compiled from scratch. Both of these problems can be avoided by using the bzr checkout and bzr switch commands, thus making extremely low-cost branching possible.

There are three key concept in this arrangement:

Checkout

A checkout is a mechanism for synchronising changes (that is commits) between several locations (further information is available in the Checkout Tutorial, bzr checkout help page). For a commit to succeed in a checkout, it must also succeed on a branch that the checkout is bound to.

One of the reasons for using the checkout feature is to reproduce a centralised workflow similar to what CVS and SVN support.

Branches without working-trees

Normally a bzr branch is a directory that contains a ".bzr" sub-directory that holds the revision control information, and the other files directories that are the actual files that you are version controlling -- this is the working tree. The working tree is not necessary for the purposes of revision control since it can always be recreated from the history information.

Therefore, it is quite possible to have a Bazaar branch that does not have a working tree: this can be done on an existing branch using the bzr remove-tree command.

The advantage of this becomes apparent if you have very many branches: each QuantLib working tree consumes about 22M of disk-space so it can add up. Note however that if you have many branches you should definitely also be making use of the repository feature as described above (see bzr init-repo).

The switch command

The bzr switch command associates a checkout with a different branch and updates the working tree according to the new association.

Putting it all together

The idea is to have as many branches as you need, all without working trees. Then you do actual compilation, hacking, testing on one single checkout. When you need to work on a different branch you can make use of the bzr switch command.

In the above example with a repository containing the trunk branch this would correspond to something like:

bzr branch trunk fix-bug-no1001
# Delete the working tree
(cd fix-bug-no1001; bzr remove-tree)
bzr branch trunk fix-bug-no1002
# Delete the working tree
(cd fix-bug-no1002; bzr remove-tree)

bzr checkout trunk active
cd active
# configure, compile, confirm bug #1
...

# OK, bug is confirmed. Want to make a fix on branch fix-bug-no1001
bzr switch ../fix-bug-no1001

# Now we are working on branch fix-bug-no1001

# make fix....
bzr commit
#  ...

# OK, done. Now bug #2
bzr switch ../fix-bug-no1002

# Make fix.
# ....

There are several advantages of above workflow:

  • When switching branches only the files that actually differ between the branches are modified. Therefore, re-compilation can take dramatically less time then a new compile from scratch.
  • As a consequence of above, no files are change immediately upon the branch point, i.e., there is no cost associated with deciding to make a branch at current point of development.
  • Only one working tree is present on disk at any one time, reducing disk requirements

Bazaar Looms

Bazaar looms are a mechanism for layering a chain of dependent changes in a coherent and logical way. It is particularly suitable when tracking a large upstream tree and while trying to introduce significant changes.

See the loom how-to and project loom home page for more information