Category: linux

Xrandr solved my linux docking station woes

I had to switch from the proprietary nvidia drivers to the vanilla 'nv' driver, but that enabled xrandr and now I can take my T61 (running Ubuntu Hardy) on and off my docking station -- without restarting X!

Technorati tags for this post:

Git Backstage, Under the Covers and on the Down Low

Don't tell my employer, but I'm an undercover git user. Officially, my company uses Clearcase UCM - but I do almost all my coding outside Clearcase. I use git behind the official version control system. I'll tell you why I risk a scolding from my boss1 and the IT department for using an unsanctioned tool. But to do that, I'll have to tell you a bit more about both Clearcase2 and git.

Clearcase Basics

Clearcase is a centrally managed, client-server type system -- a big server hosts the repository which tracks all the project's info: individual files, the directory tree, versions, branches, users, permissions, Social Security numbers, DNA sequences, etc. The client (that's you) gets a working copy of the project files and tools to interact with the server. Checking out, checking in and viewing history all require connecting to the server. You can't do a thing without being tethered to the central repository.

In the Clearcase UCM development model, programmers create new "streams" ("branches" in generic terms), each having an "activity" (which is Clearcase talk for "changeset"). When you checkin a file, it's recorded against an activity. Therefore, an activity is a set of checked-in files (and directory changes). Other developers on the project won't see your changes until you "deliver" (merge) your activity (changeset) to the parent stream (the main branch). End of vocabulary lesson.

Where Clearcase Falls Down

My biggest issue with the Clearcase model is that it inhibits both experimental and iterative programming. I assert the following:

  1. Experimental coding requires branching. Let's hope this isn't a controversial statement. The only arguments people make against branching are practical - their VCSs don't support it well. Or they don't support merging well. Clearly, if you had a VCS that could branch and merge well, you'd use it all the time.
  2. Clearcase discourages experimental branching as streams are centralized, limited and public. On my project, creating "unnecessary" experimental streams is discouraged as each stream incurs the maintenance penalty of using more resources on the server.
  3. Merging in Clearcase is also less than perfect. Instead of a patch-based approach, Clearcase restricts merges to streams that have a common baseline. And guess who has to create, maintain and recommend these baselines? You, the user. Want to merge between streams that have diverged - that don't share an exact baseline? Forget it: you have to go outside Clearcase to do it.
  4. Finally, Clearcase activities don't allow logical commits within an activity - a checkin is limited to just one file. You cannot checkin 4 files as a group representing a logical change. This makes it hard to both organize your work and revert multi-file changes if they don't work out.

Let's see if git could help. (Ok, it can - or else why would I continue writing?)

Git Comes (Quietly) to the Rescue

Git is the distributed VCS used by Linus to maintain the Linux kernel. Where Clearcase requires a centralized server, git requires none -- the repository is stored in a single hidden directory at the top of your project's file tree. You can create a git repository from an existing project very easily:

cp -R /my/clearcase/project /my/git/project
cd /my/git/project
git init
git add .
git commit

A git repository is local and under your complete control, you don't need any anybody's permission to create one. And obviously you don't need network connectivity to a centralized server. You won't need to file a single TPS report to setup a git repository. And, if needed, you can keep your repo a secret.

What was I complaining about? Oh yes, branching and logical commits.

Branching is simple and lightweight in git. Let's say you're working in your 'bug_fix' branch, and you decide that the fix could be simplified by moving some methods from class Foo to class Bar. Here's how you'd create a new branch called 'foo_refactoring', make some changes in it, and then merge those changes back to the master branch:

cd /my/git/project
git checkout bug_fix
git checkout -b foo_refactoring
// make some changes to Foo and Bar, compile, and test
git add Foo.java Bar.java
git commit -m 'Refactored Foo'
git checkout bug_fix
git merge foo_refactoring

If you decided the refactoring was unnecessary, you could have skipped the merge -- or even permanently removed the experimental branch. The branch was created quickly -- just to try out some ideas -- and it can be ignored or removed. Branching is up to you, not the sysadmin.

Finally, git lets you build logical changesets. As you can see in the foo_refactoring example, a commit can contain any number of file changes. You can build a new feature piece-by-piece, committing chunks of related work together. This is good for both you and your reviewers!

So, how are you going to use git behind your VCS?

Getting it to Git

Getting your Clearcase (or CVS, Subversion, Perforce, etc) code into git is easy: copy your working dir to some local drive space and do the "git init; git add .; git commit" sequence.

Dealing with rebases is easy too. Other users have probably made changes to the codebase (in Clearcase) and you'll need to merge your work with theirs before merging to the main stream. You can handle this by rsyncing the upstream code into a git branch, then using git rebase to merge that code into your development branch - fixing any conflicts as necessary. Git rebase basically pops your current commits off your branch, merges with the requested branch, then re-applies your commits. It helps keep the history of your changes simple. I recommending doing all development work in a sub-branch off master (master is git's default branch) and keeping master for rebases. For example:

cd /my/clearcase/project
cleartool rebase -recommended // or 'cvs up' or whatever
cd /my/git/project
git checkout master
rsync -r /my/clearcase/project /my/git/project
git commit -a -m 'rebased from clearcase'
git checkout dev_branch
git rebase master

Getting it back to Clearcase

Git has made my daily coding much nicer - but if I want my code built into my product, I still have to get it back to Clearcase.

I find the simplest and safest way to do this is by applying a series of patches to the Clearcase controlled working dir. You can ask git to generate a patch for a single commit using git diff:

git diff 345983 > my-change.patch

But if you give git-diff a branch name instead of a commit, it will generate patch files for each diverging commit between the two branches. Assuming your master branch represents the rebased Clearcase branch and your dev branch has been 'git rebased' to master, this is exactly what you need!

cd /my/project/git
git checkout bug_fix
git diff master > bug_fix.patch
cd /my/clearcase/project
// checkout any files if necessary
patch -p2 < /my/git/project/bug_fix.patch

Build, test and submit in Clearcase - you're done!

Final Thoughts

Having a full understanding of your VCS's data model is essential to using a it correctly. Perhaps the root cause of why I prefer git over almost any other system is its simple conceptual model (Git for Computer Scientists does a nice job explaining the data model). Clearcase is typical enterprise software -- its feature sheet is very long and highlights words that CIOs love like "reliable", "maintainable", and "support contract", but the documentation is thrifty when discussing the systems internals. Version control is too essential and too difficult to trust to a system you don't understand. So I use git - behind the scenes if necessary.

Update (2008/8/5): As several commentators have noticed, if you can use Clearcase snapshot views instead of dynamic views, then you can directly init a git repo in the view storage directory. Then you can use git directly out of that directory or clone it. By creating the git repo directly in your snapshot view, you can substitute the rsync step with a "git pull" and the patch step with a "git push". This is a big win. Alas, my employer requires dynamic views due to tool limitations, so I didn't discuss this method in the original entry.


1 Hi Boss! What I'm discussing here is not really any less safe than having un-committed work in any working dir (which everybody does). But the threat of a knuckle slapping adds some drama to this post - don't you think? ;-)

2 Almost everything discussed in this post is true of any centralized version control system (CVS, Subversion, etc), not just Clearcase.

Technorati tags for this post:

How to fix Subversion errors after upgrading your Berkeley DB library

After a routine "apt-get upgrade" of Debian testing, I found myself unable to use my Subversion repository. I got an error message when trying to commit a file:

svn: Berkeley DB error while opening environment for filesystem db:
DB_VERSION_MISMATCH: Database environment version mismatch
svn: bdb: Program version 4.3 doesn't match environment version

A note from the Subversion FAQ had this to say:

After upgrading to Berkeley DB 4.3, I'm seeing repository errors.

Normally one can simply run svnadmin recover to upgrade a Berkeley DB repository in-place. However, due to a bug in the way this command invokes the db_recover() API, this won't work correctly when upgrading from BDB 4.0/4.1/4.2 to BDB 4.3.

Use this procedure to upgrade your repository in-place to BDB 4.3:

  • Make sure no process is accessing the repository (stop Apache, svnserve, restrict access via file://, svnlook, svnadmin, etc.)
  • Using an older svnadmin binary (that is, linked to an older BerkeleyDB):
    1. Recover the repository: 'svnadmin recover /path/to/repository'
    2. Make a backup of the repository.
    3. Delete all unused log files. You can see them by running 'svnadmin list-unused-dblogs /path/to/repeository'
    4. Delete the shared-memory files. These are files in the repository's db/ directory, of the form __db.00*

The repository is now usable by Berkeley DB 4.3.

As the instructions note, you need a copy of subversion linked with a pre-4.3 version of the Berkeley database library. Subversion uses Berkeley via the APR Library. So we need to install appropriate verions of Berkeley, APR and Subversion.

My notes are below. Note that I installed the APR and Subversion software into a local directory (/home/mk/proj/svn_db/local in my case). Also, my Subversion repository is in /data/svnroot.

# export LD_CONFIG_PATH=/home/mk/proj/svn_db/local

# wget 'http://downloads.sleepycat.com/db-4.2.52.tar.gz'
# tar -xvzf db-4.2.52.tar.gz
# cd db-4.2.52
# cd build_unix
# ../dist/configure
# make
# make install

# wget 'http://archive.apache.org/dist/apr/apr-0.9.5.tar.gz'
# tar -xvzf apr-0.9.5.tar.gz
# cd apr-0.9.5
# ./configure --prefix=/home/mk/proj/svn_db/local
# make
# make install

# wget 'http://archive.apache.org/dist/apr/apr-util-0.9.5.tar.gz'
# tar -xvzf apr-util-0.9.5.tar.gz
# cd apr-util-0.9.5
# ./configure --prefix=/home/mk/proj/svn_db/local --with-apr=/home/mk/proj/svn_db/local --with-berkeley-db=/usr/local/BerkeleyDB.4.2/

# make
# make install

# wget 'http://subversion.tigris.org/downloads/subversion-1.2.3.tar.bz2'
# tar -xvjf subversion-1.2.3.tar.bz2
# cd subversion-1.2.3
# ./configure --prefix=/home/mk/proj/svn_db/local --with-apr=/home/mk/proj/svn_db/local --with-berkeley-db=/usr/local/BerkeleyDB.4.2/
# make
# make install

# su
# /home/mk/proj/svn_db/local/bin/svnadmin recover /data/svnroot
# tar -cvf ~/svnroot_backup.tar /data/svnroot

Then I executed steps 3 and 4 from the FAQ. At this point, I was able to commit files to my repository again.

Link to story

Technorati tags for this post:

SSH Session Multiplexing

I have a new favorite ssh feature! Not that password-less public key authentication, port forwarding or X11 forwarding weren't really cool. But session multiplexing is really sweet.

Included in version 3.9, session multiplexing is a faster way to run multiple ssh session to a single remote host. When you login to a machine (call it 'remotehost') the first time, you tell that ssh session to become the "ControlMaster" (-M option).

# ssh -M -S ~/.ssh/remote-mux user@remotehost
Ssh will start a session as usual and also open up a Unix domain socket using the filename you provide in the ControlPath argument (-S option).

To start another ssh session to the same host, you can do:

# ssh -S ControlPath=~/.ssh/remote-mux user@remotehost
Ssh will skip authentication (as you've already auth'd to remotehost) and will use the existing TCP connection for the second connection. The upshot of this is that logging in a second (or third!) time is instantaneous.

Adding the -S and -M options on the command line is tedious. You can setup your .ssh/config file like this:

Host remotehost
   HostName remotehost
   User user
   ControlMaster yes
   ControlPath ~/.ssh/remote-mux

Host remotehostfast
   HostName remotehost
   User user
   ControlMaster no
   ControlPath ~/.ssh/remote-mux
For your first connection, do:
# ssh remotehost
For your subsequent connections, do:
# ssh remotehostfast

Link to story

Technorati tags for this post:

Linux App of the Day ( x 2 )

I've discovered two new applications today and I'm in love with them both.

First, Firestarter. Its an elegant gui application (gtk) to configure and monitor a linux firewall. It manipulates iptables rules under the hood of course. I'm finding it most useful on my desktop machines (and laptops) where my firewalling needs are simple.

Second, Vim Outliner. Vim is my long-time favorite text editor. And vimoutliner provides some nice keyboard macros to do simple outlining. Its going to make maintaining my TODO list much quicker. Bonus: it comes with a plugin to add checkboxes to your outlines.

Link to story

Technorati tags for this post:

lsof -i

Hot dog! Two unix posts in one day!

I just wanted to share my joy in discovering the -i switch of lsof. The argument to -i can be either a protocol (TCP|UDP), a hostname or ip address or a port number. Or any combination of the above. The result: a nice listing of the processes with connections that match your criteria.

Technorati tags for this post:

You screen, I screen, We all screen for ... console multiplexing

I've recently started using the screen unix program. It lets you multiplex several processes (think shells) in a single window. Even better, you can 'detach' the screen process from your current window and log out of your machine. Upon returning, run screen -r to get your whole screen session back.

Here's my personal ~/.screenrc

# no startup msg
startup_message off

# new windows go to home dir
chdir

# keep some status lines on bottom of screen
caption always "%{= bb}%{+b w}%n %h %=%t %c"
hardstatus alwayslastline "%-Lw%{= BW}%50>%n%f* %t%{-}%+Lw%<"

activity "Activity in %t(%n)"

# Create 3 more screens.  They will exit once the designated
# program quits.
screen -t misc  0
screen -t mutt  1 mutt
screen -t www   2 lynx -book


# move to the main shell
number 0

Technorati tags for this post:

Advanced Thunderbird Filtering Extension?

I've recently switched from using Mutt to Thunderbird at work. The only feature I really and truly miss is the ability to run external programs as part of the message filtering process - just like good old procmail used to let me do (or Perl's Mail::Audit). Thunderbird lets me sort messages into folders, but I need to do things like:
  • Modify the body of a message
  • Add a header
  • Run another program based on the contents of the message
  • Bounce the message to another address

Does anybody have an extension for Thunderbird to let me run Perl or Python programs on an incoming message? No? Am I going to have to write this myself?

Technorati tags for this post:

< Future 10 | Past 10 >