I've always been enthralled by those code visualisation videos produced with
gource and recently I had the opportunity to generate one for the code base for our new product at the company I work for. Unfortunately I can't show you that one but I was inspired to generate one for my own code.
For my video I took a slightly different route. Rather than concentrate on a single project I modified
this script to scan for all GIT repositories under the working directory and create a combined log file (with duplicate repos removed) that can be fed into gource (you can find my modified version here). I wanted to see if it would reveal anything about my working habits and my approach to projects (when it comes to my personal projects at least). Here is the result: VIDEO
Unfortunately this is not an entirely accurate representation. I've been running private repositories since 2000/2001 or so and I've moved from
CVS to Subversion to GIT between then and now. When I imported my CVS repos into Subversion I lost almost all my history. I managed to keep all the history I'd built up in subversion when I migrated to GIT but there was another problem - when I pulled in an external project to use it was simply imported into the Subversion repository as if it was a completely new set of files with no historical change information associated with it.
Turns out that gource doesn't really cope well with a log file that includes something the size of the Linux Kernel or the
Yocto Project suddenly appearing over the space of a few minutes. There were two projects that had large dependencies like this (I was experimenting with building custom Linux distros for the BeagleBoard) so I simply removed them from the tree.
So, what did this reveal about my habits when it comes to personal projects? Well, it's not pretty. Projects that appear in a sudden flurry of activity only to remain untouched for long periods, sudden changes in direction that result in the deletion of a huge number of files, bouncing from project to project with no obvious reason and long periods of inactivity. Chaos in other words.
The periods of inactivity can be explained away by time pressure caused by my day job. The rest of the behaviour seems to indicate that I use my personal projects as more of a learning activity (new technologies, new tools, etc) rather than as things with specific goals. This is not necessarily a bad thing (and I would recommend that anyone working as a developer spend a lot of time doing the same thing - being able to apply things I've learnt while
playing with them on my home time to my paid job has been a real boon to my career) but I guess it means I'm never going to be an internet millionaire :)
As a comparison I generated a video showing my work on this website, at least one project that gets steady work done it and has been maintained over a reasonable period of time. Here's what it looks like:
At the beginning you can see that I'm experimenting with a number of different tools before I finally settled on having a static site generated by
Nikola with Twitter Bootstrap to handle the theme and layout. After that it settles down into general site updates (new posts, new images) and the occasional tweak to the styles and site generator itself. A little while ago I went through and purged everything that was left over from my initial experiments and now I'm left with a nice manageable codebase that does everything I want.
This is how a project should look I guess - some dancing around as the best solutions are found until it finally settles into a stable configuration and just requires minor updates from that point on. I'd love to regenerate the video in another 6 months time and see how my theory pans out. The last bit of that video should be fairly boring if I'm right - minor changes to a well established codebase.
I highly recommend using gource for visualisation - apart from the pure aesthetics there's something about visual presentation of data that makes it easy to spot patterns and simplify the understanding of larger trends. Because gource will accept a generic log format you can convert just about any time based data into an input for it - not just source repositories. I've already got a few ideas about things I want to try to feed into it (more small projects to bloom and remain idle forever after).