Saturday, August 30, 2008

Build systems, and work flow thoughts

Long time, no post. No worries though, there are new tasks and new thoughts associated with them. Here's a couple.

First is a general problem associated with build systems. For some reason, it appears that dependency resolution and build/package/deploy scripting always get mixed up in build systems. I'm creating one now for a large Java system and the fact is that all the available tools and styles leave me unsatisfied.

A system that is basic might include dependencies persisted into the source tree (libraries etc committed) with ant scripting. This is functional but for systems with multiple modules, like the one I'm working with, it results in either a lack of fine-grained library control, or library clashes, or library redundancies - all of which are maintenance inefficiences. At the ant level you would either end up with a large complicated ant file that is hard to maintain or you would have many sub-build files which may have redundancy

A system that is more "modern" would use maven for dependency resolution and build, but the problem as I see it is that while you eliminate the operational inefficiency associated with library placement you add a large maintenance inefficiency in that you were already using a source control system and now you have a fair bit of work to do to maintain your maven repository as well, in order to back up your dependency declarations. Further, scripting the actual compilation and packaging of your artifacts can be done in maven but is not elegant, to my eyes.

I have heard that ivy+ant can help, but as ivy is built on maven repositories to meet declared dependencies there is still the maintenance cost of maintaining a repository, and scripting the build in ant reduces the problem at minimum to the cost incurred in the basic build system.

Which all leaves me honestly feeling as though from a pure dependency resolution and artifact production perspective that the basic system with libraries in the source tree and ant scripting for artifact production is still the global optimum for build systems. That just can't be the case, can it? I'd love to hear otherwise.

The only saving grace I'm aware of with maven is that there are enough value-adding plugins (e.g. IDE configuration generators, static analysis tools) that maven proponents assert the value of a maven system is large enough globally to overcome the local added costs of maintaining the repository. While I will grant that there are a large number of value-adding plugins, I'm not convinced that the same value couldn't be captured with what I would wager is a simpler system substituting like ant tasks for those maven plugins.

The internal debate goes on, though I am committed to using maven for the system. At the least I will come out of this with a thorough understanding of exactly how maven will work on a large project because we are certainly destined to find out.

One other thing I have an ongoing interest in is the general problem of how to efficiently complete tasks where more than one person could do parts of the task, and in general more than one type of specialized skill must be used to complete the task.

I've been watching articles on lean engineering, agile, scrum and XP flow by for a while, and I think something between a pull-based lean system and agile/scrum models how a highly skilled team works, while introspection on the real things that drive such a system would perhaps formalize (and make more efficient) how this realistic team works. That makes a little formal thought around the ideas useful, and I just recently read a great article that does so here:

The idea I see there is a generalization of a "bucket brigade" work team style to any production need, and then a refocus of that style to a specific application of software feature development, mapping in the lean and agile processes where necessary to communicate the idea. It appears it might work.

One thing I'm curious about though is that it appears the work style counts on the links in a chain being fixed - e.g. this one type of skill (or skill overlap) is always needed for this stream of tasks. In reality a stream of tasks is typically much less uniform and contains individual tasks that need a variety of skills (or skill overlaps) - never quite the same set twice in a row. I'm not sure how you would handle that, or even if it is handleable in a general work flow design like that.

Perhaps for each unique set of skills you'd have a different feature brigade, and just hope that in reality there was a finite number of linkages required and that exercising them didn't result in over-utilization of a resource shared between multiple brigades? Unfortunately, this seems to match reality frequently as well. It shouldn't be too hard to reduce the multiple chains to a single slightly branching chain though and still get the inventory/capacity alarms that a kanban board will give you while maintaining flexibility around the skills required for a given task.

Either way, if there is potential to have smooth high-velocity feature development it should be examined, and this is article definitely has some ideas I'd like to incorporate.