2013年2月19日 星期二

SVN:Branch By Abstraction


lesser-known source-control best practice I’ve been pushing for a number of years is “Branch by Abstraction”. It is not my invention, and has been best practice for many years, but how about it is given a name. The suggestion is that you can convene large sets of developers in a single trunk (Trunk Based Development) and avoid “short lived feature branches” that you have to merge back. The problem being with feature branches is that the current state of any one of them might be unable to be deployed for a number of weeks while the team gets it right. Those branches just end up running and running ….

Provisos

There are some general provisos for the single as opposed to composite trunk design, that coincide with hard-core Agile development:
  • You’ve broken your application into multiple components.
  • Each component into a directory inside the trunk (possibly hierarchical).
  • Each directory its own source self-contained and its own build (possibly hierarchical).
  • You have a good set of unit tests and consider them important enough to illustrate snippets of example usage.
  • Continuous Integration drives things, even for hundreds of components, that drops items into a Maven-like repository. CruiseControl has a nice directive that with the directive that allows you to set up the killer CI installation that is branch-ready meaning you can run without a dedicated CI administrator.
  • Your management are good at release planning.
  • Developers are in the habit of never breaking the build :-)
Your trunk may look like:
  
    trunk/
      foo-components/
        foo-api/
        foo-beans/
        foo-impl/
          build.xml
          src/
            java/
            test/
          cruisecontrol-config-snippet.xml
        remote-foo/
      bar-services/
        bar/
          build.xml
          src/
            java/
            test/
          cruisecontrol-config-snippet.xml
        bar-web-service/

So back to the problem..

What to do when (if) your team says they want need to shift from Hibernate to iBatis (hypothetical case). There could be thousands of classes that depend on Hibernate. The architects might suggest that the build will be broken for weeks so a separate branch is the best place for this change. Instead, lets try Branch by Abstraction (BBA) instead of the traditional “Branch by Source Control” ( Stacy Curl coined the name by the way – I’m trying to shame him into writing a better blog article than this one).

The steps to living Branch By Abstraction

With your most responsible developers -
  1. Introduce an abstraction over the core bits of the big thing you’re going to change and commit
  2. Update all the bits of code that were formerly using the thing directly to use it via the new abstraction and commit
  3. Make a second implementation of the abstraction, with unit tests that specifically test its core functionality and commit
  4. Update all the code from (2) to use the new implementation (still via the abstraction)* and commit
  5. Deprecate the first implementation (or skip to 6 if you don’t want a respectful grace period).
  6. Delete the first implementation (its proven there is no need for you to go back).
  7. Remove the abstraction (if it is inelegant).

Benefits

  • Only a small team is even bothered by the change.
  • You can go live at any stage – because the larger application works at all times.
  • Management can be adaptive about scheduling.
  • Avoids merge hell.
  • Introducing Abstraction helps increase understanding/modeling of piece – which is useful in itself.
Of course, BBA is not a panacea . It is just a practice that developers/architects can often do it when architects with less nerve are suggesting yet another long running feature branch. Architects should strive to do BBA instead of new feature branches – Architects should not hope to reach a situation where they can declare at the outset that a new branch is the “only way” to achieve something.

Oh by the way, ClearCase sucks

A buddy last week was telling me of 21 significant branches who’s merge order was uncertain in his nameless client. Sucks. He smiled wryly when I guessed ClearCase as their SCM choice. ClearCase whether in dynamic, static or UCM modes has no place in Agile development efforts. It is a self fulfilling prophesy that requires dozens of administrators a few black-belt merge-meisters and multiple branches and causes long development cycles, waterfall thinking and high staff turnover. The only thing worse than it is PVCS (who owns it now?). Anyone wanting a good SCM tool for Agile development should be looking at Perforce (a favorite because cos Intellij works very well with it) or Subversion. Subversion will overtake Perforce one year soon I guess [note: Early 2007 comment].

When do you branch then?

Ideally for release only.
  
    trunk/
    releases/
      rel-1.0/
      rel-1.1/
      rel-1.2.x/
You may branch some days before release, then “production harden” the branch on a staging box. You’re not going to give permissions to all developers to that branch, just a couple who are ensuring its ready and handling later merges (one’s or two’s only if at all). You branch the release from trunk of course – given that CI proved that trunk was at all times pretty solid.
As well as Stacy Curl, I’m hoping Martin writes an article on this important practice. He is better with words than me.
Written April 26, 2007

Update (May 2nd, 2009)

So the state of the art has shifted some from Subversion to Mercurial, GIT and (some burn a candle to it) Bazaar. Prior to this update, the blog article concerns trunk best practice, and is preaching to a multi-branch development team that “trunk based development” can work for you with discipline. That discipline is “Branch by Abstraction” and “little and often” commits. Of course, the team is trying to push towards Agile ad an increase in the frequency of deployments to production, with fewer defects each time. ClearCase is often where they are coming from. So I had a FX trading client in 2005 that I persuaded to move from multi-branch development to trunk-based. There was a lot of choreography to move from the entangled source tree represented in fifty branches to a trunk metaphor broken out into buildable components as outlined. It took months of course be left them with a clearer understanding of their workflow and asset control.
Later in 2005, Roy Singham (ThoughtWorks owner), made me fly in the middle of the OSCONconference (!) to CollabNet’s offices to present on Subversion and the importance of trunk based development as a way for their sales engineers to pitch in corporations who are otherwise invested in ClearCase, StarTeam, PVCS etc. The theory being that Subversion is a sellable piece in its own right, and that the larger Collab stack was not the only product/service of theirs worth talking about to clients.

Multi branch versus trunk diagrams

Here is a diagram of an often encountered development team branching choice : Multi branch. Merges are happening in multiple direction all the time. Some branches are long lived, some short. Some branches concern functional enhancements (business value) and others are for non-functional technical reasons (like a shift from RDMBS to a distributed database). Its chaos – the department pushes to production from any of the branches that allege that they need to go live and have all of the integrations needed. Here are the bad things associated with multi branch :
  1. For weeks at a time, an individual branch can be in an undeployable state
  2. the development team will report that merging is a major part of their day, and fraught with complexity given (1)
  3. Often there are regressions, as a merges are missed, and the business yells at the IT dept.
  4. Labeling and handling labels makes you consider another career
Contrast to the trunk model:
So here we see concentrated development on the trunk (actually we imply that, see the diagram below). We also see releases exclusively from release branches. We see only bug fixes on the release branches, and merges back to trunk (though we might hope that all bugs are fixed on the trunk, and merged to the release branch). We see something that is not only buildable at all times, but is also deployable from anywhere with a day’s notice. Of course you are not going to deploy from just anywhere, but imagine that as a requirement from the business – “be ready to go live within a day’s notice, and have a high level of confidence”.
A day in the life of two trunk developers..
  1. Checkout : 2 minutes (100BaseT network)
  2. Checkout : 2 minutes
  3. Update/Sync (speculative) : 3 seconds
  4. Update/Sync (speculative) : 3 seconds
  5. Update/Sync (speculative) : 3 seconds
  6. Commit : 10 seconds (10 java files)
  7. Update/Sync (speculative) : 6 seconds (10 java files)
  8. Commit : 10 seconds (6 java files)
  9. Update/Sync (speculative) : 3 seconds
  10. Update/Sync (speculative) : 3 seconds
  11. Commit : 10 seconds (5 java files)
This is just showing regular life in the trunk, and not branch by abstraction commit by commit per se.
Moving to trunk based development on a nimble SCM
Flipping from ClearCase to a Subversion or Perforce. Some clients take a phased approach, some do a big bang. One automotive client did the latter in a lunch break. Wherever it happens, you will hear reports of increased personal productivity from 20% to 33%.
Be aware though that ClearCase requires admins. As much as one admin to twenty developers. They’ll want to put up a case for not switching SCM tools. I’m sure that Perforce and Subversion require admins. As Google apparently use Perforce , I wonder how many of twenty thousand staff use ‘p4 admin’ on a daily basis. Not many I hope.

On Distributed?

Agile teams report productivity improvements over Perforce or Subversion. There’s no doubt that’s true at least today, but the Subversion team is pushing towards the features of their distributed competitors. At least, one feature at a time. I wonder though, if a team should not be adept with trunk-based development (BBA and “little&often”) before they move to distributed. That is a longer discussion.
[ article published: Apr 26th, 2007 ]

Update: Dec 7th, 2010

Update: May 13th, 2011

Jez Humble linked back to me on his experience report for BBA (thanks Jez). Ironically, he’s performing the change from iBatis to Hibernate which is the opposite of the hypothetical case that I’d made some years ago.
After reading his article and seen some field usage of BBA, I wanted to say some more.

Avoid big bang

If you can help it, you should not do the cutover from A to B in a big bang. Imagine you had a set of components that you needed to change in some way, that was going to take some time. Perhaps a longer time than you’re production release interval. Say that was three months, and your team was in the habit of pushing production releases monthly like clockwork.
Strategy A (big bang).
1) You take your starting set of components like so
2) work over three months to get them all in the finished state. Commit to trunk of course, but provide a feature toggle that is ‘old implementation’ for everyone on trunk, and the two production release branches that were made from it, with only yourself and a CI build pipeline as seeing the ‘new implementation’. Like so:
3) When finished, and everyone agrees, then flip the switch (toggle) for everyone (including the next live release). Then, when that has been in production for a while, remove the switch and the old implementation (and perhaps the abstraction if it was not useful for unit testing):
Strategy B: Iterative instead of Big Bang
Instead, how about an iterative approach where components are turned on as they are completed. You would be forcing all developers to take the changed implementation, and production releases that go out. You would not need to hold on to the old implementation as long, and the switch/toggle is somewhat less magnificent. This feels safer to me.
The less than magnificent switch/toggle is still there of course, but it only concerns the single component that you’re currently working on. At any moment in time, as viewed by everyone else on trunk, or the folks concerned with releases, some components have the old implementation, and some have the new ones. Over the three months the transition is completed, and everyone gets to use the new implementation as it is progressively rolled out.
Of course, you still need to introduce the abstraction throughout the codebase before you start to change the first implementation.
The benefits of this approach?
You are able to defuse these allegations:
“you’ll never finish”
“devs in other teams are still making old implementations, therefore you new version is at risk”
“there is too much at risk for the big bang, as it has never been tested properly”

Agile teams and BBA

There’s some responsibility needed using BBA in Agile teams. Practitioners of XP (and alike) are bound to love refactoring. There’s going to be pain from merging to working-copy as much as there is to any branch (all SCM tools to variable degrees irrespective of workflow). Therefore responsibility is needed. If someone wants to rename every and package to fit with some new understanding of the application, then you have to be 100% sure the thing is going to merge to whatever developers don’t have committed. Git (and alike) hit the new high-bar of SCM, merge through rename , quite well so could cope, but lesser SCM tools might not. If you understand that refactoring is not refactoring+make some other changes, then it might be polite to notify developers that you’re about to to a big rename/move thing and they might want to a) checkin stuff they are doing if timely, and b) go to lunch.

Ball of Mud vs Hairball

Jez cites Brain Foote’s ball of mud article. I’ve met Brian a couple of times, and he’s engagingand entertaining , but I’m not sure that ball of mud is the right metaphor for the entanglement we see in enterprise application development. I think ‘hairball’ is. Imagine the thing a cat coughs up, with implicit horrific entanglement.

以上文章摘自http://paulhammant.com/blog/branch_by_abstraction.html/

沒有留言: