I’ve just started a blog series to talk about how the first causes of software development can help us make better choices and decisions. In previous posts I talked about how we can know something and how to decide what is good. For my very first deep dive I want to talk about Reversibility. It’s an important thing to consider because Reversibility, or the lack thereof, pretty well dictates when you have to make a decision and whether or not you can unmake that decision. Why you care about Reversibility is summed up nicely by this quote from Martin Fowler in his seminal paper Is Design Dead?
“If you can easily change your decisions, this means it’s less important to get them right – which makes your life much simpler. ”
Think about that. If you can’t change a decision at a later time, you’ve got to make that decision quicker and you’ve got less leeway to be wrong. We can’t control everything, but in many ways we can apply techniques and choose technologies that give us more Reversibility as a way to make decision making easier. I see Reversibility manifesting itself in three general themes:
- Working and delivering incrementally. I want to work incrementally by building a system one useful feature at a time. There are solid economic reasons to work a project this way. Underlying everything I’m writing about in this series is the desire to maximize the return on investment of our development efforts. There’s no return on investment whatsoever until something gets released into production. In my company’s case, we can make a release after the very first little module is complete to get an additional cashflow going. To get that positive cashflow going I need to get the first module out the door, and it doesn’t require the same complexity of infrastructure that the following modules will. I can make the company more money by building out a simple infrastructure for the first release, even though that simple infrastructure will eventually need to become much more complex and robust. I do want the ability to build out the more complex infrastructure later without breaking or rewriting the first module.
There’s also the issue of complexity and the limits of the human mind. We can only consciously deal with so many variables at any one time. It’s easier to develop a system when we can eat the elephant one bite at a time by working on one feature at a time. In order to really work one feature at a time and keep up the pace of continuous delivery, we often make simplistic solutions for some infrastructure elements. That infrastructure might be perfectly good for the features at hand, but insufficient for later features and later system loads. That’s perfectly fine, and I’d argue that it’s desirable, as long as I can reverse the simplistic solution with a more complex or robust solution later. Not having to worry about making the infrastructure robust for later features let’s me escape the analysis/paralysis trap and keep getting completed features out the door. If I feel like I need to get some piece of infrastructure code exactly right before going any farther with features, I’ve got to pause and complete that infrastructure before I can continue with more features.
- You will make wrong choices and mistakes. I want everybody to think hard about this. Putting yourself in a situation where an early decision made wrongly can is expensive to be undone is going to lead to a brittle project. It’s going to be easy to fail because you’re dependent upon every decision being exactly right upfront. On the other hand, if you can make adaptations in the requirements or refactor away problems in the design later without substantial extra costs, you’re going to be more resilient. That implies some attention to finding your mistakes and problem spots in the design before it’s too late, but I’ll save that for a following essay on Feedback. Working adaptively will lead to more consistent success than working only by prediction. What would you rather base your decisions on, adapting to what is actually happening as the project progresses, or basing decisions solely on what you think is going to happen? Adaptability is much easier with Reversibility.
- Things will change. Your current requirements are exactly right for today’s business situation, and your current design elegantly satisfies those requirements, but tomorrow is going to come with unexpected surprises. Business drivers will change. Technologies will change. Your understanding of the design will change. You’ll stumble into performance problems that you didn’t foresee. You’ll make assumptions that will blow up in your face. Change will happen. With good reversibility, you’ll be able to deal with change.
I think it boils down to making good decisions. Having Reversibility makes it easier for you to get decisions right. It makes decision making easier by allowing you to make decisions without the fear that you can never undo that decision, by allowing you to wait longer in the project to make more informed decisions, and by allowing you to back out decisions that turn out to have been wrong. In situations where you have to make irreversible decisions you’re going to be more brittle because you’re forced to make decisions earlier and you have less flexibility to change bad decisions. Since I think it’s impossible to be right every time, put me down for an extra helping of Reversibility.
The Mini Saga of the Clustered Index
Here’s an example from my current project that brings out several of the issues around Reversibility. My boss has a deep background in database design and development, but little in Sql Server. He’s very concerned about the choice of which column in any given database table should be the clustered index. It’s a big, important decision that potentially has a lot of impact on the performance of the system. My boss is concerned, and he wants that decision made right now so that he can rest assured that our baseline performance is going to be good enough. I think it’s an important decision too, and that’s why I think we should put that decision off for at least six months to make sure that our baseline performance is going to be good enough. So who’s right? It basically comes down to how the two of us perceive the Reversibility issue.
I think that decision needs to be put off until much later in the project when we have real performance numbers and usage scenarios to consider. If we tried to design the clustered index solution right now we would just be making an educated guess that’s more than likely going to be wrong anyway, so why bother? I want to wait until the Last Responsible Moment to make that decision. For a refresher course, as stated by Mary Poppendieck, the intent behind Last Responsible Moment is to:
“…delay commitment until the last responsible moment, that is, the moment at which failing to make a decision eliminates an important alternative. “
Roughly put, the Last Responsible Moment calls for you to make better informed decisions by waiting as long as possible to make a decision. The Last Responsible Moment is the point at which you have the most information in your hands with which to make that decision. In six months we’ll have much more information about the performance characteristics of our new system and know where our database performance bottlenecks really are based on empirical performance measurements. In six months I’ll be much better able to determine which database column is the best choice for the clustered index in every table.
Now I have to answer the question “can I get away with waiting to make this decision?” The Last Responsible Moment is also the last point at which you can wait to make a decision before the lack of a decision causes problems. Using the logic of the Last Responsible Moment, I want to wait as long as possible to make well informed, and hence better, decisions. Unfortunately, there’s a limit to how long I can wait to make a decision, and this is where Reversibility comes into the picture. If a decision is easy to reverse or retrofit into the system later, I don’t need to make that decision right now. If a decision is hard or expensive to change later, then I have to spend more upfront analysis and design time to make that decision earlier. Decisiveness can be an admirable trait, but good decision making often requires reflection and feedback. When I can put off a decision for later I gather more insights from the project work that can lead me to better or even simpler solutions than I might choose if I had to make the decision upfront. Oftentimes the Reversibility of a decision is out of your control, but in other cases like my clustered index decision, I can take steps to increase my Reversibility.
My boss’s worry is partially based on how hard he found it to change the clustered index on a table using the Sql Server Management Studio GUI tool. The admin screen promptly put up a message box telling him that he couldn’t change the clustered index on a table without dropping a bunch of other indexes on other tables first. That experience made him conclude that changing the clustered index late in the game was going to be too hard. He was afraid that the choice of clustered index is not a reversible decision. One way or another we’re going to have the entire database schema scripted out with all of the DDL in Subversion, with a fully automated way to rebuild the database schema from scratch. Since rebuilding the database from scratch is going to be part of the normal Continuous Integration strategy anyway, all I have to do to change the clustered index later on any table is to modify a single DDL script, check it back in, and voila! — I’ve changed the clustered index. My point being that the existence of a good configuration management strategy for the database will allow me to change the database schema at will right up to the point when we’re actually ready to deploy the database to a customer. Add in an automated regression test suite, plus the unit test coverage that falls out of TDD, and I’m much more able to make changes later. By baking in more Reversibility with the database build and test automation strategies, I can push back the Last Responsible Moment, and make better decisions in regards to my clustered index decisions.
Database design isn’t rocket science, so why can’t I take the time to do the analysis and think it through right now? Maybe I could, but the time spent on upfront database design would be taken away from other areas where I see more immediate risk. Performance is a risk for sure, but our biggest risks are delivering a good user experience and getting the right Domain Model in place to support the necessary behavior. I’d rather put my time and focus on those issues now because I see those decisions being harder to reverse. The Last Responsible Moment for the user interface machinery is coming up a lot faster than the database. By making the reconstruction of the database schema highly automated and ruthlessly keeping persistence concerns out of my business logic and user interface code (orthogonality, but I’ll get to that later), the decisions about the database structure are highly reversible. Therefore, I can put my focus on the first complicated screen and Domain Model without worrying if I can make it work with the database or not.
Team Composition Matters
The composition of your team and organization can have a dramatic effect on your reversibility. If all of your decision making can be done within your team, you’ve got a lot more Reversibility in your decision making than you do when your decisions have to coordinated with or made by an external resource. When you need to deal with an external specialist, you have to do more upfront planning to best take advantage of the expert while they are available to you. The expert often adds value to your project, but the availibility of the expert decreases your scheduling flexibilityy. I don’t know if I’ve said it explicitly enough, but I think that a small self-contained team of full time generalists trumps a team of part time specialists every single time.
Earlier in this decade I worked in the manufacturing IT arm of a company that was an interesting juxtaposition of elite business moxie and putrid software development. The company was undeniably one of the best examples of lean manufacturing in the world (the intellectual progenitor of Agile development), but its IT departments were the worst examples of bad Waterfall thinking. Near the end of my captivity there we went to a “consulting” model (our VP was from CapGemini and Accenture). The idea being that we would sharpen a set of very solid specialists and share those specialists across the entire portfolio of projects. Of course, it also meant that those specialists were moved very rapidly from project to project. Business analysts would go onto a project early, write the requirements, hand them over to an architect and move on to the next project. The architect would take the requirements, document a system design, then (you guessed it) skedaddle on to one of the other 6-7 projects he was responsible for. At no time in the project were developers, analysts, the architect, and the testers ever actively engaged on the project at the same time. It was unbelievably brittle. The whole thing crashed and burned anytime anybody made a mistake. The developers on the team were typically not empowered to make decisions on design direction or the requirements, and it would always take some time to get the original analysts and architects back to the project to consult. The organization wasn’t that great to begin with, but it took a major step backwards because of the “consulting” model. The business finally became angry enough that they stepped in and took over direct control of the IT program management. I just googled the VP that inflicted that insanity upon us. She was named CIO of a Fortune 100 company last year.
At DevTeach Vancouver I heard Greg Young say “I can find an argument to defend everything but scope creep.” I’ll pick up the torch here from Greg, because I can think of a great defense of scope creep. Very frequently the user requirements that come in very late in the game represent some great ideas that would add a lot of value into the system. The most popular feature of the most successful system I ever built was a near last minute user request. Fortunately, I had access to the business folks, analysts, and testers all the way through the project and we could happily take on that user request. If we had needed an external expert (I will NOT refer to any human being as a “resource”) for any part of that feature we wouldn’t have been able to coordinate with the external person fast enough to complete that feature — and the project would have been less successful for not having done that feature.
Sooner or later, you’re going to hit a project that’s too big for the ideal 4-6 developer sized team. In those cases we generally have to divide the team, but how? We could divide the team by:
- Horizontal layers. In my financial development experience, that ended up being a backend Java team and a frontend .Net team.
- Vertical features, but maybe create a centralized team to manage shared services and framework construction.
- Vertical feature teams, and share the responsibility for shared infrastructure among the teams (which doesn’t preclude an architecture team)
I choose number 3, and I’ll use the concept of Reversibility to explain my answer. Like I said above, you can most easily reverse the decisions that you own while decisions you can only make with the participation of people external to your team are harder to reverse. In case number 1 you might be dependent upon an external team to build your web services. Every single feature that you build is going to be dependent upon that other team. The Last Responsible Moment is going to come relatively early as you and the backend team need to agree on a web service contract fairly early so that the backend team can start their work and your team knows what signature to code to. In case #2, you’re reliant on infrastructure code that you can’t change yourself. You can ask the central team to change the framework to better suit your needs, but that’s going to involve more friction and resistance than it would in code that’s completely under your control. Again, you don’t want to put yourself in a position where any mistakes are hard to reverse by putting organizational overhead in the way.
I’m jumping ahead of myself, but choices #1 and #2 also result in less corrective Feedback by splitting the infrastructure builders from the infrastructure users.
- Code Generation: Rob Conery just shared a story about using code generation for an admin site where the database and generated code got out of synch and eventually blew up. If you’re going to use Code Generation, I’d very strongly recommend you use active code generation instead of passive code generation. Make the code generation part of the automated build process. If you absolutely have to use one of those schemes where the object model is codegen’d from the database schema, set something up with the Continuous Integration build that automatically builds the database from the latest source and uses that schema to regenerate the code.
- The Myth of Service Oriented Architecture: I’ll get flamed for this again, but I don’t care. SOA is a means, not a goal. Used in the right circumstances it will make your business more nimble by making the business automation processes easier to change. Great, more reversibility is good. But think on this as you compose your service boundaries, what can you change faster, an in-process service provider completely contained within one logical system/service boundary, or the same service provider exposed externally and maintained separately by a different team. Which is easier to change? You do NOT have good reversibility in changing the contract of a publicly published web service contract, especially if there is more than one development team involved. I hit this before in a Microcosm of Agile Design, but I don’t think anybody picked up on the Reversibility angle of the two choices. SOA is partially predicated on the assumption that “monolithic” applications are hard to change. I think that good application architecture can frequently eliminate the economic advantage of building distributed SOA style solutions. SOA is most appropriate to me in cases where rapidly changing business processes need to interact with legacy systems that aren’t easy to change themselves.
- Process and Modeling Weight: If you invest a lot of time in creating elaborate UML models and spiffy design documentation upfront, are you really going to be as emotionally able to ditch that design when it’s proven to be wrong or a better solution is discovered? You should, because the design documentation is a sunk cost. However, human nature is going to make that decision harder because of the emotional investment in the bad design. One of the truisms of Agile design lore is to not give yourself any incentive to keep a bad design. If you need to go through some sort of process vetting to make or change a decision you’ve also got less reversibility. A lot of shops will use formal quality gates at various times to “lock down” designs or requirements. That might be a good thing, unless that quality gate process makes it harder to change your design or analysis after the quality gate process is done. I guess that what I’m trying to say is that you never want to give your teams artificial reasons not to do the right thing.
- Project Friction: I’ve been in a couple spots where configuration management practices actually made it difficult to check in code changes. In that situation you don’t want to be making coding changes any more often that you possibly have to. More frequently, I’ve been in shops that had very poor configuration management processes around the database structure. If you can’t reliably synchronize database changes across different development environments, or if you have to fill out a lot of paperwork to make those changes, you’ll find yourself bending over backwards to avoid making database changes — often to the detriment of the project’s quality.
- Model First: I do not like the concept of designing the database first then codegen’ing the object model. I think the Agile database community has made strides in the last couple years, but Object Oriented code is much, much easier to evolve and refactor. With refactoring tools like ReSharper and fast running automated tests, OO Domain Models are much “softer” than a database schema can ever be. My strong preference is to use a “Persistence Ignorant” Object Relational Mapper like NHibernate and let the database largely fall out of my Domain Model (or at least make the design of the database run in parallel with the Domain Model). I think a database first approach forces you into doing much more design upfront. The upfront code generation is great for the initial creation, but I think those tools are balky when it comes time to make little changes. When I want to add a property to a Domain Model class I want to go right to it in Visual Studio and add the property. I don’t want to fire up a sluggish code generation tool just to add a single property. I also get irritated with the code generation strategies when I want to change a property name. With ReSharper I hit SHIFT-F6 and do the Rename refactoring. With a codegen tool I modify the code generation model, start the code generation, then compile to find all the places in my code that is now broken because the generated code changed. No thank you.
- VB6 COM versus .Net: You didn’t want to do much evolutionary design with COM components written in VB6. You really wanted to lock in the public signature of the COM classes as early as possible, then warn other developers not to breathe on your IDL signature. Anytime you changed the public signature of a COM class you risked a chance of breaking the entire COM class and rendering the newly compiled class useless from its clients. With the advent of .Net, those problems largely went away. As long as the binary signature of an existing method was maintained, you could happily add new things to a .Net assembly without any concern for binary compatibility. It’s no coincidence that the evolutionary practices behind Agile development first came into the Microsoft development world as we switched from Windows DNA to .Net based solutions. Your technology choice greatly impacts your Reversibility.
- Duplication is a Killer: One of the best ways to increase your Reversibility is to reduce duplication in the system. If you Don’t Repeat Yourself, you can change your system much more effectively. An example of this for me last week is the “autocomplete” behavior in our screens. My business folks want something more sophisticated than the built in ComboBox, but for reasons of my own, I don’t want to touch the equivalent control in our 3rd party suite. Sometime when I shake free I’m actually going to write my own autocomplete functionality (sigh). I don’t think I need to do this right now because every ComboBox in our user interface is governed by an instance of the same MicroController class. All I need to do to add the autocomplete functionality later is enhance this one class. If I had to make that change manually to each and every ComboBox in every code behind, I’d have to stop and do that code right now.
- Test Automation: A solid safety net of test automation makes changing the code safer. Any time we change an existing system we incur a risk of causing regression errors in existing code. That fear of regression problems prevents many teams from considering improvements or changes to their system. With a good amount of test coverage we can confidently change a system and know if and where we’ve broken some existing code. Regression testing is a major cost over the lifetime of an application. So much so that it often dictates the end of life of many systems. I see test automation as a means to lengthen the lifecycle of a codebase.
How does Reversibility relate to the other first causes?
- Orthogonality is a huge contributor to improving your Reversibility. You can most easily reverse prior decisions if you only have to change that one decision. You can’t rapidly change your caching strategy if it means getting into and changing user interface and business logic code as well. Orthogonality helps to contain changes.
- If we’re going to take advantage of making decisions at the Last Responsible Moment, we need to gather more information about our system as we go. Feedback helps guide us in our decisions and informs us when we need to reverse a decision. Another quote from Martin Fowler: “Designing for reversibility also implies a process that makes errors show up quickly.” You need to build in Feedback loops to find problems.
- Code readability/understandability/solubility whatever you want to call it. In order to make a change in a system, you’ve got to find the place to make that change and understand the impact of your change.
Reversibility makes it easier to make good decisions. You can partially and advantageously control the timing of your decision making by choosing tools that give you more Reversibility. Stealing another saying from the Poppendiecks, you need to decide when to decide. Some decisions cannot be put off and you’ll have to do the best you can with these.
There’ll be more to come sometime next week. I think I’ll do Feedback next.