Previously

This is the third in a series about TDD’s value to humans, following

Learning the domain

The software I was hired to develop acted as user-facing glue to several non-user-facing infrastructural systems in a highly complex environment. On arrival, I didn’t see the whole picture, and knew I couldn’t understand the implications of changing any given bit of code. Whether or not I tried to consider the systems outside ours, I didn’t understand what our software was aiming to do. The Unix command-line tool’s “create” subcommand would have made sense, except that there was also an “insert” which was slightly different. I thought I grokked “update” until I noticed other subcommands for specific kinds of updates. And I was about to believe I had a handle on “assert” when I discovered that “accuse” was a way to speed it up. Yikes!

To (be able to) go fast, (be able to) go well

Making things worse, the code I and my lone teammate had inherited was freshly minted legacy code. If we’d been instructed to go fast, we’d have made dangerous mistakes, and we still wouldn’t be going anywhere near fast enough to make those mistakes tolerable. If we’d been instructed to go safely, we wouldn’t have been able to deliver much, plus some dangerous mistakes would still have slipped through. If we wanted to move quickly and safely, we needed to make a concerted effort to change the properties of our code. In A last-minute feature, we did, and it began to pay off.

Now please go fast

The big refactoring-etc. in that post had been driven in part by the anticipated business need for a graphical UI for non-Unix users, which in turn derived from the demonstrated business need to manage identities for non-Unix platforms in the same way our successful proof of concept was managing them for Unix. For reasons I may never have been privy to, and certainly don’t remember, the need to manage mainframe identities congealed into a hard deadline.

We had 1.5 weeks.

The code was built on Unix- and Kerberos-specific assumptions scattered throughout — some of them hard-coded, some of them invisible, and as yet relatively few of them documented by any kind of automated tests. Discovering and reacting to all those assumptions by building support for our first additional platform, while having it be an unfamiliar platform with its own weird yet-to-be-discovered rules, appeared rather unlikely to be doable at all, and exceedingly unlikely to be doable safely.

And don’t screw up

But maybe. And we had to try. So I notified my teammate and our manager that in order to avoid office distractions and enable strategic napping, I’d be working from home, around the clock, until stated otherwise. At midnight, I emailed them a summary of the day’s progress and my tasks for the following day that, if completed, would prevent the deadline from being obviously un-meetable. Any given day, things might not have gone according to plan, and that would’ve been the end of that; but every night, I kept being able to report that they had, and here’s what has to happen tomorrow to keep the string going. A week later, with a couple days to go, we were still — against all reasonable expectation for a schedule with zero slack — on track.

The next day, I went into the office to knowledge-share with my teammate, who had researched the important platform-specific behavior we’d need to implement. I took my understanding, turned it into unit tests, and asked Robert (as I’d asked Bill about that last-minute feature) to validate that they were testing for the desired behavior and only the desired behavior. He caught and fixed a few mistakes in the assertions. Then I went off and made the tests pass.

What size shoe, and when?

That night, our release scheduled for the following evening, I slept deeply. We’d done everything in our power. We would go into production with a credible effort that might prove good enough. Even so, I was worried that it might not. I knew many of my refactorings had outstripped our still limited test coverage, for which I attempted to compensate by carefully making one small mechanical change at a time, reasoning about what to manually check, and doing lots of manual checking. I was pretty sure the system as a whole wasn’t fundamentally broken, but I was also pretty sure it wouldn’t be thoroughly fine. When we got to production, there was no question whether the other shoe would drop. We hadn’t worked with sufficient discipline to avoid it.

A small shoe, immediately

During the course of the Operations team’s usual post-release system checks, they found a big regression: one of my less rigorous refactorings had broken a common usage in an important third-party client I’d forgotten to check. I immediately recalled a smell I’d ignored while doing that refactoring and knew an expedient way to fix the problem. A half hour later, our updated release indeed fixed the regression, and I’d made myself a note to add a few integration tests with that client.

A big shoe, later

I felt certain we hadn’t seen the last of the fallout. Some months later it hit us. I don’t remember how we found it, but there was one remaining invisible assumption in the code that hadn’t been flushed out and fixed, and it was a potentially very bad one: a SQL DELETE statement with an insufficiently constraining WHERE clause. Not only had we forced Ops to scour several months of application logs and database backups for signs of wrongly deleted data, but also we’d forced the business to operate for several months unknowingly without the data integrity they were relying on. Against all reasonable expectation, again, the damage was very low.

Conclusion

It proved to have been crucial that, by the time this deadline popped up, we’d already greatly improved the health of our code and the extent of our tests and knowledge. If either had been any worse, we’d have missed the deadline, made more costly mistakes, or both. Instead I was able to test-drive sometimes, manually test fairly effectively, and tolerate — as a time-boxed exception — working mostly alone at an unsustainable pace with an unsustainable lack of discipline. And because we had made ourselves able to do those things, we delivered what we understood the business to need, when they needed it.

But only because we lucked out. The unseen, unmanaged risk we incurred by working in this way, even for a short time, could have come back to bite us much harder than it did; since it happened not to, I earned more trust. When I was given responsibility for managing the product half a year later, I was told that this particular outcome had been the one that sealed the deal. In other words, the career opportunity to learn about software product development from an entirely different perspective came as a direct result of having chosen to pursue TDD. And when I landed in that role, I looked for ways to manage risk without having to be so lucky.

Test-driven ways.

Posted Thu Apr 23 11:41:30 2015 Tags:

In TDD in context #1: Keeping my job, I…

  • Started a job as a software developer after 5+ years away from programming
  • Implemented my first simple new feature in an unfamiliar domain and a legacy codebase
  • Passed code review by one of the software’s original authors
  • Delivered to production, only to discover (in the form of user complaints) that I’d caused several regressions
  • Scrambled to fix my dumb mistakes, while suddenly and vividly recalling the value of Test-Driven Development
  • Obtained management permission to test-drive from then on

Getting to testability

There were a couple very basic end-to-end tests to show that the server could handle concurrent write requests without obviously always corrupting the database. But to avoid the screwup I’d just upscrewed, I’d have needed unit tests around the code I changed. Since they didn’t already exist, I’d have needed to write them; since that was too hard, I needed to make it easier somehow.

Working against me were my ignorance of the problem domain, my inexperience with rescuing legacy code, my lack of recent practice at programming in general and Perl in particular, my limited grasp of the full complexity of the environment, and the fact that this application had come to life as a hack-weekend proof-of-concept — one that had so convincingly succeeded that they hired a dedicated full-time developer (moi) to maintain and extend it.

Working in my favor were the existence of a couple of in-house (now open source) tools that provided network transport and authentication entirely outside our server daemon’s process space, the imminent in-house release of a major upgrade to the protocol library that promised to obviate the need for marshal-and-unmarshal concern-mixing boilerplate in applications, and the ready availability of the two in-house developers who’d built those tools and my application.

Tipping the balance further in my favor was that, arguably, doing that protocol library upgrade would help us meet a business need. A Unix command-line client had been part of the proof-of-concept code, but a web-based GUI for users from other platforms was inevitable. Before we’d build and maintain a second client, though, we wanted a dead simple client API. The protocol library upgrade would solve that problem too.

Sealing the deal, one of the in-house developers was my manager. Bill well understood the reasoning, agreed with it, and arranged for us to have plenty of slack in the schedule for “the SSP 2 port.”

After a month or two of careful, incremental changes with help and supervision from the local experts, we had less code, better structure, a vastly simpler client API, decent confidence that we hadn’t otherwise changed any behavior, and a handful of fast new functional tests that helped me understand common workflows through the system. We ran the new service alongside the old until we’d found and ported all the other clients in the wild to our new API, then retired the old service. The SSP 2 port was complete. But so what?

Getting under test

Now that we had the beginnings of a test suite, for each new feature, I knew what to do: TDD with one extra step. Before writing a new red test, I checked whether any tests covered the current behavior in the area of code I was likely to change. Almost always the answer was “no”, so I’d dig around until I understood enough to add some tests. Only then would I add a red test. (Or, sometimes, take one of the tests I’d just added and change its assertion.)

After about half a year, I noticed that the answer was almost always “yes”. We’d arrived at TDD as usual. I felt good about that. But so what?

Dubious business value

Just before a release, a strange feature request came down out of nowhere. It wasn’t a request at all, as we found out when we tried and failed to refuse it on the basis that it wouldn’t actually solve the stated problem (along with having lots of unspecified corner cases). So on release-day morning, Bill apologetically asked if there were any way we could implement the feature before cutting the release.

We had good test coverage around that area of the code, so I said “Yeah, I think so. I’ll write a spec covering all the corner cases I can think of. Come back in an hour and half ready to review it.” He read thoughtfully through the new tests and agreed: “I think this feature is dumb, but if we have to do it, then this is how it has to work.”

45 minutes later I had the new tests passing, the rest of the suite still green, the dumb feature committed to source control, and the release ready to deploy.

Genuine business value

We were right to feel skeptical about the feature: it never did solve the stated problem. A few years later, we removed it.

I was right to feel confident about having arrived at TDD as usual: it improved our internal collaboration, helped us arrive at early and precise agreement, and allowed us to move quickly and safely — not just for this silly feature, but for dozens of far more valuable ones.

And Bill was right to have told me to do my work however I saw fit. I vividly recall what he said when we shipped that feature:

”I may not have the discipline to write tests first, but I sure am glad you do.”

Me too.

Posted Thu Apr 16 03:23:41 2015 Tags:

I envision my day job (software development coaching) being rather akin to parenting. If that sounds paternalistic and creepy, consider firstly my complete lack of personal experience with parenting, and secondly my mental model of what raising children is like. My model says, given that humans…

  • Are independent and individual
  • Are always in development
  • Are always influenced by our environments and experiences
  • Exert some influence over our immediate environments and experiences, and those of perhaps a few others

then we’ll know we’ve succeeded as parents when one more human responsibly owns and directs their own continued development. If that’s what success looks like, then we’d want to act every day as the temporary, partial, and grudging guardians we seem to be, and seek every day to direct our love so as to hasten our abdication.

My mental model of parenting, untested though it may be, says “raising children” is an inapposite way to describe the goal of aiding in the development of adult humans. (In my darker moments, the common wording — along with “having a baby” — does nothing to dispel my suspicion that many parents aren’t interested in raising adult humans.) But this post is less about parenting and more about informal aunting and uncling: about what we can be doing, as a colleague who’s already doing it asked, to “involve the next generation of young minds.”

They’re already involved

According to my parenting model, combined with a pretty simple interpretation of observed behavior throughout human history, we could stick with parenting our own kids if any, letting others parent their own kids if any, and getting what we get. Given the importance of parenting to human development, any additional effort might be lost in the noise.

Help children play at their enthusiasms

It seems more likely to me that any additional effort could produce exponential benefit. For instance, standardized schooling as commonly implemented can be notoriously stultifying. Imagine attending your neighborhood public school’s PTA meeting and volunteering to supervise an age-appropriate “maker space” every other week. How likely is it that they’d accept? What kind of impact might you have on the lives of a few children? What kind of second-order impact could that have on all of us?

If you’re finding opportunities to interact with young minds, take your opportunities to provide options that might interest them. Help them see all that is possible. Help them experience the enthusiasm of others. If you have an opinion about the apparent talent of a youngster, be a grownup and do whatever it takes to keep it to yourself. For humans of any age, do not…

  • Steer them toward what you think they should want
  • Manufacture enthusiasm on their behalf

…but especially for kids, who have less practice detecting and deflecting sociopathic behavior.

Humans have their own enthusiasms. Help them find theirs. If you must tell a kid what you think about their activity, tell them how interested they appear to be, or how much fun they appear to be having. If they agree, you’re being helpful so far; find more options along that path to show them. If they don’t agree, you can be helpful by showing them options along some other path. We don’t need humans forcibly arranged into the fields we wish they were in (e.g., more women in STEM fields). We need humans who maintain their capacity for enthusiasm into adulthood, and who have seen and followed ways to turn their enthusiasm into expertise.

Help yourself play at your enthusiasms

If you don’t get to interact with young minds, maybe it’d be good for you. What’s the last wacky new thing you tried?

Help grownups put their enthusiasm to work

Until recently, I’d only ever worked at jobs where I wasn’t sure their values matched mine — or where I was sure they didn’t. I’ve finally found a place where I’m sure they do, where the enthusiasm I’ve managed to retain about software development is reflected and magnified by the kind and capable folks around me. I’m no more deserving of this sort of work-joy than anyone else. I simply had enough advantages in life to be able to keep holding on and hoping. How many people are forced to learn quickly what I managed to refuse to believe? How many are forced to grin and bear it because “Hey, that’s why they call it ‘work’”? How can we grasp the enormity of what is lost when we take intelligent, self-directed, decision-making adults and consign them to an infantilized workaday fate?

If you’re part of systems that influence the development of humans — and as a human, you are unavoidably so — you have some degree of influence over those systems. You can find ways to make them more beneficial, or at least less harmful. Where your influence stops — and as a human, it is unavoidably finite — you can find alternatives to offer. And if you have, against all odds, preserved and nurtured any form of enthusiasm in yourself, then you are uniquely positioned to preserve and nurture it in others.

Imagine taking advantage of your position to create a workplace where your favorite enthusiasms are cherished. What kind of impact might you have on the lives of a few grownups? What kind of second-order impact could that have on all of us?

I couldn’t begin to tell you. But I’ve been given such a gift. I hope I’ve begun to convey how it feels to have received it. May I pass it on.

Posted Wed Apr 8 22:39:10 2015 Tags:

The following idea popped into my head after discussions with some colleagues. They bear no responsibility for it. If it’s a bad idea, neither do I. ;-)

The problem

It’s hard enough to reach full-throttle agility in a small software company with a single development team. Large organizations make it much harder, because they tend to:

  • Require coordinating across teams
  • Require convincing more people
  • Incentivize counterproductive behavior (albeit unintentionally)
  • Organize around components and projects (not products)
  • Treat IT as a cost center (not a value provider)
  • Specialize in something other than software development
  • Control decision-making

Agility can only happen if the team is empowered to make decisions for themselves and their product. Big companies are afraid to unleash that power because they have much longer risk tails and are much fatter lawsuit targets; not only could one wrong move be extremely costly, but also wrong moves are extremely easy to make. Out of justifiable fear, then, big companies tend to prioritize risk reduction over nearly everything else, up to and including the appearance of common sense.

The priorities of a company — particularly the mistakes it wants to avoid — tend to become policies. The priorities of a self-organizing team may not always coincide with company policy. The more policies exist, the more likely a self-organizing team finds itself forced to decide whether to ignore or fight. If they’re lucky, maybe they’ll be granted blanket exceptions for particular policies.

Best-case scenario: given a relatively enlightened large organization, a team can perhaps hope that someday, relatively few of their decisions won’t be theirs to make.

Wait, that’s the best-case scenario?

The idea

If big organizations truly want the benefits of Agile, they need to provide the preconditions for Agile. But they don’t believe they can vest full decision-making power in product teams, because they’re afraid that one sufficiently bad decision could sink the whole company.

What if it couldn’t?

What if we could limit decision-making liability to the teams, so that we could safely delegate decision-making authority to the teams?

Why not spin off every team as its own very small subsidiary company?

The (presumed) benefits

Worst-case scenario: a subsidiary team makes a terrible decision, they lose all their money and/or assets (perhaps as a result of losing a lawsuit), they lay everyone off and close up shop, and whatever’s left maybe rolls up to the parent company, which was never itself at risk.

Best-case scenario: subsidiary teams make their own decisions — who are our customers? who can be on our team? how does our work work? which software licenses can we accept? which security tradeoffs are we comfortable with? etc. — and they’re sensible, effective, and profitable.

In any case, we might expect to see:

  • Value-delivering teams turning a profit
  • Underperforming teams netting a loss
  • Team performance being easier to appraise (and improve)
  • Annual budgets being simpler to prepare (and adjust)

The drawbacks

IANAL (I Am Not A Lawyer), and I haven’t tried to search for and study existing corporate structures that may be similar, but I feel certain this can’t be an original idea. AYMIAL (Are You Maybe Is A Lawyer)?

Even if you’re also not a lawyer, do you think this scheme could possibly work? What might prevent it from working? If it could work, what might be suboptimal about it?

One possibility: it’s expensive and risky to restructure a large organization. But for sufficiently large organizations sufficiently averse to risk, restructuring in this way might well be less expensive and risky than doing nothing.

Other possibilities?

The upshot

If this approach to delegating risk through legal means can work, then it promises to remove BigCo-induced accidental complexity, leaving only the essential complexity faced by any self-organizing team.

Perhaps the most effective way to scale Agile up is to scale risk down.

Posted Wed Apr 1 10:06:36 2015 Tags:

As software developers, “technical lead” sounds like a role we’re supposed to want. How can we decide whether we should want it? What exactly is a tech lead, and what does it take to be good at it?

What isn’t a tech lead?

We know by Schmonz’s Theorem that there necessarily exist other roles similar to, yet different from, “tech lead”. By inspection, some such similar roles appear to be:

  • Architect
  • Developer
  • Senior developer
  • System administrator
  • Line manager

Line manager?!? Sort of. Both line managers and tech leads are better positioned to make certain types of decisions.

Sysadmin? Sort of. Sysadmins and tech leads are better positioned to understand the costs of technology decisions.

Developer? For sure. A tech lead who hasn’t been a developer faces an uphill climb to credibility, even if they somehow manage to make good decisions.

Senior developer? Depending on the context-specific meaning of the title, it could be nearly identical with my conception of a tech lead, or quite some distance away. In my experience, usually the latter.

Architect? Again, depends on the architect and the situation, but in my experience, it’s difficult for architects to do what tech leads can do.

What do tech leads do?

They help technical decisions get made.

What do good tech leads do?

They help good technical decisions get made by the team, consistently, at every level of the decision tree.

How do good tech leads do it?

Their knowledge is wide, such that given any technical problem, they know enough to guess well about potential solutions. Their knowledge is a bit deeper in the codebases they live near, so they don’t have to guess as often. And their desire to see teammates make informed decisions is wide and deep, such that they can await the delayed gratification of someone else making the “right” decision, or even tolerate the discomfort of someone making the “wrong” decision when it’s not too irreversibly wrong, if that’s what it takes for them to make a better decision next time.

What makes a great tech lead?

A great tech lead has impeccable judgment with which to make every last technical decision — and instead seeks, above all else, to share that power with teammates. A great tech lead is one who doesn’t think of themselves as “leading”, won’t be seen doing any such thing, and might appear harmless to remove from their team. Don’t go removing them! But if you do, and it turns out to be harmless, that might be because they’ve just finished doing a terrific job.

When is being “technical lead” a good decision?

When you could be an architect, but are afraid to drift from your team and your code; when you could be a senior developer, but are afraid you won’t get to share everything you’ve learned; when you could be a manager, but are afraid folks won’t get to manage themselves; when you’ve got the chops to be “technical lead”, but don’t want anyone thinking of you that way.

Conclusion

It’s a tough job. If you do it well, then (just like a sysadmin) you might not hear appreciation for your work until years later. Until that day comes, for what it’s worth, you’ve got mine.

Posted Fri Mar 13 17:17:26 2015 Tags: