At the second Agile Alliance Technical Conference, I facilitated a DevOps-themed workshop called “Cross-Platform Full-Stack Deployment with pkgsrc”.

I loved being at the first one last year, and it’s great to be back with a comfortably large number of people who feel — as I do — that Agile software development centers on doing it and helping others do it.


Does your company deliver valuable software? Maybe I can help. Consulting, coaching, training.

Posted April 20, 2017 at 12:00:00 AM EDT Tags:

I’m working on…

Things other than work.

I’m grateful to Pillar for their part in two formative years that have changed how I feel about myself, how I prioritize, and how best to incorporate my work into my life.

I’m spending time on…

Converting health intentions back to health habits.

Publishing guest episodes of Agile in 3 Minutes, most recently 35: Define.

Making guest appearances on other podcasts, most recently Agile for Humans 058 with Llewellyn Falco, 061 with GeePaw Hill, and 062 with Lisa Crispin, as well as S1E1 of The Agile Path and the accompanying retrospective.

Designing material for upcoming conference appearances, starting with this week’s Agile Alliance Technical Conference.

Rediscovering my digital piano.

Preparing, even though I can’t know exactly how, for my priorities to soon change.

I’m looking forward to…

Reaping health benefits from health habits.

Deepening relationships with family, friends, and colleagues.

Doing short-term, high-value consulting, coaching, and training in the New York metropolitan area (or remote).

I’m traveling to…

See my speaker page.


What’s this?

It’s a /now page.

nownownow.com is a directory of people with /now pages. I’m listed there.

Posted April 17, 2017 at 05:46:09 PM EDT Tags:

Previously

Intro

As a sysadmin (”Ops”), I’ve had root responsibilities on NetBSD, FreeBSD, OpenBSD, Mac OS X, Solaris, and weirder systems like IRIX, AIX, and Linux. As a developer (”Dev”), I’ve used a bunch of those, sometimes without root privileges. Across all these contexts, to do my job reliably and productively, I found it often helped to use a cross-platform Unix package manager.

In the course of a couple decades of running mail servers, for pay and for myself, I’ve made a “DevOps” habit of observing and automating. This post is about how I recently synthesized old skills with new ones to make a big improvement in my mail-hosting automation.

Are any of your contexts like mine? Read on. You may conclude that my approach and/or tools could help you in your work. If so, maybe you can come to my hands-on workshop in a few weeks. If not, maybe we can work together another way.

Background

I’ve been running NetBSD for 20 years: from 1.1 in high school to 7.1, released two weeks ago. NetBSD’s continued growth and relevance is one of the reasons I keep choosing to run it.

I’ve been running qmail for almost as long: from 1.03 in my first job to 1.03 (plus a few third-party patches) today. qmail’s lack of continued growth and relevance makes it harder for me to keep choosing to run it. Yet I still do.

A friend wondered, in response to a recent post about refactoring my mail-server configuration, why I continue to make this choice when Postfix is ubiquitous and easy to work with. My friend made a good point. Postfix is the default MTA shipped with many operating systems — including NetBSD — and on the few occasions I’ve needed to tweak Postfix from its default behavior, it’s been easy enough. If it’d existed when my first employer and I needed an alternative to Sendmail, I’m sure I’d be happily running Postfix today.

But qmail 1.03 did exist by 1998. And by force of its compellingly different and better architecture, I found it both necessary and possible to understand SMTP, the nuances of running a production mail server on the open Internet, and the details that go into safely and reliably delivering a single message. qmail’s author, Dan Bernstein, also invented a new Maildir storage format whose desirable runtime behaviors delivered me a message about the significance of finding the right data structure — one of the few lessons I managed to learn in college-the-first-time. (A few years later, Bernstein’s djbdns would teach me very similar lessons about DNS.)

Anywho, I responded to my friend with the offhand rationale that my choice to continue with qmail stems from

A mélange of

As my weekly server update has grown more streamlined, the sharp edges of the qmail package have grown more salient. And a couple weeks ago, they scratched a friend trying to install it on a fresh NetBSD instance. So I wanted to take another stab at making things easier for my friend (and future-me, who I hope to be friends with).

It went well. Now that I’ve composed NetBSD and qmail into their finest two-part harmony yet, I’d like to tell you a tale of software archaeology, a tale replete with flexibility, persuasion, persistence, and very occasional free time. And possibly just a smidge of Sunk Cost Fallacy.

It’s not my shortest story. (Did I mention Sunk Cost Fallacy?) But if the problems I’ve tried to solve sound like problems you’re trying to solve, there’s likely something in it for you.

1997: Build from source

What’s wrong with installing software from source code by manually following its instructions? When you’re trying to learn how it works, not much at all. If you do this with qmail, you’ll learn how it

  • Makes use of several Unix users and groups for security partitioning,
  • Requires that those users and groups be created before it’ll compile,
  • Installs its programs and documentation to locations that need additional configuration to be useful to users, and
  • Expects its daemons to be managed in a way that almost certainly departs significantly from your OS’s usual (such as NetBSD’s rc.d subsystem).

When you’re trying to learn how qmail works, these are all important things to know. When you’re trying to integrate it into your systems, your habits, and your automations, these are important things to be able to forget.

There’s an abstraction layer for hiding such details: a package manager. NetBSD’s is spelled “pkgsrc” and pronounced “package source”.

2002: Maybe build from pkgsrc

After submitting a series of patch requests to pkgsrc, it was just about exactly 15 years ago that I was invited to commit directly to the repository. As a project, pkgsrc was mainly focused on automating dependencies, builds, installs, and uninstalls of third-party code from source code. As a developer and user, so was I.

We had a package of qmail. It was pretty rough. It jumped through hoops to create those users and groups early enough in the build, it didn’t place qmail’s programs or documentation in users’ existing paths, it didn’t offer NetBSD-style rc.d scripts, and if memory serves, uninstalling it didn’t really uninstall anything. And while pkgsrc could be used to generate binary packages, the qmail package very sensibly refused to.

My first commit to qmail (original) was from a future developer’s patch request. He provided an rc.d script that started and stopped qmail-send. Pretty cool.

2004: Definitely build from pkgsrc

Someday I hoped to produce a binary package that worked well. If I could start making our source package work well, I might get there someday.

One of DJB’s conditions for redistributions of qmail was that all files must be addressable in their documented default locations under the /var/qmail hierarchy. One of pkgsrc’s conditions for uninstalling files was that they must be stored under the /usr/pkg hierarchy. And, of course, if the programs and manual pages were installed into /usr/pkg, they’d be instantly available to users. Inspired by an idea from Matthew R. Green on the qmail discussion list, I taught the package to create symlinks under /var/qmail before running qmail’s installer, which then followed the symlinks. For example, a symlink in /var/qmail/bin pointing to /usr/pkg/bin induced qmail programs to install into /usr/pkg/bin as pkgsrc users would expect, while also being accessible via /var/qmail/bin as qmail users would expect.

Score! The symlink-before-install trick provided the basis for many contemporaneous improvements (original) and also obviated the need for path-related hacks (original) in a handful of packages depending on qmail. After a lot of careful work, our package suddenly provided a rather high-functioning qmail installation.

It even followed a whole bunch of the recommendations codified by the qmail community elders in the Life with qmail configuration guide. There was one big recommendation, by virtue of its mere existence, that it could not follow: the qmail elders strongly recommended avoiding packages entirely, for reasons like those in the “1997: Build from source” section above. Nonetheless, I asked for their review and feedback, received it, and applied it. The gist: I removed my NetBSD-specific customizations from the qmail package (original) and re-added them as a new qmail-run package (original), along with other tweaks intended to help users distinguish standard stuff that the community could support from custom stuff they should ask me about directly. (I wrote about the experience as a college-the-second-time application essay. Wound up going somewhere else.)

At this point, as I explained at a couple pkgsrcCons, for most “djbware” in pkgsrc, we could generate useful binary packages, but weren’t confident that we could meet DJB’s conditions for distributors.

For qmail in particular, binary packages were less useful. A qmail binary package built on one machine could run on another if and only if the numeric IDs of qmail’s users and groups were the same on both systems, which is not only mathematically exceedingly unlikely to happen at random, but often difficult to arrange for in the wild. FreeBSD Ports, among others, worked around this by permanently allocating a non-configurable set of IDs. Their decision could never have made sense for pkgsrc, where besides NetBSD we run on all manner of operating systems we don’t control. To fix it, we’d really have to fix it. But applying patches that changed visible behavior of the installed software was clearly against the redistribution conditions.

In late 2007, DJB placed qmail in the public domain. It became possible to apply the usual consideration for the costs and benefits of patching our package.

I was busy with college-the-second-time, particularly so with validating my goal to be a composer. For my own needs, I continued using source packages.

2011: Maybe install from a custom binary package

Motivated by the desire to improve the quality of our binary packages, pkgsrc grew an idea of DESTDIR. Source packages that knew about DESTDIR no longer installed directly onto the running system, but rather into a temporary staging directory from which a binary package was generated and then installed. (This was not a new idea; OpenBSD Ports, among others, beat us to it.) Once all our packages had been adapted, we could remove the old way. Then even source-centric developers like myself would notice when our binary packages weren’t working right.

I liked the sound of that and didn’t want to hold up progress, so I made what I thought were the smallest possible changes to minimally support DESTDIR (original).

Turns out, when the DESTDIR rapture finally came (original) in late 2015, I had not arranged for the qmail binary package to initialize its message queue.

The regression in our binary package occurred because my brain continued to assume it was not worth anyone’s time to try to use it, even though the meaning of the coming rapture was specifically that one day all package installs would be binary package installs.

The missing queue went unnoticed because on the mail server where I would’ve quickly noticed it, it hadn’t been missing! It was still there from before the DESTDIR rapture, when I’d first installed from the old-style source package.

(It also went unnoticed because very few people run qmail from pkgsrc. Even when lots of people ran qmail, pkgsrc’s package wasn’t great, pkgsrc itself was relatively unpopular, and running qmail from any sort of package was, as mentioned previously, strongly discouraged.)

2017: Definitely install from the pre-built 2017Q1 binary package

In late 2014, before everything went DESTDIR, I stopped managing packages from source directly on my server — it ate CPU, I/O, and disk space I preferred to allocate to serving — and started building binaries in a VM on my local development machine. Since I had chosen to generate binary packages, I had to configure qmail to build with numeric user and group IDs matching those already present on my server.

A couple weeks ago, trying to help my friend, I turned up a patch by Paul Fox that taught an early version of qmail to look up its numeric IDs at runtime. Using my growing C skills, I ported it forward and added it to pkgsrc (original).

With the patch, qmail’s compile process no longer needed any users or groups to exist. To also avoid needing them in the install process, I recorded the permissions qmail’s installer wanted to set, linked the installer with a fake chown() that does nothing, then applied pkgsrc’s mechanism for setting all the desired special permissions when installing a binary package. (I used the trick in Permanent CVS, temporary Git to work on a git branch, experiment freely, and keep only my best commits.)

I thought I was just about done solving my friend’s problem. On a test system, I cleaned out all vestiges of qmail having ever been there, then built and installed it. That’s when I noticed there was no queue.

Huh?

Oh.

I don’t remember slapping my forehead, which might not mean I didn’t.

After over a year of breakage, I finally noticed my nearly six-year-old mistake. Eager to fix it and ship it, and not seeing an elegant solution, I settled for a less elegant one. queue-fix is a third-party tool that reuses qmail’s code to repair or create a queue. I added it as a dependency and scripted it to create the queue if missing.

As far as I can tell, the result of naively installing our qmail binary package now matches file-for-file, permission-for-permission, what you’d get installing it from source by hand. Also, it works exactly as well as it always has for me in production. The triumphant pkgsrc commit message (original) concludes:

A typical binary package should now:

  1. Install on any other system of matching OS and architecture,
  2. Not need matching numeric UIDs and GIDs to do so, and
  3. Be usable in production.

You know, like any other binary package.

The latest soon-to-be-released pkgsrc stable branch will include production-ready qmail binary packages for many popular platforms. Especially if you’re on NetBSD, but also on other platforms, pkg_add qmail-run will make you wonder what all this fuss was ever about.

Exactly the point!

Once 2017Q1 is out, the main reasons I can think of not to get your qmail from our published binary packages are:

  1. You don’t use pkgsrc or NetBSD (or qmail)
  2. You don’t see a binary qmail package for your OS and architecture
  3. You need a qmail with non-default PKG_OPTIONS

With the default PKG_OPTIONS, you’ll get netqmail with Christopher K. Davis’s oversize DNS and Paul Jarc’s realrcptto patches.

2017/04/04 Update: Here’s the 2017Q1 release announcement.

2017/04/10 Update: Binary qmail packages aren’t showing up in the usual places. A dependency on checkpassword — which, not being under any license, is not under one of the licenses pkgsrc deems acceptable by default — is preventing bulk builds from publishing qmail binaries. On the off chance DJB intends to include checkpassword in the list of software he has placed in the public domain, I’ve emailed to request that he do so now. In the meanwhile, I’ve removed the checkpassword dependency (original) and requested the change be applied to the 2017Q1 branch. When it’s been applied, we should start seeing qmail binary packages. Likewise qmail-run and its dependency on ofmipd from mess822, which I made optional, off by default (original).

2017/04/18 Update: Binary packages are here! Running qmail on NetBSD takes two minutes.

Why did I make this effort?

My mail server had been running fine before. As far as I can tell, it’s running exactly as fine now.

My friend who likes qmail is having trouble convincing himself it’s still practical to run his own mail server at all.

My friend who wonders why I stick with qmail, were he to read this, would surely be feeling a renewed sense of wonder right about now.

I’m not. I’ve long been investing myself in NetBSD, pkgsrc, and qmail, and they continue to feel like good investments. Someday there may be a new effect I want to accomplish with qmail whose marginal cost I won’t enjoy paying. When that happens, I’ll reevaluate alternatives and the cost of switching to them.

In the meanwhile, next time I install qmail, I’m happy to say there’s an excellent chance none of the experience and expertise that went into the writing of this post will be called for. Having shipped working code, I get to evict some longstanding want-to-dos and oh-yeah-that-weird-things from residence in my brain (and cruft from my weekly build config), and I get positive reinforcement that further improvements are also worth trying for. Both of these have the effect of increasing my overall area under the curve.

The more I can simplify and automate, the fewer attention-slices I can require of myself, the more likely my desired outcomes persist when my circumstances change. That’s pretty important to me right about now.

What I might do next

qmail lets the names of its users and groups be configured at build time. pkgsrc could pretty easily support this, retaining the current names as defaults.

Some of qmail’s attendant packages can optionally build with patches to add IPv6 support. Enabling IPv6 mail service would require at least an optional patch to qmail itself.

Patches and/or wrapper programs exist to implement Sender Policy Framework, Sender Rewriting Scheme, and DomainKeys Identified Mail. It may be useful to optionally include them.

Update: account names are configurable (original) and SRS is optional (original).

What you might do next

What do you think? Could you benefit from applying some of these DevOps tools and practices? I’m teaching a hands-on workshop at the upcoming Agile Alliance Technical Conference where you can learn them. And/or teach me how I can benefit from letting go of qmail. ;-)

Posted March 28, 2017 at 02:26:58 PM EDT Tags:

There’s a link in the sidebar where you can buy me a fancy coffee, but I’d much rather treat you, dear reader, to the beverage of your choice. Here are some upcoming chances for us to do that:

Boston, April 19-21: Agile Alliance Technical Conference

New York City, April 28-30: Agile Coach Camp

New York City, May 1: Big Apple Scrum Day

Ann Arbor, May 4-5: Agile and Beyond

Hope to see you. If not one of these, then another time.

Posted March 16, 2017 at 09:58:48 AM EDT Tags:

When solving a problem, we often take advantage of known solutions. For sufficiently small and repeatable problems, we can buy solutions at the store. Usually they don’t solve the entire problem all by themselves. Bleach doesn’t clean the toilet; I do, with the help of bleach and other tools.

In software problem-solving parlance, bleach is a “dependency”. It doesn’t have to be. I’m free to try to solve my problem using some other product. But for as long as I believe bleach is what I need — maybe because it gets me there sooner and more reliably than anything else I can think of — then I’m depending on it.

I’m also free to try to solve my problem using no products whatsoever. It might sound like an unqualified bad decision to get in there with my fingernails. But what if bleach and scrub brushes haven’t been invented yet? What if they have, but the store can’t reliably stock them? What if they can, but I can’t reliably get to the store or pay for it? Or what if cleaning toilets is my core business, and my main differentiator is artisanal hand-maintained porcelain?

Whether to add a new dependency, and if so which, are two of the many, many decisions we make every day in software development. We can have reasons for generally preferring to keep the total number small. But however small that is, it will never reach zero. For me to clean the toilet depends, at minimum, on whatever it takes to keep me alive, able, and present. And on the privilege of having a toilet.

Minimize the number

One Agile strategy for reducing dependency risk is to do the simplest thing that could possibly work. Sometimes that can mean copy-pasting a function from Stack Overflow instead of depending on a third-party library for it.

Another Agile strategy for reducing dependency risk is to maximize the amount of work not done. Sometimes we can deliberately decide that a sub-problem isn’t worth solving.

The only way to entirely avoid dependencies is to entirely avoid solving problems.

Then we’d need new jobs, which would be a problem. Sort of a circular dependency.

We need techniques for living with dependencies more effectively.

A dependency isn’t just one more line in a file. It’s

  • An expected interface (adapter pattern)
  • Noticing when it changes (contract tests)
    Moi

Amber is the force behind Self.conference, which I’ve spoken at and attended and which just posted its list of sessions and speakers. If you’re a human involved in the making of software, I highly recommend it, and can give you a code good for 10% off your ticket.

And also all of its transitive dependencies — dependencies can have dependencies too — as Amber Conville reminded me.

In the moment where we decide to add one more dependency, it could look like just one more line in a file specifying names and minimum versions. In the moment where we finish the feature and get it out the door, that could feel like enough.

Dependency now, dependency later

We may think we’ve chosen to depend on something as it is today. That’s true.

We may think we can cheaply remove or replace it tomorrow if we have a better idea. That might be true too.

We may even think we can cheaply remove or replace it in a few months if need be. That’s a lot less likely to be true.

Why? If we’re not careful, our expectations of it will be dispersed throughout our code. If it then changes in a way that fails to meet our expectations, we’ll have to be expensively careful to adjust to it. If our own expectations need to change, we’ll have to be expensively careful about how we change them. And if we hadn’t been careful, the way we’ll find out we suddenly need to be expensively careful might well come at an expensive time: in production.

Maximize control

In software, when we are careful, we can arrange to reduce our dependency risk:

I follow similar reasoning when I update all the dependencies on my server every week. The tool I use for this recently underwent a major upgrade. So last week I did my build on Friday, a little early. Over the weekend, I made the fewest possible changes to bring forward my existing configuration, did a build with the new tool, found a regression, did another build, went to production, and reported the fix. If it hadn’t worked, I’d still have another week to figure it out or revert without interrupting my cadence. But since it did work, my dependency on the build tool has been safely managed through some breaking changes. Next week, if I feel like it, I can see about taking advantage of some of the new features.

Adapter pattern: Define the calls we want to be able to make, then implement them by backing our interface with the dependency. Lets us adjust faster when something changes. (more on Wikipedia)

Contract tests: For each call our adapter makes to the dependency, write automated tests for the behavior we’re relying on. Tells us sooner when adjustments must be made. (more at Martin Fowler’s Bliki)

Given these techniques, another Agile risk-mitigation strategy clicks into place: If it hurts, do it more often. Figure out how to get notified when any of your dependencies have been updated. When you get a notification, before you do anything else, update your code to use the new version. If tests break, before you do anything else, fix them. If it’s not clear how to do that, before you do anything else, raise the risk with your team. By uncovering the dependency problem as early as possible, you’ve maximized your options for handling it well.

Once your tests are green, ship it as soon as you can. If a dependency problem somehow slips past your tests into production, it’ll be relatively easy to find, because you’ve narrowed the search space considerably. Roll back first, if you have to. Then test-drive a bugfix and ship it again.

The goal here, given that surprises are inevitable, is to control the influx of entropy into your system. Track updates and put out releases less frequently, and the surprises get larger, take longer to track down, and offer fewer, more expensive options for resolution. Or reduce batch size to get the opposite effects.

Sorry, I only understand toilets

No problem. To keep your delivery schedule from getting clogged, flush regularly.

Here’s how I documented my reasoning for a product I both managed and tech lead-ed:

Why release every month?

  1. Each release contains less total change. Why this matters:
    • Code change is risky. Smaller increments of change help manage the risk.
    • If a bug survives into production, finding it is easier, so fixing it is faster.
    • If a feature doesn’t meet expected requirements, customers will complain earlier, so fixing it can happen earlier.
  2. The next release is always soon. Why this matters:
    • Small features (or bugfixes) don’t have to wait long to get into customers’ hands.
    • Big features can only be delivered via composable solutions, implemented one tractable piece at a time.
    • Each change can be tested well because it’s small. Each change must be tested well because it’s about to ship.
    • The master branch is always production-ready. We can always ship what we have right now.
  3. The previous release is always recent. Why this matters:
    • Release deployment is risky. More frequent practice — and being able to remember what went wrong last time — helps manage the risk.

Why skip a month?

  1. If Operations is fully booked on other product releases and doesn’t have someone available.
  2. If up- or downstream systems are changing and it’s too risky to change ours at the same time.
  3. If a particular big feature simply can’t be decomposed into month-sized chunks of work. (This is very rare.)

When do you notify Operations of new dependencies to package?

By Thursday afternoon, the day before release day, we have a complete or near-complete list. In general, we try to declare each codebase’s dependencies in one place so that we can simply diff it against the previous release to see everything that’s new (updated counts as new). Then we list those dependencies on the release’s wiki page and notify our local Operations representative.

Occasionally a last-minute code change will add another dependency or two. Asking Operations to build one or two more last-minute packages isn’t terrible if we don’t do it often. More packages than that means the change probably isn’t a smart last-minute choice.

Why are you always upgrading to the latest available dependencies?

Because we depend on them.

Less elliptical answer: because they’re code we rely on but don’t control. Therefore we’re especially susceptible to changes in them. Therefore we minimize our exposure by staying up to date whenever possible, giving us the easiest possible rollback option when (inevitably) an unexpected problem occurs.

See also “Why release every month?”

Another dependency inversion principle

GeePaw Hill likes to say The code works for me, I don’t work for the code. If you’d like to put dependencies fully at your service — and not the other way around — I invite you to join me next month for my hands-on workshop at the Agile Alliance Technical Conference.

Posted March 8, 2017 at 01:11:42 PM EST Tags: