Product vs Product Operations

When I lived in Amsterdam I owned - like most Dutch people - a bike. I used it to commute to work, go to parties, race to buy groceries just before the store closes; in short, I used it every day. Or at least I did, until my bike stopped booting up.

The bicycle I owned was a first-generation VanMoof - a hip, fancy city bike with a built-in computer. The computer itself is the bike's brain: it connects to your smartphone via Bluetooth, sends out a distress beacon if the bike has been stolen, and turns the integrated lights and alarm on and off. You can't ride the bike without this last feature.

On a late post-lockdown Friday night, I jumped on the bike after a few drinks and instead of being greeted with the regular electronic boot tunes and the lights flickering to life, I got absolute silence and no unlock. My attempts to start the on-board computer into recovery mode were unsuccessful: whether using the app or the physical buttons themselves, all I got from the bike was a weird 'error' tune. A weekend trip to the store - where they took the frame apart and completely replaced the hardware inside - was needed to have my bike back in a functioning state.

Simplifying failure modes

In my career have been thinking about that 'error' tune a lot. The error tune the bicycle played was completely different from every other sound cue the bike used. In short, the bike is aware that something has gone awry - something that can only be fixed by fully replacing the on-board computer assembly - and it tells you to seek assistance with a classic 'hardware fault' tune.

That means someone designed the boot sequence to check whether something has gone irreparably wrong, and give the 'bleep' tune instead of a regular startup sound.

Which of course means someone signed off on a 'play error tune when firmware is irreparably damaged' feature: someone consciously took a product decision to have the bike recognize that the startup sequence has gone wrong, and instead of trying to fix it, it will simply play an 'error' tune and turn off, stuck in an infinite boot loop.

The knee-jerk reaction as a Product Manager would be to complain about how the product was shipped in unfinished state, and about the abysmal user experience of having to go to a physical store to get Operations to repair your bike.

I'm just not sure that's the case: in fact, I think that whoever shipped this bike with a simple try/catch block during startup and a distinctive 'error' tune is actually a pretty brilliant product person.

Turning product features into operational processes.

When shipping a product it's important to follow the triumvirate of good products: scope well, build well, test well. But, especially when scoping, it's way too easy to overthink every possible edge case. Even more failure cases will pop up when building and testing, and fixing them will quickly turn your strategic product launch into a nasty game of whack-a-mole. As the team morale drops and more features start creeping, you'll start having second thoughts about the product itself

Whoever designed the bike didn't fall for this, and instead managed to ship the product by aggressively solving down a very wide class of problems ('something is wrong in either firmware or hardware') with a single user experience loop:

  1. refuse boot sequence
  2. play distinctive 'error' tone to indicate user to seek assistance,
  3. write a process for operations to solve the issue at the store.

The bike, as a core product, is still operable when this feature is triggered. However, the interface (the lights and sound assembly) clearly signals to the end user that something is wrong. It does so with a very clear, visible and difficult to ignore signal (the unique 'error' bleep). When the faulty state is triggered, the customer gets handed over to the operations team with a 'fix' request, that is performed relatively easily by swapping the ~50 € on-board computer in a few minutes, entirely free of charge.

Imagine the cost of developing all kind of feature flags and reboot sequences for a hardware product, which has no possible way of receiving over-the-air updates. Hundreds of possible edge cases, most of which would never be triggered, would need to be scoped and developed. And of course you'd probably end up missing a few - leading to confusion when the bike inevitably breaks in an unexpected way and goes completely silent.

Within the product's constraint (fast development, no way of updating the firmware, first generation bike only sold in a small geographical region with easy access to brand store for repairs), lifting the 'faulty booting sequence' set of edge cases out of the product scope and into the operations/support scope makes total sense, and is a smart, time-effective and cost-effective solution.

Whoever designed this bike turned a whole class of product features into a straightforward and simple process to be performed by Operations, with minimal impact to the user experience and following the principle of Least Astonishment. The takeaway here is that it's usually better to leave fault-tolerant design to aerospace engineers, and focus on delivering products that are 99% functional and manage to solve the 1% of edge cases with a well-defined operations and customer support flow.

The bike that didn't boot did exactly that - hats off to the PM who shipped it.