Feature Flags Book

From Concept to Cultural Revolution

by Ben Nadel

In my tenure as co-founder and principal engineer at InVision, I went from never having heard of "Feature Flags" (aka "feature toggles" aka "feature switches"); to seeing them become widely adopted by our engineering team; to witnessing a complete cultural revolution in regard to how our company approached product development. For me, feature flags are as transformational as databases—they are as important as both logs and metrics. I cannot imagine creating another product without them.

I believe that I have a perspective worth sharing. I want to help people see the magic that I see. I want to help teams deliver value to their customers with love and empathy and without fear.

Subscribe For Updates

Purchase Book

This is very much a work-in-progress.

→ Early Access ←

Feature Flags: From Concept to Cultural Revolution

by Ben Nadel

Dedicated to my wife, who tells me that I can do anything;

And, to my dog, who loves me unconditionally.

Table of Contents

  1. Who Should Read This Book
  2. Caveat Emptor
  3. Of Outages And Incidents
  4. The Status Quo
  5. Feature Flags, An Introduction
  6. Key Terms and Concepts
  7. Going Deep on Feature Flag Targeting
  8. The User Experience (UX) of Feature Flag Targeting
  9. Types of Feature Flags
  10. Life-Cycle of a Feature Flag
  11. Use Cases
  12. Server-Side vs. Client-Side
  13. Bridging the Sophistication Gap
  14. Life Without Automated Testing
  15. Ownership Boundaries
  16. The Hidden Cost of Feature Flags
  17. Not Everything Can Be Feature Flagged
  18. Build vs. Buy
  19. Track Actions, Not Feature Flag State
  20. Logs, Metrics, and Feature Flags
  21. A Cultural Revolution
  22. People Like Us Do Things Like This
  23. Building Inclusive Products
  24. An Opinionated Guide to Pull Requests (PRs)
  25. Removing the Cost of Context Switching
  26. Measuring Team Productivity
  27. Increasing Agility With Dynamic Code
  28. Product Release vs. Marketing Release
  29. Getting From No to Yes
  30. What If I Can Only Deploy Every 2-Weeks?
  31. I Eat, I Sleep, I Feature Flag
  32. About The Author

Who Should Read This Book

I have written this book for any team that is building customer-facing software; and, is keen to build a more stable, more rewarding product for said customers. This might be in the context of a consumer product (B2C), an enterprise product (B2B), or an internal intranet in which your coworkers are your "customers".

If you've ever been frustrated by the pace of development, this book is for you. If you've ever felt disconnected from your organization, your mission, or your customers, this book is for you. If you've ever imaged that there must be a better way to approach product development, I can tell you there is; and, this book is for you.

Feature flags are primarily an engineering tool. As such, I am speaking primarily to my fellow engineers. I believe that we engineers inhabit a uniquely potent role within the organization. We exist at the nexus of design, form, function, user experience, and platform stability. We communicate with Support engineers, Sales associates, Designers, Product Managers, Project Managers, Technical writers, and customers. This centrality gives us an opportunity to break down barriers and help heal the cultural problems that plague our companies and our productivity.

That said, this book certainly holds value for non-engineers. Product development is a collaborative process. And, we build our best products when we work together in harmony. The sooner we can all start moving in the same direction with the same priorities, the sooner we can start shipping products with confidence (and without fear). This book will help you reset unhealthy organizational expectations, build psychological safety and, show you a product development strategy that is both iterative and inclusive.

There are no prerequisites here. Your team doesn't have to be a certain size or reach a certain level of engineering complexity before you start using feature flags. In fact, feature flags help bridge the sophistication gap between small, scrappy teams and large, vertically integrated teams. All that you really need is a desire to build better products. And, a belief that great things will happen when you starting operating from a place of love and generosity.

If you are in the early stages of product formation and you don't yet have customers, feature flags won't help you all that much. They can still aide in feature optimization and serve to bootstrap certain feature modules. But, until you have customers, your chosen approach to product development is simply less meaningful.

Of course, you should be aiming to pull customers into your development process as soon as possible. And, feature flags will let you do this safely and effectively.

Feature flags work particularly well for web-based software (my area of expertise); which is where this book will be focused. But, I have seen other teams use feature flags with great success in both desktop software and mobile apps.

Caveat Emptor

I have opinions. Often, these opinions are strong; and, in most cases, these opinions are strongly held. But, they are just my opinions. In this book, I will speak with an air of authority because I believe deeply in what I am saying based on what I have seen: a team transformed. But, what I am saying is based on my own experience, context, and organizational constraints. What I say may not always apply perfectly to you and your situation.

You are a discerning, creative person. You are here because you are enrolled in the work of building better products; of building more effective teams; and, of delivering more value to your customers. Do not let that creativity take a back seat as you read this book. Be critical, but open; question my assertions, but do not dismiss them out of hand.

Feature flags are a deceptively simple concept. It can be hard to understand the extent of the impact they have on your team because the implications aren't only technical. If all you learn from this book is how to use feature flags as a means to control flow within your software, this book will be worth reading. However, the true value of what I'm sharing here lies in the holistic cultural change that feature flags can bring to every part of your product development life-cycle.

This book does not represent an all-or-nothing approach to product development. But, I do believe that the more you take from this book, the more you will get.

Of Outages And Incidents

I used to tell my people: "You're not really a member of the team until you crash production".

In the early days of the company, crashing production—or, at least, significantly degrading it—was nothing short of inevitable. And so, instead of wielding each outage as a weapon, I treated it like a rite of passage. This was my attempt to create a safe space in which my people would learn about and become accustomed to our product.

I knew that engineers needed to ship code. This is a matter of self-actualization. Pushing code to production benefits us—and our mental well-being—just as much as it benefits our customers.

But, when our deployments became fraught, our engineers became fearful. They began to overthink their code and under-deliver on their commitments. This wasn't good for the product. And, it certainly wasn't good for the team. Not to mention the fact that it created an unhealthy tension between our Executive Leadership Team (ELT) and—well—everyone else.

The more people we added to our engineering staff, the worse this problem became. Poorly architected code with no discernible boundaries soon lead to tightly coupled code that no one wanted to touch let alone maintain. Code changed in one part of the application often had a surprising impact in another part of the application.

The whole situation felt chaotic and unstoppable. And, at the time, the best we thought we could do was prepare people for the trauma. We implemented an "IC Certification" program. The "IC"—or, Incident Commander—was responsible for getting the right people in the (Slack) room and then liaising between the triage team and the rest of the organization.

To become IC certified, you had to be trained by an existing IC at the company. You had to run through a mock incident and demonstrate that you understood:

This IC training and certification program was mandatory for all engineers. Our issues were very real and very urgent; and, we needed everyone to be prepared.

In general, the certified ICs were good at communicating status, but each IC had their own style—their own way of writing and presenting updates to the team. This lead to a lot of fumbling and inconsistency, which ultimately distracted us from the goal at hand.

To address this, I built a small utility that brought order to the output: Incident Commander. This online tool provided ICs with a way to translate status updates into pre-formatted Slack message which the IC could then copy-and-paste into the #incident channel.

As a team, we became quite adept at responding to each incident. And, in those early days, this coalescing around the chaos formed a camaraderie that bound us together. Even years later, I still look back on those Zoom calls with a fondness for how close I felt to those that were there fighting alongside me.

But, the kindness and compassion that we gave each other internally meant nothing to our customers. The incidental joy that we felt from a shared struggle was no comfort to those that were paying us to provide a stable product.

Our CTO (Chief Technical Officer) at the time understood this. He never measured downtime in minutes, he measured it in lost opportunities. He painted the picture of customers, victimized by our negligence:

"People don't care about SLOs (Service Level Objectives) and SLAs (Service Level Agreements). 30-minutes of downtime isn't much on balance. But, when it takes place during your pitch meeting and prevents you from securing a life-changing round of Venture Capital, 30-minutes means the difference between a path forward and a dead-end."

The CTO put a Root Cause Analysis (RCA) process in place, and personally reviewed every single write-up. Remediating an incident was only a minor victory—preventing the next incident was the real goal. Each RCA included a technical discussion about what happened, how we became aware of the problem, how we identified the root cause, and the steps we intended to take in order to prevent it from occurring again.

The RCA process—and the subsequent Quality Assurance Items (QAI)—did create a more stable platform. But, a platform is merely the foundation upon which the product lives. Most of the work that we were doing took place above the platform, in the ever-evolving user-facing feature-set. To be certain, a stable platform is a necessity. But, as the platform stabilized, the RCA process began to see a diminishing return on investment (ROI). Even as the platform improved, the outages continued to happen.

In a last ditch effort to bring about better outcomes, a Change Control Board (CCB) was put in place. A CCB is a group of subject matter experts (SME) that must review and approve all changes being made to the product.

A Change Control Board is the antithesis of worker autonomy. It is the antithesis of productivity. A Change Control Board says, "we don't pay you to think." A Change Control Board says, "we don't trust you to use your best judgement." If workers yearn to find fulfillment in self-direction, increasing responsibility, and a sense of purpose, the Change Control Board seeks to strip responsibility and treat workers as nothing more than a mindless resource.

And still, with the choke-hold of the CCB in place, the incidents continued.

After working on and maintaining the same product for over a decade, I have the luxury of hindsight and experience. I can see what we did right and what we did wrong. The time and effort we put into improving the underlying platform was time well spent. Without a solid foundation on which to build, nothing much else matters.

Our biggest mistake was trying to create a predictable outcome for the product. We slowed down the design process in hopes that every single pixel was in the correct location. We slowed down the deployment pipeline in hopes that every single edge-case and feature had been fully-implemented and tweaked to perfection.

We thought that we could increase quality and productivity by slowing everything down. But, the opposite is true. Quality increases when you go faster. Productivity increases when you work smaller. Innovation and discovery happen at the edge, in the unpredictable, heady space where Product and Customer meet.

Eventually, we learned these lessons. Outages and incidents went from a daily occurrence to a weekly occurrence to a rarity. At the same time, productivity went up, team morale was boosted, and our paying customers finally started to see the value that we promised them.

But, none of it would have been possible without feature flags.

Feature flags changed everything.

The Status Quo

There's no "one way" for organizations to build and deploy a product. Even a single engineer will use different techniques in different contexts. When I'm at work, for example, I use a Slack-based chatbot to trigger new deployments; which, in turn, communicate with Kubernetes; which, in turn, execute an incremental rollout of new Docker containers. But, in my personal life—on side projects—I still use FTP (File Transfer Protocol) in order to manually sync files between my local development environment and my VPS (Virtual Private Server).

No given approach to web development is inherently "right" or "wrong". Some approaches do have advantages. But, everything is a matter of nuance; and, every approach is based on some set of constraints and trade-offs. At work, I get to use a relatively sophisticated deployment pipeline because I stand on the shoulders of brilliant teammates. But, when I'm on my own, I don't have the ability to create that level of automation and orchestration.

Though many different approaches exist, most build and deployment strategies do have one thing in common: when code is deployed to a production server, users start to consume it. Add a new item to your navigation and—immediately—users start to click it. Change a database query and—immediately—users start to execute it. Refactor a workflow and—immediately—users start to explore it.

We're decades beyond the days of shipping floppy disks and CD-Roms; but, most of us still inhabit a state in which deploying code and releasing code are the same thing. Having users come to us (and our servers) gives us the ability to respond to issues more rapidly; but, fundamentally, we're still delivering a "static product" to our customers.

To operate within this limitation, teams will oftentimes commit temporary logic to their control flow in order to negotiate application access. For example, a team may only allow certain parts of an application to be accessed if:

I've used many of these techniques myself. And, they all work; but, they are all subpar. Yes, they do allow internal users to inspect a feature prior to its release; but, they offer little else in terms of dynamic functionality. Plus, exposing the gated code to a wider audience requires changing the code and deploying it to production. Which, conversely, means that re-gating the code—such as in the case of a major bug or incident—requires the code to be updated or reverted and re-deployed.

Going beyond their hard-coded nature, these techniques often treat deployment as an afterthought. Meaning, they are typically implemented only after a feature has been completed and is now ready for review. This implies that the feature has been articulated and developed in relative isolation.

This is referred to as the "Waterfall Methodology". Waterfall Methodology comprises a set of product development stages that get executed in linear sequence:

  1. Analysis and requirements gathering.
  2. Graphic design and prototyping.
  3. Implementation.
  4. Testing and quality assurance (QA).
  5. Deployment to production.
  6. Maintenance.

Each one of these stages is intended to be completed in turn, with the outputs from one stage acting as the inputs to the next stage.

At first blush, the Waterfall Methodology is attractive because it looks to create structure and predictability. But, this is mostly an illusion. In the best case scenario, the timeline for such a project is grossly underestimated. And, in the worst case scenario, the engineering team builds the wrong product.

Years ago, I worked at a company that always used the waterfall methodology. On one particular project for a data-management tool, the team carried out the requirements gathering, did some graphic design, and then entered the implementation phase as per usual. But, building the product took a lot longer than anticipated (which is standard even in the "best case" scenario). And, naturally, the client was very upset about the slipping release date.

Finally, after many delays and many heated phone calls and much triaging, the team performed their "big reveal" to the client. And, after walking through the product, the client remarked, "This isn't at all what I asked for."

It turns out, there was a large understanding gap in the requirements gathering phase. This understanding gap was then baked into the design process which was subsequently baked into the engineering process.

The client was furious about the loss of time and money (with nothing to show for it). The engineers were furious because they felt that the client hadn't been forthcoming during the analysis phase (classic victim blaming). And, of course, product management was furious because the project failed and reflected poorly on our firm.

This isn't a black swan event. Many of us in the product development space have similar stories. And, as an industry, we've come to mostly agree that the Waterfall Methodology is problematic; and, that "Agile Methodology" is the preferred approach.

The Agile Methodology takes the Waterfall Methodology, shrinks it down in scope, and repeats it over and over again until the product is completed. The cornerstone of Agile is a strong emphasis on "People over processes and tools" using a continual feedback loop:

If our team had been using an Agile Methodology (in the earlier anecdote), success would have been achieved. The client would have seen early-on that the product was moving in the wrong direction and they would have told us. In turn, the design and engineering teams would have changed course and adapted to the emergent requirements. And, ultimately, all stakeholders would have been happy with the outcome.

The Agile Methodology is fantastic!

At least, it is within a "greenfield" project—one in which no prior art exists. But, as soon as you release a product to the customers, all subsequent changes are being made within a "brownfield" project. This is where things get tricky. Or, expensive. Oftentimes both.

Agile Methodology has clear advantages over Waterfall Methodology; but, eventually, both approaches run into the same fundamental problem: deploying code to production is still a dangerous proposition. And, even with "agile", bad things will happen. And so, the fear creeps in; and soon, even "agile" teams with the best intentions start to fall back into old waterfall tendencies in an effort to protect themselves.

In hopes of leaving nothing to chance, the design process becomes endless. An eroding trust in the engineering team leads to a longer, more tedious QA period. Paranoia about outages means no more deploying on Fridays (or, perhaps, with even much less frequency). Test coverage percentage becomes a target. Expensive staging environments are created and immediately fall out-of-sync. A product manager creates a deployment checklist and arbitrarily makes "load testing" a blocking step.

The whole system—the whole process—bloats; and, starts to creak and moan under the pressure (of time, of cost, of expectation) until, at some point, someone makes the joke:

Work would be so great if it weren't for all the customers.

In an industry where customer empathy builds the foundation of all great products, wanting to work without customers becomes the sign of something truly toxic: cultural death.

This isn't leadership's fault. Or the fault of the engineers or of the managers or of the designers. This doesn't happen because the wrong technology stack was chosen or the wrong management methodology was applied. It has absolutely nothing to do with your people working in-office or being remote. This happens because a small seed of fear takes root. And then grows and grows and grows until it subsumes the entire organization.

Fear erodes trust. And, without trust we don't feel safe. And, if we don't feel safe, the only motivation that we have left is that of self-preservation.

This sounds dire (and it is); but, it isn't without hope. All we have to do is address the underlying fear and everything else will eventually fall into place. This transformation may not come quick and it may not come easy; but, it can be done; and, it starts with feature flags.

Feature Flags, An Introduction

I've been working in the web development industry since 1999; and, before 2015, I'd never heard the term, "feature flag" (or "feature toggle", or "feature switch"). When my Director of Product—Christopher Andersson—pulled me aside and suggested that feature flags might help us with our company's "downtime problem", I didn't know what he was talking about.

Cut to me in 2023—after 8-years of trial, error, and experimentation—and I can't imagine building another product or platform without feature flags. They have become a critical part of my success. I put feature flags in the same category as I do Logs and Metrics: the essential services on which all product performance and stability are built.

But, it wasn't love at first sight. In fact, when Christopher Andersson described feature flags to me in 2015, I didn't see the value. After all, a feature flag is "just" an if statement:

if ( featureIsEnabled() ) {

	// Execute NEW logic.

} else {

	// Execute OLD logic.

}

I already had control flow that looked like this in my applications (see The Status Quo). As such, I didn't understand why adding yet another dependency to our tech-stack would make any difference in our code, let alone have a positive impact on our downtime.

What I failed to see then was the fundamental difference underlying the two techniques. In my approach, changing the behavior of the if statement meant updating the code and re-deploying it to production. But, in the case of feature flags, changing the behavior of the if statement meant flipping a switch.

That's it.

No code updates. No deployments. No latency. No waiting.

This is the magic of feature flags: having the ability to dynamically change the behavior of your application at runtime. This is what sets feature flags apart from environment variables, build flags, and any other type of deploy-time or dev-time setting.

To stress this point: if you can't dynamically change the behavior of your application without touching the code or the servers, you're not using "feature flags". The dynamic runtime nature isn't a nice-to-have, it is the fundamental driver that brings both psychological safety and inclusion to your organization.

This dynamic nature means that in one moment, our feature flag settings can look like this:

User interface (UI) control for a product feature flag currently in the "Off" state.

Which means that our application's control flow operates like this:

if ( featureIsEnabled() /* false */ ) {

	// ... dormant code ...

} else {

	// This code is executing!

}

The featureIsEnabled() function is currently returning false, directing all incoming traffic through the else block.

Then, if we flip the switch on in the next moment, our feature flag settings look like this:

User interface (UI) control for a product feature flag currently in the "On" state.

And, our application's control flow operates like this:

if ( featureIsEnabled() /* true */  ) {

	// This code is executing!

} else {

	// ... dormant code ...

}

Instantly—or thereabouts—the featureIsEnabled() function starts returning true; and, the incoming traffic is diverted away from the else block and into the if block, changing the behavior of our application in (near) real-time.

But, turning a feature flag on is only half the story. It's equally important that—at any moment—a feature flag can be turned off. Which means that, should we need to (in case of emergency), we can instantly disable the feature flag settings:

User interface (UI) control for a product feature flag currently in the "Off" state.

Which will immediately revert the application's control flow back to its previous state:

if ( featureIsEnabled() /* false */ ) {

	// ... dormant code ...

} else {

	// This code is executing (again)!

}

Even with the illustration above, this is still a rather abstract concept. To convey the power of feature flags more concretely, let's dip-down into the actual use-case that opened my eyes up to the possibilities: refactoring a SQL database query.

The efficiency of a SQL query changes over the lifetime of a product. As the number of rows increase and the access patterns evolve, some SQL queries start to slow down. This is why database index design is just as much art as it is science.

Traditionally, a refactoring of this type might involve running an EXPLAIN query locally, looking at the query plan bottlenecks, and then updating the SQL in an effort to better leverage existing table indices. The query code, once updated, is then deployed to the production server. And, what the you hope to see is a latency graph that looks like this:

Line graph showing SQL query latency over time. The latency has a marked improvement after the code is deployed.

In this case, the SQL refactoring was effective in bringing the query latency times back down. But, this is the best case scenario. In the worst case scenario, deploying the refactored query leads to a latency graph that looks more like this:

Line graph showing SQL query latency over time. The latency increases greatly after the code is deployed.

In this case, something went terribly wrong! For any number of reasons, the new SQL query that performed well in your local development environment does not perform well in production. The query latency rockets upward, consuming most of the database's available CPU. This, in turn, slows down all queries executing against the database. Which, in turn, leads to a spike in concurrent queries. Which, in turn, starves the thread pool. Which, in turn, crashes the database.

If you see this scenario beginning to unfold in your metrics, you might try to roll-back the deployment; or perhaps, try to revert the code and redeploy it. But, in either case, it's a race against time. Pulling down images, spinning up new nodes, warming up containers, starting applications, running builds, executing unit tests: it all takes time—time that you don't have.

Now, imagine that, instead of completely refactoring your code and deploying it, you design an optimized SQL query and gate it behind a feature flag. Code in your data-access layer could look like this:

public array function generateReport( userID ) {

	if ( featureIsEnabled() ) {

		return( getData_withOptimization( userID ) );

	}

	return( getData( userID ) );

}

In this approach, both the existing SQL query and the optimized SQL query get "deployed" to production. However, the optimized SQL query won't be "released" to the users until the feature flag is enabled. And, at that point, the if statement will short-circuit the control flow and all new requests will use the optimized SQL query.

With this feature flag gating the new query, the worst case scenario looks strikingly different:

Line graph showing SQL query latency over time. The latency increases greatly after the feature is released; but, returns to normal after the feature is disabled.

The same unexpected SQL performance issues exist in this scenario. However, the outcome is very different. First, notice (in the figure) that the "deployment" itself had no effect on the latency of the query. That's because the optimized SQL query was deployed in a dormant state behind the feature flag. Then, the feature flag was enabled, causing traffic to route through the optimized SQL query. At this point, that latency starts to go up; but, instead of the database crashing, the feature flag is turned off, immediately re-gating the code and diverting traffic back to the original SQL query.

You just avoided an outage. The dynamic runtime capability of your feature flag gave you the power to react without delay, before the database—and your application—became overwhelmed and unresponsive.

Are you beginning to see the possibilities?

Knowing that you can disable a feature flag in case of emergency is empowering. This alone creates a huge amount of psychological safety. But, it's only the beginning. Even better is to completely avoid an emergency in the first place. And, to do that, we have to dive deeper into the robust runtime functionality of feature flags.

In the previous thought experiment, our feature flag was either entirely on or entirely off. This is a vast improvement over the status quo; but, this isn't really how feature flags get applied. Instead, a feature flag is normally "rolled-out" incrementally in order to minimize risk.

But, before we can think incrementally, we have to understand a few new concepts: "targeting" and "variants". Targeting is the act of identifying which users will receive a given a variant. And, a variant is the value returned by evaluating a feature flag in the context of a given request.

To help clarify these concepts, let's take the first if statement—from earlier in the chapter—and factor-out the featureIsEnabled() call. This will help separate the feature flag evaluation from the subsequent control flow and consumption:

var booleanVariant = featureIsEnabled();

if ( booleanVariant == true ) {

	// Execute NEW logic.

} else {

	// Execute OLD logic.

}

In this example, our feature flag uses a Boolean data type, which can only ever represent two possible values: true and false. These values are the "variants" associated with the feature flag. Targeting for this feature flag then means figuring out which requests receive the true variant and which requests receive the false variant.

Flow diagram showing the relationship between requests, feature flag targeting, and variant selection.

Boolean feature flags are, by far, the most common. However, a feature flag can represent any kind of data type: Booleans, Strings, Numbers, Dates, JSON (JavaScript Object Notation), etc. The non-Boolean data types may compose any number of variants and unlock all manner of compelling functionality. But, for the moment, let's stick to our Booleans.

Targeting—the act of funneling requests into a specific variant—requires us to provide identifying information as part of the feature flag evaluation. There's no "right type" of identifying information—each evaluation is going to be context-specific; but, I find that "User ID" and "User Email" are a great place to start (for user-facing functionality):

var booleanVariant = featureIsEnabled(
	userID = request.user.id,
	userEmail = request.user.email
);

if ( booleanVariant == true ) {

	// Execute NEW logic.

} else {

	// Execute OLD logic.

}

Once we incorporate this identifying information into our feature flag evaluation, we can begin to differentiate one request from another. This is where things start to get exciting. Instead of our feature flag being entirely on for all users, perhaps we only want it to be on for an allow-listed set of User IDs. One implementation of such a featureIsEnabled() function might look like this:

public boolean function featureIsEnabled(
	numeric userID = 0,
	string userEmail = ""
	) {

	switch ( userID ) {
		case 1:
		case 2:
		case 3:
		case 4:
			return( true );
		break;
		default:
			return( false );
		break;
	}

}

Or, perhaps we only want the feature flag to be on for users with an internal company email address:

public boolean function featureIsEnabled(
	numeric userID = 0,
	string userEmail = ""
	) {

	if ( userEmail contains "@bennadel.com" ) {

		return( true );

	}

	return( false );

}

Or, perhaps we only want the feature flag to be enabled for a small percentage of users:

public boolean function featureIsEnabled(
	numeric userID = 0,
	string userEmail = ""
	) {

	var userPercentile = ( userID % 100 );

	if ( userPercentile <= 5 ) {

		return( true );

	}

	return( false );

}

In this case, we're using the Modulo operator to consistently translate the User ID into a numeric value. This numeric value gives us a way to consistently map users onto a percentile: each additional "remainder" represents an additional 1% of users. Here, we're enabling our feature flag for a consistently-segmented 5% of users.

We can even combine several different targeting concepts at once in order to apply more granular control. Imagine that we only want to target internal company users; and, of those targeted users, only enable the feature for 25% of them:

public boolean function featureIsEnabled(
	numeric userID = 0,
	string userEmail = ""
	) {

	// First, target based on email.
	if ( userEmail contains "@bennadel.com" ) {

		var userPercentile = ( userID % 100 );

		// Second, target based on percentile.
		if ( userPercentile <= 25 ) {

			return( true );

		}

	}

	return( false );

}

User targeting, combined with a %-based rollout, is an incredibly powerful part of the feature flag workflow. Now, instead of enabling a (potentially) risky feature for all users at one time, imagine a much more graduated rollout using feature flags:

  1. Deploy dormant code to production servers.
  2. Enable feature flag for your user ID.
  3. Test feature in production.
  4. Discover a bug.
  5. Fix bug and redeploy code (still only active for your user).
  6. Examine error logs.
  7. Enable feature flag for internal company users.
  8. Examine error logs and metrics.
  9. Discover bug(s).
  10. Fix bug(s) and redeploy code (still only active for internal company users).
  11. Enable feature flag for 10% of all users.
  12. Examine error logs and metrics.
  13. Enable feature flag for 25% of all users.
  14. Examine error logs and metrics.
  15. Enable feature flag for 50% of all users.
  16. Examine error logs and metrics.
  17. Enable feature flag for 75% of all users.
  18. Examine error logs and metrics.
  19. Enable feature flag for all users.
  20. Celebrate!

Few deployments will need this much rigor. But, when the risk level is high, the control is there; and, (almost) all of the risk associated with your deployment can be mitigated.

Are you beginning to see the possibilities?

So far, for the sake of simplicity, I've been hard-coding the "dynamic" logic within our featureIsEnabled() function. But, in order to facilitate the graduated deployment outlined above, this encapsulated logic must also be dynamic. This is, perhaps, the most elusive part of the feature flags mental model.

The feature flag evaluation process is powered by a "rules engine". You provide inputs, identifying the request context (ex, "User ID" and "User Email"). And, the feature flag service then applies its rules to your inputs and returns a variant.

There is nothing random about this process—it is "pure", deterministic, and repeatable. The same rules applied to the same inputs will always result in the same variant output. Therefore, when we talk about the "dynamic runtime nature" of feature flags, it is in fact the rules, within the rules engine, that are actually dynamic.

Consider the earlier version of our featureIsEnabled() function that ran against the userID:

public boolean function featureIsEnabled(
	numeric userID = 0,
	string userEmail = ""
	) {

	switch ( userID ) {
		case 1:
		case 2:
		case 3:
		case 4:
			return( true );
		break;
		default:
			return( false );
		break;
	}

}

Instead of a switch statement, let's refactor this function to use a rule data structure that reads a bit more like a "rule configuration". We're going to define an array of values; and then, check to see if the userID "is one of" the values contained within that array:

public boolean function featureIsEnabled(
	numeric userID = 0,
	string userEmail = ""
	) {

	// Our "rule configuration" data structure.
	var rule = {
		input: "userID",
		operator: "IsOneOf",
		values: [ 1, 2, 3, 4 ],
		variant: true
	};

	if (
		( rule.operator == "IsOneOf" ) &&
		rule.values.contains( arguments[ rule.input ] )
		) {

		return( rule.variant );

	}

	return( false );

}

The outcome here is exactly the same, but the mechanics have changed. We're still taking the userID and we're still looking for it within a set of defined values; but, the static values and the resultant variant have been pulled out of the evaluation logic.

At this point, we can move the rule definition out of the featureIsEnabled() function and into its own function, getRuleDefinition():

public boolean function featureIsEnabled(
	numeric userID = 0,
	string userEmail = ""
	) {

	var rule = getRuleDefinition();

	if (
		( rule.operator == "IsOneOf" ) &&
		rule.values.contains( arguments[ rule.input ] )
		) {

		return( rule.variant );

	}

	return( false );

}

public struct function getRuleDefinition() {

	return({
		input: "userID",
		operator: "IsOneOf",
		values: [ 1, 2, 3, 4 ],
		variant: true
	});

}

Here, we've completely decoupled the consumption of our feature flag rule from the definition of our feature flag rule. Which means, if we wanted to change the outcome of the featureIsEnabled() call, we wouldn't change the logic in the featureIsEnabled() function. Instead, we'd update getRuleDefinition().

But, everything is still hard-coded. In order to make our feature flag system dynamic, we need to replace the hard-coded data-structure with something to the effect of:

Which creates an application architecture like this:

Distributed application architecture showing that the feature flag administration makes real-time changes to the running application rules cache.

The implementation details will depend on your chosen solution. But, each approach reduces down to the same set of concepts: a feature flag administration system that can update the active rules being used within the feature flag rules engine that is currently operating in a given environment. This is what makes the dynamic runtime behavior possible.

At first blush, it may seem that integrating feature flags into your application logic includes a lot of low-level complexity. But, don't be put-off by this—you don't actually have to know how the rules engine works in order to extract the value. I only step down into the weeds here because having even a cursory understanding of the low-level mechanics can make it much easier to understand how feature flags fit into your product development ecosystem.

The reality is, any feature flags implementation that you choose will abstract-away most of the complexity that we've discussed. All of the variants and the user-targeting and %-based rollout configuration will be moved out of your application into the feature flags administration, leaving you with relatively simple code that looks like this:

var useNewWorkflow = featureFlags.getVariant(
	feature = "new-workflow-optimization",
	context = {
		userID: request.user.id,
		userEmail: request.user.email
	}
);

if ( useNewWorkflow ) {

	// Execute NEW logic.

} else {

	// Execute OLD logic.

}

This alone will have a meaningful impact on your product stability and uptime. But, it's only the beginning—the knock-on effects of a feature-flag-based development workflow will echo throughout your entire organization. It will transform the way you think about product development; it will transform the way you interact with customers; and, it will transform the very nature of your company culture.

Key Terms and Concepts

Feature flags enable a new way of thinking about product development. This introduces some new concepts; and, adds more nuance to existing ideas. As such, it's important to define—and perhaps redefined—some key terms that we use in this book:

Deploying

Deploying is the act of pushing code up to a production server. Deployed code is not necessarily "live" code. Meaning, deployed code isn't necessarily being executed by your users—it may just be sitting there, on the server, in a dormant state. "Deployed" refers only to the location of the code, not to its participation within the application control flow.

A helpful analogy might be that of "commented out" code. If you deploy code that is commented out, that code is living "in production"; but, no user is actually executing it. Similarly, if you deploy code behind a feature flag and no user is being targeted, then no user is actually executing that deployed code.

Releasing

Releasing is the act of exposing deployed code and functionality to your users. Before the advent of feature flags, "Deploying" code and "Releasing" code were generally the same thing. With feature flags, however, these two actions can now be decoupled and controlled independently.

The ability to separate "release" from "deployment" is why we're here—it is the transformative feature of feature flags. It is why everything in your product development life-cycle is about to change.

Feature Flag

A feature flag is a named gating mechanism for some portion of your code. A feature flag typically composes an identifier (ex, new-checkout-workflow), a type (ex, Boolean), a set of variants (ex, true and false), a series of targeting rules, and a rollout strategy. Some of these details may vary depending on your feature flag implementation.

Variant

A variant is one of the distinct values returned when evaluating a feature flag in the context of a given request. All feature flags have at least two variants—with only a single variant, you can't create a dynamic runtime behavior.

Note: This isn't absolutely true; but, it may be true in your particular feature flags implementation.

Each variant value is an instance of the Type represented by the feature flag. For example, a Boolean-based feature flag can only return one of two finite values: true or false. On the other hand, a Number-based feature flag can return any number of variants between -Infinity and Infinity (depending on how numbers are represented in your programming language).

That said, at any given moment, the variants composed within a feature flag are finite, typically few, and predictable. Entropy has no place in a feature flag workflow.

Targeting

Targeting is the mechanism that determines which feature flag variant is served to a given user. Targeting rules include both assertions about the requesting user and a rollout strategy. Targeting rules may include positive assertions, such as "the user role is Admin"; and, they may include negative assertions, such as "the user is not on a Free plan". Compound rules can be created by ANDing and ORing multiple assertions together.

The conditions within the targeting rules can be changed over time; however, at any given moment, the evaluation of the decision tree is repeatable and deterministic. Meaning, the same user will always receive the same variant when applying the same inputs to the same targeting rules.

Rollout

Rollout is an overloaded term in the context of feature flags. When we are discussing a feature flag's configuration, the rollout is the strategy that determines which variant is served to a set of targeted users. This is often expressed in terms of percentage. For example, with a Boolean-based feature flag, the rollout strategy may assign the true variant to 10% of targeted users and the false variant to 90% of targeted users.

When not discussing a feature flag's configuration, the term rollout is generally meant to describe the timeline over which a feature will be enabled within the product. There are two types of rollouts: Immediate and Gradual.

With an immediate rollout, the deployed code is released to all users at the same time. With a gradual rollout, the deployed code is released to an increasing number of users over time. So, for example, you may start by rolling a feature out to a small group of Beta-testers. Then, once the feature sees preliminary success, you roll it out to 5% of the general audience; and then 20%; and 50%; and so on, until the deployed code has been released to all users.

Roll-Back

Just as with "rollout", roll-back is another overloaded term in the context of feature flags. When we are discussing a feature flag's configuration, rolling back means reverting a recent configuration change. For example, if a targeted set of users is configured to receive the true variant of a Boolean-based feature flag, "rolling back" the feature flag would mean updating the configuration to serve the false variant to the same set of users.

When not discussing a feature flag's configuration, the term rolling back is generally meant to mean removing code from a production server. Before the advent of feature flags, if newly-deployed code caused a production incident, the code was then "rolled back", meaning that the new code was removed and the previous version of the application code was put back into production.

User

In this book, I often refer to "users" as the receiving end of feature flags. But, this is only a helpful metaphor as we often think about our products in terms of customer access. In reality, a feature flag system doesn't know anything about "users"—it only knows about "inputs". Most of the time, those inputs will be based on the requesting user. But, they don't have to be.

We'll get into this more within our use-cases chapter (see Use Cases), but feature flag inputs can be based on any meaningful identifier. For example, we can use a "server name" to affect platform-level features. Or, we can use a static value (such as app) to apply the feature flag state to all requests uniformly.

Progressive Delivery

This is the combination of two concepts: deploying a feature incrementally and releasing a feature incrementally. This is—eventually—the natural state for teams that lean into a feature-flag-based workflow. This becomes "the way" you develop your products.

The mechanics of progressive delivery will be examined in much more depth within the life-cycle chapter (see Life-Cycle of a Feature Flag).

Environment

An environment is the application context in which a feature flag configuration is defined. At a minimum, every application has a "production" environment and a local "development" environment.

A set of feature flags is shared across environments; but, each environment is configured independently. A feature flag which is enabled in the local "development" environment has no bearing on the same feature flag in the "production" environment. This is what allows a feature to be enabled locally (during development) and disabled in production, simultaneously.

Feature Flag Administration

This is the application that your Developers, Product Managers, Designers, Data Scientists, etc. will use to create, configure, update, release, and roll-back feature flags. This application is generally separated out from your "product application"; but, it doesn't have to be.

If you're buying a feature flag SaaS (Software as a Service) offering, your vendor will be building, hosting, and maintaining this administration module for you.

Going Deep on Feature Flag Targeting

As web application developers, we generally communicate with the database anytime we need to gather information about the current request. These external calls are fast; but, they do have some overhead. And, the more calls that we make to the database while processing a request, the more latency we add to the response-time.

In order to minimize this cost, feature flag implementations use an in-memory "rules engine" which allows feature flag state to be "queried" without having to communicate with an external storage provider. This keeps the processing time blazing fast! So fast, in fact, that you should be able to query for feature flag state as many times as you need to without having to worry about latency.

Aside: Obviously, all processing adds some degree of latency; but, the in-memory feature flag processing—when compared to a network call—should be considered negligible.

That said, shifting from a database mindset to a rules-engine mindset can present a stumbling block for the uninitiated. At first, it may be unclear as to why you need anything more than a "User ID" (for example) in order to do all the necessary user targeting. After all, in a traditional server-side context, said "User ID" opens the door to any number of database records that can then be fed into any number of control flow conditions.

But, when you can't go back to the database to get more information, all targeting must be done using the information at hand. Which means, in order to target based on a given data-point (such as a user's subscription plan), said data-point must be provided to the feature flag state negotiation process.

I find that a "Pure Function" provides a helpful analogy. A pure function, will always result in the same output when invoked using the same inputs. This is because the result of the pure function is based solely on the inputs (and whatever internal logic exists).

No external calls are made within a pure function. No side-effects are generated within a pure function. Unless there is a bug, a pure function will never error. It has no network calls that can fail. It has no file I/O calls that might lack the proper permissions. It has nothing that incurs a variable latency.

Remaining chapter content available in purchased book (get early access).

The User Experience (UX) of Feature Flag Targeting

There are two groups of people affected by feature flag targeting: the people who use your product; and, the people who manage your feature flag system. Most of this book is about the former—I'd like to take a moment and talk about the latter.

When you first experiment with feature flags, you're almost certainly going to start by using your database "primary keys" as the value to target. After all, these database keys are the inputs we use most often in our application request processing; so, it's only natural to extend that mindset into the realm of feature flag targeting.

Which means, if you wanted to target an "organization" (ie, a "semantic container" for a cohort of users), you might use the organization's database ID:

{
	variants: [ false, true ],
	distribution: [ 100, 0 ],
	rule: {
		operator: "IsOneOf",
		input: "companyID",
		values: [ 1728 ],
		distribution: [ 0, 100 ]
	}
}

This will absolutely work. However, it turns out that numbers have very little inherent meaning within your application. So, when a feature flag administrator goes to configure a feature flag and sees something to the effect of:

Remaining chapter content available in purchased book (get early access).

Types of Feature Flags

Feature flags vary in both the type of variants that they serve and in the way they're expected to operate within your application. Short-lived, Boolean-based features flags are almost certainly going to be the first type that you use. Their on/off state is easy to reason about; and, it closely aligns with the if statements that we commonly use in our application control flow. But, feature flags can be used—and abused—in many different ways.

Data Types

Any type of data that can be serialized for storage can be used in a feature flag. We think about feature flags as being a runtime concern—and they are primarily; but, they also have to be persisted across application restarts and rolling deployments. As such, we can only use data that serializes and deserializes without ceremony.

Aside: "Sets" and "Maps" are common Object-like data-types that are hard to represent in a serialized format. And, therefore, don't work well as feature flag variants.

This rules-out complex runtime objects such as instantiated Classes and Function references. But, it also adds challenges to relatively simple data-types such a Date/Time values. We use Date/Time values so often in programming that we are lulled into viewing them as native data-types. But, they are, in fact, instances of Classes that represent date and time using some underlying value translation (such as a Unix Timestamp).

If we wanted to use Date/Time values as feature flag variants, we'd have to configure them in a serialized format and then handle the deserialization in our application code. For example, we could serialize a Date/Time value using an ISO 8601 formatted String:

Remaining chapter content available in purchased book (get early access).

Life-Cycle of a Feature Flag

A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: a complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a simple system.

— John Gall (Gall's Law)

If you're used to building an entire feature locally and then deploying it all at once to production, it can be hard to understand where to start with feature flags. In fact, your current development practices may be so deeply ingrained that the value-add of feature flags isn't obvious—I know that I didn't "get it" at first.

To help illustrate just how wonderfully different feature flags are, I'd like to step through the life-cycle of a single feature flag as it pertains to product development. This way, you can get a sense of how feature flags change your product development workflow; and, why this change unlocks a lot of value.

For this thought experiment, let's assume that we're maintaining a collaborative task management product. Currently, users can create and complete tasks. But, they can only discuss these tasks offline. What we want to do is build a simple comment system such that each task can be backed by a persisted conversation.

My goal here isn't to outline a blueprint that you must follow in your own projects. My goal is only to start shifting your mindset and your perspective on what is possible.

Flesh-out Work Tickets

Traditionally, when building a feature, tasks in your ticketing system represent "work to be done". And, when using feature flags, this is also true. But, instead of arbitrary milestones, you need to start thinking about each ticket as a deployment opportunity.

Remaining chapter content available in purchased book (get early access).

Use Cases

When your feature flags implementation can store JSON variants, it means that there's no meaningful limitation as to what kind of "state" you can represent. Which means, the use cases for feature flags are somewhat unlimited as well. As we saw in the previous chapter, using Boolean-type flags to incrementally build and release a feature is going to be your primary gesture; but, the area of opportunity will continue to expand in step with your experience.

And, to help jump-start your imagination, I'd like to touch upon some of my use cases. This is not intended to be an exhaustive list, only a list of techniques with which I have proven, hands-on experience.

Product: Feature Development For the General Audience

Building new features safely and incrementally for your general audience is the bread-and-butter of feature flag use cases. This entails putting a new feature behind a feature flag, building the feature up in the background, and then incrementally rolling it out to all customers within your product. We've already seen this in the previous chapters; so, I won't go into any more detail here.

This type of product feature flag is intended to be removed from the application once the feature has been fully released.

Product: Feature Development For Priority Customers

In an ideal world, new features get used by all customers. But, in a pragmatic world, sometimes you just have to build a feature that only makes sense for a handful of high-priority customers. At work, we identify these customers as, "T100". These are the top 100 customers in terms of both current revenue and potential expansion.

The T100 cohort can contain specific individuals (such as the "thought leaders" and "influencers" within your industry). But, more typically, the T100 cohort consists of enterprise customers that represent large teams and organizations.

Remaining chapter content available in purchased book (get early access).

Server-Side vs. Client-Side

In a multi-page application (MPA), the page is newly rendered after each browser navigation event. As such, there's no meaningful difference between a feature flag used on the server-side vs. one used on the client-side. Any changes made to a feature flag's configuration will propagate naturally to the client-side upon navigation.

With single-page applications (SPAs), the client-side code comes with life-cycle implications that require us to think more carefully about feature flag consumption. In a SPA, the client-side code—often referred to as a "thick client"—is composed of a (relatively) large HTML, CSS, and JavaScript bundle. This bundle is cached in the browser; and, navigation events are fulfilled via client-side view-manipulation with API calls to fetch live data from the back-end.

One benefit of loading the SPA code upfront is that—from a user's perspective—subsequent navigation events appear very fast. This is especially true if the client-side logic uses optimistic updates and "stale while revalidate" (SWR) rendering strategies.

One downside of loading the SPA code upfront is that the version of the code executing in the browser may become outdated quickly. Each application is going to have its own usage patterns; but, for a "business" app, it's not uncommon for a SPA to remain open all day. Or, in the case of an email client, to remain open for weeks at a time.

Remaining chapter content available in purchased book (get early access).

Bridging the Sophistication Gap

As you're reading through the aforementioned use cases, it might occur to you that large, sophisticated companies have different ways of solving the same problems. For example, instead of using feature flags to implement rate limiting, a more sophisticated company might put that logic in an Application Load Balancer (ALB) or a reverse proxy.

Or, instead of using feature flags to implement IP-blocking, a more sophisticated company might add reactive request filtering to a Web Application Firewall (WAF) or dynamically update ingress routing rules.

Or, instead of gating code behind a feature flag, a more sophisticated company might adjust the traffic distribution across an array of Blue/Green deployment environments.

Or, instead of using feature flags to adjust log emissions, a more sophisticated company might change the filtering in their centralized log aggregation pipeline.

Large, sophisticated companies can do large, sophisticated things because they have massive budgets and hordes of highly-specialized engineers that are focused on building specialized systems. This allows them to solve problems at a scale that most of us cannot fathom.

But, this difference in relative sophistication isn't a slight against feature flags. Exactly the opposite! The "difference" is a spotlight that underscores the outsize value of feature flags—that we can use such simple and straightforward techniques to solve the same class of problems (at a fraction of the cost and complexity).

Remaining chapter content available in purchased book (get early access).

Life Without Automated Testing

People who say it cannot be done should not interrupt those who are doing it.

— George Bernard Shaw (misquoted)

I have a dirty little secret: I haven't written a test in years.

And, I deploy code to production anywhere between 5 and 10 times a day.

Now, I'll caveat this by saying that I don't write and distribute "library" code for broad consumption—I maintain a large SaaS product over which the company has full end-to-end control. If your context is different—if the distance between your code and your consumer is vast—this chapter may not apply to you.

To be clear, I enthusiastically support testing! In fact, I thoroughly test every single change that I make to the code. Only, instead of maintaining a large suite of automated tests that run against the entire codebase, I manually test the code that I change (right after I change it).

Both automated testing and manual testing seek to create "correct" software. But, they accomplish this in two different ways. Automated testing uses tools and scripts to programmatically run tests on your application—without human intervention—comparing actual outcomes against expected results. Manual testing is the process of having a living, breathing, emotional human interact with your application in order to identify bugs, usability issues, and other points of concern that may or may not be expected.

There is a common misconception in the engineering world that automated tests and manual tests are at opposite ends of the testing spectrum. Some engineers might even go so far as to suggest that automated testing is intended to replace manual testing.

Remaining chapter content available in purchased book (get early access).

Ownership Boundaries

Feature flags are an implementation detail for engineering teams. As such, each feature flag belongs to the engineering team that created it. If we were talking about any other low-level algorithmic decision, this sense of ownership would be self-evident; and, we wouldn't need to have this discussion. However, since feature flags can be both observed and configured outside of the application itself, it's easy for people to misunderstand where the boundaries must be drawn.

Who Can Manage a Feature Flag

Simple—the engineering team that implemented the feature flag is the only team that can manage the feature flag. Since feature flags are an implementation detail, only the implementing engineers have the necessary understanding of how changes to the configuration will impact the production system. These engineers know which database queries are involved; which API calls are being made; which areas of the application are being touched; and, most importantly, which error logs and performance dashboards need to be monitored during a release.

Your product manager doesn't have this information. Your designer doesn't have this information. Your data scientist doesn't have this information. Your growth engineers don't have this information. Your security and compliance officers don't have this information. No one other than the implementing engineers have a fully integrated understanding of the implications; and, therefore, no one other the implementing engineers should be touching the feature flag.

At the beginning of this book, we talked about feature flags as being little more than dynamic if statements. This code-oriented perspective can help bring the ownership boundaries into focus. The people who have permission to configure a feature flag are the same people that have permission to open your code and edit your if statements. And, if you wouldn't be comfortable with a given person jumping in and editing your code, you shouldn't be comfortable with that person jumping in and editing your feature flag configuration.

The exception to this rule involves open-ended, longer-term feature flags. For example, if a feature flag is being used to gate "T100" features or to act as a make-shift paywall (see Use Cases), targeting of the feature flag can be safely managed by customer-facing employees.

Remaining chapter content available in purchased book (get early access).

The Hidden Cost of Feature Flags

Programmers know the benefits of everything and the trade-offs of nothing.

— Rich Hickey (creator of Clojure)

I believe that feature flags are an essential part of modern product development. But, every architectural choice represents a trade-off; and, so do feature flags. While they do create a lot of value, that value comes at a cost—both literal and figurative.

Whether you're building your own feature flags implementation or you're using a 3rd-party vendor, feature flags cost money—actual dollars-and-cents. You're either paying for development time and hosting (if you build your own implementation); or, you're paying for usage and seats (if you buy a managed solution).

But, we'll talk more about the financial trade-offs later (see Build vs. Buy). For now, I want to focus on the more hidden cost of feature flags.

In terms of developer ergonomics, feature flags create additional complexity in your code. By definition, they are spreading the business logic across two different systems: the application code and the runtime configuration. This makes it much more difficult to understand why the application is exhibiting a certain behavior. And, there's no possible way to know—at a glance—which conditional branch in the code should be executing.

Remaining chapter content available in purchased book (get early access).

Not Everything Can Be Feature Flagged

Once you begin weaving feature flags into your product development life-cycle, it can become hard to imagine any other way of working. But, not every type of change can—or should—be put behind a feature flag. Sometimes, the level of effort just isn't worth the added complexity.

At the end of the day, anything is possible when given the necessary time and resources. But, not everything is worth the cost of implementation. Feature flags should be relatively easy to use—they should be simplifying the amount of work you have to do in order to make changes in a production application. If the path forward isn't clear; it's likely that the change you're making isn't a good match for feature flags.

Remaining chapter content available in purchased book (get early access).

Build vs. Buy

As engineers, we love to build! As such, it's completely naturally to learn about feature flags and then think, "Yeah, I could build that." And, in fact, we did build a very simple implementation earlier in this book (see Going Deep on Feature Flag Targeting). But, building your own feature flags system is probably not a great use of your time (or your company's time).

Feature flags are deceptively simple. So much of what we see—the branching logic within our application code—is just the tip of the iceberg. There's much more below the surface that goes into keeping your feature flags available, reflecting configuration changes in real time, and allowing your team to manage feature flag state across multiple environments.

Assume that the cost of building your own implementation is going to be high—higher than you anticipate. And then, ask yourself if this kind of work generates a competitive difference for your company? Meaning, does building your own implementation give you a competitive advantage when compared to buying an existing implementation?

Remaining chapter content available in purchased book (get early access).

Track Actions, Not Feature Flag State

If you graph the numbers of any system, patterns emerge. Therefore, there are patterns everywhere...

— Maximillian Cohen (Pi, 1998)

Data Scientists and Growth Teams love to track everything. The assumption being that with enough data, user behaviors can be predicted and then manipulated. As such, when feature flags are introduced to an application, some people will want feature flag state to be included in the existing analytics data.

In my experience, this is a bad idea. For starters, it adds even more complexity to the code, especially if tracking happens in both the front-end and the back-end context. But, more than that, tracking feature flag state cuts against the grain of a feature flag's expected life-cycle.

Remaining chapter content available in purchased book (get early access).

Logs, Metrics, and Feature Flags

In the early days of my career, my error-handling approach consisted of catching errors and then sending myself individual emails containing the error content. This was OK initially; but, it didn't scale well. Which, I learned the hard way one morning when I arrived at my desk and found over 47,000 errors in my Inbox (due to a failing database connection).

Eventually, I learned about error logging and log aggregation services. And, suddenly, the idea of emailing myself an error message felt so naive and antiquated. The more I used logs, the more I came to realize just how powerful they were. And, soon thereafter, I started to view logging as a critical component in my application architecture.

Over time, I realized that I could use logs for more than just error messages. In fact, I could write anything I wanted in a log message. And so, I started to use logging as a way to record key user actions. When a user signed-up for an account, I logged it. When a user upgraded to a paid-plan, I logged it. When a user sent an invitation for collaboration, I logged it. I could then query the logs and get a sense of how the system was being consumed.

Then, I attended a web development meet-up group and learned about something called "StatsD" (created by Etsy). StatsD allowed an application to emit numeric values at a high rate without any added latency (due to the use of UDP as the underlying transport layer). These values could then be aggregated and graphed on a timeline, creating a visual heartbeat for any workflow within an application.

Remaining chapter content available in purchased book (get early access).

A Cultural Revolution

At this point, I hope that you can see how feature flags make engineering work safer and more effective. In fact, I'm hoping that you're downright eager to start using feature flags as soon as possible! And, I'm confident that once you start, your whole perspective on product development will be—forever—changed.

But, the "technical" benefits of feature flags are only the beginning. If you work on a team with other engineers, product managers, and designers, there's much more value to be had.

The rest of this book deals with the inter-personal aspects of feature flags. And, I do not exaggerate when I say that feature flags can lead to a full-on cultural revolution within your company.

Are you ready to see just how deep the rabbit hole goes?

People Like Us Do Things Like This

I can tell you I love you as many times as you can stand to hear it. And all that does—the only thing—is remind us that Love is not enough. Not even close.

— George Monroe (movie, Life as a House)

Company culture isn't defined by a set of principles or by a mission statement; it isn't defined by the founder's vision; and, it cannot be mandated from the top. Company culture is defined by action; it is the byproduct of movement; and, it is born of guerrilla warfare.

It starts when a small group of people, often at the bottom, stand-up and say that to be here—to be on this team—it means something. And then, they act accordingly; demonstrating, with consistency, that people like us do things like this (1).

In a product company, I believe that our primary goal is to serve the customer. As engineers in a product company, I believe that people like us achieve this goal by shipping code. And, I believe that in order to ship code effectively, we must operate from a place of love and generosity.

Generosity means bringing our whole self to the work; and, leaning on our past experience in order to draw meaningful insight. It means trusting others to do the same. It means thinking "right" about our customers; and meeting them where they are, not only where we want them to be. It means working small and prioritizing throughput. It means unblocking each other and deploying to production, even when it scares us.

Generosity means saying, "this might not work"; and then, doing it anyway because something beautiful might happen; and because we believe that the possibility of beauty is far more important than fear of failure and judgement.

This is not an easy way to exist. Which means that leading by example is also an act of generosity. As we make manifest our beliefs, "we unconsciously give other people permission to do the same" (2). Our actions beget a movement; and, this movement creates the culture.

People like us do things like that.

1 From "People Like Us Do Things Like This" by Seth Godin.
2 From "Our Deepest Fear" by Marianne Williamson.

Building Inclusive Products

The hard part about the idea of "minimum viable product," for me, is you don't know what "minimum" is, and you don't know what "viable" is.

—Ben Silbermann (co-founder of Pinterest)

In the digital product space, an EPD (Engineering, Product, Design) organization, is often described as a "partnership". At work, we even referred to this as the "three-legged stool"—a metaphor which depicts each leg/department as being an equally important part of the overall product design process.

But, this an idealized version of reality. More often, the EPD organization is built like a Totem pole; with Product at the top, Design in the middle, and lowly Engineering down at the bottom. This is much less like a "partnership" and much more like an "assembly line".

Caveat: In my career as an engineer, the interplay between Product and Design has always been a bit of a black box. As such, I mostly speak about this process from the engineering perspective. And, I tend to think about Product and Design as being a single unit.

In this model, Engineering is handed a set of specifications (hopefully) in the form of an interactive prototype. Engineering then comes up with an estimate as to how long an implementation might take. Product then argues with the Engineering team, asserting that said estimate is completely absurd and needs to be cut in half. The Engineering team capitulates in an effort to keep the peace (with rationalizations about it only being an "estimate"). And then, when the Engineering team inevitably fails to meet the estimated due-date, they are reprimanded. And, the Product team becomes increasingly convinced that the Engineering team is the cause of all the company's problems.

If you're not familiar with this story, consider yourself lucky—you are in the minority. These are the trying conditions under which many teams operate.

Remaining chapter content available in purchased book (get early access).

An Opinionated Guide to Pull Requests (PRs)

Everything in this chapter is predicated on the notion that getting our code into production is the single most important thing that we do as product engineers. Our code creates value for the customer; it setups up a fly-wheel of feedback that leads to more inclusive products; and, it provides the psychological safety that the EPD organization needs in order to iterate effectively. But, this value is only ever realized after our code reaches production and becomes consumable.

Deploying code is, therefore, an act of love and generosity; both for our internal team and for the external customers. And, the entire process preceding deployment must be optimized in order to maximize the value that we deliver—as engineers—to the people that we're serving.

There are many steps that need to be completed when taking a project from conception to deployment. If any one of these steps creates a bottleneck, the entire process is delayed; and, as a result, the customers suffer.

More often than not, the slowest and most unpredictable step in this process is the Pull Request (PR) review. It's not uncommon for a PR to remain open for days; and, every now and then, I hear from an engineer who waited weeks for their code to be reviewed and approved.

Frankly, this is unacceptable. If a PR remains open for more than 15-minutes, your process has failed you. And, more to the point, your process has failed the customer.

Remaining chapter content available in purchased book (get early access).

Removing the Cost of Context Switching

In software design, there are two types of complexities: "essential" and "accidental". Essential complexity represents the fundamental nature of a problem. There is no way to get around or reduce this complexity as it is born of the problem itself. Accidental complexity, on the other hand, is self-imposed. It is the complexity that we unnecessarily add to the software through inexperience, insufficient planning, technical limitation, and "résumé-driven development".

When working within the confines of the waterfall methodology, the scope of the project becomes part of the project's essential complexity. Since every aspect of a waterfall project must be developed, tested, revised, tweaked, polished, and approved before it can be deployed, product engineers are forced to hold a tremendous amount of information in their head at all times.

This "complexity of scope" isn't necessarily "essential", even in the context of "waterfall" planning. However, when engineers don't have the technological means to safely and incrementally deploy work (via feature flags), the scope of the project takes on a de facto gravity that gives it an essential expression within the overall complexity.

Remaining chapter content available in purchased book (get early access).

Measuring Team Productivity

Fuzzy math over time still shows trends.

—Mike Maughan (podcast, No Stupid Questions)

When your team uses feature flags to work small and build incrementally, what you end up with is a culture of shipping. In this culture, deployments become the heartbeat of the system; and, a bellwether of health. When a team is functioning well, its engineers ship often. And, when a team is having trouble, shipping slows down (or becomes a rarity).

Deployments, therefore, act as a metric that can proxy team health and productivity. There's no one value that indicates health; but, deviations from the baseline will reveal truths about the team, its engineers, and their relationship to the rest of the organization.

For example, if an engineer hasn't shipped code in a day or two, it might be an indication that they are stuck. Perhaps the project requirements are unclear; and, they're afraid to ask questions. Or, perhaps the work is too complex; and, they're having trouble decomposing the work into small, incremental steps.

Remaining chapter content available in purchased book (get early access).

Increasing Agility With Dynamic Code

In traditional project management, each feature is developed entirely within a local development environment. And, each feature is only ever released once it is 100% complete, reviewed, polished, and approved. This has a negative impact on the quality of the feature; but, it also has a negative impact on the team itself.

Until a feature is released, it only represents a potential value for the customer. Which is a pleasant way of saying that the feature is entirely worthless until it is deployed to production.

This creates a challenging dynamic for the team because it means that the engineers on the team can only demonstrate value once the feature is completed and deployed. In such an environment, all interruptions come with real psychological discomfort: every time an engineer stops what they're doing, they see the finish-line—and their moment to shine—slipping further and further into the future.

And so, the team begins to view work as a competitive, zero-sum game. Cross-team collaboration—once viewed as an act of generosity—is now met with a residue of resentment. To protect itself, the team doubles-down on process and uses the "existing roadmap" as a shield that deflects all incoming requests. Soon, the team becomes so sclerotic—so unwilling to adapt—that even trivial product changes must be scheduled months in advance.

Remaining chapter content available in purchased book (get early access).

Product Release vs. Marketing Release

One of the most frustrating mistakes that I repeatedly see within an organization is the conflation of a "product release" with a "marketing release". A "product" release is the exposure of new functionality to users within the product. A "marketing" release is the announcement of product updates to the rest of the world.

There is no reason that these two releases need to happen at the same time. In fact, attempting to combine the product release and the marketing release is misguided.

Consider the amount of information that your team has about a feature that it is building. As we discussed in the life-cycle chapter (see Life-Cycle of a Feature Flag), we continue to gather new information throughout the product development life-cycle. This leads up to and includes the release of the feature to the users.

Which means, the "product release" will necessarily represent an incomplete version of the feature. This is OK when your goal is to co-create an inclusive product with customers that are already enrolled in the product journey. But, this is not great if you're about to share your new wares with the rest of the world.

Instead, you should be "releasing product" continually; allowing individual features to "evolve" in production with your customers; and then, at some point in the future, release a "set" of features to the rest of the world in some razzle-dazzle-infused marketing moment.

Which is to say that feature flags are not in conflict with marketing releases; because, product releases are not in conflict with marketing releases. In fact, you can use feature flags for both product releases and marketing releases (so long as they are different feature flags owned by different teams).

As product engineers, you should not expect the marketing team to think about this. The marketing team's job is to market the product—not to manage the product engineers. It is the sole responsibility of the product engineers to enforce a process internally that ensures the success of the product.

This may mean pushing back against deadlines. Or, at the very least, negotiating over scope. This can be uncomfortable. But, remember that what you're doing is choosing to build a culture that does right by the customer and creates psychological safety for the team. This is an act of love and generosity.

Getting From No to Yes

It's not enough; but not enough is better than nothing.

— Phillip Atiba Goff

In an organization where features must be fully completed before being deployed, the people who say "No" to a deployment might characterize themselves as "protecting" the customer and the customer experience. By holding back until a feature is "perfect", "lovable", and beyond reproach, these gatekeepers might even position themselves as being the true champions of the customer; and, the ones keeping the rest of the organization under control.

But, saying "No" starts to feel very different when you're using feature flags to incrementally deploy a feature to production. Unlike a "feature deployment", which represents a potentially large amount of coordinated work, a "feature release"—using feature flags—is little more than a flip of the switch. This alters the social dynamic within the organization.

Remaining chapter content available in purchased book (get early access).

What If I Can Only Deploy Every 2-Weeks?

Creating a culture of shipping is going to be a problem if your company only allows for a single, coordinated deployment ever N-number of weeks. If that's the case, I would suggest taking time to understand where that constraint is coming from. It might be fear-based. But, it might be part of some contract; or, part of some security and compliance requirement.

If the constraint is fear-based, you have some wiggle room. You have an opportunity to demonstrate that building products with feature flags—and working small and shipping more often—is actually safer. This won't be an easy argument to make (given the context); but, if you're reading this book, I suspect that you're the right type of person to lead this effort.

Perhaps you can start by getting permission to experiment with work that is outside of the main product-suite. Or, perhaps you can apply this approach to internal dashboards where "bugs" aren't considered a mission critical issue.

Remaining chapter content available in purchased book (get early access).

I Eat, I Sleep, I Feature Flag

When I first started toying around with web development in the 90s, I was uploading static HTML files to GeoCities. It was thrilling to take an idea that was only in my head and turn it into something that I could share with the world. I felt connected to the human experience in a way that didn't previously seem possible.

After a while, however, I came to understand that operating in isolation was very limiting. From my experience using online message boards, I knew that the real magic happened through human interaction. But, I didn't know how to create connections using static HTML. Even with JavaScript sprinkles, the best I could do was move objects around the screen and perform dynamic image roll-overs.

That said, I was hooked! I knew that this was what I wanted to do with my life. And so, in 1999, I got my first web development internship at Koko Interactive, a digital agency in NYC that created online products and enterprise extranets. It was there that I learned about ColdFusion (CFML) and databases.

Having a dynamic back-end with data persistence was transformative! It opened up a whole new world of opportunity—not only to foster human connection but also to provide services that literally change the way people live and work.

I share this story with you in order to find some common ground. Since you're reading this book, I'm going to assume that some of my origin story matches yours. And, that you too have experienced the "step function" which is data persistence. And, that—like me—you can't really imagine building a robust product offering without a back-end database.

Holding that thought in mind, I want to say that feature flags represent the same kind of "step function". They work through a different mechanism; but, the transformative power of feature flags is similar in magnitude to the transformative power of databases. And, in that same vein, I can no longer imagine building a robust product offering without feature flags.

They really do change everything, at every level of the organization. And, I truly believe that the products you develop with feature flags will be more inclusive, more stable, more focused, and more successful.

About The Author

Ben Nadel lives in Rhinebeck, New York with his wife (Mary Kate) and his little dog (Lucy). He's been in the web development industry since the late 90's and has been blogging about technology since 2006 (see www.bennadel.com). This is his first foray into long-form writing.

Table of Contents