Originally published on medium.

The Global Day of Code Retreat is coming soon. Going down the memory lane of Coding Katas, Test-Driven-Development and the related workshops I had the opportunity to attend/host, I got reminded of the long-term value of these exercises.

This posts showcases strategies for shipping software in iterations, which build on the fundamental practices introduced in these workshops. Just like the small steps in coding katas, the strategies outlined here may seem obvious or too simple to leverage in a real-life project, until one encounters the right project to apply them in.

Introduction

Code Retreat workshops

The Code Retreat format promotes Test-Driven-Development through coding katas and injects added fun through pairing, learning from one another, and making the exercises harder through constraints (e.g. no loops). While Katas may feel simple and a repetitive exercise helping getting into flow or learning a new programming language/construct, there is an implicit concept they teach us. It’s all about the simple iterations and sequences of tests that one defines in order to complete the set of requirements in an iterative way. The important bit is that the complexity level may only increase in a gradual way. In the meantime, this concept has been codified as Transformation Priority Premises, which to the familiar eye will look like an extension of the intuitive evolution of null → single result → two results → loop, etc.

Elephant Carpaccio

The Elephant Carpaccio exercise takes the concept of developing in iterations to the next level. Its aim is to teach how to slice a feature into very small iterations spanning all layers. One is asked to define a detailed iteration plan and set of acceptance criteria for each increment. The caveat is that each iteration is to be delivered in minutes! (without Copilot / GPT-3 support ;-)) While tricky at first, with some practice during the workshop this gets rather fun quickly. Try it in your team if you haven’t yet!

Ship early and ship often! (single feature)

More often than not, developers have the tendency to ship features only when fully ready, delaying integration and increasing pull request size (and as a result the lead time). For the team, it’s hard to understand the quality and progress on the feature as the code isn’t used in production and never deployed. As most problems happen in production, this doesn’t sound like a good strategy, right?

A common technique in teams practicing trunk-based development is to hide the feature behind a feature flag. This allows for frequent code integration and avoids unintended production usage. It’s a step forward, but heavily missing out on the value the code may have from actually being used in production continuously, even when the feature is not fully finished (or not even a single line of code of logic is written).

Testing latency assumptions

Whenever new service calls or complex calculations are are added to an application, the overall response time of the application may increase. Typically, services will have a defined latency SLO for its operations, which clients rely on and use to define timeouts. In other cases, increased response time will have immediate business implications as users will interact with the software less frequently as it gets slower or abandon the application along the journey. It’s possible to use A/B experiments to validate the latency buffer a service has, but these may take weeks to reach statistical significance. Hence, there is value in learning early in a project that the pure added latency triggers undesired business impact for users.

Instead of waiting for the feature to be fully implemented, shipping code that’s a no-op in terms of business-logic, but such that simulates the processing time allows to verify latency impact of the new functions. When code runs in production with the added latency, impact on processes, end-user KPIs will become immediately visible, which helps to validate that the chosen design approach is viable in production early in the development cycle. As clients spend more CPU time to process the responses, this also provides early insights for capacity planning.

To prepare for adding business logic, we add an execution budget after which the function terminates automatically and returns a fallback value. Lastly, business logic can be added in multiple iterations, relying on the guaranteed execution time to cover for performance inefficiencies. At this last step, the added latency is optional and can be dropped as soon as the function returns values for all input combinations.

The diagram below demonstrates the iterative evolution of the function code added to an example application. Each iteration ends with a production deployment. The 4th step should have additional iterations to develop the business logic.

A diagram visualizing the iterative evolution of function code, starting with a simple no-op, through adding latency that’s later incrementally replaced with business logic.

Iterative evolution of function code

Verifying inputs and discovering edge cases

When faced with business logic needing to process many input parameters, it’s helpful to learn more about the input parameters and data access patterns. Frequently, this task cannot be done based on data dumps.

Shipping code that just processes the input parameters allows to calculate the distribution of input parameters, estimate the likelihood of certain edge cases to occur in production (and these tend to occur more frequently than expected), or to record the access patterns based on combination of inputs or frequency of use. This helps in selecting the right data structures for efficient processing, define strategies for populating caches, etc. Long-term, the already developed input verification procedures can help detect data skew or inform that new use cases were added to production, which may invalidate assumptions taken during development.

The function can either store the statistics for requests in memory and/or log these with every n-th call. To minimize impact on the main execution flow of the application, incoming traffic can be duplicated into a separate application that logs required statistics over time.

A diagram visualizing the concept of early data verification through calculating statistics on the input data and emitting these via logs.

Verifying input data to a function early in development

Shipping logic in iterations

Just like in TDD where the passing (acceptance) tests demonstrate real progress, the iterations of the code you shipped showcase progress that’s verified in production. This allows to understand when a certain iteration triggers problems in production. As soon as first use cases are ready for production use, the service can invoke the logic for these asynchronously and log results for offline verification.

For simple cases, it’s sufficient to log the input and output to perform offline verification of the results. For complex cases, like rewrites or migrations (see picture below) it’s advised to record the result of both operations and to perform additional asynchronous result comparison, helping to measure and log the correctness of the performed operations (often referred to as the Parallel Run Pattern). These comparisons also showcase the incremental progress of the development work, which is much needed in migrations.

A diagram showing how two implementations of a function are invoked in parallel in order to compare results and measure migration progress/completeness.

Invoking two implementations in parallel for result comparison

Given sufficient confidence for a subset of use cases, the source of truth of the operation can be shifted to the new implementation, which is the essence of the Strangler Fig Pattern.

Being responsible!

Production deployments of the described patterns require high discipline. For high traffic use cases, it is advised to invoke early stage code infrequently, e.g. for a fraction of incoming requests or through capping the number of executions per time unit. The mentioned technique of leveraging latency budgets by ensuring maximum execution time of the executed code helps in ensuring that early stage code has predictable runtime implications. Impact on memory footprint needs to be managed carefully as well. This is easy to achieve through limiting the size of any intermediate results or statistics computed, stored, and logged. Unsurprisingly, these are practices that come in handy nonetheless and would be built up eventually over time, often as a result of production hardening or (worst case) in response to incidents.

Untangling dependencies! (cross-team projects)

The strategies described so far apply mostly to single functions or applications. In cross-team projects, dependencies are typically the limiting factor for successful project delivery on time. Dependencies in larger projects cannot be avoided and need to be managed. Most likely, you will be familiar with one of the following situations:

  • Frontend teams waiting for a backend API to be ready, so that they can start building UI widgets and wiring it to the provided data.
  • Backend teams waiting for dependencies on APIs providing the data/fields they need to calculate results.
  • Analysts delaying building dashboards until the time when data is available in production. Unless, built-in in the UI frameworks, interaction data tends to be de-scoped until the very end.
  • Teams waiting for UI components to be ready and shown in production, so that they can verify the end-to-end user journey by clicking a button to initiate an action as part of their tests.

The waiting time in each of the scenarios is the factor to focus on when improving project delivery speed. Here, lessons from the Elephant Carpaccio exercise come in handy and trigger thinking in end-to-end iterations. Coupled with addressing identified dependencies early on in the project it’s a powerful combo.

Shipping the simplest slice

The more dependencies, the more it’s important to manage scope and optimize for early integration. Just like the slices defined in the exercise, the value of small iterations shipped to production cannot be understated. More often than not, simplest will be counter-intuitive to what one is used to.

Building a complex UI widget with a CTA button? From a dependency point of view, it’s the button that really matters as it links data with the destination. Shipping a UI component with just the button unblocks teams building and testing the triggered action along with providing the click events/data. When looked at visually, the widget may feel odd as it would not be shown to your users, but reducing dependencies enables the team to focus on further iterations of the UI component itself. It’s also a great starting point to reserve screen estate for the full component and observe its incremental evolution. Naturally, such change requires to be hidden behind a feature toggle, so that this button is shown only when desired during testing.

API first

While widely practiced through IDLs or OpenAPI specifications that help auto-generate API clients, just the specs may not be enough. To untangle dependencies, beyond agreement on the API contracts, it’s essential to provide a set of sample responses when the code is invoked in production. Such responses can be then returned by API endpoints serving static responses, alternating through sample requests or switching responses based on input context. In case a new service needs to be built, a simple web server serving static files may be enough to start with (even if later replaced by a different application). Re-using any contract tests is helpful here as well, as long as the responses are used in early stage development, in production.

Early end-to-end integration

The earlier features built by multiple teams are integrated with one another, the better. It’s helpful to align on a small end-to-end deliverable across all teams early in the project. In this iteration, API endpoints with sample responses and clients for those APIs are developed to invoke APIs and process (at least) a few API parameters (it’s worth noting that events are also APIs).

When the integration in production succeeds, it ensures that every piece of the puzzle can be further iterated on with relaxed dependencies. Early deployment to production enables everyone to observe incremental progress, incl. interaction events in analytics systems. Moreover, when the features are treated as production-grade from the start, the team continuously learns about daily operations of the system.

Lastly, through simulation of system (end-user) activity, non-functional requirements can be verified uncovering performance bottlenecks. Applications generating such simulated traffic (be it UI interactions, API calls, or events pushed into the system) can be leveraged as part of CI/CD pipelines as smoke tests executed ahead of releasing a change. Additionally, such simulator can serve as an end-to-end probe continuously verifying that the system works in production.

A diagram showing an application with two synchronous dependencies and a simulator application that simulates input events, API calls and verifies the results/events of these operations.

Input data simulation and result verification

Summary

The strategies outlined in this post can help in reducing risk of software delivery by shipping software to production early, in small, and well-defined iterations. Applied to the right problem, they will result in more frequent code integration and deployment to production. Teams are enabled to gather information about the complexity of the problem at hand at the design stage of the feature delivery, untangle dependencies, and provide more transparency on the progress of the project.

Shipping to production should be frequent and fun! If it’s not, do it more often and address the factors leading to toil or anxiety, be it through automation of lengthy and manual procedures, by adding gradual deployments with automatic rollbacks, or canary builds to reduce the blast radius of failures. In cloud-native environments, there are plenty of tools making it easy to adopt these deployment practices.