The Engineeringfest

This Sunday, we at Next Insurance kick off our 7th Engineeringfest, so I thought it might be a good opportunity to talk about what it is and why it’s a really cool thing to also do in your software organization.

So first off, what’s an Engineeringfest? Well, it’s the term we coined for a week or two where the engineering organization focuses on itself. Tomorrow’s round will be held over two weeks.

What do we focus on? Technical debt (A.K.A. the technical backlog).

During these two weeks, we will work on delivering tasks that benefit engineering and, by extension, the whole company.

Once upon a time

Engineeringfest is the result of a bottom up initiative that started 2 years ago.

Next Insurance is a startup that has grown at breakneck speeds in the past few years in all dimensions – headcount, product features and sales to name a few. This is super exciting but obviously also stressful. Part of the stress comes from having to make hard choices between delivery and “correctness” of the system. You sometimes have to choose a faster time to deliver at the expense of doing things the optimal way.

That is OK, as long as it’s a conscious choice that you and your managers are aware of, because, most likely, somewhere along the line, you will have to go back and put in the time to make up for these choices.

Back in late 2018 we reached such a point. We had grown and the strain on the system was showing. Changes to the system were getting harder and harder to make. There were lots of “todos” scattered across the code reminding us of the things we meant to do vs. the things we ended up doing.

We decided to use the upcoming Christmas holiday in the U.S. to work on paying back some of our technical debt. Mind you, our engineering is base mostly outside the U.S. with no Christmas to celebrate, so we figured we would use the last 2 weeks of the year for this effort while non-engineering people in the U.S. office were busy decorating trees and wearing strange sweaters (or whatever people do based on Christmas movies).

Pretty much all of the engineers wanted to be in on this initiative, so we decided to make it an engineering-wide effort. The result was extremely successful. We made a significant impact on our systems. Things that we would not have been able to do during routine feature development. It helped us maintain velocity during 2019 and with buy-in from management and the enthusiasm from the engineers, Engineeringfest became part of Next’s culture.

The goal

There’s a very concrete goal to Engineeringfest: Do what needs to be done in order improve engineering and serve the goal of moving faster. The main themes for these improvements are usually:

  • Improvements to quality which lead to less development churn and downtime.
  • Improvements to our workflow process (not just code!).
  • Work on the build & testing pipeline
  • Code refactoring that will make it easier to change the system in the future

When an engineer proposes a task to work on, it is evaluated in this context. For example, looking at the planned tickets for this fest here’s a very partial list of things we are going to do, in no particular order (the full list is over 100 tasks long!):

  • Break down a large table that has become a “repository for everything” into smaller, better defined tables.
  • Improve the process of handling build failures so that we get back to a green build faster.
  • Unify a succession of complex API calls into one call.
  • Spin off some services from a service that has become too large.
  • Create a simple self-serve data comparison tool for the product managers so that they no longer have to consult engineers to get information.
  • Create new custom linting rules for catching antipatterns in our code.
  • Add a new abstraction layer to a part of our data. This will limit the blast radius of future changes to that data.
  • Break down a dependencies choke point that’s slowing our build.
  • Add audit data for various entities for easier future debugging.

How does it work?

We have 4 weeks of Engineeringfest every year. They take place for 1 week at the end of Q1 and Q2 and for 2 weeks at the end of Q4.

Since it’s a bottom up event, the engineers get to take their wish list out of their drawers (or pull up their spreadsheets) for things they would like to work on in the fest. Just like any dev effort, there’s a process of planning and demonstrating the value of the tasks you would like to work on, but the requirements and creativity come from within the engineering org, not from the outside.

Why does it work?

Having a dedicated period of self-work gives us many benefits:

Efficiency and workflow

The rest of Next Insurance knows that engineering is not expected to work on other things during this time. Unless it’s an emergency, we will get back to you after the fest. This reduces our context switching.

Even if you happen to be THE go-to person for your system or domain, you will probably not be interrupted by other engineers that need to meet their product delivery deadline. They are also busy working on their own fest task.

We cancel all the routine meetings – dailies, weeklies. This is the only time where even the sacred 1-1 meeting with your manager is skipped. Anything to help people get into the zone of “deep work”.

People and culture

There are many types of engineers. Some live for the internal technical stuff and some really like to deliver features. For the technically oriented engineers, this is the time where they get to really focus on their favorite kind of work. For the delivery engineers, it’s a chance to do something different and touch areas of the system that they would not normally handle.

The fest is as inclusive as it gets! Everyone gets to work on tech debt. Tasks will vary in scope, complexity and level of supervision (if required), but even if you are a graduate just 3 months out of college, you still get to work on a fest task. This works against engineering cultures where refactoring is seen as the responsibility of only seniors and tech leads and helps to instil the habit of looking at the code proactively and continuously thinking about what can be improved.

Planning

A fest task is no different than any other task. When planning we try to maximize the chance of success. If the fest is 10 working days, the task should take as most 7-8 days of work leaving a small buffer. We also break a task of this size down into smaller deliveries. We try to select tasks that are suited for individual work. This maximizes the chances of successful delivery.

Although each team decides on its scope of work, we also coordinate between teams to prevent collisions or overlapping efforts. If another group in the organization such as Devops or BI need to be aware of changes we are going to make, we notify them ahead of the fest to make sure they are able to support the effort if needed.

The result is a super-effective week (or two).

What is it not?

Given this effort, it’s also important to realize that there are some things that Engineeringfest is not:

Engineeringfest is not a substitute for routine gardening and housekeeping of your systems. If you only work on your technical backlog during the fest, you are doing it wrong!

Whenever we see some unexpected non-delivery work that needs to be done, it’s very tempting to just throw it into the Engineeringfest bucket and say “I’ll deal with it then”. You might find yourself reaching the fest with a huge list you can never hope to handle. Don’t let it be the excuse to not handle issues!

Not all types of work are suitable for Engineeringfest.

  • Some work needs to take place over months. Engineeringfest might be a good point to kick off, but it’s not going to be enough.
  • Some work requires many dependencies between teams and people and is not suited to independent work.
  • Some work requires time between iterations to see how it affects the system. Engineeringfest is too short for this type of work.

The purpose of Engineeringfest is to improve velocity. The goal is not to try out new stuff for the sake of innovation or for playing with new technologies, unless they contribute to the goal of velocity. In fact, adding new technology oftentimes adds complexity and overhead which is the exact opposite of our goal! So think hard and well before adding new and strange cogs to the machine.

This also leads us to the risk of misaligned expectations: Engineeringfest is not a hackathon, and there is a risk that if you do not communicate this clearly, people will want to use the time for unrelated experiments or projects, which have their place, but not as part of the fest.

Business not as usual

As part of making Engineeringfest feel non-routine, we’ve also developed traditions around the process:

Normally, we work with Jira for tickets. In Engineeringfest we still do that but there’s also a visual element to the work: Each developer takes an A4 sized sheet of paper for each feature they will work on. The feature is written down with the occasional illustration (as can be seen in this post’s banner). We hang them on a “todo” wall, decorating the office. Each time something is done, the paper is moved to the “done” wall accompanied by the ring of a cowbell or gong. The sense of team accomplishment and individual recognition is much more pronounced compared to a status change in Jira 🙂

This year with COVID, the gong will be replaced by a dedicated Slack channel.

For many of the engineers, this Engineeringfest will be the first, and the fest has a strong social component that helps them to get to know people outside their immediate circle of contacts. We always set a round or two of social games, whether it’s board game evening or a things-you-didn’t-know-about quiz. This year, obviously, it will all be held online, with social games over Zoom.

Conclusion

While it’s generally agreed that every engineering org should devote at least 20% of its time to the technical backlog (and oftentimes more than that), come crunch time, these tasks are usually the first to go out the window. The result is an organization with an unbalanced ratio between delivery and housekeeping. The fest helps keep the ratio in balance.

I think the reason that I like Engineeringfest so much is that it’s not only about the work that’s done. It’s a reflection of a kind of engineering culture that I aspire to be part of.

Internally within engineering, it’s about pride in our work, caring about how our systems are designed and coded, not just about the raw outcome and delivery. It encourages everyone in the organization to come up with ideas for improvement and allows them time to implement them.

The fest itself is not only about raw metrics and velocity. It’s a statement that the agenda that’s important to engineering is also important to the company as a whole and that success of this agenda contributes to the success of the company.