Build in the age of AI – Learnings from a failed integration

Today’s story is about build vs. buy, how I chose wrong and what I learned from this failure about this choice in the age of LLMs.

I’m was brought into a team that’s building a voice bot, with the task of building their observability. When building a bot, it’s vital that we be able to see interactions to audit them, tag them and close the feedback loop to improve quality.

When evaluating the options I was faced with several constraints – This system will contain production data – it has to be secure. It has to integrate into our current user role management. It’s not just read-only observability – users will annotate the inputs. It has to support a streaming architecture, not just http calls.

For reasons of simplicity, we decided not to build or own solution from scratch – that would require bringing in another team with knowledge of building our backoffice frontend apps. I was also very concerned about the TCO of such a system – it’s not just the initial development, we would have to fix bugs, deal with security and performance and incrementally add more capabilities and it would have to be coordinated with this other team.

I decided that it would be better to pay the upfront cost of integrating an existing platform instead. We looked for a self-hosted open source solution and settled on Langfuse which the team had already evaluated in the past as a potential LLM management platform.

The implementation turned out to be anything but smooth. Everything from forcing Langfuse to align with event stream architecture, to the GRC & security hoops we needed to jump through, to what turned out to be a hyper-sensitive configuration of all of the DBs under the hood (Langfuse requires Postgres, Redis and most painfully, Clickhouse) using Terraform that deploys to K8S.

My breaking point turned out to be a full working day where I tried to harden the system by closing a non-secure port and forcing the system to use the secure one. I kept reaching chicken & egg situations where the only way to fix a broken configuration was to reach a non-broken state at which point I would be able to fix the broken state (makes sense right? 🤦🏻‍♂️). All this for a platform that was not a 100% natural fit for our use case, but just a good-enough solution for the time budget that we had.

I raised a red flag (hopefully early enough) to management and had to admit that we were off-track and barking up the wrong tree.

In parallel, another team member had prototyped an internal page with the exact requirements that we needed – viewing a transcript and even being able to listen in on the conversation and visually asses pauses in speech. Yes, it’s a dreaded “vibe coded” page, but it works!

We decided to use this initial prototype internally within the team. At this point we had just completed the rollout of Cursor to all devs at Next and when we discussed the cost of expanding this prototype to be integrated into our existing backoffice platform, I was surprised to hear that the effort estimate for this was lower than I expected. We ended up planning a roadmap starting with using the page as an internal tool with a gradual port to the backoffice app (that already has user authorization and authentication). Given Cursor, we will probably be able to maintain this app with much less commitment from the backoffice team to support this in the future.

To paraphrase, I was given a choice between the building my own system and integrating a solution. I chose integration and I also had to build 🙂

In retrospect, the main error in my assessment was the cost / benefit of building my own solution – This cost has been dramatically reduced with Cursor (or other LLM tools) and will shift my evaluations going forward. Yes, integrations have also come down in cost with LLMs being able to do a lot of configuration dirty work, but I think that building stuff has come down in cost much more.

I also suspect that companies with dev tools offerings will start facing headwinds as LLM tools gain more traction and teams gain the confidence to dare to build end to end internal solutions instead of “throwing money at the problem”.