Build the perfect internal developer platform for your teams

We have already seen that an internal developer platform (IDP) is necessary to scale DevOps initiatives. Otherwise, software developers will suffer from cognitive overload, which will negatively impact their productivity and well-being. Building an IDP is a must, and everybody should be doing it right now.

So what about just doing it? What could go wrong?

As with many other projects, the most difficult part is getting started—and starting the right way. But that’s not enough: a platform must be completed and used against all odds.

In this blog post, I’ll be answering these questions:

How do you start building a platform?
How do you shape its architecture?
What challenges will you face? How can they be overcome?

How to start building a platform

So, how do you make a platform a reality? Establishing an IDP involves three main steps:

1. Define the architecture and the software tools

To create a platform, you need to make a few key architectural and technological choices. In addition to the developer portal, most platforms revolve around Kubernetes, which sets the stage for the abstractions and paradigms used by the different software components.

2. Integrate components and add functionality

Once you have the platform's core in place, it’s time to consolidate the different components. Integration goes beyond connecting the source code to build pipelines. It’s also about establishing access controls and enforcing compliance. For example, you often need to embed security scanning and automated quality gates within your IDP.

3. Make developers part of the platform

With help from guides and tutorials, developers can start using the platform as soon as it’s mature enough. They can provide early feedback on the functionality and request features that benefit their workflow. A platform is successfully adopted when developers actively contribute, for instance, by creating templates, writing documentation, and adding custom views to the dashboard.

Architecting the platform

Defining the platform's architecture is complicated, and one could be tempted to repurpose existing tooling and services as a head start. It may be a bad idea in many cases, though. Thinking of the platform as it should be, as a greenfield project, is a valuable approach to get rid of legacy and embrace modern technologies. But how do you deal with complexity? It’s good to start with solid foundations and then consider what is at the platform's core; “reference” implementations also help to start.

Cloud-native principles

The foundations of a platform are the technical approaches underlying it and the basic components that make it work. Modern cloud-native technologies are purposely built to take advantage of the cloud paradigm and specifically target scalable systems with loosely coupled components that are resilient and secure. A cloud-native approach to building IDPs is based on a few key principles.

Kubernetes. Besides being an orchestrator, Kubernetes establishes how modern applications work in practice: a set of stateless microservices implemented with software containers. This also allows for easy application scaling in cloud environments.
Everything as code. A system is defined as code for all its components, including the underlying infrastructure. Such a code is human-readable and versioned, typically through YAML configuration files stored in a version control system.
Security everywhere. Access to the platform is regulated by role-based access control. Mutual authentication is employed, and all data is encrypted in transit. Tokens and certificates are short-lived and automatically generated.

Core components

The principles we’ve seen are the fundamental abstractions for building platforms, and they all share the same core components.

Identity and access. Allows users to authenticate and access the platform with the rights corresponding to their role. It generally employs identity federation and centralized access management.
Infrastructure provisioning: This involves deploying and configuring the infrastructure supporting the platform's services. Typical infrastructure includes Kubernetes clusters, databases, and network resources.
Version control system. Stores code and track changes over time by keeping a history. Git is the de-facto standard nowadays, utilized for code and configuration files and documentation.
Continuous integration and delivery. It enables the building of an application, carrying out different tests, and deploying the application to a target environment.
Developer portal. This includes a service catalog, software templates, and technical documents through a unified view. It is usually a dashboard that reports the status of different services and offers self-service capabilities.

Reference architectures

Now, we know the basic components of a platform, and we can pick specific software or services for each of them and add the rest. There's plenty of choice; there are way too many options. The Cloud Native Computing Foundation (CNCF) maintains a landscape of open source projects and commercial products. The landscape can be used as a map to explore the available options, conveniently organized into categories. Still, the landscape includes hundreds of solutions. Picking the “right ones” for your platform may be daunting.

Cloud Native Computing Foundation (CNCF)

The Cloud Native Computing Foundation (CNCF) is an initiative to foster adoption of modern technologies that enable running scalable applications in the cloud—software containers, microservices, and declarative application programming interfaces, to name a few. The CNCF also supports an ecosystem of open-source, community-based projects. In particular, it maintains a map of these projects and defines metrics to assess their maturity.

Platforms should be tailored to each organization, but are there any “templates” to build upon? Cloud Native Operational Excellence (CNOE) aims to provide a reference architecture by consolidating toolchains and best practices for building an IDP. CNOE emphasizes open source solutions and targets IDPs that can be realized on top of different cloud providers. CNOE’s technological choices are the following:

Keycloak for identity and access management.
External Secrets Operator to interface with third-party vaults and secret managers.
Crossplane and Terraform for infrastructure as code.
Argo Workflows and Tekton for continuous integration.
Argo CD and Flux for continuous delivery.
Backstage as the developer portal.

Cloud Native Operational Excellence (CNOE)

Cloud Native Operational Excellence (CNOE) is an open source initiative for building internal developer platforms (IDP) led by leading companies, including Adobe, Amazon Web Services, Autodesk, Salesforce, and Twilio. CNOE is not a premade platform solution; it’s an effort to share developer tooling and patterns that organizations can adopt for creating their IDPs. As a community, CNOE also contributes tools and reference implementations supporting different cloud providers.

The challenges you will run into

Despite the benefits, you will need help establishing a platform.

Managing complexity

Developing a platform is challenging. Many components need to be carefully integrated for them to work as intended, and integration often relies on configuration files under version control. This approach is powerful but also somewhat inconvenient. These files are not that human-friendly and may require pre-processing with specialized tooling.

One option to overcome this issue is to extend the platform's self-service capabilities, which are available through the dashboard, so that developers can create resources and not just visualize them from a user-friendly and familiar web interface.

Balancing freedom and governance

Developers may perceive the platform as a threat, a structure limiting their freedom and disrupting their workflow. This generally happens when the platform doesn’t have the expected features.

In such cases, developers will start finding ways to circumvent the platform, eventually resorting to building their own. Therefore, you need to involve developers in the early stages of building the platform. Extensive feedback from multiple teams is extremely valuable for making informed decisions and sizing functionalities.

Making the platform sustainable

Once available, your platform must be constantly updated, expanded, and operated.

A common approach is treating it as a product and establishing a platform team to support it. This approach seems natural, but you risk going against the DevOps principles. You are separating development and operations and possibly creating a platform silo.

One option to overcome this threat is to allow developers to contribute to the platform, for instance, through innersourcing—a software development methodology that applies best practices from open source projects within an organization.

Closing thoughts

Creating an internal developer platform (IDP) is essential for scaling DevOps initiatives without overwhelming developers, but starting and maintaining one poses challenges. My goal with this blog post was to overview the process of building an IDP, from defining its architecture to overcoming inevitable obstacles that arise.

You have the knowledge. Now, it’s time for you to get started!

Published: Nov 29, 2024

DevOps Cloud Platform engineering