Skip to main content
GeneralDev/CloudOpsInternal Developer Platform (IDP)

Internal Developer Platform: Insights from over 100 conversations

By December 4, 2024January 14th, 2025No Comments14 min read

What have I learned and what do I think about the hype surrounding the Internal Developer Platform (IDP)?

I'll come straight to the conclusion from my discussions. I have already written an article about what constitutes an internal developer platform (IDP) and how everything fits together: "Internal developer platforms: A real innovation or just a trend?". Here are some insights into the internal developer platform that I have gained.

1. internal developer platforms can be everything and nothing

That's right. There is no clear definition of what an internal developer platform is. Many have tried to create a maturity model for an IDP - based on features, level of automation and the value it provides.

I'll make it simple and show you what different companies mean by an IDP.

An IDP can simply be a documentation or a guide with a blueprint for other teams. In this context, companies are not talking about Terraform modules, Helm charts or packaging tools like APT. What they really mean is something like:

Internal developer platform based on documentsPlatform Engineer:The Platform Engineer creates and defines the structure, tools and processes for the platform.Documentation/Blueprint:This information is recorded in documentation or as a blueprint, which is then passed on to developers.Developers:Developers copy the information from the documentation or blueprint into their working environment.They perform tasks such as search, replace, test before executing the code on the platform.If errors occur, feedback is provided to the developer.Platform:The platform is the target on which the code is ultimately executed.
Internal developer platform based on documents

Yes, you recognized it correctly. There are companies that say: "If we provide a blueprint with placeholders that can be used by different developers, then that qualifies us as an IDP." I partially agree with this perspective. Team X provides a template to one or more teams, along with instructions on how to consume a service as a self-service.

An IDP can also consist of Terraform modules that a team member configures and deploys locally, based on a guide for the other users. It could look like this:

Internal developer platform based on Terraform modulesPlatform Engineer:The Platform Engineer provides Terraform modules that serve as building blocks for the platform.Terraform modules:These modules are preconfigured templates that are made available to developers to build infrastructure or platform components.They are shown in the image as red building blocks, marked with the Terraform logo.Developers:Developers use these Terraform modules to configure and build infrastructure locally.They integrate the modules into their workflows and build on them.Platform:The combined use of the Terraform modules leads to the creation of the platform, shown as a pyramid of building blocks with Kubernetes at the top.This symbolizes the successful implementation of an infrastructure or system based on the provided Terraform modules.
Internal developer platform based on Terraform modules

This is more in line with my understanding of what an IDP is. You provide Infrastructure as Code or Configuration as Code, and only user-defined configurations need to be set up.

An IDP can also be a portal that has achieved a relatively high level of automation. This means that I can request a template in a certain "t-shirt size" with one click or via an API and everything is provided automatically. This refers to something like:

Portal based on the internal developer platformPlatform Engineer:The Platform Engineer creates and maintains the portal. This portal is a centralized location where templates and services are made accessible.Portal:The portal provides the automation logic and infrastructure templates.It allows developers to select specific configurations (e.g. "t-shirt sizes") to create a scalable platform.Developer:The developer interacts with the portal, selects the appropriate "t-shirt size" (a standardized template or configuration) and starts the deployment.Platform:Once selected, the platform is automatically deployed.This is represented in the diagram by the Kubernetes icon, which represents the target platform.
Portal based on the internal developer platform

You can see that different companies have different views on this, and there are also some understandable reasons for these differences. I will come back to this later.

In the following, I have tried to illustrate the different stages of development of the companies I spoke to.

2. degree of maturity of automation

In this section, we will look at the different states of the companies. This is not about the quality of the respective level, but rather a classification of where you see yourself or where you operate.

There is an excellent piece from the CNCF WG Platforms that contributed to the whitepaper and developed this great graphic titled "Capabilities of Platforms":

1. product and application teams (top level)These teams are the end users of the platform and work directly with the tools and services provided.Their focus is on the development and provision of products and applications.2. platform interfaces (middle level)This level provides the interfaces between the product teams and the platform capabilities:Documentation and search: Enables access to information and instructions.User interfaces (web portals): Interactive interfaces for interacting with the platform.Project and environment templates: pre-built templates that developers can use.APIs and CLIs: programming interfaces and command line tools for automation and direct integration.3. Platform capabilities (core of the platform)This layer provides the core services of the platform that support various tasks and automations:Provisioning environments and resources:Infrastructure: resources such as compute, network and storage.Data: Databases, caches, buckets.Messaging: Message queues and brokers.Identification and authorization: control of users and services.Scanning and policies: Checking artifacts and compliance with policies.Automation: Automated build, test and delivery.Storing artifacts: Storage in registries and repositories.Tying services: Linking services to workloads (e.g. via Secrets).Monitoring workloads: Monitoring applications and services.4. Capability and service providers (Lowest level)This level represents the infrastructure and service providers that form the foundation for all platform services.These include cloud providers, databases, networks and other underlying technologies.
Capabilities of platforms

If you understand the capabilities of platforms, you probably have a broader perspective than many small and medium-sized companies whose core business is not software or product development.

For this reason, I have tried to abstract the concept in order to simplify it. I have created a stack that you will probably be familiar with. In the next step, we will look at the different levels of automation.

Maturity level of automationLevel 5: IDP + PortaLevel 4: Modules, ChartsLevel 3: IaC + PipelinesLevel 2: IaC, CaCLevel 1: ScriptingLevel 0: ClickOps
Maturity level of automation

Level 0: ClickOps

There are still many organizations that prefer ClickOps, whether on-premises or in the cloud, because they believe it's faster. I'm not going to judge this approach; it's just a fact.

Level 1: Scripting: Bash, Python or PowerShell

Many companies consider the execution of scripts to be automation. Since this is not done via clicks, it is considered automated. Again, I will not give a rating.

Level 2: Infrastructure as Code and Configuration as Code

In my opinion, the next step after scripting is the use of tools such as Terraform to provide infrastructure and Ansible for configuration.

Level 3: Pipelines: IaC + CI/CD or operators with CRDs

One step further: IaC is no longer executed locally on a client device, but via a pipeline, or a tool such as Crossplane is used, which automatically provides the corresponding resources.

Level 4: Terraform modules, helmet charts and GitOps

With increasing professionalization, recurring infrastructure parts are packed into Terraform modules, for example to provide infrastructure or Kubernetes clusters. A GitOps approach is then used to deliver infrastructure as an application to the respective clusters. The level of automation here is quite high. "Quite high" means:

  1. Can I scale with growing projects?
  2. Can I also scale maintenance and operations to avoid technical debt?
  3. Can I scale the setup without increasing the number of employees?

This is still done by people, more precisely by the platform team.

Level 5: Replace the human with a portal

The next stage would be to replace the human component in level 4 with an abstraction layer. This does not mean that the platform teams disappear; someone still needs to create terraform modules, helmet charts, pipelines, etc. so that they can be rolled out via a template.

Where do you stand?

I think it's important to understand what level you are at, as I often relate this level to the skills and resources within an organization. From my observations, there is a correlation between a low level of automation and heterogeneous infrastructures, which in turn often goes hand in hand with resource scarcity and scaling through additional staff.

This often reflects the level of competence. However, this does not mean that people are poorly qualified - quite the opposite. Rather, it shows where my company is currently on the cloud-native roadmap (do we use Git, containers, CI/CD, do we have IaC and CaC, etc.).

I have tried to illustrate this, and I think many will be able to understand it. First, let's look at some key competency points on the cloud-native roadmap. Low is bad, and high is good.

Skills: Cloud Native RoadmapImperativeContainerizationDeclarative ApproachesAPI-drivenPackage ManagementOrchestration
Skills: Cloud Native Roadmap

We are now trying to identify these points at the automation level.

Imperative refers to an instruction or command that asks someone to perform a specific action or task.

ImperativeLevel 5: IDP + PortalLevel 4: Modules, ChartsLevel 3: IaC + PipelinesLevel 2: IaC, CaCLevel 1: ScriptingLevel 0: ClickOpsImperative (marked red)
Imperative

Containerization is the process of bundling applications and their dependencies into containers so that they can be run consistently in different computing environments.

Containerization as with DockerLevel 5: IDP + PortalLevel 4: Modules, ChartsLevel 3: IaC + PipelinesLevel 2: IaC, CaCLevel 1: ScriptingLevel 0: ClickOpsContainerization (marked red)
Containerization like with Docker

Declarative refers to a programming approach in which the desired result is specified without explicitly specifying the steps to achieve this result. The system automatically takes over the details of the implementation.

DeclarativeLevel 5: IDP + PortalLevel 4: Modules, ChartsLevel 3: IaC + PipelinesLevel 2: IaC, CaCLevel 1: ScriptingLevel 0: ClickOpsDeclarative (marked red)
Declarative

API-Driven refers to a design approach that prioritizes the use of application programming interfaces (APIs) as the primary method of interaction between different software components, enabling seamless communication and integration between systems.

API-drivenLevel 5: IDP + PortalLevel 4: Modules, ChartsLevel 3: IaC + PipelinesLevel 2: IaC, CaCLevel 1: ScriptingLevel 0: ClickOpsContainerization (marked red)
API-driven

Package management is the process of automating the installation, updating, configuration and removal of software packages to ensure that software dependencies are correctly managed and maintained in different environments (e.g. Helm charts, APT).

Package management like Helm ChartsLevel 5: IDP + PortalLevel 4: Modules, ChartsLevel 3: IaC + PipelinesLevel 2: IaC, CaCLevel 1: ScriptingLevel 0: ClickOpsPackage management (marked red)
Package management like Helm Charts

Orchestration refers to the automated coordination and management of complex processes or workflows, often involving multiple services and systems, to ensure that they work together efficiently and effectively (e.g. Kubernetes).

Orchestration like KubernetesLevel 5: IDP + PortalLevel 4: Modules, ChartsLevel 3: IaC + PipelinesLevel 2: IaC, CaCLevel 1: ScriptingLevel 0: ClickOpsOrchestration (marked red)
Orchestration like Kubernetes

As you can see, there is a correlation that can be summarized as follows:

  • Level 0-1: Predominantly imperative approaches, without containerization or orchestration.
  • Level 2-3: Transition from imperative to declarative approaches, first steps towards containerization and API integration.
  • Level 4-5: Strong focus on declarative approaches, extensive use of containerization, orchestration and API-driven environments as well as package management.

3. how IDPs fit and what they offer

Most solutions I have seen are structured in a logically similar way. A portal is provided, typically based on an internal developer platform (IDP), which is usually held together by an operator. This means that there is a way to convert infrastructure into a template to make it usable. In other words, there is a level of abstraction that makes it easier to use.

So there are usually at least two sides involved. For example, there is a platform team that defines templates so that developers can use them more easily. The developers, in turn, become users of these templates or can use the same tools to abstract their own applications. But what does abstraction mean in this context?

Let's take a look at the following diagram:

Platform Engineer:Defines the components of a web application (Deployment, Service, Ingress, ServiceAccount).Magic Operator:Translates the abstraction (webapp) into concrete resources and executes the deployment.Developer:Configures only the necessary parameters (e.g. image, hostName) via a user-defined resource (CR) without dealing with the details.

It becomes clear that the platform team abstracts and defines everything needed to deploy a web application service. The developer only has to define 2-3 values instead of dealing with the entire Kubernetes manifests, and an operator takes over the deployment of the web application.

This has been shown in a simplified form, but a portal based on an IDP essentially does nothing different - just more complex. If you replace the simple diagram above with Humanitec's Orchestrator, you will see some similarity in the logic:

Humanitec ScoreDeveloper: Defines application in score.yaml (e.g. product-service, resources such as PostgreSQL, DNS).Infrastructure & Operations: Defines resources per context (e.g.
Humanitec Score

4 Who is the use of an IDP suitable for?

Before I try to answer this question, I would like to show you the following diagram. Please take a minute to think about it:

Skills: Cloud Native RoadmapScale: Low → HighLevels:ImperativeContainerizationDeclarativePackage ManagementAPI-drivenOrchestrationMarking:

Personally, I believe that before you start looking at internal developer platforms (IDPs) and portals, you should first assess the degree of automation. The term "degree of automation" should not be taken literally here, but rather as a synonym for understanding where you currently stand and why.

If you are at levels 0-1, I personally wonder what you want to integrate into the portal - scripts? If you are at levels 2-3, you might want to invest more to reach levels 3-4 first before tackling the IDP issue. It's better to close gaps, including skills gaps, to build a solid foundation for an IDP with Portal.

Most of the companies I've seen that have built an IDP, for example based on Backstage, have been at levels 4-5. For them, Docker, CI/CD, IaC, Kubernetes, etc. have become foundational skills, allowing them to evolve with other topics.

When you start learning math, you start with basic math and don't jump right into advanced math at the university level. The same goes for an IDP + portal in my opinion. It's better to build a solid foundation in the business instead of following trends you can't afford.

5 Do it yourself or buy it?

Most organizations I've spoken to, especially those listed below as non-vendors, prefer SaaS solutions or self-hosted options, with SaaS being the preferred choice. Many organizations are hesitant to use an IDP because they don't want to replace the human component of platform engineering.

The most common statements (against MAKE OR BUY) that I heard were:

  • Providers offer managed services, and their internal teams or external customers have smaller teams, making the effort of implementing an IDP not worthwhile.
  • Many companies prefer to professionalize their various layers first before starting to build an IDP in order to create a solid foundation.
  • Some companies say they are much faster with platform engineering and the human component replacing the portal. You don't need a MAKE OR BUY decision for an IDP/portal.
  • The complexity is already high, and another level would not help the company move forward.


Personal assessment of MAKE OR BUY:

  • Companies at level 0-3 have other challenges than the introduction of an IDP/portal. ❌
  • Service providers should → BUY or MAKE (innovation, new product, etc.).
  • In-house IT companies with fewer experts, but at levels 4-5 → BUY.
  • Companies with few platform engineers and small development teams should → BUY or do platform engineering.
  • Companies with 10-15 platform engineers and 500-1000 developers should → BUY.

It's not an easy decision. I always try to think in terms of values and ask myself: what added value does it bring to the company?

  • Innovation?
  • Better time-to-market?
  • Scalability through self-service?
  • Reducing the cognitive load on platform engineers?
  • etc.

However, there is one point that concerns me greatly in all of these topics, which I will address briefly in the next section.

6 Why is there such a strong focus on developers?

I don't understand why people always talk about developer empowerment; sometimes it feels like treating developers like little kids learning to ride a bike without training wheels. Are IT companies really only made up of developers? To be honest, it bothers me that the rest of the IT professionals in organizations are often overlooked. Is the goal to divide the culture and then bring them back together through DevOps 3.0?

With DevOps, we failed to create a culture between developers, operations and other departments, and now there is platform engineering.

To illustrate my frustration, I'd like to share a development that didn't come about through platform engineering - even though it may seem that way - but through GitOps with tools like Argo CD. This year a new culture has evolved in a large organization and I was live and involved.

In the past, Platform Engineers/Ops worked more closely with developers than with other teams.

PLATFORM: Platform team delivers services.Kubernetes container ship: Infrastructure/platform.FEATURES: Platform enables developers to create features.DEVELOPER: Developer uses provided features and services.Arrows: Illustrate the flow of services (platform → infrastructure) and features (infrastructure → developer).

Now, however, a form of collaboration is emerging that I never thought would happen.

Actors:Service Owner: Uses dashboards.Platform: Delivers services.Developer: Uses features.Provider: Responsible for alerting.Kubernetes container ship: Platform infrastructure.

I see service owners taking an active role in learning how to manage Grafana dashboards and Prometheus alerting as code and deploying them across different clusters using Argo CD. This increases the quality of the service as they understand how the service should be operated. It improves collaboration with developers and operations as they suddenly speak a common language (YAML).

In addition, there are service providers who provide a service across multiple clusters and as product operators now provide their external, customized alerting using the same GitOps practices (with multi-tenancy separation).

If IDPs continue to be built with a developer-centric focus, I fear that the emerging culture will crumble and we will need DevOps 3.0 to rebuild it in the future.

You should definitely take a look at the Platform Engineering Maturity Model!

Next, I'll show you the most popular portal providers, both as SaaS and self-hosted solutions, with whom I've had many conversations or received input.

What portals are there?

Here is an overview of the portals I know from my exchange:

If something is missing, please let us know!

Would you like to make a contribution to this topic? Then go ahead! → https://tag-app-delivery.cncf.io/

Contact information


If you would like to learn more about IDP, please contact us here or simply add me to your LinkedIn network!

Artem Lajko

Artem Lajko, certified Kubestronaut and Platform Engineer at iits-consulting, specializes in GitOps and Kubernetes scalability. He's a published author of the book "Implementing GitOps with Kubernetes", co-founder of connectii.io, and IT freelancer, writing for ITNEXT on Medium. Dedicated to Open Source, Artem helps companies select suitable products, promoting tech adoption and innovation.