Key Takeaways
- A unified configuration layer abstracts infrastructure, CI/CD, and operational complexity, allowing developers to concentrate on application development.
- A single configuration model per service enables shift-left FinOps by validating resource limits at YAML authoring time.
- Independent CI pipelines feeding a centralized CD pipeline balance team autonomy with consistent deployment practices.
- Centralizing application and infrastructure intent in one configuration makes reviews more effective and predictable.
- This approach delivers visibility and enables a customized internal developer platform aligned with organizational compliance requirements.
In today’s world, developers have to deal with too many different and complicated tools.
Managing Kubernetes, cloud resources, security checks, and deployments across different environments requires significant time and expertise. Platform engineering aims to address this problem by making infrastructure more straightforward to use.
The Problem: Too Much for Developers to Learn
Modern application deployment forces developers to learn many different tools and concepts, to name a few:
- Writing Kubernetes manifests for deployments, services, ingress, and autoscaling.
- Creating cloud resources using SDKs, APIs, or Infrastructure as Code tools like Terraform requires knowledge of cloud services, security models, networking, and cost implications.
- Setting up CI/CD pipelines with build, test, security, and promotion stages.
- Managing secrets and credentials consistently across environments.
Each of these areas is individually manageable, but together they create a steep learning curve. Developers must constantly context-switch between application logic and infrastructure concerns, which slows delivery and increases the likelihood of misconfiguration.
Without centralized guardrails, teams often compensate by over-allocating resources “to be safe”, leading to inconsistent environments and unnecessary cloud spend that is only discovered after deployment.
Bridging the Gap: Why Abstraction Is Necessary
The core challenge is not a lack of tools. Modern platforms provide many ways to build applications, provision cloud resources, configure CI/CD pipelines, manage secrets, and deploy to Kubernetes. The challenge lies in the fragmentation of these responsibilities across different tools, files, and layers of abstraction.
SDKs, APIs, Terraform modules, pipeline definitions, Kubernetes manifests, and environment-specific configurations are all powerful in isolation. However, they expose low-level details and require context across the entire delivery lifecycle. Expecting every application developer to understand and coordinate all of these concerns, alongside writing and testing application code, does not scale, particularly in organizations operating large numbers of microservices.
What is missing is a developer-friendly abstraction that brings these related concerns together. Developers need a way to express intent (not only what infrastructure is required, but also how the application should be built, deployed, configured across environments, secured, and sized) without having to implement the mechanics of each underlying system.
From a platform engineering perspective, this abstraction represents the core of an internal developer platform and can be implemented as a lightweight Python-based platform framework.
A Possible Solution: A Declarative Platform Framework
So what does this abstraction look like in practice?
Rather than introducing another portal or asking developers to learn yet another API, this approach starts from a simpler observation: Developers already work with configuration files every day. The question is whether that familiarity can be extended to unify how applications are built, deployed, configured, and operated.
In this model, a single declarative configuration becomes the primary interface between developers and the delivery system. It captures application intent across the full lifecycle, CI/CD behavior, environment-specific configuration, secrets integration, resource sizing, autoscaling, and Kubernetes placement, while leaving the mechanics of execution to the platform engineering layer.
Behind the scenes, the platform framework consumes this configuration and coordinates the necessary actions across CI/CD pipelines, infrastructure automation, and Kubernetes deployments. For the developer, the interaction remains focused and predictable; for the organization, delivery becomes consistent, reviewable, and policy-aware.
Let’s discuss an example in which a developer needs to deploy a microservice that requires Azure Storage, secure credentials, specific memory and CPU in Kubernetes, auto-scaling rules, dedicated node pools, and a custom web address.
In many organizations, these concerns are implemented across separate repositories and codebases. Cloud resources are defined in dedicated Terraform repositories, Kubernetes deployments and autoscaling rules are maintained in Helm chart repositories, CI/CD behavior is controlled through pipeline definitions, and environment-specific deployment logic may live in separate configuration repositories or deployment tooling such as Puppet modules. Secrets are often managed independently through external secret stores or pipeline variables.
A practical way to make this manageable is to centralize the developer’s input in a single YAML file that lives alongside the service code. That file becomes the authoritative description of the service across environments. It captures infrastructure dependencies implemented in Terraform repositories, Kubernetes runtime settings rendered as Helm configuration, deployment behaviors implemented through Puppet modules, and CI/CD stages executed via shared pipeline templates. Instead of asking developers to modify each of those repositories directly, the YAML acts as a single entry point that the platform tooling interprets and applies consistently.
application:
name: payment-service
runtime: python:3.11
resources:
kubernetes:
cpu: 500m # Maximum: 2000m (validated by schema)
memory: 1Gi # Maximum: 4Gi (validated by schema)
replicas: 3
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilization: 70
nodePool: frontend
azure:
storage:
- name: payment-receipts
type: blob
tier: hot
keyvault:
secrets:
- name: stripe-api-key
source: ENV_STRIPE_KEY
networking:
hostname: payments.example.com
ingress:
tls: enabled
deployment:
tool: puppet
environments:
- development
- staging
- production
YAML should be the developer-facing interface because it offers a pragmatic balance of familiarity, readability, and automation. Most developers already work with YAML through Kubernetes and CI/CD systems, making it a low-friction way to express application intent that fits naturally into version control and code review workflows. Alternatives such as self-service portals (e.g., internal service catalogs or UI-based provisioning tools), custom APIs, or more expressive configuration languages can provide stronger typing or richer abstractions, but often come with higher adoption costs or reduced transparency. YAML also has drawbacks, some of which we will discuss later, including the risk of configuration sprawl and limited native validation, which is why schema checks and explicit versioning are essential, allowing teams to track changes, reason about impact, and evolve the platform safely over time.
This one file is the single source of truth. It drives automated pipelines that handle everything from building and testing code to creating infrastructure and deploying it simultaneously. Because everything is deployed to Kubernetes, managing multiple microservices becomes straightforward, and each service gets its own configuration file with appropriate resource limits, scaling rules, and node pool assignments.
Platform Architecture
The platform comprises several interconnected components. GitLab pipelines coordinate everything, pulling code from repositories, building and unit testing applications (with tests written by developers), checking security, creating cloud infrastructure with Terraform/IaC, and deploying to Kubernetes clusters with Puppet configuration management. The configuration YAML file controls all of this, telling each component what to do.
The architecture clearly separates concerns: the CI pipeline handles code building, testing, and vulnerability scanning. CD pipeline handles deployment: creating cloud resources, updating Kubernetes, and configuring environments. Schema validation happens first, as part of the YAML creation itself at the very beginning, catching configuration errors or functional issues like resource over-allocation immediately (shift left).
Figure 1: Product life cycle from commit to deployment
Why Kubernetes Makes this Better
Deploying everything to a Kubernetes cluster provides several key advantages:
- Easier Microservice Management
Kubernetes is designed to run many microservices together. Each microservice can be managed independently with its own configuration while still working together as a complete application. - Automatic Scaling
When traffic changes, Kubernetes automatically scales optimizing costs. All this is controlled right in the configuration file. This feature incorporates FinOps practice in your workflow. - Dedicated Node Pools
Some services need more memory, while others need more CPU. One can assign services/applications to specific groups of servers (node pools) that match their needs. For example, a high-compute back-end application can run on better memory-intensive nodes (SKUs), while a web API runs on standard nodes. This approach can also be extended to spot node pools (cheaper nodes) in lower environments where application reliability is not a concern, and the nodes can be killed with little warning by the cloud provider, potentially saving thousands of dollars. - Better Code Reviews
When everything is in a single file, reviewers can easily see if someone requested too much memory or CPU. They can check scaling settings, nodepool assignments, and all infrastructure in one place instead of hunting through multiple files. - Cost Control Through Schema Validation
The platform includes schema checks that stop developers from requesting more resources than allowed. If someone tries to request 10GB of memory when the maximum is 4GB, the validation fails immediately when the file is created. This shift-left approach catches resource waste before it happens, making FinOps part of the standard development process.
How Pipeline Works: CI and CD separation
The platform splits CI and CD using multi-project pipelines. This separation gives several benefits:
- CI Pipeline Job
The CI pipeline only works with code. It builds the application, runs tests, checks for security problems, and creates a packaged artifact. The result is a tested, safe, versioned container image ready to deploy anywhere. - CD Pipeline Job
The CD pipeline takes what CI created and handles deployment and infrastructure. It reads the configuration file and makes fundamental changes, creating cloud resources if needed, deploying the application to the Kubernetes cluster using Helm charts, and using a configuration management tool like Puppet for environment setup and deployment.
This split helps each pipeline perform its job better. CI pipelines run fast on every code change, giving quick feedback. CD pipelines run less often, sometimes needing approvals, and focus on making infrastructure changes across environments.
Schema Validation: Catching Problems Early
One of the most powerful features is schema validation, which runs before anything else. The schema defines rules like:
- Maximum CPU: 2000m (2 cores)
- Maximum Memory: 4Gi
- Maximum replicas: 20
- Allowed node pools: standard, high-memory, high-cpu
- Required fields: application name, runtime, resource limits
When a developer creates or updates their configuration file, the schema validation runs immediately. If they try to request 5GiB of memory (more than the 4 GiB limit), the validation fails with a clear error message. This validation happens during YAML creation, not during deployment, saving time and preventing waste.
Smart Infrastructure Creation
When developers specify cloud resources in their configuration file, the CD pipeline checks if these resources exist and creates or updates them as needed. For example, if a developer adds Azure Storage:
azure:
storage:
- name: user-uploads
type: blob
tier: hot
retention_days: 90
The pipeline’s Terraform part:
- Checks if the storage account exists in Azure, more of a Terraform feature itself.
- Creates/updates it based on the settings specified.
- Saves connection information, such as the storage account connection string or password, in Azure Key Vault (again, can be controlled in the YAML).
This approach removes the need for developers and operations to coordinate manually. Developers write what they need; the platform orchestrates it for them.
Built-In Security Checks
Security is not optional, it runs in every pipeline. The CI stage includes:
- Checking Dependencies
Automatically scans third-party packages for known security problems. If problems are found, the pipeline stops and won’t create the deployment package. - Code Security Testing
Analyzes code to find security issues, hardcoded passwords, and potential vulnerabilities. - Container Image Scanning
Checks container images for operating system security vulnerabilities and ensures base images are safe.
By automating these checks, security becomes part of the normal development process rather than a separate approval step. This is especially critical in regulated industries, where early, consistent enforcement of security controls during CI reduces audit risk, shortens review cycles, and prevents non-compliant changes from ever reaching shared or production environments.
Simplifying Kubernetes
For developers, Kubernetes can be very overwhelming, and this solution hides that complexity while keeping the flexibility.
For microservices, this is especially powerful. A team managing ten microservices can keep ten simple configuration files instead of dozens of Kubernetes manifest files. Each microservice clearly states its needs:
# Service 1: Web API - standard resources
resources:
kubernetes:
cpu: 250m
memory: 512Mi
replicas: 5
autoscaling:
enabled: true
minReplicas: 5
maxReplicas: 20
nodePool: standard
# Service 2: Data processor - high memory
resources:
kubernetes:
cpu: 1000m
memory: 4Gi
replicas: 2
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 8
nodePool: high-memory
For web routing, developers just need to specify a hostname:
networking:
hostname: api.payments.example.com
The platform’s deployment pipeline:
- Creates Kubernetes Ingress resources
- Sets up TLS certificates
- Updates DNS records if connected to cloud DNS
- Applies traffic rules based on company standards
Deploying to Multiple Environments
The platform supports deploying to multiple environments, such as development, staging, and production, with environment-specific configuration. In this example, Puppet is used to apply those settings. Still, the same pattern would work equally well with tools like GitOps-based solutions (e.g., GitLab CD) and other configuration management tools (e.g., Ansible, Chef). The approach helps solve a common challenge: maintaining consistency across environments while still allowing necessary differences, such as database endpoints, API credentials, and scaling parameters.
When the CD pipeline deploys to an environment, it:
- Runs the puppet module that is parameterized to take env-specific values.
- Passes application settings from the configuration file.
- Allows Puppet to handle system settings and static fields, such as the Kubernetes cluster name, Azure key vault name, and runtime arguments for deployments.
- Keeps the environment separate while reusing the deployment logic and also maintains its desired state.
This approach uses the best of both tools. Kubernetes handles application scaling and management, while Puppet makes sure system settings are consistent across all servers.
Challenges
While the approach described in the article helps a lot, there are some challenges:
- Limits of Simplification
Very specialized applications may require settings beyond those provided by the simple configuration file. Platform teams need to balance simplicity with customization for advanced use cases or exceptional cases. The configuration file can evolve in response to new infrastructure requirements or developer demands. - Schema Maintenance
As needs evolve, the validation schema must be updated. If it is too restrictive, developers will feel blocked. If it is too loose, cost control suffers and FinOps practices may take a hit. Finding the right balance requires ongoing refinement. - Pipeline Complexity
Multi-stage pipelines that create infrastructure and handle deployments can become complex as more applications are added. One needs to ensure that these pipelines can scale and that the pipeline infrastructure (e.g., Gitlab runners) has the capacity to accommodate them. The deployment pipelines are the busiest in this solution; hence, they need to be revisited periodically to optimize performance and prevent slowdowns. Also, good error messages, detailed logs, and troubleshooting guides are essential. - Secret Safety While managed secret stores provide a secure foundation by ensuring secrets are rotated regularly, accessed safely, and never exposed in logs, careful CI/CD and deployment design is still required.
Measuring Success
Platform engineering should be measured by how it helps developers experience:
- Time to first deployment: How long does it take new developers to deploy their first application?
- Deployment Speed: Are teams deploying more often?
- Developer Happiness: Regular surveys to see if the platform actually makes work easier.
- Cost Efficiency: Are teams using resources more efficiently? Are over-allocation incidents decreasing?
- Review Speed: Did consolidating the configuration into a single file reduce code review time?
- Scalability: Did the approach scale as adoption grew, maintaining performance as more teams used the platform?
In my experience, deployment times went from hours to minutes, and developers shipped features about forty percent faster after adopting the platform. Resource over-allocation dropped by sixty percent through schema validation, directly improving cloud costs.
Conclusion
As companies grow their cloud operations, this model becomes more effective. Instead of every single team solving infrastructure and deployment problems separately, platform engineering provides a “golden path”, a tested, automated, continuously improved way of working that helps the whole organization deliver faster while controlling costs.
The future isn’t about making every developer an infrastructure expert, it’s about building platforms that make infrastructure invisible, operations automatic, and cost control built-in, so developers can focus on writing great code.
