Frank Ittermann
Senior Platform Engineer
| medium.com/@frank.ittermann_46267
| linkedin.com/in/frank-ittermann
Career Profile
Platform engineer with 15+ years of experience designing and operating cloud-native infrastructure, Kubernetes platforms, and distributed systems at scale. Focused on developer productivity, self-service platforms, and site reliability. Creator of engine-ci (container-native CI/CD engine) and DuneBot (GitHub App for automated dependency management). Writes about CI/CD, platform engineering, and developer tooling on Medium.
Experience
Jul, 2024 – present
Founder & Open Source Developer
Building CI/CD tooling for container-native development. Projects documented in a 4-part blog series on Medium (12K+ reads).
- engine-ci — Reduced per-project CI setup from hours to minutes by building a container-native pipeline engine in Go (103 releases, 7 GitHub stars) that runs identically locally and in CI via Docker or Podman
- DuneBot — Eliminated 10+ hours/week of manual Dependabot PR triage by building a GitHub App (20 releases) that automates approvals and dependency merges
- Authored a 4-part tech blog series (12K+ reads) documenting the journey from GitHub Actions frustrations, through Dagger.io evaluation, to building engine-ci
- go-file — Published a Go file abstraction package with lazy initialization, buffer I/O, and test-friendly error injection for simplified file testing
Mar, 2023 – present
Flink SE
Senior Platform Engineer
Overseeing the production infrastructure of a cloud-native, event-driven platform serving 1M+ users across Europe on Google Kubernetes Engine (GKE) and GCP.
- Zero-downtime GKE upgrade across 15+ production clusters spanning v1.26 to v1.29 within 3 months, reducing security vulnerabilities by 40% through automated rollout strategies
- Cut infrastructure provisioning from days to minutes by designing Infrastructure as Code for 200+ GCP resources using Terraform and Config Connector
- 5x faster deployment velocity for 30+ developers by architecting a self-service platform on Argo CD and Temporal, enabling independent microservice deployments
- Saved 10+ engineering hours per week by building DuneBot, a GitHub App automating Dependabot PR approvals and dependency merges
- 50+ daily deployments across dev, staging, and production by engineering CI/CD pipelines with GitHub Actions and automated quality gates
- 35% faster incident response through improved Prometheus/Grafana monitoring dashboards and structured on-call runbooks
Feb, 2022 – Feb 2023
Planetly GmbH
Senior Site Reliability Engineer
Managed AWS and Azure infrastructure for an enterprise-scale carbon management platform. Focused on reliability, cost optimization, and cloud migration.
- 99.95% platform uptime and 20% cloud cost reduction by managing 50+ AWS resources via Terraform with reserved instance optimization and right-sizing
- 60% faster infrastructure deployments by designing Terraform CI/CD pipelines with CircleCI and automated provisioning workflows
- Zero data loss migration of 30+ production workloads from AWS to Azure within 4 months, achieving less than 2 hours total downtime
- Reduced mean time to detection from 30 minutes to under 5 minutes by integrating Datadog APM with custom alerting thresholds
- Standardized build environments across 4 development teams by containerizing CI/CD pipelines with Docker
Jan, 2020 – Oct 2021
Data4Life
Team Lead / Senior Site Reliability Engineer
Led a 4-person SRE team for a health-data platform serving 500K+ users. Transitioned from IC to team lead, managing people growth, ISO 27001 compliance, and infrastructure reliability across OpenStack, Kubernetes, and hybrid-cloud.
- Zero major findings in ISO 27001 re-audit by leading a 4-person SRE team through German BSI certification, implementing compliance controls across the infrastructure stack
- 95% of planned OKRs delivered on schedule across 3 quarters by defining team roadmap, resource plans, and cross-team dependencies
- Halved junior SRE onboarding time from 6 to 3 months through structured mentorship program covering tooling, runbooks, and incident response
- PostgreSQL cluster provisioning cut from 4 hours to 30 minutes by designing automated cluster management with Ansible playbooks
- Eliminated manual documentation drift across 200+ infrastructure components by building a CI/CD-driven living documentation system for OpenStack resources
- 25% faster incident resolution by improving Grafana dashboards and alerting rules with SLO-based monitoring
- Open source contribution: upstream fix to HashiCorp Packer enabling OpenStack Glance image upload skip (merged)
Feb, 2011 – Aug, 2017
QualityPark — AWIN AG (formerly zanox AG)
Senior Software Engineer
Led migration of core services from legacy C/C++ to Java EE at AWIN (10M+ API requests/month), later developed enterprise requirement management solutions at QualityPark.
- 10M+ API requests/month served after migrating core services from legacy C/C++ to Java EE platform at AWIN, enabling horizontal scaling and modern CI/CD
- 5K+ requests/second at peak as technical lead for public REST/SOAP API team, architecting OAuth-based authentication (zanox connect) for 100+ B2B/B2C integrations
- 60% reduction in legacy system dependency at QualityPark by reverse engineering database access and web service APIs from C/C++ to Java
- Increased deployment frequency from monthly to weekly by designing CI/CD pipeline with Jenkins and automated testing
Skills
Cloud & Infrastructure:
Google Cloud Platform (GCP), Amazon Web Services (AWS), Microsoft Azure, OpenStack, Cloud-Native Architecture
Container & Orchestration:
Kubernetes (GKE, EKS, AKS), Docker, Helm, Argo CD, Kustomize
Infrastructure as Code:
HashiCorp Terraform, GCP Config Connector, Pulumi, Ansible, Packer, Vagrant
CI/CD & Automation:
GitHub Actions, CircleCI, Jenkins, Travis CI, Dependabot, DuneBot
Programming Languages:
Go (Golang), Python, Bash, Java, TypeScript / JavaScript, Zig
Observability & Monitoring:
Prometheus, Grafana, Datadog, OpenTelemetry, ELK Stack
Platform Engineering:
Internal Developer Platform (IDP), Self-Service Infrastructure, Developer Experience (DX), Temporal
Security & Compliance:
ISO 27001, HashiCorp Vault, Secrets Management, Supply Chain Security
Databases & Messaging:
PostgreSQL, MongoDB, MSSQL, Redis / Memcached
Leadership & Mentoring:
Team Leadership (4+ person teams), Technical Mentorship, Cross-Functional Collaboration, Roadmap Planning, Agile / Scrum
Education
Diploma, Applied Computer Sciene — Fachhochschule für Technik und Wirtschaft Berlin HTW (formerly FHTW), 2001 - 2006