Principal Platform Operations Engineer
Overview
Our client is seeking a hands-on Principal Platform Operations Engineer to lead the transition from on-prem infrastructure to Azure cloud while improving platform reliability, CI/CD maturity, and overall engineering velocity.
This is a player-coach role responsible for stabilizing environments immediately while building the long-term foundation for scalable, secure, and modern cloud-based platform operations. Cloud migration will be delivered incrementally through operational improvements such as telemetry, CI/CD, and API gateway modernization rather than as a standalone lift-and-shift project.
This individual will initially operate as a senior IC while progressively building and leading a small team across cloud infrastructure, security engineering, and database operations. The security engineering function will report into this role, with that hire sequenced after this role and the Principal Architect are both in seat.
Key Responsibilities
Telemetry & Observability (First Priority)
• Stand up telemetry and observability as the first cloud workload using Azure Monitor, Application Insights, and Log Analytics
• Implement correlation IDs across critical workflows for cross-system traceability
• Build operational dashboards and alerting for critical processes
• Establish cross-system event conventions, severity standards, and runbook conventions
Cloud Transformation & Infrastructure
• Drive Azure-first infrastructure strategy where cloud adoption is an enabler for operational improvement
• Design scalable cloud architecture patterns supporting application and data platforms
• Implement infrastructure as code standards and reusable cloud patterns
• Establish cloud governance, cost optimization, and security frameworks
• Manage environment stability across QA, staging, and production during a period where multiple architecture generations coexist in business-critical workflows
Security Execution
• Own security execution for Product & Engineering, including pen test remediation, authentication consolidation, DR/failover testing, and compliance monitoring
• Provide Product & Engineering oversight of IT infrastructure operations and hold IT accountable for security posture and operational reliability
• Define the security engineer role profile and lead that hire once the Principal Architect is also in seat
• Drive authentication consolidation roadmap across identity domains
• Lead database optimization and security remediation efforts across production database environments
Platform & DevOps Engineering
• Build and standardize CI/CD pipelines using Azure DevOps with promotion controls across engineering teams
• Implement monitoring, observability, and alerting capabilities
• Automate operational processes to reduce manual overhead
• Define and implement API gateway and edge controls using Azure API Management for ingress validation, throttling, and authentication
Engineering Enablement & Collaboration
• Partner with engineering teams to improve development workflows and platform usability
• Work closely with the Principal Architect on infrastructure patterns supporting the target-state architecture
• Collaborate with the Data & Analytics Practice Leader on infrastructure supporting the data platform
• Define platform standards supporting reliability, performance, and scalability
• Provide architectural guidance across infrastructure and deployment design decisions
Leadership Responsibilities
• Act as a player-coach while building a small platform engineering function across cloud infrastructure, security engineering, and database operations
• Establish best practices for DevOps maturity across the organization
• Help define long-term org structure across cloud, infrastructure, and platform engineering
• Establish ITSM and incident management processes, including operational runbooks for critical systems
Ideal Background
• Deep Azure cloud experience including migration from on-prem environments
• Strong hands-on DevOps engineering experience in CI/CD, infrastructure automation, and infrastructure as code
• Experience building telemetry and observability foundations using Azure Monitor, Application Insights, or equivalent tools
• Experience improving environment reliability and deployment pipelines
• Experience with security execution in regulated environments, including pen testing remediation, authentication consolidation, and DR/failover
• Experience with ITSM and incident management tooling, along with establishing operational runbooks
• Ability to operate both strategically and tactically
• Experience bridging engineering and IT operations organizations
• Experience building or scaling platform engineering teams
• Proven ability to manage environment stability across legacy and modern applications running concurrently in production
• Experience with database administration and operations, including SQL Server, PostgreSQL, or equivalent, with performance optimization and security hardening
• Background working in regulated environments preferred, such as HIPAA or SOC 2

