-
Lead, DevOps Support Engineering - Troy, MI
Posted: 05/11/2025Apply Now
We are seeking a Lead DevOps Support Engineer to drive automation, system integration, troubleshooting, and monitoring improvements. As a Project Lead, you will play a key role in optimizing DevOps workflows, enhancing system observability, and ensuring seamless incident resolution.
This is a technical leadership role (not a managerial position), requiring hands-on expertise in automating support processes, integrating infrastructure components, improving monitoring dashboards, and handling tasks for L2 DevOps support engineers. #LI-Hybrid
Key Responsibilities:Automation & Integration
- Design and implement automated CI/CD pipelines tailored to embedded software workflows
- Integrate build systems (e.g., Make, CMake, Bazel) into CI pipelines
- Configure pipelines for cross-compilation targeting various hardware architectures (ARM, RISC-V, etc.)
- Automate firmware packaging and secure signing
- Set up deployment mechanisms for over-the-air (OTA) or USB/SD card updates
- Validate firmware integrity post-deployment using checksums or digital signatures
- Automate L2 support processes, incident resolution, and infrastructure management
- Develop and maintain scripts and automation tools to enhance efficiency and reduce manual work
- Ensure seamless integration between infrastructure, CI/CD pipelines, and monitoring solutions
- Optimize deployment processes and automate recurring operational tasks
Troubleshooting & Support
- Lead DevOps L2 incident response, diagnosing and resolving infrastructure and application issues
- Perform root cause analysis and implement proactive fixes to prevent recurring incidents
- Work closely with L1 and L3 teams to streamline support escalations and improve response times
- Troubleshoot Kubernetes, cloud infrastructure, networking, and deployment failures
Monitoring & Dashboards
- Design, configure, and optimize monitoring and logging dashboards (Prometheus, Grafana, ELK, etc.)
- Improve alerting mechanisms to enhance observability and reduce noise
- Ensure system performance metrics are effectively tracked and visualized for proactive incident management
Process Optimization & Escalation Management
- Define and optimize support workflows for efficient issue resolution
- Establish escalation routes to ensure timely handling of critical incidents
- Evaluate risks associated with deployments and infrastructure changes, implementing mitigation strategies
- Assist in QA validation of infrastructure changes and automation scripts
Required Skills & Experience:
- 5+ years of experience in DevOps, SRE, or L2 technical support roles
- Experience with automated CI/CD pipelines tailored to embedded software workflows
- Experience creating and tracking tasks for L2 DevOps engineers to drive operational efficiency
- Strong expertise in automating support processes and troubleshooting complex systems
- Proficiency in scripting (Bash, Python, or similar) for automation and monitoring
- Hands-on experience with monitoring and logging tools (Prometheus, Grafana, ELK, Datadog, etc.)
- Solid understanding of CI/CD pipelines, infrastructure components, and cloud services (AWS, GCP, or Azure)
- Experience with containerized environments (Docker, Kubernetes) and troubleshooting containerized applications
- Strong analytical skills for root cause analysis, incident resolution, and risk assessment
Tell a Friend
-
Featured Events
-
-
Event Calendar
-
Event Calendar
-
Building Business. Building Community.