Tech Stack
Responsibilities
- Design and evolve reliability architecture for distributed and cloud-hosted systems.
- Define and implement SRE best practices, including SLIs, SLOs, error budgets, and capacity planning.
- Lead incident response processes including on-call rotations, escalation, and post-incident reviews.
- Design and maintain observability systems for metrics, logging, tracing, and alerting.
- Build automation to improve system reliability, deployment safety, and recovery processes.
Soft Skills
Incident ResponseCross-Functional CollaborationChaos Engineering
Benefits
- Health Insurance
- Dental
- Vision
- Life Insurance
- 401k
- Unlimited PTO
- Equity
- Parental Leave
- Remote Stipend
- Gym/Wellness
Culture
Mission-DrivenInnovationIntegrityTransparent LeadershipWork-Life Balance
Requirements
Regions: Us
Get jobs like this in your inbox
Weekly Distributed Systems, Linux, Networking hiring trends and salary data — free.
Join 6 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About HavocAI
Industry: defense
Size: startup
HavocAI is a leader in collaborative autonomy, specializing in autonomous surface vessels for defense and commercial maritime missions, focused on innovation to prevent conflict and save lives.
View company profile →Compensation
Equity: Equity Package