Staff Site Reliability Engineer

Modern Health 

Modern Health is a mental health benefits platform for employers. We are the first global mental health solution to offer employees access to one-on-one, group, and self-serve digital resources for their emotional, professional, social, financial, and physical well-being needs—all within a single platform. Whether someone wants to proactively manage stress or treat depression, Modern Health guides people to the right care at the right time. We empower companies to help all their employees be the best version of themselves, and believe in meeting people wherever they are in their mental health journey.

We are a female-founded company backed by investors like Kleiner Perkins, Founders Fund, John Doerr, Y Combinator, and Battery Ventures. We partner with 500+ global companies like Lyft, Electronic Arts, Pixar, Clif Bar, Okta, and Udemy that are taking a proactive approach to mental health care for their employees. Modern Health has raised more than $170 million in less than two years with a valuation of $1.17 billion, making Modern Health the fastest entirely female-founded company in the U.S. to reach unicorn status. 

We tripled our headcount in 2021 and as a hyper-growth company with a fully remote workforce, we prioritize our people-first culture (winning awards including Fortune’s Best Workplaces in the Bay Area 2021). To protect our culture and help our team stay connected, we require overlapping hours for everyone. While many roles may function from anywhere in the world—see individual job listing for more—team members who live outside the Pacific time zone must be comfortable working early in the morning or late at night; all full-time employees must work at least six hours between 8 am and 5 pm Pacific time each workday. 

We are looking for driven, creative, and passionate individuals to join in our mission. An inclusive and diverse culture are key components of mental well-being in the workplace, and that starts with how we build our own team. If you’re excited about a role, we’d love to hear from you!

The Role

In this role, you’ll be given lots of responsibility and the opportunity to have true ownership as we build out the product. This is a unique opportunity to use your engineering powers to make a direct impact in people’s lives. We need a Staff Site Reliability Engineer who is enthusiastic about building reliable, scalable, and flexible systems to support our growing team, product, and user base. You’ll work with other engineers to reliably release and maintain services, and help define and meet internal and customer-facing SLA’s and SLO’s.

This position is not eligible to be performed in Hawaii.

What You’ll Do

  • Manage and orchestrate Cloud Resource (AWS) configuration using Infrastructure As Code (Terraform) to empower engineering staff to embrace a DevOps culture of Self Service Ownership
  • Develop and govern Observability (Datadog) best practices for tracking platform performance and health trends to meet customer SLAs and lead technical decisions with strong supporting evidence
  • Create solutions that dynamically scale based on demand with enough flexibility to pivot for fast changing project requirements while maintaining a balance of good versus perfect
  • Provide strong and consistent communication updates on technical progress or blockers to keep stakeholders informed while additionally creating appropriate documentation on technical design to spread knowledge and reduce information silos
  • Participate and respond to 24/7 on-call critical alerts and follow documented incident investigation procedures to reestablish customer facing feature availability
  • Maintain HIPAA, GDPR, SOC-2 compliance and general security through best practice implementation

Who You Are

  • At least 8+ years of experience in software engineering with 4+ years experience in DevOps
  • Cloud Provider (AWS, GCP, Azure) experience on managing resources through Infrastructure As Code (Terraform) 
  • Container Orchestration (ECS or K8s) experience to confidently build, test, and release containerized applications for multiple environments and regions
  • Knowledge of Observability best practices across common cloud resources (EC2, ECS, RDS, DynamoDB, S3, SQS, Eventbridge) with experience on rolling out enhancements across a distributed platform with scale in mind
  • Experience with shell scripting for *nix systems
  • Experience with Networking for web applications
  • Effective at communicating ideas through writing and diagramming
  • Comfortable working with a distributed development and ops team
  • Familiarity with AWS: ECS and cloud hosting, Gitlab: CI/CD, Python: Django, Flask, aiohttp, Bash, Data: PostgreSQL, Redis, Monitoring: Datadog and Sentry, IaC: Terraform, Packer

Benefits

Fundamentals:

  • Medical / Dental / Vision / Disability / Life Insurance 
  • High Deductible Health Plan with Health Savings Account (HSA) option
  • Flexible Spending Account (FSA)
  • Access to coaches and therapists through Modern Health’s platform
  • Generous Time Off 
  • Company-wide Collective Pause Days