Site Reliability Engineer Remote Jobs

49 Results

2d

Site Reliability Engineer

CruiseUS Remote
Bachelor's degreeterraformpostgresDesignansibleazurec++dockerelasticsearchtypescriptkubernetes

Cruise is hiring a Remote Site Reliability Engineer

We're Cruise, a self-driving service designed for the cities we love.

We’re building the world’s most advanced self-driving vehicles to safely connect people to the places, things, and experiences they care about. We believe self-driving vehicles will help save lives, reshape cities, give back time in transit, and restore freedom of movement for many.

In our cars, you’re free to be yourself. It’s the same here at Cruise. We’re creating a culture that values the experiences and contributions of all of the unique individuals who collectively make up Cruise, so that every employee can do their best work. 

Cruise is committed to building a diverse, equitable, and inclusive environment, both in our workplace and in our products. If you are looking to play a part in making a positive impact in the world by advancing the revolutionary work of self-driving cars, come join us. Even if you might not meet every requirement, we strongly encourage you to apply. You might just be the right candidate for us.

What you’ll be doing:
 

  • Build platforms, services, or tools that enable engineers to deliver operationally mature services in production

  • Help teams with automating tedious tasks and enable them to quickly launch new services and execute optimally

  • Work with service owners to have a proactive approach to designing tests, observing results and creating fixes for complex failure scenarios.

  • Help service owners  identify and instrument Service Level Objectives and design alerts that follow best practices

  • Facilitate blameless postmortems and drive effective action items
     

What you must have:(These are must-haves; skills that someone must have entering the role on day 1 and required for hiring)
 

  • Senior level experience as an Systems Engineer, Site Reliability Engineer, or Production Engineer

  • Significant Experience with Cloud Platforms such as Google Cloud Platform,  Microsoft Azure or Amazon Web Services

  • Fluent with one or more programming languages such as Go, C#, or Python.

  • Excellent ability to debug and optimize code 

  • Experience with Service ecosystem or Developer Platform like Backstage

  • Ability to coordinate and manage incident response

  • Skills in defining  and instrumenting SLOs and SLIs using query languages and observability tooling

  • Automate tasks and processes with open source tools

  • Streaming and Database technologies such as Postgres, Kafka, Cassandra, ElasticSearch, etc.

Bonus points! 

  • Previous experience as an SRE or System Engineer

  • Previous Experience with Backstage

  • Previous experience with Firehydrant or another incident management tool

  • Experience with Typescript

  • Familiarity with Chef, Puppet, Ansible or other Configuration Management Tooling

  • Familiarity with Kubernetes, Docker, Go, Istio, Terraform, Vault, Google Cloud

Why Cruise?

Our benefits are here to support the whole you:

  • Competitive salary and benefits 
  • Medical / dental / vision, Life and AD&D
  • Subsidized mental health benefits
  • Paid time off and holidays
  • Paid parental, medical, family care, and military leave of absence
  • 401(k) Cruise matching program 
  • Fertility benefits
  • Dependent Care Flexible Spending Account
  • Flexible Spending Account & Health Saving Account
  • Perks Wallet program for benefits/perks
  • Pre-tax Commuter benefit plan for local employees
  • CruiseFlex, our location-flexible work policy. (Learn more about CruiseFlex).

We’re Integrated

  • Through our partnerships with General Motors and Honda, we are the only self-driving company with fully integrated manufacturing at scale.

We’re Funded

  • GM, Honda, Microsoft, T. Rowe Price, and Walmart have invested billions in Cruise. Their backing for our technology demonstrates their confidence in our progress, team, and vision and makes us one of the leading autonomous vehicle organizations in the industry. Our deep resources greatly accelerate our operating speed.

Cruise LLC is an equal opportunity employer. We strive to create a supportive and inclusive workplace where contributions are valued and celebrated, and our employees thrive by being themselves and are inspired to do the best work of their lives. We seek applicants of all backgrounds and identities, across race, color, caste, ethnicity, national origin or ancestry, age, citizenship, religion, sex, sexual orientation, gender identity or expression, veteran status, marital status, pregnancy or parental status, or disability. Applicants will not be discriminated against based on these or other protected categories or social identities. Cruise will consider for employment qualified applicants with arrest and conviction records, in accordance with applicable laws.

Cruise is committed to the full inclusion of all applicants. If reasonable accommodation is needed to participate in the job application or interview process please let our recruiting team know or emailHR@getcruise.com.

We proactively work to design hiring processes that promote equity and inclusion while mitigating bias. To help us track the effectiveness and inclusivity of our recruiting efforts, please consider answering the following demographic questions. Answering these questions is entirely voluntary. Your answers to these questions will not be shared with the hiring decision makers and will not impact the hiring decision in any way. Instead, Cruise will use this information not only to comply with any government reporting obligations but also to track our progress toward meeting our diversity, equity, inclusion, and belonging objectives.

Candidates applying for roles that operate and remotely operate the AV:Licensed to drive a motor vehicle in the U.S. for the three years immediately preceding your application, currently holding an active in-state regular driver’s license or equivalent, and no more than one point on driving record. A successful completion of a background check, drug screen and DMV Motor Vehicle Record check is also required.

Note to Recruitment Agencies:Cruise does not accept unsolicited agency resumes. Furthermore, Cruise does not pay placement fees for candidates submitted by any agency other than its approved partners. 

Apply for this job

7d

Senior Site Reliability Engineer (SRE)

CLEAR - CorporateNew York, New York, United States (Hybrid)
Designjava

CLEAR - Corporate is hiring a Remote Senior Site Reliability Engineer (SRE)

Today, CLEAR is well-known as a leader in digital and biometric identification, reducing friction for our members wherever an ID check is needed. We’re looking for a Senior Site Reliability Engineer (SRE) to establish our SRE function. You will join us to accelerate building and scaling our innovative systems that support our growing identity platform. You will drive on SLOs, using them to find and fix gaps in our observability and our overall systems. You will lead reliability-focused practices such as load testing, capacity planning, game days, chaos testing, and incident post-mortems. You will work hand-in-hand with the Software Engineering and Product team on the design, architecture, and implementation of new systems and services.


What You Will Do:

  •  Embed within an Engineering and Product pillar to deeply understand the product and implement observability across all key flows
  • Facilitate and build load testing cases, ensuring we understand the limits and scaling factors of our services and systems
  • Contribute to architecture and design of new services and systems, ensuring highly reliable and scalable concepts are implemented
  • Work closely with Infrastructure, Developer Experience, Networking, and other teams to ensure Product Engineering requirements are met on future roadmaps and technical implementations
  • Build and lead practices such as game days, chaos engineering, and failure analysis
  • Build long-term capacity plans, with an eye toward reliability and cost-efficiency

Who You Are:

  • A software engineer who has worked as an embedded Site Reliability Engineer
  • Experience writing production-grade software in a modern language, such as Java
  • Strong knowledge of distributed systems concepts (think CAP theorem), microservices architecture, and distributed tracing 
  • Experience with modern observability systems such as Datadog
  • Experience with performance debugging tools and patterns. You should be able to read a flame graph
  • A strong product and user-centric mindset
  • Desire to continuously improve systems and environments

How You'll be Rewarded:

At CLEAR we help YOU move forward - because when you’re at your best, we’re at our best. You’ll work with talented team members who are motivated by our mission of making experiences safer and easier.Our hybrid work environment provides flexibility. In our offices, you’ll enjoy benefits like meals and snacks. We invest in your well-being and learning & development with our stipend and reimbursement programs. 

We offer holistic total rewards, including comprehensive healthcare plans, family building benefits (fertility and adoption/surrogacy support), flexible time off, free OneMedical memberships for you and your dependents, and a 401(k) retirement plan with employer match. The base salary range for this role is $175,000 - $215,000, depending on levels of skills and experience.

The base salary range represents the low and high end of CLEAR’s salary range for this position. Salaries will vary depending on various factors which include, but are not limited to location, education, skills, experience and performance. The range listed is just one component of CLEAR’s total compensation package for employees and other rewards may include annual bonuses, commission, Restricted Stock Units

About CLEAR

Have you ever had that green-light feeling? When you hit every green light and the day just feels like magic. CLEAR's mission is to create frictionless experiences where every day has that feeling. With more than 22+ million passionate members and hundreds of partners around the world, CLEAR’s identity platform is transforming the way people live, work, and travel. Whether it’s at the airport, stadium, or right on your phone, CLEAR connects you to the things that make you, you - unlocking easier, more secure, and more seamless experiences - making them all feel like magic. 

CLEAR provides reasonable accommodation to qualified individuals with disabilities or protected needs. Please let us know if you require a reasonable accommodation to apply for a job or perform your job. Examples of reasonable accommodation include, but are not limited to, time off, extra breaks, making a change to the application process or work procedures, policy exceptions, providing documents in an alternative format, live captioning or using a sign language interpreter, or using specialized equipment.

See more jobs at CLEAR - Corporate

Apply for this job

7d

Principal Site Reliability Engineer

BrightspeedCharlotte, NC, Remote
DevOPSMaster’s DegreeterraformansibledockerkubernetesAWS

Brightspeed is hiring a Remote Principal Site Reliability Engineer

Job Description

We are currently looking for a Principal Site Reliability Engineer to join our growing team. In this role, you will implement and maintain monitoring systems to track the performance and availability of business-critical systems and infrastructure using metrics to identify trends and potential issues. You will also work closely with development teams, operations, and other stakeholders to ensure that new services and features are reliable and scalable.

As a Principal Site Reliability Engineer, your duties and responsibilities will include:

  • Implement and maintain monitoring systems to track the performance and availability of Business-critical systems and infrastructure. Use metrics to identify trends and potential issues.
  • Respond to system outages and performance issues, performing root cause analysis to prevent recurrence
  • Develop scripts and tools to automate repetitive tasks, such as deployment, scaling, and monitoring
  • Work closely with development teams, operations, and other stakeholders to ensure that new services and features are reliable and scalable
  • Work on reducing latency and improving the speed of data transmission across the network
  • Define and measure Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure services meet required performance and availability targets+
  • Conduct postmortems after incidents to identify what went wrong and what can be improved
  • Work with Lead Application owners and internal Change Management to review code changes and support deployments
  • Lead the team of site reliability engineers onshore/offshore, mentor them for support activities required for system reliability
  • Must have ability to communicate and abstract the messaging to multiple target audiences including Sr business & IT leadership, technology, and business teams.

Qualifications

WHAT IT TAKES TO CATCH OUR EYE:

  • Master’s degree in computer science, telecommunications, or similar areas, with a minimum of 10 years software engineering experience, including a minimum of 5 years as a site reliability engineer
  • Proven track record of managing mission critical customer facing applications for reliability
  • 5+ years of experience supporting operations and maintenance for cloud-native applications in production that are fault-tolerant, self-healing, scalable and high available
  • Excellent troubleshooting and problem-solving skills, with a keen attention to detail to identify and resolve complex production issues
  • Deep understanding of cloud computing platforms (GCP) and containerization technologies (e.g., Docker, Kubernetes)
  • Solid experience with core Kubernetes concepts such as Pods, Workloads, Services, Ingress/Egress, Deployments, ConfigMaps, HPA, Liveliness Probe, and Secrets
  • Strong knowledge of infrastructure as code tools (e.g., Terraform, Ansible, ArgoCD) and CI/CD pipelines
  • Strong experience working with integration of code quality tool (SonarQube or Checkmarx) with CI/CD pipeline
  • Strong experience with monitoring, logging, and observability tools like, Splunk, GCP log, Dynatrace etc.
  • Ability to work independently and as part of a collaborative team, effectively communicating technical concepts to both technical and non-technical stakeholders
  • Must have proven written and verbal communication skills, including presentations using tools like PowerPoint
  • Must have ability to communicate and abstract the messaging to multiple target audiences including Sr business & IT leadership, technology and business teams

BONUS POINTS FOR:

  • Certifications such as Google Professional Cloud DevOps Engineer or AWS Certified DevOps Engineer 

 

#LI-SS1

See more jobs at Brightspeed

Apply for this job

10d

Copy of Senior Site Reliability Engineer - Brazil

PodiumRemote, Brazil
Bachelor's degreeterraformDesignansibleazurerubydockerkuberneteslinuxpythonAWS

Podium is hiring a Remote Copy of Senior Site Reliability Engineer - Brazil

At Podium, our mission is to help local businesses win. Our lead conversion platform, powered by AI and integrations, helps local businesses convert leads faster, communicate easier, and make more sales. Every day, thousands of local businesses utilize our review management, communication, marketing, and payments products. 

Our work and focus on helping local businesses thrive has been recognized across the industry, including Forbes’ Next Billion Dollar Startups, Forbes’ Cloud 100, the Inc. 5000, and Fast Company’s World’s Most Innovative Companies.

At Podium, we believe in fostering a culture that thrives on hiring and developing exceptional talent. Our operating principles serve as a compass, guiding daily behavior and decision-making, and ensure we hire people who will thrive at Podium. If you resonate with our operating principles and are energized by our mission, Podium will be a great place for you!

The Role:

A Site Reliability Engineer borders the worlds of software engineering and systems engineering. At Podium, the SRE team drives our products to success by building a stable, scalable, sustainable, and slick system. We permanently sit and sup with the product engineering teams to address all of their needs, and work as an SRE guild to build a world-class platform for our products to run on. We're currently targeting a senior SRE to come in and deliver impact from day one.

What you will be doing: 

  • Work with the following technologies: Kubernetes, Helm, Docker, AWS, Terraform, Datadog, Prometheus, Ansible, StrongDM, Python, Go, Ruby, GitLab and GitLab CI.
  • Engaging with Podium's engineering community to identify potential areas of improvement or pain points and making Podium's systems safer and more pleasant to operate.
  • Participating in an on-call rotation for the services the team owns, triaging and addressing production as well as development issues.
  • Working cross-functionally with different teams to make sure that there is no down time for our products.
  • Mentoring junior engineers on the team.

What you should have: 

  • Bachelor’s degree in a technical field or relevant work experience.
  • 4+ years experience working alongside a production system in either a software engineer or systems engineer type role
  • 3+ years deploying, operating and debugging server software on Linux
  • Curiosity and the desire to learn
  • Ability to take a rotating on-call shift

What we hope you have: 

  • Experience with distributed systems and microservices
  • Practical knowledge of system design
  • Cloud computing, such as AWS, GCP, or Azure
  • SOC2, HIPAA, PCI, or other regulatory or compliance standards
  • Building and maintaining a CI/CD pipeline
  • Heavy Infrastructure experience 

See more jobs at Podium

Apply for this job

10d

Junior Site Reliability Engineer (Azure)

MedfarMontréal, Canada, Remote
DevOPS2 years of experienceterraformsqlazurec++.net

Medfar is hiring a Remote Junior Site Reliability Engineer (Azure)

Job Description

As a Junior Site Reliability Engineer (SRE) you will play a crucial role within the R&D and Innovation department. You will be called upon to collaborate with the Plexia product-aligned and core architecture team. The highly sensitive nature of health and medical systems expertise makes it so that the availability and reliability of our systems are of paramount importance to MEDFAR.

The goal of the Site Reliability Engineering (SRE) team is to enable the Plexia team to deliver work with substantial autonomy, therefore they will be collaborating with team members across the company to help them achieve better outcomes and to provide them with the necessary tools and technologies to deliver them. As part of the SRE team, you will be joining the team accountable for the operation, resilience and backup of the organization’s tools, products, data and services.

What you will be working on: 

  • Refining and extending current monitoring capabilities to track essential service-level indicators and ensure visibility of these metrics.

  • Improving our infrastructure and software by collaborating extensively with the core architecture and product-aligned teams to identify and deliver improvements that enhance site availability through scalable, secure, and resilient architectures.

  • Defining and executing test plans that aim to ensure the robustness and resilience of our infrastructure and software systems.

  • Managing incidents and emergency response, tracking outages, ensuring data integrity and participating in release management to promote safe, efficient and rapid deployments.

Qualifications

Contribute to our team with your strengths:

  • 1-2 years of experience working in site reliability engineering-related projects (required) plus additional experience in system administration, DevOps or software engineering roles (an asset)

  • Knowledge of Microsoft Azure specifically with high-reliability architecture and security hardening.

  • Experience with CI/CD processes and Azure DevOps pipelines.

  • Proficient in PowerShell.

  • Experience with Windows and Network setup and management

  • Experience in C#, .NET frameworks, and SQL programming

  • Experience in SQL Database Management

  • Strong ability and rigor in documenting tasks and procedures with detail

  • Experience working with Terraform or another IaC framework, an asset 

  • Bilingual (FR/EN). The ability to communicate in English is required as many team members are located in BC.  

Working conditions:

  • Full-time permanent role, 40 hours per week schedule. 
  • 'Emergency working hours' may occasionally be necessary to ensure system stability and address critical issues promptly.
  • Flexibility in working hours is important to collaborate with team members in the Pacific Standard Time zone. 

See more jobs at Medfar

Apply for this job

28d

Junior Site Reliability Engineer

NextivaPoland (Remote)
DevOPSsqloracleDesignjavalinux

Nextiva is hiring a Remote Junior Site Reliability Engineer

Redefine the future of customer experiences. One conversation at a time.

We’re changing the game with a first-of-its-kind, conversation-centric platform that unifies team collaboration and customer experience in one place. Powered by AI, built by amazing humans.

Our culture is forward-thinking, customer-obsessed and built on an unwavering belief that connection fuels business and life; connections to our customers with our signature Amazing Service®, our products and services, and most importantly, each other. Since 2008, 100,000+ companies and 1M+ users rely on Nextiva for customer and team communication.

If you’re ready to collaborate and create with amazing people, let your personality shine and be on the frontlines of helping businesses deliver amazing experiences, you’re in the right place. 

Build Amazing - Deliver Amazing - Live Amazing - Be Amazing

 

We are looking for an Operations Site Reliability Engineer to enhance, support, and troubleshoot our SaaS and VOIP platforms for our Business Technology program. We’re looking for someone with a wide breadth of knowledge, experience, and interest in a range of technology domains. This role will ensure the continued stability of our production applications while improving automation, alerting, and monitoring. We deal with many different technologies; a desire to learn and a hunger to work on challenging projects is a must.

Key Responsibilities:

  • Triage, troubleshoot, and fix production problems in every layer of the stack, with a focus on Oracle and billing systems
  • Design, develop, improve, and tune logging, monitoring, and alerting
  • Create actionable alerts to fix system outages before they occur
  • Write software to improve reliability and recoverability of production systems
  • Identify manual work, document the fix in the form of a runbook, then automate it away
  • Perform and automate system administration tasks
  • Participate in 24/7 on-call rotation supporting production systems

Qualifications:

  • Bachelor’s degree in Computer Science or related field, or equivalent work experience
  • 0-2 years of Oracle systems experience
  • 0-2 years of software development experience
  • 0-2 years of Linux system administration experience
  • 0-2 years of performance engineering experience
  • Understanding and experience working with RESTful APIs
  • Experience with triaging troubleshooting complex systems
  • Experience working with source control
  • Experience with containerization and container orchestration
  • Experience with application performance monitoring
  • Experience with web technology components including relational and SQL Databases, Apache, Tomcat, Java, packet monitoring
  • Experience with microservice environments and distributed systems
  • Familiarity with front-end technologies
  • Ability to clearly communicate technical concepts
  • Understanding of general SRE concepts and DevOps principles
  • Familiar with the SIP concepts and troubleshooting

Nextiva Core Competencies / DNA:

  • Drives Results:  The successful candidate will be action oriented, with a passion for solving problems.  They will bring clarity and simplicity to ambiguous situations.  This individual will challenge the status quo; asking what we can do differently and finding ways to create and build more success.  S/he is a change agent, prepared to lead and drive changes as we transform. 
  • Critical Thinker:  The successful candidate is fact based and data driven, able to understand and articulate the “why,” identifying key drivers and learning from the past.  They are forward-thinking, anticipating problems before they arise.  They’ll recommend and action well thought out solutions, understanding the risks and dependencies. 
  • Right Attitude:  The successful candidate will be team-oriented, collaborative and competitive with a winning mindset; they’re resilient and able to easily bounce back from setbacks.  S/he will be able to zoom in / out, willing to be hands-on to help solve important problems while being a motivating figure for the team along the way.  S/he will embrace a culture of service and learning with a focus on caring, supporting and respecting our customers and team members.

Rewards & Benefits:

Nextiva provides a comprehensive employee benefits package that includes highly competitive salary, medical and life insurance after probation, paid parental leave as per Company policy, employee recognition initiatives,  various employee wellness programs and loads of learning and development opportunities which are coupled with career paths to last a lifetime. Great opportunity to work and build a career in international environment is supplemented by friendly atmosphere and professional team.

#LI-SC1 #LI-Remote

Apply for this job

28d

Junior Site Reliability Engineer

NextivaUnited States (Remote)
DevOPSsqloracleDesignjavac++linux

Nextiva is hiring a Remote Junior Site Reliability Engineer

Redefine the future of customer experiences. One conversation at a time.

We’re changing the game with a first-of-its-kind, conversation-centric platform that unifies team collaboration and customer experience in one place. Powered by AI, built by amazing humans.

Our culture is forward-thinking, customer-obsessed and built on an unwavering belief that connection fuels business and life; connections to our customers with our signature Amazing Service®, our products and services, and most importantly, each other. Since 2008, 100,000+ companies and 1M+ users rely on Nextiva for customer and team communication.

If you’re ready to collaborate and create with amazing people, let your personality shine and be on the frontlines of helping businesses deliver amazing experiences, you’re in the right place. 

Build Amazing - Deliver Amazing - Live Amazing - Be Amazing

 

We are looking for an Operations Site Reliability Engineer to enhance, support, and troubleshoot our SaaS and VOIP platforms for our Business Technology program. We’re looking for someone with a wide breadth of knowledge, experience, and interest in a range of technology domains. This role will ensure the continued stability of our production applications while improving automation, alerting, and monitoring. We deal with many different technologies; a desire to learn and a hunger to work on challenging projects is a must.

Key Responsibilities:

  • Triage, troubleshoot, and fix production problems in every layer of the stack, with a focus on Oracle and billing systems
  • Design, develop, improve, and tune logging, monitoring, and alerting
  • Create actionable alerts to fix system outages before they occur
  • Write software to improve reliability and recoverability of production systems
  • Identify manual work, document the fix in the form of a runbook, then automate it away
  • Perform and automate system administration tasks
  • Participate in 24/7 on-call rotation supporting production systems

Qualifications:

  • Bachelor’s degree in Computer Science or related field, or equivalent work experience
  • 0-2 years of Oracle systems experience
  • 0-2 years of software development experience
  • 0-2 years of Linux system administration experience
  • 0-2 years of performance engineering experience
  • Understanding and experience working with RESTful APIs
  • Experience with triaging troubleshooting complex systems
  • Experience working with source control
  • Experience with containerization and container orchestration
  • Experience with application performance monitoring
  • Experience with web technology components including relational and SQL Databases, Apache, Tomcat, Java, packet monitoring
  • Experience with microservice environments and distributed systems
  • Familiarity with front-end technologies
  • Ability to clearly communicate technical concepts
  • Understanding of general SRE concepts and DevOps principles
  • Familiar with the SIP concepts and troubleshooting

Nextiva Core Competencies / DNA:

  • Drives Results:  The successful candidate will be action oriented, with a passion for solving problems.  They will bring clarity and simplicity to ambiguous situations.  This individual will challenge the status quo; asking what we can do differently and finding ways to create and build more success.  They are a change agent, prepared to lead and drive changes as we transform. 
  • Critical Thinker:  The successful candidate is fact based and data driven, able to understand and articulate the “why,” identifying key drivers and learning from the past.  They are forward-thinking, anticipating problems before they arise.  They’ll recommend and action well thought out solutions, understanding the risks and dependencies. 
  • Right Attitude:  The successful candidate will be team-oriented, collaborative and competitive with a winning mindset; they’re resilient and able to easily bounce back from setbacks.  They will be able to zoom in / out, willing to be hands-on to help solve important problems while being a motivating figure for the team along the way.  They will embrace a culture of service and learning with a focus on caring, supporting and respecting our customers and team members.

Compensation, Rewards & Benefits:

The salary or hourly wage offered by Nextiva to external candidates considers a wide range of factors, including but not limited to skills sets, experience, training, licensure and certifications, etc. Our compensation decisions are dependent on the facts and circumstances of each case. Our estimate of the expected hiring range for the position as posted is $57,000 - $84,650 A different level in the job hierarchy may apply to a specific candidate resulting in a different hiring range.

Nextiva provides a comprehensive employee benefits package that includes medical (including supplemental plans for accident, hospitalization and critical illness), telemedicine, dental, vision, disability, life insurance, legal assistance, an Employee Assistance Plan, paid parental bonding leave, PTO for hourly employees and Flexible Time Off (FTO) for salaried employees, an employee long-term savings plan (401k) through Fidelity with Nextiva matching, comprehensive employee wellness programs and loads of learning and development opportunities which are coupled with career paths to last a lifetime.

Interested in joining our amazing team at Nextiva HQ? Apply today as we launch the future of business conversations!????

Established in 2008 and headquartered in Scottsdale, Arizona, Nextiva secured $200M from Goldman Sachs in late 2021, valuing the company at $2.7B.To check out what’s going on at Nextiva, check us out on Instagram, Instagram (MX), YouTube, LinkedIn, and the Nextiva blog

Nextiva is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We prohibit discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.Nextiva participates in the E-Verify Program where and as required by law. For additional information about E-Verify visit USCIS

#LI-RQ1 #LI-Remote

See more jobs at Nextiva

Apply for this job

29d

Site Reliability Engineer - Brazil

PodiumRemote, Brazil
Bachelor's degreeterraformDesignansibleazurerubydockerkuberneteslinuxpythonAWS

Podium is hiring a Remote Site Reliability Engineer - Brazil

At Podium, our mission is to help local businesses win. Our lead conversion platform, powered by AI and integrations, helps local businesses convert leads faster, communicate easier, and make more sales. Every day, thousands of local businesses utilize our review management, communication, marketing, and payments products. 

Our work and focus on helping local businesses thrive has been recognized across the industry, including Forbes’ Next Billion Dollar Startups, Forbes’ Cloud 100, the Inc. 5000, and Fast Company’s World’s Most Innovative Companies.

At Podium, we believe in fostering a culture that thrives on hiring and developing exceptional talent. Our operating principles serve as a compass, guiding daily behavior and decision-making, and ensure we hire people who will thrive at Podium. If you resonate with our operating principles and are energized by our mission, Podium will be a great place for you!

The Role:

A Site Reliability Engineer borders the worlds of software engineering and systems engineering. At Podium, the SRE team drives our products to success by building a stable, scalable, sustainable, and slick system. We permanently sit and sup with the product engineering teams to address all of their needs, and work as an SRE guild to build a world-class platform for our products to run on. We're currently targeting a senior SRE to come in and deliver impact from day one.

What you will be doing: 

  • Work with the following technologies: Kubernetes, Helm, Docker, AWS, Terraform, Datadog, Prometheus, Ansible, StrongDM, Python, Go, Ruby, GitLab and GitLab CI.
  • Engaging with Podium's engineering community to identify potential areas of improvement or pain points and making Podium's systems safer and more pleasant to operate.
  • Participating in an on-call rotation for the services the team owns, triaging and addressing production as well as development issues.
  • Working cross-functionally with different teams to make sure that there is no down time for our products.
  • Mentoring junior engineers on the team.

What you should have: 

  • Bachelor’s degree in a technical field or relevant work experience.
  • 4+ years experience working alongside a production system in either a software engineer or systems engineer type role
  • 3+ years deploying, operating and debugging server software on Linux
  • Curiosity and the desire to learn
  • Ability to take a rotating on-call shift

What we hope you have: 

  • Experience with distributed systems and microservices
  • Practical knowledge of system design
  • Cloud computing, such as AWS, GCP, or Azure
  • SOC2, HIPAA, PCI, or other regulatory or compliance standards
  • Building and maintaining a CI/CD pipeline
  • Heavy Infrastructure experience 

See more jobs at Podium

Apply for this job

30d

Site Reliability Engineer - II

Live PersonHyderabad, Telangana, India (Remote)
terraformnosqlpostgressqlansiblemongodbazureelasticsearchMySQLkuberneteslinuxjenkinsAWS

Live Person is hiring a Remote Site Reliability Engineer - II

LivePerson (NASDAQ: LPSN) is the global leader in enterprise conversations. Hundreds of the world’s leading brands — including HSBC, Chipotle, and Virgin Media — use our award-winning Conversational Cloud platform to connect with millions of consumers. We power nearly a billion conversational interactions every month, providing a uniquely rich data set and safety tools to unlock the power of Conversational AI for better customer experiences.

At LivePerson, we foster an inclusive workplace culture that encourages meaningful connection, collaboration, and innovation. Everyone is invited to ask questions, actively seek new ways to achieve success, nd reach their full potential. We are continually looking for ways to improve our products and make things better. This means spotting opportunities, solving ambiguities, and seeking effective solutions to the problems our customers care about.

Overview:

LivePerson is looking for a Site Reliability/DevOps Engineer for the GPT (Global Product & Technology) Division. You will be part of the LivePerson SRE team building and managing highly available, distributed systems. You will have the opportunity to be part of a strong team and enjoy the work environment of a start-up, with a robust product and the benefits of a leading company in its field.

You will: 

  • Ensure product high uptime and reliability 24x7.
  • Manage Linux servers in a multi-cloud environment
  • Manage high availability Kubernetes resources using Helm charts
  • Assist with deploying upgrades and patches using Puppet/Ansible/Chef/Helm
  • Monitoring and troubleshooting warnings and alerts related to the reporting platform’s performance
  • Develop monitoring resources and alerting systems such as Grafana, Prometheus, Kibana, DataDog and PagerDuty
  • Coordinate with DBA and developers to manage SQL and NOSQL database systems, including MongoDB, ElasticSearch, Postgres, MySQL and others
  • Managing message bus systems such as Kafka and Pulsar

You have:

  • Minimum 3+ years of experience of managing cloud based production environment (AWS, GCP, Azure, etc)
  • Highly experienced working in the Linux environment, good scripting in Bash / Python.
  • Highly experienced working configuration management systems like Puppet, OpsCode Chef, Ansible, etc.
  • Strong experience in Terraform, CloudFormation or other IAC
  • Experienced in SQL, including DDL and complex queries
  • Experienced working in the Kubernetes platform
  • Experience working in a microservices architecture using a message bus
  • Good knowledge of CI/CD pipelines orchestrators like TeamCity, Jenkins, Gitlab.
  • Highly motivated and independent.
  • Team player and excellent interpersonal Skills.
  • Excellent written and verbal communication skills.
  • BS in Computer Science or a related field, or equivalent work experience.
  • A strong background in cloud, network and application security and compliance
  • Experience with GPT or other LLMs a strong advantage

Benefits

  • Health: Medical, Dental, and Vision
  • Time away: Vacation and holidays
  • Development: Generous tuition reimbursement and access to internal professional development resources.
  • Equal opportunity employer

Why You’ll Love Working Here

As leaders in enterprise customer conversations, we celebrate diversity, empowering our team to forge impactful conversations globally. LivePerson is a place where uniqueness is embraced, growth is constant, and everyone is empowered to create their own success. And, we're very proud to have earned recognition from Fast Company, Newsweek, and BuiltIn for being a top innovative, beloved, and remote-friendly workplace.

Belonging At LivePerson

We are proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants with criminal histories, consistent with applicable federal, state, and local law.

We are committed to the accessibility needs of applicants and employees. We provide reasonable accommodations to job applicants with physical or mental disabilities. Applicants with a disability who require reasonable accommodation for any part of the application or hiring process should inform their recruiting contact upon initial connection.

Apply for this job

+30d

Site Reliability Engineer

BrazeRemote - Ontario
SalesDevOPSredisterraformpostgresmongodbrubydockerkuberneteslinux

Braze is hiring a Remote Site Reliability Engineer

At Braze, we have found our people. We’re a genuinely approachable, exceptionally kind, and intensely passionate crew.

We seek to ignite that passion by setting high standards, championing teamwork, and creating work-life harmony as we collectively navigate rapid growth on a global scale while striving for greater equity and opportunity – inside and outside our organization.

To flourish here, you must be prepared to set a high bar for yourself and those around you. There is always a way to contribute: Acting with autonomy, having accountability and being open to new perspectives are essential to our continued success. Our deep curiosity to learn and our eagerness to share diverse passions with others gives us balance and injects a one-of-a-kind vibrancy into our culture.

If you are driven to solve exhilarating challenges and have a bias toward action in the face of change, you will be empowered to make a real impact here, with a sharp and passionate team at your back. If Braze sounds like a place where you can thrive, we can’t wait to meet you.

Site Reliability Engineers (SREs) are responsible for keeping all internal-facing services and platforms running smoothly. In a nutshell, SREs ensure site uptime. SREs blend sensible system administrators and software engineers who apply sound engineering principles, operational discipline, and mature automation to the environments and infrastructure services we provide. We specialize in systems–whether it be networking, the Linux kernel, or some more specific interest in scaling–algorithms or distributed systems.

Our team helps to improve automation, infrastructure reliability, and empowers Braze’s other engineering teams to leverage the infrastructure products and platforms we create easily. Braze operates at a massive scale with over 3.3 billion monthly active users across our customers, collecting hundreds of billions of data points each month, and sending billions of messages to end-users daily. We use a diverse technology stack rooted in Ruby on Rails, MongoDB, Redis, Kafka, Kubernetes, and more.  As a Site Reliability Engineer at Braze, you will collaborate with your team and consumer engineering teams to continuously improve the infrastructure, automation, and tooling that build internal products from these technologies.

WHAT YOU'LL DO

  • Partner with Braze’s engineering teams on:
    • Architecting products to effectively utilize infrastructure platforms in a scalable, reliable manner
    • Debugging reliability and scalability issues across all stack layers, including the products built using our infrastructure platforms
    • Make monitoring and alerting alerts on symptoms and not on outages
    • Ensure that Braze meets our strict enterprise-grade SLAs with customers
  • Develop Braze’s internal platform infrastructure:
    • Create Infrastructure as code using  Chef, Terraform, and Kubernetes
    • Develop deployment pipelines for applications in multiple languages using Docker, Kubernetes, etc.
    • Provide centralized/common tooling, services, and automation frameworks that are critical for scaling operations, capacity management, reducing operational pain, and improving the day-to-day workflow of Braze’s engineering teams
  • Manage incidents:
    • Be on a PagerDuty rotation to respond to availability incidents and provide support for other engineers
    • Use your on-call shift to prevent incidents from ever happening
    • Retrospect everything that happens to turn lessons into system improvements/changes, automation, etc.

WHO YOU ARE

  • 3+ years of experience as a Software, DevOps, or Site Reliability Engineer
  • You think about systems - interfaces, boundaries, edge cases, failure modes, behaviors, specific implementations
  • Have an urge to collaborate, document, and deliver quickly
    • Collaborating across the global remote teams, often working asynchronously
    • Document everything so you don't need to learn the same thing (or plan the same work) twice
    • Delivering fast to delight our customers–even internal ones
  • Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it
  • Have a desire to solve everyday challenges facing software engineers and automate their toil away
  • Have an excellent ability to manage multiple tasks and expectations at once
  • Know your way around Linux and Unix Shell
  • Have strong programming skills - Ruby and/or Go preferred
  • Have experience with Docker, Kubernetes, Terraform, or similar IaC technologies
  • Have experience with MongoDB, Redis, Kafka, Postgres, or similar data technologies

WHAT WE OFFER

Details of these benefits plan will be provided if a candidate receives an offer of employment. Benefits may vary by location.

From offering comprehensive benefits to fostering flexible environments, we’ve got you covered so you can prioritize work-life harmony.

  • Competitive compensation that may include equity
  • Retirement and Employee Stock Purchase Plans
  • Flexible paid time off
  • Comprehensive benefit plans covering medical, dental, vision, life, and disability
  • Family services that include fertility benefits and equal paid parental leave
  • Professional development supported by formal career pathing, learning platforms, and tuition reimbursement
  • Community engagement opportunities throughout the year, including an annual company wide Volunteer Week
  • Employee Resource Groups that provide supportive communities within Braze
  • Collaborative, transparent, and fun culture recognized as a Great Place to Work®

ABOUT BRAZE

Braze is a leading customer engagement platform that powers lasting connections between consumers and brands they love. Braze allows any marketer to collect and take action on any amount of data from any source, so they can creatively engage with customers in real time, across channels from one platform. From cross-channel messaging and journey orchestration to Al-powered experimentation and optimization, Braze enables companies to build and maintain absolutely engaging relationships with their customers that foster growth and loyalty.

Braze is proudly certified as a Great Place to Work® in the U.S., the UK and Singapore. We ranked #3 on Great Place to Work UK’s 2024 Best Workplaces (Large), #3 on Great Place to Work UK’s 2023 Best Workplaces for Wellbeing (Medium), #4 on Great Place to Work’s 2023 Best Workplaces in Europe (Medium), #10 on Great Place to Work UK’s 2023 Best Workplaces for Women (Large), #19 on Fortune’s 2023 Best Workplaces in New York (Large). We were also featured in Built In's 2024 Best Places to Work, U.S. News Best Technology Companies to Work For, and Great Place to Work UK’s 2023 Best Workplaces in Tech.

You’ll find many of us at headquarters in New York City or around the world in Austin, Berlin, Chicago, Jakarta, London, Paris, San Francisco, Singapore, Sydney and Tokyo – not to mention our employees in nearly 50 remote locations.

BRAZE IS AN EQUAL OPPORTUNITY EMPLOYER

At Braze, we strive to create equitable growth and opportunities inside and outside the organization.

Building meaningful connections is at the heart of everything we do, and that includes our recruiting practices. We're committed to offering all candidates a fair, accessible, and inclusive experience – regardless of age, color, disability, gender identity, marital status, maternity, national origin, pregnancy, race, religion, sex, sexual orientation, or status as a protected veteran. When applying and interviewing with Braze, we want you to feel comfortable showcasing what makes you you.

We know that sometimes different circumstances can lead talented people to hesitate to apply for a role unless they meet 100% of the criteria. If this sounds familiar, we encourage you to apply, as we’d love to meet you.

Please see ourCandidate Privacy Policy for more information on how Braze processes your personal information during the recruitment process and, if applicable based on your location, how you can exercise any privacy rights.

See more jobs at Braze

Apply for this job

+30d

Sr. Site Reliability Engineer II

Life36Remote, Canada
Bachelor's degreeremote-firstterraformscalaDesignmobileansibleazureapijavac++pythonAWSbackendPHP

Life36 is hiring a Remote Sr. Site Reliability Engineer II

About Life360

Life360’s mission is to keep people close to the ones they love. Our category-leading mobile app and Tile tracking devices empower members to protect the people, pets, and things they care about most with a range of services, including location sharing, safe driver reports, and crash detection with emergency dispatch. Life360 serves approximately 66 million monthly active users (MAU) across more than 150 countries. 

Life360 delivers peace of mind and enhances everyday family life with seamless coordination for all the moments that matter, big and small. By continuing to innovate and deliver for our customers, we have become a household name and the must-have mobile-based membership for families (and those friends that basically are family). 

Life360 has more than 500 (and growing!) remote-first employees. For more information, please visit life360.com.

Life360 is a Remote First company, which means a remote work environment will be the primary experience for all employees. All positions, unless otherwise specified, can be performed remotely (within Canada) regardless of any specified location above.  

About The Team

The Location Cloud team develops and maintains the core backend services critical to delivering real-time location functionality to the Life360 app. Our distributed systems are optimized for durability, high availability, low latency, and internet-scale. The Location Cloud team is a part of our Location Operating Group, which focuses on all the location-based features the Life360 app offers our millions of users.

About the Job

As an SRE on the Location Engineering group you will help build and operate scalable services powering Life360 product. Our cloud team ensures that our API's are able to process hundreds of thousands of requests a second with the ability to scale 10x. You'll be a very active contributor to the design and operation of the core services. You use, develop, and improve automation tools as often as possible to increase the efficiency of the team and your work. You are comfortable dealing with very large amounts of traffic to the tune of billions of daily API requests.

The Canada-based salary range for this position is $170,000 to $210,000 CAD. We take into consideration an individual's background and experience in determining final salary- therefore, base pay offered may vary considerably depending on geographic location, job-related knowledge, skills, and experience. The compensation package includes a wide range of medical, dental, vision, financial, and other benefits, as well as equity.

What You’ll Do

  • Engage with product and engineering teams to design, build and maintain the system / software for high availability and resiliency.
  • Manage SLOs / Error Budgets for service teams
  • Write software layers, scripts, deployment frameworks, tracers, monitors, self-healing/auto remediation tools to automate the processes.
  • Build and maintain software modules for use and reuse in cloud systems automation.
  • Build and maintain network border layer for applications (CDN / DNS / Load Balancing / etc)
  • Troubleshooting and root-cause analysis of issues regardless of tool, provider, platform, or language.
  • Participate in shared on-call rotation
  • Estimate schedules, breaking tasks down to reasonable 1-3 day tasks.

What We’re Looking For

  • Bachelor's degree in Computer Science or equivalent discipline with at least 5 years experience in operations and exposure to software engineering.
  • 7+ years of experience as an SRE
  • 3+ years as a Senior SRE with programming experience with one or more relevant languages: Java, Python, PHP, Scala, etc.  
  • Previous experience working remotely 
  • Experience with Infrastructure as code tools: Terraform, CloudFormation; config management/provisioning tools: Ansible, Chef, etc.
  • Proficient in multi-threaded design and implementation.
  • Troubleshooting and system engineering exposure in UNIX/Linux production environments.
  • Developing, running, and/or consuming cloud technologies such as AWS, Azure, Docker/Kubernetes, etc.
  • Experience with existing open source projects such as Consul, Kafka, Cassandra, Docker.
  • Ability to quickly learn and apply complex subjects and technologies.
  • Experience desired with Big Data, streaming technologies, SaaS based environments, Web Analytics
  • Excellent interpersonal skills. Excellent English verbal and written communication skills. Highly collaborative working style.

Our Benefits

  • Competitive pay and benefits
  • Medical, dental, vision, life and disability insurance plans 
  • RRSP plan with DPSP company matching program
  • Employee Assistance Program (EAP) for mental well being
  • Flexible PTO, several company wide days off throughout the year
  • Winter and Summer Week-long Synchronized Company Shutdowns
  • Learning & Development programs
  • Equipment, tools, and reimbursement support for a productive remote environment
  • Free Life360 Platinum Membership for your preferred circle
  • Free Tile Products

Life360 Values

Our company’s mission driven culture is guided by our shared values to create a trusted work environment where you can bring your authentic self to work and make a positive difference 

  • Be a Good Person - We have a team of high integrity people you can trust. 
  • Be Direct With Respect - We communicate directly, even when it’s hard.
  • Members Before Metrics - We focus on building an exceptional experience for families. 
  • High Intensity High Impact - We do whatever it takes to get the job done. 

Our Commitment to Diversity

We believe that different ideas, perspectives and backgrounds create a stronger and more creative work environment that delivers better results. Together, we continue to build an inclusive culture that encourages, supports, and celebrates the diverse voices of our employees. It fuels our innovation and connects us closer to our customers and the communities we serve. We strive to create a workplace that reflects the communities we serve and where everyone feels empowered to bring their authentic best selves to work.

We are an equal opportunity employer and value diversity at Life360. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, disability status or any legally protected status.  

We encourage people of all backgrounds to apply. We believe that a diversity of perspectives and experiences create a foundation for the best ideas. Come join us in building something meaningful.Even if you don’t meet 100% of the below qualifications, you should still seriously consider applying!

 

#LI-Remote

____________________________________________________________________________

See more jobs at Life36

Apply for this job

+30d

Senior Site Reliability Engineer

CruiseUS Remote
Bachelor's degreeterraformpostgresDesignansibleazurejavac++dockerelasticsearchkubernetespython

Cruise is hiring a Remote Senior Site Reliability Engineer

We're Cruise, a self-driving service designed for the cities we love.

We’re building the world’s most advanced self-driving vehicles to safely connect people to the places, things, and experiences they care about. We believe self-driving vehicles will help save lives, reshape cities, give back time in transit, and restore freedom of movement for many.

In our cars, you’re free to be yourself. It’s the same here at Cruise. We’re creating a culture that values the experiences and contributions of all of the unique individuals who collectively make up Cruise, so that every employee can do their best work. 

Cruise is committed to building a diverse, equitable, and inclusive environment, both in our workplace and in our products. If you are looking to play a part in making a positive impact in the world by advancing the revolutionary work of self-driving cars, come join us. Even if you might not meet every requirement, we strongly encourage you to apply. You might just be the right candidate for us.

What you’ll be doing:
 

  • Build platforms, services, or tools that enable engineers to deliver operationally mature services in production

  • Help teams with automating tedious tasks and enable them to quickly launch new services and execute optimally

  • Work with service owners to have a proactive approach to designing tests, observing results and creating fixes for complex failure scenarios.

  • Help service owners  identify and instrument Service Level Objectives and design alerts that follow best practices

  • Facilitate blameless postmortems and drive effective action items
     

What you must have:(These are must-haves; skills that someone must have entering the role on day 1 and required for hiring)
 

  • Senior level experience as an Systems Engineer, Site Reliability Engineer, or Production Engineer

  • Significant Experience with Cloud Platforms such as Google Cloud Platform,  Microsoft Azure or Amazon Web Services

  • Fluent with one or more programming languages such as Go, Python or Java

  • Ability to debug and optimize code 

  • Experience with Incident Management platforms like Firehydrant

  • Ability to coordinate and manage incident response

  • Skills in defining  and instrumenting SLOs and SLIs using query languages and observability tooling

  • Automate tasks and processes with open source tools

  • Streaming and Database technologies such as Postgres, Kafka, Cassandra, ElasticSearch, etc.


 

Bonus points! 

  • Previous experience as an SRE or System Engineer

  • Previous Experience with Firehydrant

  • Previous experience with Backstage or another Developer Experience tool

  • Familiarity with Chef, Puppet, Ansible or other Configuration Management Tooling

  • Familiarity with Kubernetes, Docker, Go, Istio, Terraform, Vault, Google Cloud

The salary range for this position is $122,400 - $180,000. Compensation will vary depending on location, job-related knowledge, skills, and experience. You may also be offered a bonus, long-term incentives, and benefits. These ranges are subject to change.

Why Cruise?

Our benefits are here to support the whole you:

  • Competitive salary and benefits 
  • Medical / dental / vision, Life and AD&D
  • Subsidized mental health benefits
  • Paid time off and holidays
  • Paid parental, medical, family care, and military leave of absence
  • 401(k) Cruise matching program 
  • Fertility benefits
  • Dependent Care Flexible Spending Account
  • Flexible Spending Account & Health Saving Account
  • Perks Wallet program for benefits/perks
  • Pre-tax Commuter benefit plan for local employees
  • CruiseFlex, our location-flexible work policy. (Learn more about CruiseFlex).

We’re Integrated

  • Through our partnerships with General Motors and Honda, we are the only self-driving company with fully integrated manufacturing at scale.

We’re Funded

  • GM, Honda, Microsoft, T. Rowe Price, and Walmart have invested billions in Cruise. Their backing for our technology demonstrates their confidence in our progress, team, and vision and makes us one of the leading autonomous vehicle organizations in the industry. Our deep resources greatly accelerate our operating speed.

Cruise LLC is an equal opportunity employer. We strive to create a supportive and inclusive workplace where contributions are valued and celebrated, and our employees thrive by being themselves and are inspired to do the best work of their lives. We seek applicants of all backgrounds and identities, across race, color, caste, ethnicity, national origin or ancestry, age, citizenship, religion, sex, sexual orientation, gender identity or expression, veteran status, marital status, pregnancy or parental status, or disability. Applicants will not be discriminated against based on these or other protected categories or social identities. Cruise will consider for employment qualified applicants with arrest and conviction records, in accordance with applicable laws.

Cruise is committed to the full inclusion of all applicants. If reasonable accommodation is needed to participate in the job application or interview process please let our recruiting team know or emailHR@getcruise.com.

We proactively work to design hiring processes that promote equity and inclusion while mitigating bias. To help us track the effectiveness and inclusivity of our recruiting efforts, please consider answering the following demographic questions. Answering these questions is entirely voluntary. Your answers to these questions will not be shared with the hiring decision makers and will not impact the hiring decision in any way. Instead, Cruise will use this information not only to comply with any government reporting obligations but also to track our progress toward meeting our diversity, equity, inclusion, and belonging objectives.

Candidates applying for roles that operate and remotely operate the AV:Licensed to drive a motor vehicle in the U.S. for the three years immediately preceding your application, currently holding an active in-state regular driver’s license or equivalent, and no more than one point on driving record. A successful completion of a background check, drug screen and DMV Motor Vehicle Record check is also required.

Note to Recruitment Agencies:Cruise does not accept unsolicited agency resumes. Furthermore, Cruise does not pay placement fees for candidates submitted by any agency other than its approved partners. 

Apply for this job

+30d

Sr Site Reliability Engineer

MozillaRemote Canada
Full TimeDevOPSterraformDesignjenkinspythonAWS

Mozilla is hiring a Remote Sr Site Reliability Engineer

Why Thunderbird?

MZLA Technologies Corporation (MZLA), a wholly-owned subsidiary of Mozilla Foundation, runs the Thunderbird Project and develops related software and services. Thunderbird is a global, free, and open source email client that has grown significantly in donations, staff, and aspirations since its launch 20 years ago. We are expanding our team as we broaden our product and service offerings, committed to delivering best-in-class productivity solutions independent of big tech influence. This new role is an opportunity for an experienced engineer who is excited to design and implement infrastructure & automation to support Thunderbird’s ongoing growth.

The Opportunity:

Thunderbird is looking for a multi-skilled self-starter to work on site reliability engineering. As a Senior SRE, you will play a critical role in ensuring the reliability, scaleability, and performance of our systems. You will work closely with cross-functional teams to design and implement solutions that enhance our infrastructure and optimize our processes.You will be a foundational member behind Thunderbird's exciting new web services.You bring production-hardened IaC knowledge and experience to plan and implement the processes to deliver our new product offerings.The ideal candidate will excel in collaboration and possess strong communication skills, ensuring clear and open dialogue across all levels of the organization. Additionally, you will have a proven track record of successfully completing projects from start to finish.

You will be working in close cooperation with our current SDEs and SREs, other staff, and community members.TheSr Site Reliability Engineeris anindividual contributor and will report directly to the Manager, Web Services. 

We’re committed to creating an amazing experience for our users, and you’ll play a key part in this effort. You will be working with our existing staff and community members from all over the globe to support the mission and objectives of MZLA Technologies Corp and the Thunderbird Project.

This is aremote,full time position. We expect excellent written communication skills so as to foster strong work coordination over email, video conferencing, Matrix, andand GitHub issues.

What you’ll do: 

  • Set up and deploy the infrastructure and monitoring for emerging, long-running projects.
  • Design and develop the CI/CD systems developers use and the infrastructure for all current and future websites and services.
  • Diagnose and debug production incidents and then improve systems to prevent the problem from recurring.
  • Collaborate with SDEs and fellow SREs to ship, maintain and monitor new builds of our websites and services.
  • Occasionally assist with Thunderbird desktop CI/CD and releases.
  • Work with a geographically-distributed development team.

What you’ll bring: 

  • Seasoned professional with 10+ years of work experience
  • Minimum 5 years professional experience in a tech infrastructure role, ideally in cloud-scale environments.
  • Minimum 2+ years of experience in a senior DevOps or SRE role and experience acting as a technical lead, team lead or line manager.
  • Ability to self-direct work, handle less structured environments, and communicate effectively with staff and community members in many different roles.
  • Professional experience programming in Python, shell scripting, etc.
  • Experience setting up reliable infrastructure-as-code deployments in one of the major cloud platforms such as AWS or GCP, using tools like Terraform, Helm, Cloudformation, or Ansible.
  • Experience with industry standard web development CI/CD tools such as Jenkins, CircleCI, TeamCity, GitHub Actions, etc.
  • Excellent English written and verbal communication, with the ability to clearly and concisely interact with an international audience.
  • Proven track record of scoping and finishing projects.
  • A mission of making a concrete positive impact on the day to day communication experience for tens of millions of users.
  • Commitment to our values:
    • Passionate about fostering openness and transparency within an open-source community
    • Demonstrates a collaborative and team-oriented approach
    • Motivated by curiosity and creativity
    • Embraces and champions diversity
    • Brings a hearty dose of scrappy grit and resilience to our lively and spirited team.

Bonus points for:

  • Experience with database administration and performance optimization.
  • Experience with data science & analytics software such as Redshift, Presto, EMR, Kinesis, etc.
  • Experience with web development.
  • Knowledge of email protocols and/or experience running email servers (SMTP, IMAP).
  • Previous experience with an Open Source project, or participation in an Open Source community.
  • Dedication to open source and open standards.
  • Passionate about our mission - you care deeply about user privacy and control over one’s data

What you’ll get:

We benchmark our base salaries to local markets and target the 60th percentile of the peer market. The salary ranges for this role are:

  • Canada:$96,000 - $115,000 CAD 

In addition to competitive salaries, we offer a comprehensive benefits package designed to support your whole self.

Work & Career

  • Fully remote work & schedule flexibility
  • Latest Laptop and accessories 
  • Annual Remote Work Stipend
  • Monthly Internet Stipend
  • Professional Development Stipend
  • Industry Conferences
  • Annual Global Team Offsite

Rest & Play

  • 24 days PTO per year (prorated) 
  • Your Birthday
  • Year-end Company Shutdown
  • Pilot 4 Day Work Week (July & August 2024)
  • Public Holidays
  • Other Paid Leave
  • Wellbeing Stipend for Personal / Family Activities

Health & Family

  • RRSP Contributions
  • Health, Dental, & Vision Insurance
  • Disability/Income Protection Insurance
  • Life Insurance
  • Employee Assistance Program 
  • Paid Parental Leave
  • Paid Sick Days 

*Applicants must reside in and have work authorization for one of the country locations specified above. We are unable to consider applicants outside of these markets at this time. And we are unable to provide visa sponsorship

About Mozilla 

At Mozilla, we have big ambitions for the future, we want to build impactful products that are different — that are built with more respect for the people using them and help us explore new forms of openness. It’s going to take hard work that Mozilla is uniquely suited to take on. It’s why we’re here. It’s who we are. And it’s our future.

Bring your passion, your creativity, your big ideas, and your new perspectives to make the difference we’re aiming for.

MZLA Technologies Corporation (MZLA) Commitment to diversity, equity and inclusion

Mozilla believes in the value of diverse creative practices and forms of knowledge, and knows diversity, equity and inclusion are crucial to and enrich the company’s core mission. We encourage applications from everyone, including members of all equity-seeking communities, such as (but not limited to) women, racialized and Indigenous persons, persons with disabilities, persons of all sexual orientations, gender identities and expressions.

We are an equal opportunity employer. We do not discriminate on the basis of race (including hairstyle and texture), religion (including religious grooming and dress practices), gender, gender identity, gender expression, color, national origin, pregnancy, ancestry, domestic partner status, disability, sexual orientation, age, genetic predisposition, medical condition, marital status, citizenship status, military or veteran status, or any other basis covered by applicable laws. Mozilla will not tolerate discrimination or harassment based on any of these characteristics or any other unlawful behavior, conduct, or purpose. 

We will ensure that qualified individuals with disabilities are provided reasonable accommodations to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment, as appropriate. Please contact us athiringaccommodations@thunderbird.netto request accommodation.

#LI-REMOTE

See more jobs at Mozilla

Apply for this job

+30d

Sr. Site Reliability Engineer

Signify HealthDallas TX, Remote
terraformairflowmobileazurec++kubernetespythonAWS

Signify Health is hiring a Remote Sr. Site Reliability Engineer

How will this role have an impact?

Signify Health is looking for a passionate Site Reliability Engineer (SRE) to enhance our dynamic SRE team. Reporting to the Sr Director of Cloud Operations and SRE, we welcome individuals from different technical backgrounds, especially software engineers aspiring to transition into SRE/DevOps roles. 

At Signify Health, we appreciate and respect the unique experiences and perspectives that each team member brings. We are committed to providing an environment where everyone feels welcomed, respected, and empowered. So, no matter what your background is, we invite you to join us and help shape the future of healthcare while refining your skills in the SRE domain.

Diversity and Inclusion are core values at Signify Health, and fostering a workplace culture reflective of that is critical to our continued success as an organization

What will you do?

  • Develop and implement strategies that improve the stability, scalability, and availability of our products
  • Maintain and deploy observability solutions for infrastructure and applications to ensure optimal performance
  • Participate in real-time service management, including crafting monitoring systems, alerts, playbooks, and runbooks in collaboration with our development teams
  • Utilize your on-call rotation to proactively prevent incidents and maintain uninterrupted operations
  • Work alongside colleagues from various disciplines to optimize operational processes
  • This is a remote role with some occasional travel required to Dallas, TX


Basic Requirements

  • Minimum of 4 years of relevant technical experience, with an emphasis on SRE/DevOps
  • Experience creating python scripts to solve operational challenges
  • Experience with Pipeline orchestration tooling such as Airflow, Dagster, etc.
  • ELT tooling, Azure Data Factory
  • Experience with Databricks interface/tools
  • Practical experience with Azure or AWS, and Terraform
  • Working knowledge of Kubernetes (AKS/EKS preferred)
  • Familiarity with the deployment of CI/CD systems and practices


About Us:

Signify Health is helping build the healthcare system we all want to experience by transforming the home into the healthcare hub. We coordinate care holistically across individuals’ clinical, social, and behavioral needs so they can enjoy more healthy days at home. By building strong connections to primary care providers and community resources, we’re able to close critical care and social gaps, as well as manage risk for individuals who need help the most. This leads to better outcomes and a better experience for everyone involved.

Our high-performance networks are powered by more than 9,000 mobile doctors and nurses covering every county in the U.S., 3,500 healthcare providers and facilities in value-based arrangements, and hundreds of community-based organizations. Signify’s intelligent technology and decision-support services enable these resources to radically simplify care coordination for more than 1.5 million individuals each year while helping payers and providers more effectively implement value-based care programs.

We are committed to equal employment opportunities for employees and job applicants in compliance with applicable law and to an environment where employees are valued for their differences.

To learn more about how we’re driving outcomes and making healthcare work better, please visit us at www.signifyhealth.com

See more jobs at Signify Health

Apply for this job

+30d

Junior Site Reliability Engineer

MedfarMontréal, Canada, Remote
DevOPS2 years of experienceterraformsqlazurec++.net

Medfar is hiring a Remote Junior Site Reliability Engineer

Job Description

As a Junior Site Reliability Engineer (SRE) you will play a crucial role within the R&D and Innovation department. You will be called upon to collaborate with the Plexia product-aligned and core architecture team. The highly sensitive nature of health and medical systems expertise makes it so that the availability and reliability of our systems are of paramount importance to MEDFAR.

The goal of the Site Reliability Engineering (SRE) team is to enable the Plexia team to deliver work with substantial autonomy, therefore they will be collaborating with team members across the company to help them achieve better outcomes and to provide them with the necessary tools and technologies to deliver them. As part of the SRE team, you will be joining the team accountable for the operation, resilience and backup of the organization’s tools, products, data and services.

What you will be working on: 

  • Refining and extending current monitoring capabilities to track essential service-level indicators and ensure visibility of these metrics.

  • Improving our infrastructure and software by collaborating extensively with the core architecture and product-aligned teams to identify and deliver improvements that enhance site availability through scalable, secure, and resilient architectures.

  • Defining and executing test plans that aim to ensure the robustness and resilience of our infrastructure and software systems.

  • Managing incidents and emergency response, tracking outages, ensuring data integrity and participating in release management to promote safe, efficient and rapid deployments.

Qualifications

Contribute to our team with your strengths:

  • 1-2 years of experience working in site reliability engineering-related projects (required) plus additional experience in system administration, DevOps or software engineering roles (an asset)

  • Knowledge of Microsoft Azure specifically with high-reliability architecture and security hardening.

  • Experience with CI/CD processes and Azure DevOps pipelines.

  • Proficient in PowerShell.

  • Experience with Windows and Network setup and management

  • Experience in C#, .NET frameworks, and SQL programming

  • Experience in SQL Database Management

  • Strong ability and rigor in documenting tasks and procedures with detail

  • Experience working with Terraform or another IaC framework, an asset 

  • Bilingual (FR/EN). The ability to communicate in English is required as many team members are located in BC.  

Working conditions:

  • Full-time permanent role, 40 hours per week schedule. 
  • 'Emergency working hours' may occasionally be necessary to ensure system stability and address critical issues promptly.
  • Flexibility in working hours is important to collaborate with team members in the Pacific Standard Time zone. 

See more jobs at Medfar

Apply for this job

+30d

Junior Site Reliability Engineer (SRE)

MedfarMontréal, Canada, Remote
DevOPS2 years of experienceterraformsqlazurec++.net

Medfar is hiring a Remote Junior Site Reliability Engineer (SRE)

Job Description

As a Junior Site Reliability Engineer (SRE) you will play a crucial role within the R&D and Innovation department. You will be called upon to collaborate with the Plexia product-aligned and core architecture team. The highly sensitive nature of health and medical systems expertise makes it so that the availability and reliability of our systems are of paramount importance to MEDFAR.

The goal of the Site Reliability Engineering (SRE) team is to enable the Plexia team to deliver work with substantial autonomy, therefore they will be collaborating with team members across the company to help them achieve better outcomes and to provide them with the necessary tools and technologies to deliver them. As part of the SRE team, you will be joining the team accountable for the operation, resilience and backup of the organization’s tools, products, data and services.

What you will be working on: 

  • Refining and extending current monitoring capabilities to track essential service-level indicators and ensure visibility of these metrics.

  • Improving our infrastructure and software by collaborating extensively with the core architecture and product-aligned teams to identify and deliver improvements that enhance site availability through scalable, secure, and resilient architectures.

  • Defining and executing test plans that aim to ensure the robustness and resilience of our infrastructure and software systems.

  • Managing incidents and emergency response, tracking outages, ensuring data integrity and participating in release management to promote safe, efficient and rapid deployments.

Qualifications

Contribute to our team with your strengths:

  • 1-2 years of experience working in site reliability engineering-related projects (required) plus additional experience in system administration, DevOps or software engineering roles (an asset)

  • Knowledge of Microsoft Azure specifically with high-reliability architecture and security hardening.

  • Experience with CI/CD processes and Azure DevOps pipelines.

  • Proficient in PowerShell.

  • Experience with Windows and Network setup and management

  • Experience in C#, .NET frameworks, and SQL programming

  • Experience in SQL Database Management

  • Strong ability and rigor in documenting tasks and procedures with detail

  • Experience working with Terraform or another IaC framework, an asset 

  • Bilingual (FR/EN). The ability to communicate in English is required as many team members are located in BC.  

Working conditions:

  • Full-time permanent role, 40 hours per week schedule. 
  • 'Emergency working hours' may occasionally be necessary to ensure system stability and address critical issues promptly.
  • Flexibility in working hours is important to collaborate with team members in the Pacific Standard Time zone. 

See more jobs at Medfar

Apply for this job

+30d

Senior Site Reliability Engineer

Multi MediaUnited States, Remote
DevOPSmetal

Multi Media is hiring a Remote Senior Site Reliability Engineer

Multi Media LLC is a forward-thinking innovator in the content-creator community. Chaturbate is our industry-leading flagship product that serves 600 million users through live broadcasts daily. We commit to delivering a safe and engaging online experience to our diverse community.

We are seeking a remote Senior Site Reliability Engineer for Multi Media LLC who will elevate our infrastructure resilience and optimize system performance. As we advance into our next phase of growth, we are searching for someone passionate about leading the enhancement of both physical and cloud-based systems, fostering innovation and efficiency across our platforms.

What you'll do:

  • Performance analysis to identify sources of instability using data from APM and distributed telemetry data tools
  • Analyze complex systems to identify operational surprises and minimize downtime.
  • Software engineering and patching in to incrementally improve performance, scalability, and reliability
  • Infrastructure modifications in both a data center metal environment with advanced routing/switching and in the public cloud
  • Predictive failure analysis and disaster planning
  • Author new tools and automation to streamline the DevOps pipeline 
  • Collaborate with other engineering teams 
  • Database and kv store administration and configuration with a focus on uptime and performance
  • Incident response and postmortem reports

What you bring: 

  • STEM degree and relevant experience as a Site Reliability Engineer
  • Exceptional problem solving skills
  • High proficiency in one of the following: C, C++, Java, Python, Go, etc.
  • High proficiency in Unix/Linux environment, excellent knowledge of internals (e.g., filesystems, system calls)
  • Networking knowledge (e.g., routing, switching, TCP stack) for both metal and cloud (VPC, Security Groups) environments
  • Experience in database administration and configuration 
  • Experience with DevOps tools such as Terraform, Ansible, Docker, Kubernetes
  • On call reporting to monitoring and alerting of core website functions as needed

What you’ll gain:

  • A strong team of A-players
  • A robust engineering culture
  • Opportunity to make an impact on the highly popular product
  • Freedom to bring ideas to the table and to make technical decisions
  • Support and guidance of the highly professional and knowledgeable team
  • Flexible working environment

Perks & Benefits

  • Fully remote optional and flexible work schedule.
  • Health, Vision, Dental, and Life Insurance for you and any dependents, with policy premiums covered by the Company.
  • 401k plan with 5% matching.
  • Long & Short term disability insurance.
  • Unlimited PTO.
  • Annual Year-End Company Closure.
  • 12 Paid Holidays.
  • $125/week food and grocery stipend via Sharebite.
  • Employee wellness programs via Holisticly.
  • EAP and Employee Recognition Programs.
  • And much more!

The Base Salary range for this position is $161,000 - $180,000 annually. This range reflects base salary only and does not include additional compensation or benefits. The range displayed reflects the minimum and maximum range for a new hire across the US for the posted position. A candidate's specific pay will be determined on a case-by-case basis and may vary based on the candidate's job-related skills, relevant education, training, experience, certifications, and abilities of the candidate, as well as other factors unique to each candidate.

Multi Media, LLC is an equal opportunity employer and strives for diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We encourage people from underrepresented groups to apply!

See more jobs at Multi Media

Apply for this job

+30d

Junior Site Reliability Engineer

PodiumRemote, US
Bachelor's degreeterraformDesignansibleazurerubydockerkuberneteslinuxpythonAWS

Podium is hiring a Remote Junior Site Reliability Engineer

At Podium, our mission is to help local businesses win. Our lead conversion platform, powered by AI and integrations, helps local businesses convert leads faster, communicate easier, and make more sales. Every day, thousands of local businesses utilize our review management, communication, marketing, and payments products. 

Our work and focus on helping local businesses thrive has been recognized across the industry, including Forbes’ Next Billion Dollar Startups, Forbes’ Cloud 100, the Inc. 5000, and Fast Company’s World’s Most Innovative Companies.

At Podium, we believe in fostering a culture that thrives on hiring and developing exceptional talent. Our operating principles serve as a compass, guiding daily behavior and decision-making, and ensure we hire people who will thrive at Podium. If you resonate with our operating principles and are energized by our mission, Podium will be a great place for you!

A Site Reliability Engineer borders the worlds of software engineering and systems engineering. At Podium, the SRE team drives our products to success by building a stable, scalable, sustainable, and slick system. We permanently sit and sup with the product engineering teams to address all of their needs, and work as an SRE guild to build a world-class platform for our products to run on. We're currently targeting a junior SRE to come in and deliver impact from day one.

What you will be doing: 

  • Working with the following technologies: Kubernetes, Helm, Docker, AWS, Terraform, Datadog, Prometheus, Ansible, StrongDM, Python, Go, Ruby, GitLab and GitLab CI.
  • Engaging with Podium's engineering community to identify potential areas of improvement or pain points and make Podium's systems more secure and pleasant to operate.
  • Participating in an on-call rotation for the services the team owns, triaging and addressing production as well as development issues.
  • Working cross-functionally with different teams to make sure that there is no downtime for our products.

What you should have: 

  • Bachelor’s degree in a technical field or relevant work experience.
  • 1-3  years experience working alongside a production system running on Kubernetes
  • 1-3 years deploying, operating and debugging server software on Linux
  • Curiosity and the desire to learn
  • Ability to take a rotating on-call shift

What we hope you have: 

  • Experience with distributed systems and microservices
  • Practical knowledge of system design
  • Cloud computing, such as AWS, GCP, or Azure
  • SOC2, HIPAA, PCI, or other regulatory or compliance standards
  • Building and maintaining a CI/CD pipeline

BENEFITS

  • Open and transparent culture - Checkout thisvideoto see what it’s like to work at Podium 
  • Life insurance, long and short-term disability coverage
  • Paid maternity and paternity leave
  • Fertility Benefits
  • Generous vacation time, plus three 4-day summer holiday weekends
  • Excellent medical, dental, and vision benefits
  • 401k Plan
  • Bi-annual swag drops with cool Podium gear and apparel 
  • A stellar HQ (Utah) gym with local professional coaches and classes offered
  • Onsite HQ (Utah) child care center, subsidized for employees
  • Additional benefits for fully remote employees

Podium is an equal opportunity employer. Podium provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, national origin, sexual orientation, gender identity or expression, age, disability, genetic information, marital status or veteran status.

See more jobs at Podium

Apply for this job

+30d

Staff Site Reliability Engineer

Modern HealthRemote - US
DevOPSDjangoS3SQSEC2redisterraformDesignazurepostgresqlpythonAWS

Modern Health is hiring a Remote Staff Site Reliability Engineer

Modern Health 

Modern Healthis a mental health benefits platform for employers. We are the first global mental health solution to offer employees access to one-on-one, group, and self-serve digital resources for their emotional, professional, social, financial, and physical well-being needs—all within a single platform. Whether someone wants to proactively manage stress or treat depression, Modern Health guides people to the right care at the right time. We empower companies to helpalltheir employees be the best version of themselves, and believe in meeting people wherever they are in their mental health journey.

We are a female-founded company backed by investors like Kleiner Perkins, Founders Fund, John Doerr, Y Combinator, and Battery Ventures. We partner with 500+ global companies like Lyft, Electronic Arts, Pixar, Clif Bar, Okta, and Udemy that are taking a proactive approach to mental health care for their employees. Modern Health has raised more than $170 million in less than two years with a valuation of $1.17 billion, making Modern Health the fastest entirely female-founded company in the U.S. to reach unicorn status. 

We tripled our headcount in 2021 and as a hyper-growth company with a fully remote workforce, we prioritize our people-first culture (winning awards including Fortune's Best Workplaces in the Bay Area 2021). To protect our culture and help our team stay connected, we require overlapping hours for everyone. While many roles may function from anywhere in the world—see individual job listing for more—team members who live outside the Pacific time zone must be comfortable working early in the morning or late at night; all full-time employees must work at least six hours between 8 am and 5 pm Pacific time each workday. 

We are looking for driven, creative, and passionate individuals to join in our mission. An inclusive and diverse culture are key components of mental well-being in the workplace, and that starts with how we build our own team. If you're excited about a role, we'd love to hear from you!

The Role

In this role, you'll be given lots of responsibility and the opportunity to have true ownership as we build out the product. This is a unique opportunity to use your engineering powers to make a direct impact in people's lives. We need a Staff Site Reliability Engineer who is enthusiastic about building reliable, scalable, and flexible systems to support our growing team, product, and user base. You'll work with other engineers to reliably release and maintain services, and help define and meet internal and customer-facing SLA's and SLO's.

This position is not eligible to be performed in Hawaii.

What You’ll Do

  • Manage and orchestrate Cloud Resource (AWS) configuration using Infrastructure As Code (Terraform) to empower engineering staff to embrace a DevOps culture of Self Service Ownership
  • Develop and govern Observability (Datadog) best practices for tracking platform performance and health trends to meet customer SLAs and lead technical decisions with strong supporting evidence
  • Create solutions that dynamically scale based on demand with enough flexibility to pivot for fast changing project requirements while maintaining a balance of good versus perfect
  • Provide strong and consistent communication updates on technical progress or blockers to keep stakeholders informed while additionally creating appropriate documentation on technical design to spread knowledge and reduce information silos
  • Participate and respond to 24/7 on-call critical alerts and follow documented incident investigation procedures to reestablish customer facing feature availability
  • Maintain HIPAA, GDPR, SOC-2 compliance and general security through best practice implementation

Who You Are

  • At least 8+ years of experience in software engineering with 4+ years experience in DevOps
  • Cloud Provider (AWS, GCP, Azure) experience on managing resources through Infrastructure As Code (Terraform) 
  • Container Orchestration (ECS or K8s) experience to confidently build, test, and release containerized applications for multiple environments and regions
  • Knowledge of Observability best practices across common cloud resources (EC2, ECS, RDS, DynamoDB, S3, SQS, Eventbridge) with experience on rolling out enhancements across a distributed platform with scale in mind
  • Experience with shell scripting for *nix systems
  • Experience with Networking for web applications
  • Effective at communicating ideas through writing and diagramming
  • Comfortable working with a distributed development and ops team
  • Familiarity with AWS: ECS and cloud hosting, Gitlab: CI/CD, Python: Django, Flask, aiohttp, Bash, Data: PostgreSQL, Redis, Monitoring: Datadog and Sentry, IaC: Terraform, Packer

Benefits

Fundamentals:

  • Medical / Dental / Vision / Disability / Life Insurance 
  • High Deductible Health Plan with Health Savings Account (HSA) option
  • Flexible Spending Account (FSA)
  • Access to coaches and therapists through Modern Health's platform
  • Generous Time Off 
  • Company-wide Collective Pause Days 

Family Support:

  • Parental Leave Policy 
  • Family Forming Benefit through Carrot
  • Family Assistance Benefit through UrbanSitter

Professional Development:

  • Professional Development Stipend

Financial Wellness:

  • 401k
  • Financial Planning Benefit through Origin

But wait there’s more…! 

  • Annual Wellness Stipend to use on items that promote your overall well being 
  • New Hire Stipend to help cover work-from-home setup costs
  • ModSquad Community: Virtual events like active ERGs, holiday themed activities, team-building events and more
  • Monthly Cell Phone Reimbursement

Equal Pay for Equal Work Act Information

Please refer to the ranges below to find the starting annual pay range for individuals applying to work remotely from the following locations for this role.


Compensation for the role will depend on a number of factors, including a candidate’s qualifications, skills, competencies, and experience and may fall outside of the range shown. Ranges are not necessarily indicative of the associated starting pay range in other locations. Full-time employees are also eligible for Modern Health's equity program and incredible benefits package. See our Careers page for more information.

Depending on the scope of the role, some ranges are indicative of On Target Earnings (OTE) and includes both base pay and commission at 100% achievement of established targets.

San Francisco Bay Area
$160,700$189,000 USD
All Other California Locations
$160,700$189,000 USD
Colorado
$136,600$160,700 USD
New York City
$160,700$189,000 USD
All Other New York Locations
$144,700$170,000 USD
Seattle
$160,700$189,000 USD
All Other Washington Locations
$144,700$170,000 USD

Below, we are asking you to complete identity information for the Equal Employment Opportunity Commission (EEOC). While we are required by law to ask these questions in the format provided by the EEOC, at Modern Health we know that gender is not binary, and we recognize that these categories do not reflect our employees' full range of identities.

See more jobs at Modern Health

Apply for this job

+30d

Senior Site Reliability Engineer (m/f/x)

commercetoolsEurope (Remote)
golangterraformscalaDesignazuregraphqlkubernetesAWS

commercetools is hiring a Remote Senior Site Reliability Engineer (m/f/x)

commercetools - we are:

  • Engaged: We didn't become the fastest growing, highest ever valued SaaS software company in digital commerce with nearly 100% year-over-year growth by sitting on the sidelines.
  • Inspired: We continually explore what's possible. As the founder of the headless commerce concept, the leader in true composable commerce, and the visionaries behind MACH® — our patented tech has radically disrupted the world of enterprise ecommerce software. And we are just getting started!
  • Valued: Intelligent, resilient, passionate individuals hailing from over 50 countries across the globe, speaking over 43 languages, and collectively embracing diversity, encouraging inclusion, and fostering a culture of care.

 *We can only consider applicants within a commutable distance to our offices in Amsterdam, Berlin, London, Munich, or Valencia.

The Opportunity:

commercetools represents the collective work of numerous teams; each team building a fraction of the overall platform to create a singular, powerful platform for our users. The Special Delivery team focuses our energy on enabling all these teams, building in their own way, to deliver high quality software to the world.

Your Mission:

  • Communicate decisions and actions effectively and asynchronously to the team
  • Assist team members proactively and with priority
  • Ability to divide work tickets into achievable tasks and milestones for the rest of the team.
  • Act as a consistent source of knowledge and counsel for other engineers
  • Foster a culture in which the team feels psychologically safe to openly share their opinions
  • Start and execute a technical Request for Comments (RFC) process to evaluate several alternatives, lead a decision, and collect feedback along the way
  • Take leadership roles as part of an incident management team
  • Use systematic debugging to diagnose all issues within the scope of their domain 

What you need to succeed:

  • 5+ years of SRE experience 
  • 2+ years of experience mentoring and supporting team members 
  • Experience writing automation tooling in Golang
  • Experience using IaC tooling (Terraform or Kubernetes)
  • Experience running production workloads in a major cloud provider (AWS, GCP or Azure)
  • Familiarity in driving architectural discussions and initiatives across teams
  • Experience providing high-quality code reviews to peers and junior engineers, both on and off the team, for development efforts critical to the team
  • Strong time management skills
  • Written and spoken English communication skills

Team Values:

Positivity.Negativity is the enemy of progress.

Trust & Transparency. Promote direct and continuous feedback.

Learning. Be proud if you’ve failed at something. Think big, start small, learn fast!

 

Tech at commercetools:

We Are Open Source And Innovative By Design

???? We make rapid progress by being early adopters of React, Scala, and GraphQL

???? We share & contribute to the open source community: https://github.com/sangria-graphql

⚙️ We <3 Automation and Machine Learning

 

We care about your Growth and Well-being

???? Competitive compensation package:Generous compensation structure consisting of salary, competitive stock option package, various benefits  and perks

☀️ Remote Work:Up to 60 days/year from a country different from your base country  

???? Open Learning & Development Budget

???? ct Academy:Regular internal training sessions

⌚️ Flexibility: Morning person or night owl? We believe in outcome and motivated employees

???? Mindset & Growth:A diverse, creative workspace with an international culture & learning environment

 

Are you ready? Come grow with us!

???? Are you looking for something else? Check out our Career Page and our Website for more information.

 

We are all different and that is what makes us stronger! We hire great people from awide variety of backgrounds, not just because it’s the right thing to do, but because it makes our company better.

commercetools celebrates being adiverse environmentand is proud to be anequal opportunities employer. If your professional profile aligns with our specific hiring requirements and company culture, then we encourage you to apply. We will assessyour competencies, future potential, approachto learning and self-development and passion, and not your age, color, national origin, religion, gender, gender identity or expression, sexual orientation, familial status, genetics, or disability.






See more jobs at commercetools

Apply for this job