Site Reliability Engineer Remote Jobs

+30d

Signify HealthDallas, TX, Remote

Design ● mobile ● azure ● c++ ● kubernetes ● python ● AWS

Signify Health is hiring a Remote Site Reliability Engineer

Site Reliability Engineer (SRE) at Signify Health

Join Signify Health's vibrant Site Reliability Engineering team as a Site Reliability Engineer. We’re seeking passionate individuals from diverse technical backgrounds. Reporting to the Manager of Site Reliability Engineering, we offer a collaborative environment that values each team member's unique contribution and fosters an inclusive culture.

Your Role:

Developing strategies to improve the stability, scalability, and availability of our products.
Maintain and deploy observability solutions to optimize system performance.
Collaborate with cross-functional teams to enhance operational processes and service management.
Design, build, and maintain application stacks for product teams.
Create sustainable systems and services through automation.

Skills We’re Seeking:

An eagerness to grow and collaborate in the field of Site Reliability Engineering.
Intermediate understanding of scripting languages, such as Python or Bash.
Strong familiarity with cloud environments (Azure, AWS, or GCP) and a desire to develop further expertise.
Intermediate grasp of infrastructure as code, preferably with exposure to Terraform.
Intermediate understanding of Kubernetes and containerization technologies.
Intermediate understanding of CI/CD principles and willingness to guide and enforce best practices.
A proactive approach to identifying problems, performance bottlenecks, and areas for improvement.

The base salary hiring range for this position is $92,000 to $160,000. Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for incentive compensation, equity, and benefits.

In addition to your compensation, enjoy the rewards of an organization that puts our heart into caring for our colleagues and our communities. Eligible employees may enroll in a full range of medical, dental, and vision benefits, 401(k) retirement savings plan, and an Employee Stock Purchase Plan. We also offer education assistance, free development courses, paid time off programs, paid holidays, a CVS store discount, and discount programs with participating partners.

About Us:

Signify Health is helping build the healthcare system we all want to experience by transforming the home into the healthcare hub. We coordinate care holistically across individuals’ clinical, social, and behavioral needs so they can enjoy more healthy days at home. By building strong connections to primary care providers and community resources, we’re able to close critical care and social gaps, as well as manage risk for individuals who need help the most. This leads to better outcomes and a better experience for everyone involved.

Our high-performance networks are powered by more than 9,000 mobile doctors and nurses covering every county in the U.S., 3,500 healthcare providers and facilities in value-based arrangements, and hundreds of community-based organizations. Signify’s intelligent technology and decision-support services enable these resources to radically simplify care coordination for more than 1.5 million individuals each year while helping payers and providers more effectively implement value-based care programs.

To learn more about how we’re driving outcomes and making healthcare work better, please visit us at www.signifyhealth.com

Diversity and Inclusion are core values at Signify Health, and fostering a workplace culture reflective of that is critical to our continued success as an organization.

We are committed to equal employment opportunities for employees and job applicants in compliance with applicable law and to an environment where employees are valued for their differences.

See more jobs at Signify Health

+30d

Accessibility and Candidate Notices

CelonisRemote, Spain

Master’s Degree ● nosql ● postgres ● RabbitMQ ● Design ● azure ● java ● docker ● elasticsearch ● python ● AWS ● javascript

Celonis is hiring a Remote Senior Site Reliability Engineer

We're Celonis, the global leader in Process Mining technology and one of the world's fastest-growing SaaS firms. We believe there is a massive opportunity to unlock productivity by placing data and intelligence at the core of business processes - and for that, we need you to join us.

The Team

As part of our scaling Actions Platform team, you'll have a huge impact on helping teams and engineers build and operate resilient, reliable and scalable systems. You'll have ownership over our product's health, ensuring end-to-end availability and peak performance.

We are on the path to providing a first-class service, so we want our product to be super healthy and reliable at all times!

The Role

Collaboration is a huge part of our Celonis culture! Within this role, you’ll help teams catch issues before they affect customers, and tie reliability to business outcomes.

By helping our Product teams understand the reliability of their services and how they can improve it, our teams and engineers will be able to build, deliver, and operate resilient, reliable systems.

The work you’ll do

Take ownership of complex issues related to performance, reliability, and scalability, from idea inception to production, including all required technical and organizational improvements.
Help our engineering teams gain full control over the stability and performance of their services.
Support and drive the investigation and resolution of incidents and issues in production.
Monitor and maintain object and data storage solutions.
Lead postmortems and root cause analysis to facilitate continuous improvement.
Design, write, and deliver software that enhances the availability, scalability, and efficiency of our products.
Proactively identify, plan, and execute improvement opportunities to minimize risks, address recurrent issues, automate manual processes, improve quality, and streamline our software deliveries.
Provide technical leadership on reliability to engineers, managers, and product managers.
Improve our monitoring, metrics, and KPIs, as well as define and implement missing SLOs.
Implement processes and automation to prevent problem recurrence.
Share acquired knowledge and document accordingly while implementing SRE best practices.
Guide a technical roadmap for reliability to enable the planning and building of reliable solutions using our infrastructure and developer productivity platform.

The Qualifications You Need

Experience in Software Engineering roles, typically with 5+ years of experience.
Master’s degree in Computer Science or equivalent experience and skill set.
Experience in developing and running large-scale productive services with Docker and Kubernetes.
Experience working with in-memory data stores (e.g., Redis), RDBMS (e.g., Postgres), AMQP (e.g., RabbitMQ), and NoSQL (e.g., ElasticSearch).
Experience working with various public cloud providers (AWS, Azure, or GCP) and modern cloud monitoring system observability frameworks (e.g., Datadog).
Solid knowledge of scripting languages (e.g., Bash, Python, Ruby...).
Experience with Java, Javascript, or Spring frameworks would be a plus.
Proven problem-solving skills and the ability to troubleshoot complex technical issues.
Deep commitment to maintaining high system reliability and availability.
Experience in supporting or mentoring other developers in running services reliably in production.
Excellent communication and collaboration skills to work effectively with cross-functional teams.

What Celonis can offer you:

The unique opportunity to work with industry-leading process mining technology
Investment in your personal growth and skill development (clear career paths, internal mobility opportunities, L&D platform, mentorships, and more)
Great compensation and benefits packages (equity (restricted stock units), life insurance, time off, generous leave for new parents from day one, and more)
Physical and mental well-being support (subsidized gym membership, access to counseling, virtual events on well-being topics, and more)
A global and growing team of Celonauts from diverse backgrounds to learn from and work with
An open-minded culture with innovative, autonomous teams
Business Resource Groups to help you feel connected, valued and seen (Black@Celonis, Women@Celonis, Parents@Celonis, Pride@Celonis, Resilience@Celonis, and more)
A clear set of company values that guide everything we do: Live for Customer Value, The Best Team Wins, We Own It, and Earth Is Our Future

About Us

Since 2011, Celonis has helped thousands of the world's largest and most valued companies deliver immediate cash impact, radically improve customer experience and reduce carbon emissions. Its Process Intelligence platform uses industry-leading process mining technology and AI to present companies with a living digital twin of their end-to-end processes. For the first time, everyone in an organisation has a common language about how the business works, visibility into where value is hidden and the ability to capture it. Celonis is headquartered in Munich (Germany) and New York (USA) and has more than 20 offices worldwide.

Join us as we make processes work for people, companies and the planet.

Celonis is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. Different makes us better.

See more jobs at Celonis

Sr. Site Reliability Engineer

+30d

hims & hersRemote

kotlin ● terraform ● sql ● Design ● ansible ● git ● java ● c++ ● docker ● postgresql ● mysql ● kubernetes ● python

hims & hers is hiring a Remote Sr. Site Reliability Engineer

Hims & Hers Health, Inc. (better known as Hims & Hers) is the leading health and wellness platform, on a mission to help the world feel great through the power of better health. We are revolutionizing telehealth for providers and their patients alike. Making personalized solutions accessible is of paramount importance to Hims & Hers and we are focused on continued innovation in this space. Hims & Hers offers nonprescription products and access to highly personalized prescription solutions for a variety of conditions related to mental health, sexual health, hair care, skincare, heart health, and more.

Hims & Hers is a public company, traded on the NYSE under the ticker symbol “HIMS”. To learn more about the brand and offerings, you can visit hims.com and forhers.com, or visit our investor site. For information on the company’s outstanding benefits, culture, and its talent-first flexible/remote work approach, see below and visit www.hims.com/careers-professionals.

About the Role:

We are seeking a Site Reliability Engineer to help build a reliable web experience for our users. We believe that moving fast is our competitive advantage, and enables us to better serve our users. We also know that the faster we move, the more likely we are to break things.

You Will:

Design and implement SRE practices ensuring availability, scalability and observability of production systems with a strong focus on excellent customer experience
Actively seek and identify opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation.
Use automation extensively to design, configure, manage, and monitor systems in support of our product development teams
Manage Infrastructure through automation (Infrastructure as Code)
Manage incidents and emergency response, track outages, ensure data integrity and engineer releases to promote safe, efficient and rapid deployments
Handle emergency response either by being on-call or by reacting to symptoms according to monitoring and escalation when needed
Improve the codebase by resolving logic issues, deprecating unused code, etc.
Implement monitoring, logging, alerting and SLO Reporting
Identify Service Level Indicators (SLIs) that will align the team to meet the availability and performance objectives.
Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent incident reoccurrence.

You Have:

8+ years as a software engineer, shipping production code.
5+ years of experience as a Site Reliability Engineer.
Experience with service-oriented architectures and microservices at scale
Strong proficiency with RDBMS databases (PostgreSQL, MySQL, SQL Server, etc.)
Strong proficiency in SQL scripting
Proficiency developing in one or more languages such as Java, Kotlin, Python, and/or others
Ability to use containers and orchestration frameworks (Kubernetes, Docker, Container registries etc.)
Proficiency in Git or other VCS
Experience with configuring, customizing, and extending monitoring tools (Datadog, Prometheus, New Relic etc.)
Excellent debugging and troubleshooting skills
Strong technical competency, with a data-driven analytical approach towards solving complex challenges
Have a systematic problem-solving approach, coupled with strong and effective communication skills and a sense of drive
Nice-to-have: Experience with Terraform or other IAC tools such as Chef, Puppet or Ansible

Our Benefits (there are more but here are some highlights):

Competitive salary & equity compensation for full-time roles
Unlimited PTO, company holidays, and quarterly mental health days
Comprehensive health benefits including medical, dental & vision, and parental leave
Employee Stock Purchase Program (ESPP)
Employee discounts on hims & hers & Apostrophe online products
401k benefits with employer matching contribution
Offsite team retreats

#LI-Remote

Outlined below is a reasonable estimate of H&H’s compensation range for this role for US-based candidates. If you're based outside of the US, your recruiter will be able to provide you with an estimated salary range for your location.

The actual amount will take into account a range of factors that are considered in making compensation decisions including but not limited to skill sets, experience and training, licensure and certifications, and location. H&H also offers a comprehensive Total Rewards package that may include an equity grant.

Consult with your Recruiter during any potential screening to determine a more targeted range based on location and job-related factors. We don’t ever want the pay range to act as a deterrent from you applying!

An estimate of the current salary range for US-based employees is

$135,000—$150,000 USD

We are focused on building a diverse and inclusive workforce. If you’re excited about this role, but do not meet 100% of the qualifications listed above, we encourage you to apply.

Hims is an Equal Opportunity Employer and considers applicants for employment without regard to race, color, religion, sex, orientation, national origin, age, disability, genetics or any other basis forbidden under federal, state, or local law. Hims considers all qualified applicants in accordance with the San Francisco Fair Chance Ordinance.

Hims & hers is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, you may contact us at accommodations@forhims.com. Please do not send resumes to this email address.

For our California-based applicants – Please see our California Employment Candidate Privacy Policy to learn more about how we collect, use, retain, and disclose Personal Information.

See more jobs at hims & hers

Site Reliability Engineer II

+30d

ClassyRemote, US

terraform ● api ● c++ ● docker ● python ● AWS

Classy is hiring a Remote Site Reliability Engineer II

Classy, an affiliate of GoFundMe, is a Public Benefit Corporation and giving platform that enables nonprofits to connect supporters with the causes they care about. Classy's platform provides powerful and intuitive fundraising tools to convert and retain donors. Since 2011, Classy has helped nonprofits mobilize and empower the world for good by helping them raise over $7 billion. Classy also hosts the Collaborative conference and the Classy Awards to spotlight the innovative work nonprofits are implementing around the globe. For more information, visitwww.classy.org.

Classy's Engineering team is hiring a Site Reliability Engineer II to join a team of engineers to support and extend our product infrastructure and events management platform to achieve 99.999% availability. The ideal candidate is comfortable leading technical projects and supporting a best-in-class infrastructure and reliability framework. As a member of our team, you will have the opportunity to keep the Classy online fundraising platform for nonprofits running smoothly without interruption.

What you’ll do:

Play a critical role in contributing to and maintaining a robust, fault-tolerant, global payments platform processing billions of dollars annually.
Maintain and enhance playbooks and runbooks
Work across engineering to improve SLO/SLI framework
Work to improve our incident management process
Perform daily assessments of the Classy’s stack, a distributed public cloud environment, for a comprehensive understanding of capacity trending, vulnerabilities, and stability
Participate in an engineering culture of “always be learning” where the sharing and learning from failures are celebrated and the giving and receiving of constructive, candid feedback is highly encouraged.

What you bring (Required):

Bachelor’s Degree in Computer Science or a related field, or equivalent work experience.
3+ years experience with implementing AWS services (ECS, EKS, EC2, IAM, Cloudwatch, Lambda, etc.)
3+ years experience using scripting languages such as Python, Bash, or Go.
3+ years experience using APM tools such as NewRelic, DataDog, Splunk, etc.
Solid understanding of Docker and container orchestration
Solid understanding of microservice architectures, database servers, and API services.
Solid understanding of distributed data models with experience debugging distributed systems with high data loads.
Ability to understand product requirements and translate them into technical subtasks
A deep sense of quality and sharp engineering skills with strong computer science fundamentals.
The detail-oriented mindset with the ability to rapidly learn new concepts and technologies and communicate complex ideas
Experience with Scrum/Agile development methodologies.
Ability to participate in on-call rotation for after-hours needs

What would be awesome to have (Preferred):

Experience building PCI compliant systems
Experience in payment processing systems
Experience developing high-volume transaction systems
Experience with Blue/Green deployment infrastructure
Experience with Terraform
Passion for building fault tolerant and secure platforms

Why you’ll love it here:

Market competitive pay
Rich healthcare benefits including employer paid premiums for medical/dental/vision (100% for employee only plans and 85% for employee + dependent plans) and employer HSA contributions.
401(k) retirement plan with company matching
Hybrid workplace with fully remote flexibility for many roles
Monetary support for new hire setup, hybrid work & wellbeing, family planning, and commuting expenses
A variety of mental and wellness programs to support employees
Generous paid parental leave and family planning stipend
Supportive time off policies including vacation, sick/mental health days, volunteer days, company holidays, and a floating holiday
Learning & development and recognition programs
Gives Back Program where employees can nominate a fundraiser every month for a donation from the company
Inclusion, diversity, equity, and belonging are vital to our priorities and we continue to evolve our strategy to ensure DEI is embedded in all processes and programs at GoFundMe. Our Diversity, Equity, and Inclusion team is always finding new ways for our company to uphold and represent the experiences of all of the people in our organization.
Employee resource groups
Your work has a real purpose and will help change lives on a global scale.
You’ll be a part of a fun, supportive team that works hard and celebrates accomplishments together.
We live by our core values: consider everything, do the right thing, spread empathy, delight the customer, and give back.
We are a certified Great Place to Work, are growing fast and have incredible opportunities ahead!
Ourcommitment to Sustainability.Classy exists to create a sustainable world for all.

Dedication to Diversity

Classy is working toward building a more diverse and inclusive environment that is representative of individuals of all backgrounds, experiences, and lifestyles, allowing all employees to feel comfortable being their true, authentic selves in a space that enables productivity and meaningful work.

The total annual salary for this full-time position is $110,000 - $150,000+ equity + benefits. As this is a remote position, the salary range was determined by role, level, and possible location across the US. Individual pay is determined by work location and additional factors including job-related skills, experience, and relevant education or training.

Your recruiter can share more about the specific salary range based on your location during the hiring process.

If you require a reasonable accommodation to complete a job application or a job interview or to otherwise participate in the hiring process, please contact us at accommodationrequests@gofundme.com.

Global Data Privacy Notice for Job Candidates and Applicants:

Depending on your location, the General Data Protection Regulation (GDPR) or certain US privacy laws may regulate the way we manage the data of job applicants. Our full notice outlining how data will be processed as part of the application procedure for applicable locations is available here. By submitting your application, you are agreeing to our use and processing of your data as required.

Learn more about GoFundMe:

For recent company news and announcements, visit our Newsroom.

See more jobs at Classy

+30d

PodiumRemote, US

Bachelor's degree ● terraform ● Design ● ansible ● azure ● ruby ● docker ● kubernetes ● linux ● python ● AWS

Podium is hiring a Remote Site Reliability Engineer

At Podium, our mission is to help local businesses win. Our lead conversion platform, powered by AI and integrations, helps local businesses convert leads faster, communicate easier, and make more sales. Every day, thousands of local businesses utilize our review management, communication, marketing, and payments products.

Our work and focus on helping local businesses thrive has been recognized across the industry, including Forbes’ Next Billion Dollar Startups, Forbes’ Cloud 100, the Inc. 5000, and Fast Company’s World’s Most Innovative Companies.

At Podium, we believe in fostering a culture that thrives on hiring and developing exceptional talent. Our operating principles serve as a compass, guiding daily behavior and decision-making, and ensure we hire people who will thrive at Podium. If you resonate with our operating principles and are energized by our mission, Podium will be a great place for you!

The Role:

A Site Reliability Engineer borders the worlds of software engineering and systems engineering. At Podium, the SRE team drives our products to success by building a stable, scalable, sustainable, and slick system. We permanently sit and sup with the product engineering teams to address all of their needs, and work as an SRE guild to build a world-class platform for our products to run on. We're currently targeting a senior SRE to come in and deliver impact from day one.

What you will be doing:

Work with the following technologies: Kubernetes, Helm, Docker, AWS, Terraform, Datadog, Prometheus, Ansible, StrongDM, Python, Go, Ruby, GitLab and GitLab CI.
Engaging with Podium's engineering community to identify potential areas of improvement or pain points and making Podium's systems safer and more pleasant to operate.
Participating in an on-call rotation for the services the team owns, triaging and addressing production as well as development issues.
Working cross-functionally with different teams to make sure that there is no down time for our products.
Mentoring junior engineers on the team.

What you should have:

Bachelor’s degree in a technical field or relevant work experience.
4+ years experience working alongside a production system in either a software engineer or systems engineer type role
3+ years deploying, operating and debugging server software on Linux
Curiosity and the desire to learn
Ability to take a rotating on-call shift

What we hope you have:

Experience with distributed systems and microservices
Practical knowledge of system design
Cloud computing, such as AWS, GCP, or Azure
SOC2, HIPAA, PCI, or other regulatory or compliance standards
Building and maintaining a CI/CD pipeline
Heavy Infrastructure experience

BENEFITS

Open and transparent culture - Checkout thisvideoto see what it’s like to work at Podium
Life insurance, long and short-term disability coverage
Paid maternity and paternity leave
Fertility Benefits
Generous vacation time, plus three 4-day summer holiday weekends
Excellent medical, dental, and vision benefits
401k Plan
Bi-annual swag drops with cool Podium gear and apparel
A stellar HQ (Utah) gym with local professional coaches and classes offered
Onsite HQ (Utah) child care center, subsidized for employees
Additional benefits for fully remote employees

Podium is an equal opportunity employer. Podium provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, national origin, sexual orientation, gender identity or expression, age, disability, genetic information, marital status or veteran status.

See more jobs at Podium

Mid Site Reliability Engineer

+30d

RemoteRemote-Australia

terraform ● java ● docker ● kubernetes ● linux ● jenkins ● python ● AWS

Remote is hiring a Remote Mid Site Reliability Engineer

About Remote

Remote is solving global remote organizations’ biggest challenge: employing anyone anywhere compliantly. We make it possible for businesses big and small to employ a global team by handling global payroll, benefits, taxes, and compliance. Check out remote.com/how-it-works to learn more or if you’re interested in adding to the mission, scroll down to apply now.

Please take a look at remote.com/handbook to learn more about our culture and what it is like to work here. Not only do we encourage folks from all ethnic groups, genders, sexuality, age and abilities to apply, but we prioritize a sense of belonging. You can check out independent reviews by other candidates on Glassdoor or look up the results of our candidate surveys to see how others feel about working and interviewing here.

All of our positions are fully remote. You do not have to relocate to join us!

What this job can offer you

Managing and improving our existing infrastructure
Helping us build the next generation of our platform: using tools like Kubernetes, Terraform and Docker.
Streamlining and automating our deployment processes
Work closely with our Security team to keep on top of potential threats/patches
Support our engineers and product teams to improve overall scalability, stability and reliability

What you bring

Kubernetes
AWS
Terraform
CI/CD (GitLab, Github, Jenkins or similar)
Docker
Bash scripting

Nice to have

Experience with 1 back-end programming language (Elixir, Clojure, Java, Nodejs, Python, etc)
Experience running and configuring Linux systems in a non-cloud environment
Excellent communication and interpersonal skills
Holistic debugging skills
Security knowledge and capabilities from a defensive and offensive standpoint

Practicals

You'll report to: Engineering Manager
Team:Engineering
Location: remotely within Australia
Employment type: Full time, employee
Application open date:30th January 2024
Application closing date: 27th February 2024
Start date: As soon as possible

Remote Compensation Philosophy

Remote's Total Rewards philosophy is to ensure fair, unbiased compensation and fair equitypayalong with competitive benefits in all locations in which we operate. We do not agree to or encourage cheap-labor practices and therefore we ensure to pay above in-location rates. We hope to inspire other companies to support global talent-hiring and bring local wealth to developing countries.

Application process

Interview with recruiter
Interview with future manager
Interview with team members (no managers present)
Prior employment verification check

Benefits

Our full benefits & perks are explained in our handbook at remote.com/r/benefits. As a global company, each country works differently, but some benefits/perks are for all Remoters:

work from anywhere
unlimited personal time off (minimum 4 weeks)
quarterly company-wide day off for self care
flexible working hours (we are async)
16 weeks paid parental leave
mental health support services
stock options
learning budget
home office budget & IT equipment
budget for local in-person social events or co-working spaces

How you’ll plan your day (and life)

We work async at Remote which means you can plan your schedule around your life (and not around meetings). Read more at remote.com/async.

You will be empowered to take ownership and be proactive. When in doubt you will default to action instead of waiting. Your life-work balance is important and you will be encouraged to put yourself and your family first, and fit work around your needs.

If that sounds like something you want, apply now!

How to apply

Please fill out the form below and upload your CV with a PDF format.
We kindly ask you to submit your application and CV in English, as this is the standardised language we use here at Remote.
If you don’t have an up to date CV but you are still interested in talking to us, please feel free to add a copy of your LinkedIn profile instead.

We will ask you to voluntarily tell us your pronouns at interview stage, and you will have the option to answer our anonymous demographic questionnaire when you apply below. As an equal employment opportunity employer it’s important to us that our workforce reflects people of all backgrounds, identities, and experiences and this data will help us to stay accountable. We thank you for providing this data, if you chose to.

See more jobs at Remote

Senior Site Reliability Engineer (Brazil)

+30d

SezzleColombia Remote

terraform ● sql ● Design ● c++ ● docker ● kubernetes ● linux ● python ● AWS

Sezzle is hiring a Remote Senior Site Reliability Engineer (Brazil)

This is a remote role for candidates based in Latin America.

About the Role:

We are looking for a Site Reliability Engineer to work on our core Infrastructure and Security team, to assist us with designing, building, running, improving and scaling the infrastructure that engineering and data teams use to power their services. Your duties will include the development, testing, and maintenance of our serving and data platforms, using a combination of cloud products, open source tools and internal applications. Your duties will blend software development and operations in order to continuously automate our environments. You should be able to build high-quality, scalable solutions for a variety of problems.

Our Company:

Sezzle is a cutting-edge fintech company whose long-standing mission is to financially empower the next generation. Sezzle has built a payment platform that increases purchasing power for consumers by offering interest-free installment plans. This increase in purchasing power for consumers leads to increased sales and basket sizes for the numerous eCommerce merchants that currently work with Sezzle.

What Makes Working at Sezzle Awesome?

At Sezzle, we are more than just brilliant engineers, passionate data enthusiasts, out-of-the-box thinkers, and determined innovators; we are skilled musicians, yogis, cyclists, chefs, golfers, dog-lovers, and rock-climbers. We believe in surrounding ourselves with not only the best and the brightest individuals, but those that are unique and purpose-driven in all that they do. Our culture is not defined by a certain set of perks designed to give the illusion of the traditional startup culture, but rather, it is the visible example living in every employee that we hire.

Responsibilities:

Design, build and maintain scalable infrastructure for running our systems, based on Kubernetes, Redshift and additional AWS services and products.
Help the product teams quickly build out MVP products to test new solutions on the market.
Maintain and develop monitoring and alerting solutions to improve the on-call experience.
Assist product developers in debugging and triaging production issues.
Be the first line of defense for our operational environments, triaging and resolving problems as they occur. You will be on an on-call rotation.
Design and scale platform and data architectures to sustain rapid user growth.
Level up the teams through pairing, code review, and mentoring.
Bring and share with our team extensive experience with industry best practices in software development.

Minimum Requirements:

Bachelor's in computer science (preferred) or equivalent related experience
At least 5+ years of overall software, data, deployments and platform infrastructure experience.

Ideal Skills & Experience:

Experience with building and/or serving REST APIs using Go or a similar language.
Experience with Relational Databases, SQL and ORM technologies.
Strong overall Linux knowledge.
DevOps experience with CI/CD pipelines, Docker and Kubernetes, and cloud computing platforms like AWS.
Experience with deployment/provisioning tools like Terraform, Helm, Ansible.
Experience with implementing and maintaining observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki or similar.
Experience in ETL/ELT pipelines using Python and Open-source tools such as DBT.
Proficiency in building and maintaining large-scale data warehousing technologies such as Redshift.

About You:

A+ character. We are team-first here at Sezzle.
A hard-working mentality. It’s early and there is still a lot to build.
An excellent communicator.
A fun attitude. Life’s too short. We can have fun while we work hard on cool things.
Smarts. We need people that are smart enough to make decisions on their own and also smart enough to know when they need input from others.

#Li-remote

See more jobs at Sezzle

Senior Site Reliability Engineer (Costa Rica)

+30d

SezzleColombia Remote

terraform ● sql ● Design ● c++ ● docker ● kubernetes ● linux ● python ● AWS

Sezzle is hiring a Remote Senior Site Reliability Engineer (Costa Rica)

This is a remote role for candidates based in Latin America.

About the Role:

Our Company:

What Makes Working at Sezzle Awesome?

Responsibilities:

Design, build and maintain scalable infrastructure for running our systems, based on Kubernetes, Redshift and additional AWS services and products.
Help the product teams quickly build out MVP products to test new solutions on the market.
Maintain and develop monitoring and alerting solutions to improve the on-call experience.
Assist product developers in debugging and triaging production issues.
Be the first line of defense for our operational environments, triaging and resolving problems as they occur. You will be on an on-call rotation.
Design and scale platform and data architectures to sustain rapid user growth.
Level up the teams through pairing, code review, and mentoring.
Bring and share with our team extensive experience with industry best practices in software development.

Minimum Requirements:

Bachelor's in computer science (preferred) or equivalent related experience
At least 5+ years of overall software, data, deployments and platform infrastructure experience.

Ideal Skills & Experience:

Experience with building and/or serving REST APIs using Go or a similar language.
Experience with Relational Databases, SQL and ORM technologies.
Strong overall Linux knowledge.
DevOps experience with CI/CD pipelines, Docker and Kubernetes, and cloud computing platforms like AWS.
Experience with deployment/provisioning tools like Terraform, Helm, Ansible.
Experience with implementing and maintaining observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki or similar.
Experience in ETL/ELT pipelines using Python and Open-source tools such as DBT.
Proficiency in building and maintaining large-scale data warehousing technologies such as Redshift.

About You:

A+ character. We are team-first here at Sezzle.
A hard-working mentality. It’s early and there is still a lot to build.
An excellent communicator.
A fun attitude. Life’s too short. We can have fun while we work hard on cool things.
Smarts. We need people that are smart enough to make decisions on their own and also smart enough to know when they need input from others.

#Li-remote

See more jobs at Sezzle

+30d

AristaPune, India, Remote

nosql ● mongodb ● docker ● postgresql ● kubernetes ● jenkins ● python

Arista is hiring a Remote Site Reliability Engineer

Job Description

Responsibilities:

Ensure the scalability, performance, and resilience of our suite of products
Work with the development and product team to establish the right monitoring and alerting strategy
Develop build, test, and deployment automation that seamlessly targets multiple cloud regions
Define and implement standards and best practices related to, system architecture, service delivery, metrics, and the automation of operational tasks
Optimize telemetry platform to identify customer-impacting events while providing relevant data to drive debugging
Partner with the engineering team to optimize the performance of services for cloud architecture
Debug Live Site events and conduct follow-up post-mortem and RCA analysis

Qualifications

B.E/B.Tech in Computer Science or equivalent
5 to 7 years of relevant experience
Scripting languages like Bash, Python, etc.
Exposure to operational knowledge of managing applications in AWS/GCP
Experienced in automating software build, deployment, and server configuration management using tools such as Puppet, Chef, and Jenkins
Hands-on experience with Linux/Unix Administration
Good understanding of containerization concepts - docker, ECS, EKS, Kubernetes
Experience with building tools such as Jenkins
Working experience with NoSQL databases such as MongoDB, PostgreSQL, etc.
Understanding of basic networking concepts

+30d

Modern HealthRemote - US

terraform ● Design ● azure ● postgresql ● python ● AWS

Modern Health is hiring a Remote Senior Site Reliability Engineer

Modern Health

Modern Healthis a mental health benefits platform for employers. We are the first global mental health solution to offer employees access to one-on-one, group, and self-serve digital resources for their emotional, professional, social, financial, and physical well-being needs—all within a single platform. Whether someone wants to proactively manage stress or treat depression, Modern Health guides people to the right care at the right time. We empower companies to helpalltheir employees be the best version of themselves, and believe in meeting people wherever they are in their mental health journey.

We are a female-founded company backed by investors like Kleiner Perkins, Founders Fund, John Doerr, Y Combinator, and Battery Ventures. We partner with 500+ global companies like Lyft, Electronic Arts, Pixar, Clif Bar, Okta, and Udemy that are taking a proactive approach to mental health care for their employees. Modern Health has raised more than $170 million in less than two years with a valuation of $1.17 billion, making Modern Health the fastest entirely female-founded company in the U.S. to reach unicorn status.

We tripled our headcount in 2021 and as a hyper-growth company with a fully remote workforce, we prioritize our people-first culture (winning awards including Fortune's Best Workplaces in the Bay Area 2021). To protect our culture and help our team stay connected, we require overlapping hours for everyone. While many roles may function from anywhere in the world—see individual job listing for more—team members who live outside the Pacific time zone must be comfortable working early in the morning or late at night; all full-time employees must work at least six hours between 8 am and 5 pm Pacific time each workday.

We are looking for driven, creative, and passionate individuals to join in our mission. An inclusive and diverse culture are key components of mental well-being in the workplace, and that starts with how we build our own team. If you're excited about a role, we'd love to hear from you!

The Role

You'll be given lots of responsibility and the opportunity to have true ownership as we build out the product. This is a unique opportunity to use your engineering powers to make a direct impact in people's lives. We need a Site Reliability Engineer who is enthusiastic about building reliable, scalable, and flexible systems to support our growing team, product, and user base. You'll work with other engineers to reliably release and maintain services, and help define and meet internal and customer-facing SLA's and SLO's.

This position is not eligible to be performed in Hawaii.

What You’ll Do

Manage and orchestrate Cloud Resource (AWS) configuration using Infrastructure As Code (Terraform) to empower engineering staff to embrace a DevOps culture of Self Service Ownership
Develop and govern Observability (Datadog) best practices for tracking platform performance and health trends to meet customer SLAs and lead technical decisions with strong supporting evidence
Create solutions that dynamically scale based on demand with enough flexibility to pivot for fast changing project requirements while maintaining a balance of good versus perfect
Provide strong and consistent communication updates on technical progress or blockers to keep stakeholders informed while additionally creating appropriate documentation on technical design to spread knowledge and reduce information silos
Participate and respond to 24/7 on-call critical alerts and follow documented incident investigation procedures to reestablish customer facing feature availability
Maintain HIPAA, GDPR, SOC-2 compliance and general security through best practice implementation

Who You Are

3+ years of experience in software engineering, with 2 years experience in DevOps
Cloud Provider (AWS, GCP, Azure) experience on managing resources through Infrastructure As Code (Terraform)
Container Orchestration (ECS or K8s) experience to confidently build, test, and release containerized applications for multiple environments and regions.
Knowledge of Observability best practices across common cloud resources (EC2, ECS, RDS, DynamoDB, S3, SQS, Eventbridge) with experience on rolling out enhancements across a distributed platform with scale in mind.
Experience with shell scripting for *nix systems
Experience with Networking for web applications
Effective at communicating ideas through writing and diagramming
Comfortable working with a distributed development and ops team
Familiarity with AWS: ECS and cloud hosting, Gitlab: CI/CD, Python: Django, Flask, aiohttp, Bash, Data: PostgreSQL, Redis, Monitoring: Datadog and Sentry, IaC: Terraform, Packer

Benefits

Fundamentals:

Medical / Dental / Vision / Disability / Life Insurance
High Deductible Health Plan with Health Savings Account (HSA) option
Flexible Spending Account (FSA)
Access to coaches and therapists through Modern Health's platform
Generous Time Off
Company-wide Collective Pause Days

Family Support:

Parental Leave Policy
Family Forming Benefit through Carrot
Family Assistance Benefit through UrbanSitter

Professional Development:

Professional Development Stipend

Financial Wellness:

401k
Financial Planning Benefit through Origin

But wait there’s more…!

Annual Wellness Stipend to use on items that promote your overall well being
New Hire Stipend to help cover work-from-home setup costs
ModSquad Community: Virtual events like active ERGs, holiday themed activities, team-building events and more
Monthly Cell Phone Reimbursement

Equal Pay for Equal Work Act Information

Please refer to the ranges below to find the starting annual pay range for individuals applying to work remotely from the following locations for this role.

Compensation for the role will depend on a number of factors, including a candidate’s qualifications, skills, competencies, and experience and may fall outside of the range shown. Ranges are not necessarily indicative of the associated starting pay range in other locations. Full-time employees are also eligible for Modern Health's equity program and incredible benefits package. See our Careers page for more information.

Depending on the scope of the role, some ranges are indicative of On Target Earnings (OTE) and includes both base pay and commission at 100% achievement of established targets.

San Francisco Bay Area

$138,500—$162,900 USD

All Other California Locations

$138,500—$162,900 USD

Colorado

$117,725—$138,465 USD

New York City

$138,500—$162,900 USD

All Other New York Locations

$124,450—$146,610 USD

Seattle

$138,500—$162,900 USD

All Other Washington Locations

$124,450—$146,610 USD

Below, we are asking you to complete identity information for the Equal Employment Opportunity Commission (EEOC). While we are required by law to ask these questions in the format provided by the EEOC, at Modern Health we know that gender is not binary, and we recognize that these categories do not reflect our employees' full range of identities.

See more jobs at Modern Health

Site Reliability Engineer - SRE

+30d

Now1Atlanta, GA, Remote

Design ● kubernetes ● linux ● jenkins ● python

Now1 is hiring a Remote Site Reliability Engineer - SRE

Job Description

Role: Site Reliability Engineer
Location: Atlanta, GA OR Dallas OR Austin, TX
Duration: Long Term or 6+ Months contract to Hire

Note: Remote Possible, however candidates will move to work onsite/Hybrid eventually. Please make sure you are comfortable with this.

EAD's allowed, who can work on W2. work's C2C who has their Own Corp.

Job description:

3-5 years of Site reliability engineer experience on google platform.

Strong experience in Google Cloud platform.

As a Staff Software Engineer, you will be a core player on the product team and are expected to build and grow the skillsets of the more junior Engineers. As a Staff Site Reliability Engineer you will be responsible for building and supporting the platform/application infrastructure of one of the largest retailers in the world. This will require you to maintain high site uptime/availability while embracing rapid change and growth using a strong devops mindset of continuous delivery and site automation.

Qualifications

Preferred Qualifications:

3-7 years of professional experience in engineering
Hands on experience in Site Reliability Engineering and solving problems through automation and instrumentation
Experience with Jenkins for CI/CD pipleine creation and CI/CD automation
Experience with Kubernetes implementation with Google
Proficient in a Linux or Unix based environment.
Proficiency in supporting a 24x7 operation.
Experience in a cloud computing platform and the associated automation patterns it provides, preferably GCP.
Deep understanding of an object orientated language, preferably the latest version of Java.
Proficient in a modern scripting language like GO or Python
Proficient in production systems design including High Availability, Disaster Recovery, Performance, Efficiency, and Security user, application performance, system, log, time-series, and dashboarding.
Proficient in a modern infrastructure automation toolkit such as Terraform/Helm

See more jobs at Now1

+30d

ClassyRemote, US

Design ● api ● c++ ● python ● AWS

Classy is hiring a Remote Senior Site Reliability Engineer

Classy, an affiliate of GoFundMe, is a Public Benefit Corporation and giving platform that enables nonprofits to connect supporters with the causes they care about. Classy's platform provides powerful and intuitive fundraising tools to convert and retain donors. Since 2011, Classy has helped nonprofits mobilize and empower the world for good by helping them raise over $5 billion. Classy also hosts the Collaborative conference and the Classy Awards to spotlight the innovative work nonprofits are implementing around the globe. For more information, visitwww.classy.org.

About the role:

Classy's Engineering team is hiring a Senior Site Reliability Engineer to join a team of engineers to support and extend our product infrastructure and events management platform to achieve 99.999% availability. The ideal candidate is comfortable leading technical projects and supporting a best-in-class infrastructure and reliability framework. As a member of our team, you will have the opportunity to keep the Classy online fundraising platform for nonprofits running smoothly without interruption.

What you’ll accomplish:

Play a critical role in contributing to and maintaining a robust, fault-tolerant, global payments platform processing billions of dollars annually.
Maintain and enhance playbooks and runbooks
Work across engineering to improve SLO/SLI framework
Work to improve our incident management process
Perform daily assessment of the Classy’s stack, distributed public cloud environment for a comprehensive understanding of capacity trending, vulnerabilities, and stability
Mentor engineers to become proficient developers using best software development practices and processes.
Participate in an engineering culture of “always be learning” where the sharing and learning from failures are celebrated and the giving and receiving constructive, candid feedback is highly encouraged.

What you bring (Required):

Bachelor’s Degree in Computer Science, or a related field, or equivalent work experience.
5+ years building highly scalable projects involving cloud-based infrastructure design and implementation.
5+ years of experience using scripting languages such as Python, Bash, or Javascript.
Strong APM experience using tools such as NewRelic, DataDog, Splunk, etc.
Solid understanding of microservice architectures, database servers, and API services.
Solid understanding of distributed data models with experience debugging distributed systems with high data loads.
High-level proficiency in implementing AWS services (ECS, EKS, EC2, IAM, Cloudwatch, etc.)
High-level understanding of the application and distributed environments security practice, edge protection observability such as CloudFlare.
High-level of understanding of infrastructure as code (IaC)
Experience with Blue/Green deployment infrastructure.
Ability to understand product requirements and translate them into technical subtasks.
Experience with Scrum/Agile development methodologies.
Ability to participate in an on-call rotation for after-hours needs.

What would be awesome to have (preferred):

Experience building PCI compliant systems
Experience in payment processing systems
Experience developing high-volume transaction systems
Passion for building fault tolerant and secure platforms

Why you’ll love it here:

Market competitive pay
Rich healthcare benefits including employer paid premiums for medical/dental/vision (100% for employee only plans and 85% for employee + dependent plans) and employer HSA contributions.
401(k) retirement plan with company matching
Hybrid workplace with fully remote flexibility for many roles
Monetary support for new hire setup, hybrid work & wellbeing, family planning, and commuting expenses
A variety of mental and wellness programs to support employees
Generous paid parental leave and family planning stipend
Supportive time off policies including vacation, sick/mental health days, volunteer days, company holidays, and a floating holiday
Learning & development and recognition programs
Gives Back Program where employees can nominate a fundraiser every month for a donation from the company
Inclusion, diversity, equity, and belonging are vital to our priorities and we continue to evolve our strategy to ensure DEI is embedded in all processes and programs at GoFundMe. Our Diversity, Equity, and Inclusion team is always finding new ways for our company to uphold and represent the experiences of all of the people in our organization.
Employee resource groups
Your work has a real purpose and will help change lives on a global scale.
You’ll be a part of a fun, supportive team that works hard and celebrates accomplishments together.
We live by our core values: consider everything, do the right thing, spread empathy, delight the customer, and give back.
We are a certified Great Place to Work, are growing fast and have incredible opportunities ahead!
Ourcommitment to Sustainability.Classy exists to create a sustainable world for all.

Dedication to Diversity

The total annual salary for this full-time position is $130,000 - $175,000+ equity + benefits. As this is a remote position, the salary range was determined by role, level, and possible location across the US. Individual pay is determined by work location and additional factors including job-related skills, experience, and relevant education or training.

Your recruiter can share more about the specific salary range based on your location during the hiring process.

If you require a reasonable accommodation to complete a job application or a job interview or to otherwise participate in the hiring process, please contact us at accommodationrequests@gofundme.com.

See more jobs at Classy

+30d

PlaysonRemote job, Remote

jira ● terraform ● B2B ● Design ● git ● docker ● elasticsearch ● kubernetes ● jenkins ● python ● AWS

Playson is hiring a Remote Site Reliability Engineer

????Playson is a B2B game provider with 11 years of experience on the market. Since 2012 we have ambitiously developed worldwide recognition in the industry. Nowadays, our main focus is on regulated European Markets and we operate in 20+ different jurisdictions. As of 2023, we are continuously working on enhancing our portfolio, encompassing best practices in order to meet the highest standards of technology, design, support and interoperability.

We are looking for a Site Reliability Engineer. It is a position in the Platform Tribe, SRE Stream, FireX Squad, responsible for the automation and high-load infrastructure maintenance.

To succeed in the advertised role, you have:
― Strong experience with issues processing (RCA, Postmortems practices).
― Strong understanding of Kubernetes(K8s) — Including deployment, scaling, troubleshooting, and managing containerized applications.
― Proficiency in AWSservices — Specifically, expertise in Amazon Elastic Kubernetes Service (EKS), EC2, RDS, CloudFront, and other relevant services.
― Infrastructure as Code (IAC) — Terraformmust have
― Containerization technologies — Knowledge of Docker, including creating and managing Docker images and containers.
― CI/CD — Familiarity with continuous integration and continuous deployment tools like Jenkins, GitLab CI/CD, or GitHub Actions.
― Monitoring and observability — Experience with monitoring tools like DataDog, Prometheus, Grafana, and logging solutions like Elasticsearch, Logstash, and Kibana (ELK Stack) or AWS CloudWatch.
― Networking — Strong understanding of network concepts like DNS, load balancing, and firewalls, as well as network protocols like TCP/IP, HTTP, and HTTPS and gRPCas a big plus.
― Scripting and programming languages — Proficiency in at least one scripting language (e.g., Python, NodeJS, Go).
― Configuration management — Experience with tools like FluxCD/ArgoCD.
― Version control systems — Proficiency in using Git or other version control systems.
― Incident management — Familiarity with incident response and management tools like PagerDuty, Opsgenie, or VictorOps.
― Strong problem-solving and troubleshooting skills — The ability to diagnose and resolve complex technical issues.
― Strong ownership, proactiveness, persistence, and passion for maintaining one of the biggest online gambling platforms

Would be beneficial to know:
― FluxCD/ArgoCD
― Ticket systems: Jira
― Understanding of event-driven architecture

― Understanding of ITIL Frameworks
― Security best practices — Knowledge of security principles, including securing applications, infrastructure, and data
― Cilium
― Terragruntexperience

See more jobs at Playson

+30d

Evertz Microsystems LimitedRemote

2 years of experience ● agile ● ui ● java ● typescript ● linux ● angular ● jenkins ● python ● AWS

Evertz Microsystems Limited is hiring a Remote Site Reliability Engineer

Site Reliability Engineer - Evertz Microsystems Limited - Career Page

See more jobs at Evertz Microsystems Limited

+30d

Personio+10 more Dublin, Munich, Madrid, Remote Germany, London, Remote Spain, Remote Ireland, Amsterdam, Remote Netherlands, Remote, Remote UK, Berlin, Barcelona, Remote Berlin, Remote Barcelona

4 years of experience ● agile ● Design ● python ● AWS

Personio is hiring a Remote Senior Site Reliability Engineer

The Role

At Personio we are on the amazing journey of becoming the leading HR Platform in Europe!

Join our international agile Product, Design & Engineering team and take an active role in shaping our engineering culture and the future of thousands of HR teams and organizations across Europe. At Personio you will have a direct impact on our product, our users, our organization, and our engineering practices.

The Technology Platform domain's mission is to ensure Personio builds and scales as a high-performing engineering organization. We drive engineering excellence in Personio, for the benefit of our engineers, our product, and our customers by comprising teams dedicated to excellence in specific areas of software engineering. We measure the core metrics of software development delivery and we use them to drive initiatives that enable teams to build, test, deliver, and operate their software to the highest industry standards. This makes our engineers happy and empowered to do great work. Our measurable contribution to the success of Personio lies in empowering engineering excellence positively impacting Product Development and leading to increased customer satisfaction.

Like all our domains we are a group of engineers from many different nationalities and backgrounds, spread across different locations and we continue to grow in 2023. Join us and help shape the Technology Platform journey for Personio.

Personio is aiming to keep growing the infrastructure teams that will provide tools and infrastructure as code used by our whole engineering department, our BI teams, and within our own infrastructure teams.

We need you to solve our main challenges for our product engineers like the ability to create and operate infrastructure autonomously, facilitate data access, and excel in monitoring.

At the same time, Personio aims to serve more and more customers, leading to new scalability and architectural challenges that will need to be addressed.

In the future, we aim to have a platform that is easy to scale, observable, and cost-optimized while being easily accessible by all our stakeholders.

Responsibilities

Designing and supporting a variety of infrastructure solutions with the product teams (from AWS account management to compute & database layers)
Engineering APIs and tools to allow engineers to manipulate their resources
Enabling engineering teams with self-service infrastructure as code to live by the “you build it you run it” motto
Working hand-by-hand with the other teams to create a coherent data platform
Documenting and running internal formation

What you need to succeed

Over 4 years of experience in contributing toward the architecture and design (architecture, design patterns, reliability, and scaling) of new and existing systems with AWS
You have a good knowledge of AWS APIs and are at ease with serverless cloud technologies
You have experience with DevOps (CI/CD) concepts & tooling in the cloud and master at least one infrastructure-as-code tool
You use regularly at least one programming language (Python or Golang ideally)
You have excellent written and spoken English
You have experience or are willing to experience rapid growth and are “building our plane while flying it”. So bring your agile mindset to the table!
You embrace feedback - no one is perfect, and neither are we. So let’s make this an opportunity to praise and learn from each other.
Finally, you have a good sense of humour. Have fun with us, learn with us from our mistakes and bring your good vibes!

Why Personio

Aside from our people, culture, and mission, there are a variety of additional benefits that help make Personio a great place to work! Work with us and receive:

A competitive compensation package that includes salary, benefits, and pre-IPO equity
28 days of paid vacation, plus another additional day after 2 and 4 years (because we love what we do, but we also love vacation!)
2 Impact Days you can use to have an impact on the environment and society – one for an individual project of your choice and one for a company-wide initiative! #SocialResponsibility
Find your best way to work with our office-led, remote-friendly PersonioFlex! We offer a roughly 50% remote, 50% in-office working framework to suit your needs
Annual personal development budget of €1,000 for conferences, courses, books, career coach, etc.
Regular company and team events.
High-impact working environment with flat hierarchies and short decision-making processes
Receive generous family leave, child support, mental health support, and sabbatical opportunities with PersonioCares
Save money with corporate discounts across brands like Adidas, LG, Bosch, Apple, and more
Comprehensive healthcare and dental coverage for each permanent employee (excluding taxes)
Invest in your retirement via the Personio Pension Scheme, with a Personio match of up to 5%
Access multiple fitness studios and sports facilities across Ireland for €30 per month with a subsidized Gympass membership
A vast choice of working locations: Munich, Berlin, Dublin, Madrid, Barcelona, Amsterdam… All with amenities like professional espresso machines, free drinks and snacks, and indoor and outdoor break spaces

Apply for this position

About us

Bring your best. Make your mark. We’re using technology to revolutionize the way HR operates so that we can transform the way millions of people experience work every day. We move fast, challenge the status quo, and support our people as they shape their careers.

With over 10,000 customers and a team of 1,800 in seven offices across Europe, now is the perfect time to join! We believe in hiring driven people who want to make an impact. So bring your best, and let’s build the future of HR technology together.

Personio is an equal opportunities employer, committed to building an integrative culture where everyone feels welcomed and supported. We #EmbraceUniqueness and understand that our diverse, values-driven culture makes us stronger. We are proud to have an inclusive workplace environment that will foster your development no matter your gender, civil status, family status, sexual orientation, religion, age, disability, education level, or race.

See more jobs at Personio

Senior Site Reliability Engineer - Observability

+30d

Personio+10 more Munich, Madrid, Remote Germany, London, Dublin, Remote Spain, Remote Ireland, Amsterdam, Remote Netherlands, Remote, Remote UK, Berlin, Barcelona, Remote Berlin, Remote Barcelona

agile ● Design ● python ● AWS

Personio is hiring a Remote Senior Site Reliability Engineer - Observability

The Role

Personio

At Personio we are on the amazing journey of becoming the leading HR Platform in Europe!

Join our international agile Product & Technology department and take an active role in shaping our engineering culture and the future of thousands of HR teams and organizations across Europe. At Personio you will have a direct impact on our product, our users, our organization, and our engineering practices.

The Role

Personio is aiming to build a set of teams that will provide tools and infrastructure as code used by our whole engineering department. We are aiming to make our engineers autonomous and able to operate and run their own infrastructure in AWS.

The Observability team is part of Infrastructure engineering and our goal is to deploy and scale our Observability infrastructure to enable our development teams with the infrastructure, tools and knowledge to diagnose operational issues in their systems at scale.

At the same time, Personio aims to serve more and more customers, leading to new scalability and architectural challenges that will need to be addressed.

In the future, we aim to have a platform that is easy to scale, observable, and cost-optimized while being easily accessible to all our stakeholders.

What you need to succeed

Responsibilities

Design, build out and maintain the application Observability of the Personio system
Enabling engineering teams to deliver production excellence with self-service Observability dashboards
Work hand-in-hand with our other infrastructure and product teams to create a coherent Observability platform
Define best practices around making our systems and services measurable and work with our product teams to get those best practices applied

What you need to succeed

Over 4 years experience in contributing toward the architecture and design (architecture, design patterns, reliability, Observability and scaling) of new and existing systems with AWS
Solid understanding of large scale applications, Cloud Observability, monitoring and fault management
You have experience with Observability technologies and telemetry tools such as Datadog, NewRelic, Sentry, Cloudwatch or similar.
You have a good knowledge of AWS APIs and are at ease with serverless cloud technologies
You have experience with DevOps (CI/CD) concepts & tooling in the cloud and master at least one infrastructure-as-code tool
You regularly use at least one programming language (Python or Golang ideally)
You have excellent written and spoken English
You have experienced or willing to experience rapid growth and are “building our plane while flying it”. So bring your agile mindset to the table!
You embrace feedback - no one is perfect, and neither are we. So let’s make this an opportunity to praise and learn from each other
Finally, you have a good sense of humour. Have fun with us, learn from our mistakes, and bring your good vibes!

Why Personio

Aside from our people, culture, and mission, there are a variety of additional benefits that help make Personio a great place to work! Work with us and receive:

A competitive compensation package that includes salary, benefits, and pre-IPO equity
28 days of paid vacation, plus another additional day after 2 and 4 years (because we love what we do, but we also love vacation!)
2 Impact Days you can use to have an impact on the environment and society – one for an individual project of your choice and one for a company-wide initiative! #SocialResponsibility
Find your best way to work with our office-led, remote-friendly PersonioFlex! We offer either hybrid or remote opportunities.
Annual personal development budget of €1,000 for conferences, courses, books, career coach, etc.
Regular company and team events.
High-impact working environment with flat hierarchies and short decision-making processes
Receive generous family leave, child support, mental health support, and sabbatical opportunities with PersonioCares
Save money with corporate discounts across brands like Adidas, LG, Bosch, Apple, and more
Comprehensive healthcare and dental coverage for each permanent employee (excluding taxes)
Invest in your retirement via the Personio Pension Scheme, with a Personio match of up to 5%
Access multiple fitness studios and sports facilities across Ireland for €30 per month with a subsidized Gympass membership
A vast choice of working location: Munich, Berlin, Dublin, Madrid, Barcelona, Amsterdam… All with amenities like professional espresso machines, free drinks and snacks, and indoor and outdoor break spaces

Apply for this position

About us

See more jobs at Personio