Site Reliability Engineer Remote Jobs

44 Results

3d

Site Reliability Engineer

iManageRemote
Full TimeagileterraformsqlDesignazurerubyc++c#.netdockerkuberneteslinuxpython

iManage is hiring a Remote Site Reliability Engineer

Site Reliability Engineer - iManage - Career PageWriting and designing automation, monitoring, diagnosing, and debugging tooling. 

See more jobs at iManage

Apply for this job

12d

Senior Site Reliability Engineer

AcquiaRemote - Costa Rica
DevOPS9 years of experience6 years of experience3 years of experienceterraformdrupalDesignansibleazurerubyjavakubernetesjenkinspythonAWSPHP

Acquia is hiring a Remote Senior Site Reliability Engineer

Acquia empowers the world’s most ambitious brands to create digital customer experiences that matter. With open source Drupal at its core, the Acquia Digital Experience Platform (DXP) enables marketers, developers, and IT operations teams at thousands of global organizations to rapidly compose and deploy digital products and services that engage customers, enhance conversions, and help businesses stand out.

Headquartered in the U.S., Acquiais positioned as a market leader by the analyst community and is listed as one of the world’s top software companies by The Software Report. We are Acquia. We are a global company with employees located in more than 30 countries, and we’re building for the future.We want you to be a part of it!

About the role:

As a Senior Site Reliability Engineer, you will be a key player in designing, implementing, and maintaining our CI/CD pipelines, cloud infrastructure, and monitoring solutions. Your expertise in tools like ArgoCD, Kubernetes, and cloud-native architecture will help us achieve operational excellence at scale. You will work closely with engineering teams to ensure they have the right infrastructure in place to deploy rapidly, safely, and reliably.

This is a hands-on role for someone who thrives in an environment where automation is the goal, reliability is the baseline, and scalability is second nature. You won’t just be maintaining systems—you’ll be innovating, designing new ways to make our infrastructure smarter and our development faster.

Job Responsibilities: 

  • CI/CD Pipeline Mastery: Design, build, and optimize continuous integration and continuous deployment (CI/CD) pipelines using ArgoCD, Jenkins, or similar tools. Ensure zero-downtime, fully automated deployment pipelines.
  • Infrastructure as Code (IaC): Build and manage scalable, reliable infrastructure using Terraform, Kubernetes, and other IaC tools. Ensure everything is automated—from deployments to monitoring—so that infrastructure becomes a self-service platform.
  • Cloud Expertise: Architect and manage cloud environments (AWS, GCP, or Azure), focusing on cost optimization, scalability, and performance. Implement disaster recovery, fault tolerance, and high availability strategies.
  • Monitoring and Alerting: Implement comprehensive monitoring solutions using Prometheus, Grafana, ELK, and Datadog to detect and resolve performance bottlenecks before they impact customers. Design and implement automated alerts for proactive system health monitoring.
  • DevOps Advocacy: Champion the culture of DevOps across teams—promote best practices, encourage adoption of new technologies, and drive a continuous learning mindset within the engineering teams. Be the go-to person for CI/CD, infrastructure scaling, and deployment automation.
  • SRE Mindset: Focus on building systems that are resilient by design, automating processes that improve reliability, and implementing Service Level Objectives (SLOs) to align engineering efforts with operational goals.
  • Security-First Approach: Collaborate with security teams to implement robust security practices, from container security to infrastructure hardening. Automate security checks within the pipeline for compliance and vulnerability management.
  • Collaboration with Engineering Teams: Work hand-in-hand with product development teams to understand their needs, integrate CI/CD practices into their workflows, and provide a fast, reliable, and secure path from code to production.

Skills:

  • BS in Computer Science or a comparable field of study, or equivalent practical experience.
  • Experience working with one or more of: Go, Python, Ruby, PHP, Java or Javascript. 
  • Experience with Unix/Linux systems administration using the CLI.
  • Fundamental understanding of TCP/UDP networking concepts
  • Solid oral and written communications skills.
  • CI/CD Expertise: Extensive hands-on experience with CI/CD tools such as ArgoCD, Jenkins, CircleCI, or GitLab CI. Ability to design and implement pipelines that ensure rapid, reliable deployments.
  • Kubernetes Guru: Strong understanding and experience with Kubernetes, Helm, and container orchestration. Ability to scale and manage microservices in production.
  • Cloud Mastery: Proficient in at least one major cloud provider—AWS, GCP, or Azure. Experience with multi-cloud or hybrid-cloud architecture is a plus.
  • IaC Champion: Proficiency in Terraform, Ansible, or CloudFormation to manage infrastructure as code. Familiarity with GitOps workflows and version-controlled infrastructure.
  • Monitoring & Observability: Strong experience with monitoring tools like Prometheus, Grafana, Datadog, ELK, or New Relic. Ability to build custom dashboards and alerting systems.
  • Security-Focused: Deep understanding of security best practices in DevOps, including container security, CI/CD pipeline security, and cloud infrastructure hardening.
  • Problem Solver: Excellent troubleshooting skills with the ability to diagnose issues across a variety of environments, from code to infrastructure.
  • Collaboration Skills: Ability to work effectively in cross-functional teams, influencing peers and driving adoption of best practices across the organization.

Preferred Qualifications: 

  • 5-9 years of hands-on experience as a DevOps Engineer, SRE, or related role in a cloud-native environment.
  • Deep knowledge of CI/CD pipelines, especially using ArgoCD or similar tools.
  • Proven expertise in cloud platforms (AWS, GCP, Azure), with experience building and managing scalable, reliable infrastructure.
  • Strong scripting skills in Python, Go, or Bash.
  • Experience with service mesh architectures like Istio or Linkerd is a plus.
  • SRE Certification (or equivalent experience) is a bonus.
  • Certified Kubernetes Administrator (CKA) is preferred.
  • A passion for automation, observability, and reliability.

All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

See more jobs at Acquia

Apply for this job

12d

Staff Site Reliability Engineer

AcquiaRemote - Costa Rica
DevOPS9 years of experience6 years of experience3 years of experienceterraformdrupalDesignansibleazurerubyjavakubernetesjenkinspythonAWSPHP

Acquia is hiring a Remote Staff Site Reliability Engineer

Acquia empowers the world’s most ambitious brands to create digital customer experiences that matter. With open source Drupal at its core, the Acquia Digital Experience Platform (DXP) enables marketers, developers, and IT operations teams at thousands of global organizations to rapidly compose and deploy digital products and services that engage customers, enhance conversions, and help businesses stand out.

Headquartered in the U.S., Acquiais positioned as a market leader by the analyst community and is listed as one of the world’s top software companies by The Software Report. We are Acquia. We are a global company with employees located in more than 30 countries, and we’re building for the future.We want you to be a part of it!

About the role:

As a Staff Site Reliability Engineer, you will be a key player in designing, implementing, and maintaining our CI/CD pipelines, cloud infrastructure, and monitoring solutions. Your expertise in tools like ArgoCD, Kubernetes, and cloud-native architecture will help us achieve operational excellence at scale. You will work closely with engineering teams to ensure they have the right infrastructure in place to deploy rapidly, safely, and reliably.

This is a hands-on role for someone who thrives in an environment where automation is the goal, reliability is the baseline, and scalability is second nature. You won’t just be maintaining systems—you’ll be innovating, designing new ways to make our infrastructure smarter and our development faster.

Job Responsibilities: 

  • CI/CD Pipeline Mastery: Design, build, and optimize continuous integration and continuous deployment (CI/CD) pipelines using ArgoCD, Jenkins, or similar tools. Ensure zero-downtime, fully automated deployment pipelines.
  • Infrastructure as Code (IaC): Build and manage scalable, reliable infrastructure using Terraform, Kubernetes, and other IaC tools. Ensure everything is automated—from deployments to monitoring—so that infrastructure becomes a self-service platform.
  • Cloud Expertise: Architect and manage cloud environments (AWS, GCP, or Azure), focusing on cost optimization, scalability, and performance. Implement disaster recovery, fault tolerance, and high availability strategies.
  • Monitoring and Alerting: Implement comprehensive monitoring solutions using Prometheus, Grafana, ELK, and Datadog to detect and resolve performance bottlenecks before they impact customers. Design and implement automated alerts for proactive system health monitoring.
  • DevOps Advocacy: Champion the culture of DevOps across teams—promote best practices, encourage adoption of new technologies, and drive a continuous learning mindset within the engineering teams. Be the go-to person for CI/CD, infrastructure scaling, and deployment automation.
  • SRE Mindset: Focus on building systems that are resilient by design, automating processes that improve reliability, and implementing Service Level Objectives (SLOs) to align engineering efforts with operational goals.
  • Security-First Approach: Collaborate with security teams to implement robust security practices, from container security to infrastructure hardening. Automate security checks within the pipeline for compliance and vulnerability management.
  • Collaboration with Engineering Teams: Work hand-in-hand with product development teams to understand their needs, integrate CI/CD practices into their workflows, and provide a fast, reliable, and secure path from code to production.

Skills:

  • BS in Computer Science or a comparable field of study, or equivalent practical experience.
  • Experience working with one or more of: Go, Python, Ruby, PHP, Java or Javascript. 
  • Experience with Unix/Linux systems administration using the CLI.
  • Fundamental understanding of TCP/UDP networking concepts
  • Solid oral and written communications skills.
  • CI/CD Expertise: Extensive hands-on experience with CI/CD tools such as ArgoCD, Jenkins, CircleCI, or GitLab CI. Ability to design and implement pipelines that ensure rapid, reliable deployments.
  • Kubernetes Guru: Strong understanding and experience with Kubernetes, Helm, and container orchestration. Ability to scale and manage microservices in production.
  • Cloud Mastery: Proficient in at least one major cloud provider—AWS, GCP, or Azure. Experience with multi-cloud or hybrid-cloud architecture is a plus.
  • IaC Champion: Proficiency in Terraform, Ansible, or CloudFormation to manage infrastructure as code. Familiarity with GitOps workflows and version-controlled infrastructure.
  • Monitoring & Observability: Strong experience with monitoring tools like Prometheus, Grafana, Datadog, ELK, or New Relic. Ability to build custom dashboards and alerting systems.
  • Security-Focused: Deep understanding of security best practices in DevOps, including container security, CI/CD pipeline security, and cloud infrastructure hardening.
  • Problem Solver: Excellent troubleshooting skills with the ability to diagnose issues across a variety of environments, from code to infrastructure.
  • Collaboration Skills: Ability to work effectively in cross-functional teams, influencing peers and driving adoption of best practices across the organization.

Preferred Qualifications: 

  • 8-13 years of hands-on experience as a DevOps Engineer, SRE, or related role in a cloud-native environment.
  • Proven experience mentoring junior team-members. 
  • Deep knowledge of CI/CD pipelines, especially using ArgoCD or similar tools.
  • Proven expertise in cloud platforms (AWS, GCP, Azure), with experience building and managing scalable, reliable infrastructure.
  • Strong scripting skills in Python, Go, or Bash.
  • Experience with service mesh architectures like Istio or Linkerd is a plus.
  • SRE Certification (or equivalent experience) is a bonus.
  • Certified Kubernetes Administrator (CKA) is preferred.
  • A passion for automation, observability, and reliability.

All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

See more jobs at Acquia

Apply for this job

18d

Senior Site Reliability Engineer (Bridge) HUN, Budapest, Remote

LTGBudapest, HU - Remote
LambdajiraterraformslackrubytypescriptkubernetesAWS

LTG is hiring a Remote Senior Site Reliability Engineer (Bridge) HUN, Budapest, Remote

People Matter Most!

We are a global team of Engineers, Product Managers, Designers, and Program Managers across Hungary, the US, and many other countries. We help our customers create work cultures people love.

About the Product

GetBridge was founded to define, develop, and deploy world-class, easy-to-use software; and that’s what we do and will keep on doing. We make better, more usable tools for teaching, learning and career management, stuff people will actually use. Are you interested?

So here are our questions to you:

Do you have a “Challenge Accepted” attitude?

You belong with us, if you are:

  • A problem solver who asks questions to get at the core issue the team is grappling with before deciding on a solution and a pragmatist who knows how to make trade-offs to solve challenges while building an architecture that scales for the future.
  • An owner who is capable of leading and delivering complex projects involving multiple teams while also caring about cloud operations for dozens of services across multiple regions, environments, and language stacks.
  • A builder who loves implementing automation to reduce toil and enable healthy systems by default and building tools and resources for upskilling other engineering teams to make service creation and maintenance self-service.
  • A watcher who likes configuring observability systems to identify incidents before they happen, respond to incidents, and contribute to a continuous improvement culture with occasional participation in 24/7 on-call rotations.
  • A learner who loves to learn new things and improve yourself is encoded in your DNA.
  • A mentor who supports the development and growth of their colleagues.

Knowledge is power; are you armored?

Here’s our tech stack - what you will learn:

  • At least one modern programming language (Java/Kotlin, Ruby, React & Typescript)
  • Cloud-based providers (AWS, Kubernetes, Aurora, EKS, Lambda, Pulsar and Apigee)
  • Cloud networking configuration (VPCs, security groups, load balancers, DNS, etc).
  • Configuration-as-a code (Terraform)
  • System observability (Datadog, Sentry)
  • CI/CD: GitHub, Spinnaker
  • CMO: SAFe, JIRA, Confluence, Slack, GSuite

Do you like things to be in balance?

Our offer focuses on your:

  • Healthy work-life balance: We have a great office in Allee Corner where you are welcome, but there is no mandate to get to work on a regular basis. Our employees enjoy the freedom to manage their working hours.
  • Personal growth: We want to bring out the best in you through several things, learning days, quarterly hack weeks, LinkedIn Learning, mentorship, career development plan and training opportunities from the first day.
  • Financial stability:We offer you a competitive salary package (1.7-2.1m gross / month depending on your seniority), bonus (based on the performance of the company), a comprehensive healthcare package provided by Medicover,SZÉP card, and other fringe benefits.

We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, colour, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any other federal, state or local protected class.

See more jobs at LTG

Apply for this job

20d

Site Reliability Engineer

Hack TheAlimos,Attica,Greece, Remote Hybrid
terraformDesignmobiledockerkubernetespythonAWS

Hack The is hiring a Remote Site Reliability Engineer

Ready to embark on the quest of joining Hack The Box?

At the end of this thrilling journey, you'll become a proud member of Hack The Box, with the ultimate mission to help cybersecurity professionals and organizations enhance their cyber-attack readiness. Get ready for an exciting adventure into the world of cybersecurity! ????????????

:sparkles:The Core Mission of the Site Reliability Engineer (SRE):
As a Site Reliability Engineer at Hack The Box, your paramount mission is to assist the seamless migration to AWS, strategically positioning our infrastructure to scale effectively with the company. Over the next 6 months, you will participate in enhancing our capabilities for expansion, setting the stage for the addition of new systems such as Kubernetes clusters, Services, and Databases. Additionally, your focus will shift towards establishing key performance indicators, service level objectives, and incident response metrics to drive a culture of reliability and continuous improvement.

:beer:The Fellowship You'll Be Joining:
You’ll join a team of 4 SREs, while collaborating closely with engineers, data scientists, and security experts. Finally, you will report directly to the SRE Lead and will have open communications with infrastructure department management and other high-caliber technical people across the organization.

:crossed_swords:Technology Tools & Weapons You'll Be Using:

  • Infrastructure as Code (Terraform): Automate the provisioning of AWS resources.
  • Containerization and Orchestration (Kubernetes, Flux CD): Ensure seamless deployment and scaling of applications.
  • Monitoring and Logging (Prometheus, Mimir, Grafana, Loki): Expand monitoring capabilities for new systems.
  • Automation and Scripting (Go, Python, etc): Scripting for efficient and automated processes.
  • Cloud Platforms (AWS): Execute the migration plan with a focus on AWS.

:rocket:The Adventures That Await You After Becoming a Site Reliability Engineer at Hack The Box:

  • Heavily contribute to the AWS Migration for Scalability: Spearhead the migration from the current cloud provider towards AWS, strategically positioning our infrastructure for scalable growth across regions.
  • Expand Monitoring Stack: Integrate new systems into the Monitoring Stack, enhancing visibility and alerting capabilities for a globally distributed architecture.
  • Architectural Design for Reliability: Contribute to the design and implementation of reliable AWS infrastructure, focusing on fault tolerance and high availability.
  • Establish Metrics Framework: Implement and manage Service Level Agreements (SLAs), Service Level Objectives (SLOs), and Service Level Indicators (SLIs) to measure and improve system reliability.
  • Incident Response Enhancement: Develop and enhance incident response processes, leveraging metrics to continually improve response times and effectiveness.
  • Mentorship: Mentor and guide junior SREs in adapting to the AWS environment and implementing reliability best practices.
  • Collaborative Planning: Work closely with cross-functional teams to plan and implement new systems effectively, ensuring alignment with reliability goals.
  • Team Expansion: Play a key role in the team's expansion, contributing to the mentoring junior members.
  • Best Practices Advocacy: Champion best practices in AWS architecture and SRE methodologies, fostering a culture of reliability and continuous improvement.

:trophy:Skills, Knowledge, and Experience Points Required to Unlock the Role of SRE at Hack The Box:

  • Hands-on Experience: Minimum 2 years of hands-on experience in site reliability engineering or a related field.
  • Automation Skills: Proficient in scripting and automation using languages such as Go, Python or Bash.
  • Cloud Expertise: In-depth knowledge of cloud platforms, particularly AWS.
  • Containerization: Experience with containerization technologies (Docker) and orchestration (Kubernetes).
  • Monitoring Mastery: Strong expertise in implementing and managing monitoring and logging solutions.
  • Metrics Framework: Proven experience establishing and managing SLAs, SLOs, and SLIs.
  • Problem Solving: Proven ability to troubleshoot complex system issues and implement effective solutions.
  • Collaborative Mindset: Excellent collaboration and communication skills, with a strong ability to work cross-functionally and mentor junior team members.

????️ What your Hack The Box adventure will have in store: 

  • ????You'll have the exhilarating opportunity to contribute to a product that is highly appreciated by users and the cybersecurity community at large.
  • ???? You'll experience a highly supportive and caring environment, fostering growth, flexibility, and autonomy.
  • ???? You'll embark on an exciting journey of continuous learning and problem-solving, leveling up as our organization grows.
  • ???? Most importantly, you'll have a blast at HTB ???? because fun is an essential ingredient in our recipe for success! Just wait until you see our global meet-ups! 

???? The gems you’ll be enjoying as a Site Reliability Engineer:

  • Private insurance
  • 25 annual leave days
  • Dedicated budget for training and professional development, participation in conferences
  • State-of-the-art equipment (Macbook, iPhone, and mobile plan)
  • Free lunch & snacks at the office
  • Full access to the Hack The Box lab offerings; so you can learn how to hack
  • Flexible/Hybrid working

????️ The Quest of Becoming Hack The Box’s Site Reliability Engineer:

  • Level 1: To complete level one’s objective, submit your application. 
  • Level 2: Meet the Talent Acquisition team. Level’s objective: highlight your past achievements, ambitions, and values.
  • Level 3: Meet the hiring team. Level’s objective: connect with the hiring team and share with them your achievements. 
  • Level 4: Complete an assignment that aligns with day-to-day job-related tasks and responsibilities. Part of the assignment is discussing it with the hiring team in a debriefing session, in order to walk the team through your thinking process. 
  • Level 5: Congratulations! Not many reach this level ????. Level’s objective: have a constructive, final conversation with senior leadership to explore the role and your future at HTB. 
  • Level 6: You've officially received an offer from HTB! To complete the last level and the Quest, all you need to do is accept the offer. 
  • Quest complete. Congratulations, you’re officially one of us ????????????Your next quest: complete the onboarding.

Hack Your Career, Today. Join us in this epic adventure of cybersecurity at Hack The Box! ????????????

At Hack The Box, we are on a quest to find the most exceptional and enthusiastic talent to join our team. Whether or not you consider yourself a gamer, we value what makes you unique and want to know more about you. This job post provides just a glimpse of the incredible gamified experience our business and consumer customers enjoy through our platforms. So, if you're ready to embark on a journey of growth and adventure, we can't wait to meet you!

ABOUT HACK THE BOX

Hack The Box is the Cyber Performance Center with the mission to provide a human-first platform to create and maintain high-performing cybersecurity individuals and organizations. Hack The Box is the only platform that unites upskilling, workforce development, and the human focus in the cybersecurity industry, and it’s trusted by organizations worldwide for driving their teams to peak performance. Offering an all-in-one environment for continuous growth, assessment, and recruitment, Hack The Box provides solutions for all cybersecurity domains.

Launched in 2017, Hack The Box brings together the largest global cybersecurity community of more than 2.6 million platform members. Rapidly growing its international footprint and reach, Hack The Box is headquartered in the UK, with additional offices in the US, Australia, and Greece.

???? Exciting News:

  • We are super proud to share that HTB’s all three entities across the UK, US, and Greece have been Certified as a Great Place to Work (Oct 2023-Oct 2024). 
  • Furthermore, the HTB's Greek entity has been listed by the Great Place to Work Institute as the #4 Best Workplace in Greece and #7 in Europe for 2023, among more than 3,300 companies???? 
  • Get more insights about our HTB culture and employee experience by visiting our career site and Glassdoor.

At Hack The Box, we are committed to fostering a diverse, inclusive, and equitable workplace. We believe that diversity enriches our performance, services, and the communities we serve. As such, we ensure that all job applications are considered solely based on merit, skills, and qualifications. We do not discriminate on grounds of race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. We are dedicated to providing a fair and respectful work environment that reflects our values.

See more jobs at Hack The

Apply for this job

20d

Senior Site Reliability Engineer (Turkey)

SezzleTürkiye, Remote
SalesDevOPSBachelor's degreeterraformsqlDesignc++dockerkuberneteslinuxpythonAWS

Sezzle is hiring a Remote Senior Site Reliability Engineer (Turkey)

About the Role: 

We are looking for a Site Reliability Engineer to work on our core Infrastructure and Security team, to assist us with designing, building, running, improving and scaling the infrastructure that engineering and data teams use to power their services. Your duties will include the development, testing, and maintenance of our serving and data platforms, using a combination of cloud products, open source tools and internal applications. Your duties will blend software development and operations in order to continuously automate our environments. You should be able to build high-quality, scalable solutions for a variety of problems.

Our Company:

Sezzle is a cutting-edge fintech company whose long-standing mission is to financially empower the next generation. Sezzle has built a payment platform that increases purchasing power for consumers by offering interest-free installment plans. This increase in purchasing power for consumers leads to increased sales and basket sizes for the numerous eCommerce merchants that currently work with Sezzle. 

What Makes Working at Sezzle Awesome? 

At Sezzle, we are more than just brilliant engineers, passionate data enthusiasts, out-of-the-box thinkers, and determined innovators; we are skilled musicians, yogis, cyclists, chefs, golfers, dog-lovers, and rock-climbers. We believe in surrounding ourselves with not only the best and the brightest individuals, but those that are unique and purpose-driven in all that they do. Our culture is not defined by a certain set of perks designed to give the illusion of the traditional startup culture, but rather, it is the visible example living in every employee that we hire. 

Responsibilities:

  • Design, build and maintain scalable infrastructure for running our systems, based on Kubernetes, Redshift and additional AWS services and products.
  • Help the product teams quickly build out MVP products to test new solutions on the market.
  • Maintain and develop monitoring and alerting solutions to improve the on-call experience.
  • Assist product developers in debugging and triaging production issues.
  • Be the first line of defense for our operational environments, triaging and resolving problems as they occur. You will be on an on-call rotation.
  • Design and scale platform and data architectures to sustain rapid user growth.
  • Level up the teams through pairing, code review, and mentoring.
  • Bring and share with our team extensive experience with industry best practices in software development.

Minimum Requirements: 

  • Bachelor's in computer science (preferred) or equivalent related experience 
  • At least 5+ years of overall software, data, deployments and platform infrastructure experience.

Ideal Skills & Experience: 

  • Experience with building and/or serving REST APIs using Go or a similar language.
  • Experience with Relational Databases, SQL and ORM technologies.
  • Strong overall Linux knowledge.
  • DevOps experience with CI/CD pipelines, Docker and Kubernetes, and cloud computing platforms like AWS.
  • Experience with deployment/provisioning tools like Terraform, Helm, Ansible.
  • Experience with implementing and maintaining observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki or similar.
  • Experience in ETL/ELT pipelines using Python and Open-source tools such as DBT.
  • Proficiency in building and maintaining large-scale data warehousing technologies such as Redshift.

About You: 

  • A+ character. We are team-first here at Sezzle. 
  • A hard-working mentality. It’s early and there is still a lot to build. 
  • An excellent communicator. 
  • A fun attitude. Life’s too short. We can have fun while we work hard on cool things. 
  • Smarts. We need people that are smart enough to make decisions on their own and also smart enough to know when they need input from others. 

Sezzle’s Technology Stack:

  • Languages:Golang, Typescript, Python
  • Frontend:Typescript - React and React Native
  • Backend:Golang
  • Database:MySQL, Postgres, Elasticsearch
  • DevOps & Cloud:AWS, Kubernetes
  • Version Control:Git
  • CI/CD:Gitlab
  • Testing:Developer-driven, focus on automated unit, integration, and end-to-end tests
  • Sezzle is focused on using open source, and we build what we can before buying!

Compensation

The compensation range for the role is as follows:

4,600 - 9,000 USD Monthly

Equal Employment Opportunity: Sezzle Inc. is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, sex, national origin, age, disability, genetic information, pregnancy, or any other legally protected status. Sezzle recognizes and values the importance of diversity and inclusion in enriching the employment experience of its employees and in supporting our mission.

#Li-remote

See more jobs at Sezzle

Apply for this job

21d

Principal Site Reliability Engineer

ScienceLogicReston, VA or Remote
DevOPSagileremote-firstterraformDesignmobilelinuxpythonAWS

ScienceLogic is hiring a Remote Principal Site Reliability Engineer

*This position can be remote within the United States*

 

 

Who we are... 

In a world of constant change, we're leading the charge towards truly autonomous enterprises. Our cutting-edge platform harnesses the power of automation and generative AI to revolutionize how businesses manage and optimize their IT operations.

We're not just adapting to digital transformation—we're accelerating it. Our solutions bring business and operations leaders together, unlocking new levels of innovation, efficiency, and scalability. We empower organizations to deliver superior customer experiences and drive revenue growth in an always-on, always-mobile world.

At ScienceLogic, we're building the foundation for Autonomic IT—a future where IT operations are self-healing, self-optimizing, and aligned perfectly with business objectives. Our team of visionaries is reshaping the $18+ billion IT operations market, creating cost-optimized, efficient, and next-level capabilities for enterprises worldwide.

ScienceLogic is going through a product transformation and the Site Reliability Engineering (SRE) team is at the forefront of it. We are responsible for the design, deployment, and maintenance of the Cloud Infrastructure used for running company’s revenue generating go-forward SaaS product line. Overall, we’re passionate about automation and solving complex business and technology challenges. Our team combines SRE, DevOps, Software Development and Information Security knowledge to help make Cloud operations agile, elastic inside the security and governance framework boundaries.

 

What we’re looking for…

We are looking for a Principal Site Reliability Engineer who is well versed in building cloud technologies in a secure manner, has an automation mindset and is an ardent follower of the SRE discipline. If this sounds like you, then our team will benefit from your skillset!

 

What you’ll be doing…

  •  Enhance the company’s SaaS infrastructure security protocols.
  • Collaborate across the organization to design, build and operationalize SaaS services conforming to various security standards like FedRAMP, SOC2, ISO etc.
  • Participate in architecture, security, and operations reviews.
  • Lead design reviews and buildout of secure systems for delivering various SaaS services with 99.99% uptime.
  • Design, automate, test, and monitor the use of cloud native technologies as a foundation for a service platform.
  • Investigate and resolve customer and operational issues with the mentality of fixing and not just mitigating issues.
  • Identify and automate measurement of operations SLAs and SLOs. 
  • Triage incident response, document SOPs, Runbooks, and train NOC team members
  • Writing automation that can be easily supported and extended by others.
  • Work on special projects as assigned.

 

Qualities you possess…

Here at Site Reliability, we believe that if you are hungry for learning, passionate for technology and like building tools then you are a good fit. Having experience with the skills is an added plus:

  • Must be a U.S. Citizen.
  • 7-10 years of site reliability engineering or cloud operations experience or equivalent experience.
  •  Proven track record of operating production SaaS environments within security standards like FedRAMP, SOC2, ISO, PCI.
  • Bachelors or Master's degree in Computer Science, Information Systems or similar field.
  • Skilled at problem solving, algorithms, and data structures conforming to the modern SaaS security requirements.
  • Building tools and scripting frameworks from scratch.
  • Working with Cloud Automation tools like CloudFormation, Terraform, CDK, aws-cli.
  • Scripting languages like Python, Groovy, PowerShell, Bash, Perl etc.
  • Exposure to Windows and Linux administration skills.
  • Familiarity with basic networking, security and cloud engineering concepts.
  • Highly collaborative with effective written and verbal communication skills.
  • Ability to work against tight deadlines and occasionally after-hours, part of on-call scheduling.
  • Occasionally work during off-hours and participate in weekly on-call schedule.
  • Take full responsibility for the availability and performance of the platform.

 

 

 

Benefits & Perks

  • A remote-first culture - work from home or come into the office, it's totally up to you.
  • Comprehensive medical, dental and vision plans.
  • 401(k) plan with employer match.
  • Flexible Paid Time Off (FTO) so that you can take the time that you need to re-energize.
  • Volunteer Time Off (VTO) - take two days off per calendar year to volunteer with your preferred charitable organization.
  • 5-year Service Milestone Sabbatical.
  • Paid parental leave.
  • Generous employee referral bonus program.
  • Pet insurance.
  • HQ Office centrally located in Reston Town Center featuring a well-stocked kitchen with rotating snacks and beverages, and catered lunch on Thursdays.
  • Regular virtual company-wide events, including cooking classes, yoga, meditation and more.
  • The opportunity to learn and develop from some of the best and brightest minds in the industry!

Don’t meet every single requirement? Studies have shown that women and people of color are less likely to apply to jobs unless they meet every single qualification. At ScienceLogic, we are dedicated to building a diverse, inclusive and authentic workplace, so if you’re excited about this role but your past experience doesn’t align perfectly with every qualification in the job description, we encourage you to apply anyway. You may be just the right candidate for this or other roles.

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which you are applying.

 

 

About ScienceLogic

ScienceLogic empowers intelligent, automated IT operations, freeing up time and resources, and driving business outcomes with actionable insights. ScienceLogic’s AIOps platform sees broadly across clouds and on-premises, enabling business service visibility with relationship mapping, and workflow automation to eliminate manual tasks. Trusted by thousands of organizations across the globe, ScienceLogic’s technology has been proven for scale by the world’s largest service providers, enterprises and government agencies.

 

www.sciencelogic.com

 

All ScienceLogic employees have the responsibility to protect information assets, adhere to access controls, report suspicious activity, and comply with security and privacy policies.

 

#LI-Remote

 

See more jobs at ScienceLogic

Apply for this job

25d

Site Reliability Engineer (Bridge) HUN, Budapest, Remote

LTGBudapest, HU - Remote
LambdajiraterraformslackrubytypescriptkubernetesAWS

LTG is hiring a Remote Site Reliability Engineer (Bridge) HUN, Budapest, Remote

People Matter Most!

We are a global team of Engineers, Product Managers, Designers, and Program Managers across Hungary, the US, and many other countries. We help our customers create work cultures people love.

About the Product

GetBridge was founded to define, develop, and deploy world-class, easy-to-use software; and that’s what we do and will keep on doing. We make better, more usable tools for teaching, learning and career management, stuff people will actually use. Are you interested?

So here are our questions to you:

Do you have a “Challenge Accepted” attitude?

You belong with us, if you are:

  • A problem solver who asks questions to get at the core issue the team is grappling with before deciding on a solution and a pragmatist who knows how to make trade-offs to solve challenges while building an architecture that scales for the future.
  • An owner who is capable of leading and delivering complex projects involving multiple teams while also caring about cloud operations for dozens of services across multiple regions, environments, and language stacks.
  • A builder who loves implementing automation to reduce toil and enable healthy systems by default and building tools and resources for upskilling other engineering teams to make service creation and maintenance self-service.
  • A watcher who likes configuring observability systems to identify incidents before they happen, respond to incidents, and contribute to a continuous improvement culture with occasional participation in 24/7 on-call rotations.
  • A learner who loves to learn new things and improve yourself is encoded in your DNA.
  • A mentor who supports the development and growth of their colleagues.

Knowledge is power; are you armored?

Here’s our tech stack - what you will learn:

  • At least one modern programming language (Java/Kotlin, Ruby, React & Typescript)
  • Cloud-based providers (AWS, Kubernetes, Aurora, EKS, Lambda, Pulsar and Apigee)
  • Cloud networking configuration (VPCs, security groups, load balancers, DNS, etc).
  • Configuration-as-a code (Terraform)
  • System observability (Datadog, Sentry)
  • CI/CD: GitHub, Spinnaker
  • CMO: SAFe, JIRA, Confluence, Slack, GSuite

Do you like things to be in balance?

Our offer focuses on your:

  • Healthy work-life balance: We have a great office at MOM Park where you are welcome, but there is no mandate to get to work on a regular basis. Our employees enjoy the freedom to manage their working hours.
  • Personal growth: We want to bring out the best in you through several things, learning days, quarterly hack weeks, LinkedIn Learning, mentorship, career development plan and training opportunities from the first day.
  • Financial stability:We offer you a competitive salary package (1.4 - 1.9M HUF gross / month depending on your seniority), bonus (based on the performance of the company), a comprehensive healthcare package provided by Medicover,SZÉP card, and other fringe benefits.

We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, colour, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any other federal, state or local protected class.

See more jobs at LTG

Apply for this job

29d

Site Reliability Engineer II

Signify HealthDallas, TX, Remote
Designmobileazurec++kubernetespythonAWS

Signify Health is hiring a Remote Site Reliability Engineer II

How will this role have an Impact?

Join Signify Health's vibrant Site Reliability Engineering team as a Site Reliability Engineer. We’re seeking passionate individuals from diverse technical backgrounds. Reporting to the Manager of Site Reliability Engineering, we offer a collaborative environment that values each team member's unique contribution and fosters an inclusive culture.

Your Role:

  • Developing strategies to improve the stability, scalability, and availability of our products.
  • Maintain and deploy observability solutions to optimize system performance.
  • Collaborate with cross-functional teams to enhance operational processes and service management.
  • Design, build, and maintain application stacks for product teams.
  • Create sustainable systems and services through automation.

Skills We’re Seeking:

  • An eagerness to grow and collaborate in the field of Site Reliability Engineering.
  • Strong familiarity with cloud environments (Azure, AWS, or GCP) and a desire to develop further expertise.
  • Intermediate understanding of scripting languages, preferably with exposure to Bash or Python, and programming languages, preferably with exposure to Golang.
  • Novice understanding of infrastructure as code, preferably with exposure to Terraform.
  • Novice understanding of Kubernetes and containerization technologies.
  • Novice understanding of CI/CD principles and willingness to guide and enforce best practices.
  • Novice understanding of Site Reliability and observability principles, preferably with exposure to New Relic.
  • A proactive approach to identifying problems, performance bottlenecks, and areas for improvement.

The base salary hiring range for this position is $72,100 to $125,600. Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for incentive compensation, equity, and benefits.
In addition to your compensation, enjoy the rewards of an organization that puts our heart into caring for our colleagues and our communities.  Eligible employees may enroll in a full range of medical, dental, and vision benefits, 401(k) retirement savings plan, and an Employee Stock Purchase Plan.  We also offer education assistance, free development courses, paid time off programs, paid holidays, a CVS store discount, and discount programs with participating partners.  

About Us:

Signify Health is helping build the healthcare system we all want to experience by transforming the home into the healthcare hub. We coordinate care holistically across individuals’ clinical, social, and behavioral needs so they can enjoy more healthy days at home. By building strong connections to primary care providers and community resources, we’re able to close critical care and social gaps, as well as manage risk for individuals who need help the most. This leads to better outcomes and a better experience for everyone involved.

Our high-performance networks are powered by more than 9,000 mobile doctors and nurses covering every county in the U.S., 3,500 healthcare providers and facilities in value-based arrangements, and hundreds of community-based organizations. Signify’s intelligent technology and decision-support services enable these resources to radically simplify care coordination for more than 1.5 million individuals each year while helping payers and providers more effectively implement value-based care programs.

To learn more about how we’re driving outcomes and making healthcare work better, please visit us at www.signifyhealth.com

Diversity and Inclusion are core values at Signify Health, and fostering a workplace culture reflective of that is critical to our continued success as an organization.

We are committed to equal employment opportunities for employees and job applicants in compliance with applicable law and to an environment where employees are valued for their differences.

See more jobs at Signify Health

Apply for this job

+30d

Senior Site Reliability Engineer

Tyk TechnologiesVancouver,British Columbia,Canada, Remote
B2BDesignmobilescrumapi

Tyk Technologies is hiring a Remote Senior Site Reliability Engineer

Who are Tyk, and what do we do?
The Tyk API Management platform is helping to drive the connected world and power new products and services. We’re changing the way that organisations connect any number of their systems and services. Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the retail, finance, telecoms, healthcare, or media industries (to name just a few!) 

If you’ve banked online, used an app to check the news, or perhaps even driven a connected car, API’s, and by extension, Tyk, make that possible. Founded in 2015 with offices in London - UK, London - Ontario, Atlanta and Singapore, we have many thousands of users of our B2B platform across the globe. Brands using Tyk range from Lotte, Bell, T Mobile, to RBS, Capital One and Vinci. We have a varied user base hailing from every continent – even Antarctica.

Our Mission

Tyk is on a mission to connect every system in the world. We’ve started by building an API Management platform.

Total flexibility, default remote, radical responsibility

We offer unlimited paid holidays and remote working from anywhere in the world, for everyone, Why? Tyk was founded on the principle of offering flexibility and autonomy to our employees, we believe this allows our employees to achieve their best results. It also means we can build the best possible team, location and working hours are no barrier. 

If this sounds like an environment that you believe could work for you then read on to find out more.

At Tyk, we’re obsessed with building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower users with a rich feature set, high availability, and stellar performance level to pursue their missions.

Our customer base is growing, so we’re seeking an experienced Senior SRE to optimise, automate, and improve our performance, using insights from massive-scale data in real time. We want an original thinker, a challenger, a technical legend, an opinionated collaborator who wants to make things better.

Here’s what you’ll be getting up to:

  • Collaborate with the Principal SRE to shape and implement the SRE strategic plan.
  • Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process.
  • Address wellbeing and performance concerns, fostering a positive and productive team environment.
  • Work with the Principal SRE and Scrum Master to analyse wellbeing survey outcomes and develop improvement plans.
  • Champion operational communication, ensuring high-quality and timely updates on team progress.
  • Ensure SLA compliance for our cloud environment through proactive monitoring.
  • Develop and oversee the roadmap for proactive alerting and monitoring.
  • Define and track key performance metrics for cloud services, driving continuous improvement.
  • Design and implement solutions to maintain and enhance KPIs.
  • Lead performance tuning and fault finding by analysing metrics from operating systems and applications.
  • Optimise system and infrastructure performance, focusing on innovation and customer needs anticipation.
  • Engage with commercial teams to understand growth plans and develop corresponding SRE strategies.
  • Direct the analysis of cloud infrastructure, focusing on automation, scalability, and management.
  • Align with the Principal SRE on automation strategies for cloud-operations tasks.
  • Model excellence in software design and automation to enhance Tyk Cloud services, creating runbooks and knowledge sharing.
  • Conduct blame-free root cause analysis postmortems, reporting findings and recommendations.
  • Document operational processes and policies, ensuring replicability and adherence.
  • Provide on-call support, ensuring effective response and resolution in line with SLAs.
  • Plan and execute software upgrades to optimise cloud services.
  • Assist commercial teams with data requests and account management.
  • Champion and adhere to SCRUM methodologies within the SRE team.

  • Here’s what we’re looking for:
    • Proven experience in a senior SRE role or similar.
    • Strong knowledge of cloud technologies and SLA SLO SLI management.
    • Experience leading teams and implementing SCRUM processes.
    • Excellent communication and leadership skills.
    • Experience line managing, mentoring and coaching.
    • Ability to analyze and improve operational processes and performance metrics.
    • Experience in software design, automation, and root cause analysis.
    • On-call support experience and customer-focused mindset.
    • Collaborative attitude with commercial and technical teams.
    • Launching and operating production Kubernetes clusters.
    • Designing and operating infrastructure on AWS and other providers.
    • Operating MongoDB (or other document database) clusters.
    • Operating Redis (or other key-value storage) clusters.
    • Administering Linux servers.
    • Maintaining distributed software.
    • Operating Prometheus and Grafana.
    • Operating logging collection and analysis system.
    • Working hours within 16:00pm – 4:00am UTC.
  • Skills:
    • Kubernetes (administrator)
    • Go and/or Python (advanced)
    • AWS (proficient)
    • Linux (proficient)
    • Terraform and IaC in general (proficient)
    • Helm (familiar)
    • MongoDB (or similar)
    • Redis (or similar)
    • Monitoring & logging
    • Grasp of networking concepts (subnets, routing, peering, load balancing, NAT, etc.)
    • Common networking protocols (DNS, TCP/IP, HTTP, TLS, UDP)

We all share the same vision - we value authenticity, respect, responsibility, independence, honesty, diversity and inclusion and most importantly treating others how you wish to be treated. We look for like-minded people who bring their personalities to work everyday, strive to achieve their personal goals and who are willing to challenge the way we do things, why? - to make what we do even better!

Our values tell the story of Tyk - here’s how:

  • It’s ok to screw up! 

We’ve found that it’s often the ‘stupid’ or unexpected ideas that turn out to be the successful ones - so try it, at least we can say we have!

  • The only stupid idea, is the untested one! 

It’s in our DNA - starting a business with founders 12 hours apart, giving our gateway away for free - sure, we did that, and we’d do it again!

  • Trust starts with you - make it count! 

Trust is a two-way street - instil it from day one!

  • Assume best intent! 

We have each other’s back - we’re all on the same team. Think before you speak or act. 

  • Make things better! 

Always try to leave things better than when you found them - change is constant, inevitable and embraced! Be that change we want to see.

Here’s why you should join us:

  • Everyone has unlimited paid holidays. 
  • We have total flexibility in hours, as we believe creativity flows better when our people are given freedom to decide when they are most productive. Everyone is unique after all.
  • Employee share scheme
  • Generous maternity and paternity leave
  • Volunteering Days
  • Company retreats
  • Employee Wellbeing platform

What’s it like to work here?! check it out: https://tyk.io/worklife/

Tyk is an equal opportunities employer and we are determined to ensure that no applicant or employee receives less favourable treatment on the grounds of gender, age, disability, religion, belief, sexual orientation, marital status, or race, or is disadvantaged by conditions or requirements which cannot be shown to be justifiable.

You can see more about us here https://tyk.io

See more jobs at Tyk Technologies

Apply for this job

+30d

Senior Site Reliability Engineer

WebflowU.S. Remote
SalesWebflowBachelor's degreeremote-firstterraformansiblemongodbc++dockertypescriptkubernetespythonAWSjavascript

Webflow is hiring a Remote Senior Site Reliability Engineer

At Webflow, our mission is to bring development superpowers to everyone. Webflow is the leading visual development platform for building powerful websites without writing code. By combining modern web development technologies into one platform, Webflow enables people to build websites visually, saving engineering time, while clean code seamlessly generates in the background. From independent designers and creative agencies to Fortune 500 companies, millions worldwide use Webflow to be more nimble, creative, and collaborative. It’s the web, made better. 

We’re looking for a Senior Site Reliability Engineerto improve reliability and stability of Webflow’s customer-facing, production infrastructure, serving millions of page views per hour. Our product is used by over 2 million users world-wide across 190 countries, and you’ll help ensure our platform is secure and scalable for these users as tens of thousands of projects are launched on Webflow each month.

About the role 

  • Location: Remote-first (United States; BC & ON, Canada) 
  • Full-time 
  • Permanent
  • Exempt 
  • The cash compensation for this role is tailored to align with the cost of labor in different geographic markets. We've structured the base pay ranges for this role into zones for our geographic markets, and the specific base pay within the range will be determined by the candidate’s geographic location, job-related experience, knowledge, qualifications, and skills.
    • United States  (all figures cited below in USD and pertain to workers in the United States)
      • Zone A: $158,000 - $218,000
      • Zone B: $149,000 - $205,000
      • Zone C: $139,00 - $192,000 
    • Canada  (All figures cited below in CAD and pertain to workers in ON & BC, Canada)
      • CAD 180,000 - CAD 248,000
  • Please visit our Careers page for more information on which locations are included in each of our geographic pay zones. However, please confirm the zone for your specific location with your recruiter.
  • Reporting to the Engineering Manager  

As a Senior Site Reliability Engineer, you’ll … 

  • Empower engineers on other teams to take control of their services by maintaining monitoring tooling and collaborating on internal best practices for observability.
  • Enhance reliability of applications running in Kubernetes by optimizing resource allocation, streamlining upgrade processes, and ensuring scalability and fault tolerance.
  • Occasionally dive into the main Webflow application in Node, Python, or Go to better discern (and sometimes fix) behavior in production.
  • Work with peers on Webflow’s Customer Support, Partnerships, and Sales teams to enable customers using Webflow’s services in production.
  • Participate in and continuously improve on-call and incident response processes.

In addition to the responsibilities outlined above, at Webflow we will support you in identifying where your interests and development opportunities lie and we'll help you incorporate them into your role.

About you 

You’ll thrive as a Senior Site Reliability Engineer if you …

  • Either a background as an ops engineer with an enthusiasm for code, or a background as a software engineer with an enthusiasm for systems administration.
  • 5+ years of experience building, maintaining, and debugging distributed systems in a customer-facing environment that allows for little to no downtime.
  • Experience navigating and scaling multi-tier cloud environments on either AWS or GCP.
  • Experience with container-centric architectures, built with Docker and tools like Kubernetes (EKS, GKE, AKS, OpenShift, etc.), ECS, Docker Swarm, or Mesos.
  • Experience with infrastructure-as-code tools like Terraform, Pulumi, Ansible, Puppet, or Chef.
  • Experience in contributing to full-stack applications built using tools like React, Node, and MongoDB.
  • Enthusiasm for mentoring and sponsoring less-experienced engineers.

It would be a bonus if you had even one of the following …

  • Experience with Kubernetes, Nginx, Terraform, or Pulumi specifically.
  • Experience improving on-call and incident response processes for Engineering.
  • Experience working in high-compliance environments or a special interest in security engineering. We are not the security team, but we are always looking to improve our security posture!

Our Core Behaviors:

  • Obsess over customer experience. We deeply understand what we’re building and who we’re building for and serving. We define the leading edge of what’s possible in our industry and deliver the future for our customers
  • Move with heartfelt urgency. We have a healthy relationship with impatience, channeling it thoughtfully to show up better and faster for our customers and for each other. Time is the most limited thing we have, and we make the most of every moment
  • Say the hard thing with care. Our best work often comes from intelligent debate, critique, and even difficult conversations. We speak our minds and don’t sugarcoat things — and we do so with respect, maturity, and care
  • Make your mark. We seek out new and unique ways to create meaningful impact, and we champion the same from our colleagues. We work as a team to get the job done, and we go out of our way to celebrate and reward those going above and beyond for our customers and our teammates

Benefits & wellness

  • Equity ownership (RSUs) in a growing, privately-owned company
  • 100% employer-paid healthcare, vision, and dental insurance coverage for employees and dependents (full-time employees working 30+ hours per week), as well as Health Savings Account/Health Reimbursement Account, dependent care Flexible Spending Account (US only), dependent on insurance plan selection where applicable in the respective country of employment; Employees may also have voluntary insurance options, such as life, disability, hospital protection, accident, and critical illness where applicable in the respective country of employment
  • 12 weeks of paid parental leave for both birthing and non-birthing caregivers, as well as an additional 6-8 weeks of pregnancy disability for birthing parents to be used before child bonding leave (where local requirements are more generous employees receive the greater benefit); Employees also have access to family planning care and reimbursement
  • Flexible PTO with a mandatory annual minimum of 10 days paid time off for all locations (where local requirements are more generous employees receive the greater benefit), and sabbatical program
  • Access to mental wellness and professional coaching, therapy, and Employee Assistance Program
  • Monthly stipends to support health and wellness, smart work, and professional growth
  • Professional career coaching, internal learning & development programs
  • 401k plan and pension schemes (in countries where statutorily required) financial wellness benefits, like CPA or financial advisor coverage
  • Discounted Pet Insurance offering (US only)
  • Commuter benefits for in-office employees

Temporary employees are not eligible for paid holiday time off, accrued paid time off, paid leaves of absence, or company-sponsored perks unless otherwise required by law.

Remote, together

At Webflow, equality is a core tenet of our culture. We are an Equal Opportunity (EEO)/Veterans/Disabled Employer and are committed to building an inclusive global team that represents a variety of backgrounds, perspectives, beliefs, and experiences. Employment decisions are made on the basis of job-related criteria without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other classification protected by applicable law. Pursuant to the San Francisco Fair Chance Ordinance, Webflow will consider for employment qualified applicants with arrest and conviction records.

Stay connected

Not ready to apply, but want to be part of the Webflow community? Consider following our story on our Webflow Blog, LinkedIn, X (Twitter), and/or Glassdoor

Please note:

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Upon interview scheduling, instructions for confidential accommodation requests will be administered.

To join Webflow, you'll need a valid right to work authorization depending on the country of employment.

If you are extended an offer, that offer may be contingent upon your successful completion of a background check, which will be conducted in accordance with applicable laws. We may obtain one or more background screening reports about you, solely for employment purposes.

For information about how Webflow processes your personal information, please reviewWebflow’s Applicant Privacy Notice

 

See more jobs at Webflow

Apply for this job

+30d

Senior Site Reliability Engineer (Argentina)

SezzleArgentina, Remote
SalesDevOPSBachelor's degreeterraformsqlDesignc++dockerkuberneteslinuxpythonAWS

Sezzle is hiring a Remote Senior Site Reliability Engineer (Argentina)

About the Role: 

We are looking for a Site Reliability Engineer to work on our core Infrastructure and Security team, to assist us with designing, building, running, improving and scaling the infrastructure that engineering and data teams use to power their services. Your duties will include the development, testing, and maintenance of our serving and data platforms, using a combination of cloud products, open source tools and internal applications. Your duties will blend software development and operations in order to continuously automate our environments. You should be able to build high-quality, scalable solutions for a variety of problems.

Our Company:

Sezzle is a cutting-edge fintech company whose long-standing mission is to financially empower the next generation. Sezzle has built a payment platform that increases purchasing power for consumers by offering interest-free installment plans. This increase in purchasing power for consumers leads to increased sales and basket sizes for the numerous eCommerce merchants that currently work with Sezzle. 

What Makes Working at Sezzle Awesome? 

At Sezzle, we are more than just brilliant engineers, passionate data enthusiasts, out-of-the-box thinkers, and determined innovators; we are skilled musicians, yogis, cyclists, chefs, golfers, dog-lovers, and rock-climbers. We believe in surrounding ourselves with not only the best and the brightest individuals, but those that are unique and purpose-driven in all that they do. Our culture is not defined by a certain set of perks designed to give the illusion of the traditional startup culture, but rather, it is the visible example living in every employee that we hire. 

Responsibilities:

  • Design, build and maintain scalable infrastructure for running our systems, based on Kubernetes, Redshift and additional AWS services and products.
  • Help the product teams quickly build out MVP products to test new solutions on the market.
  • Maintain and develop monitoring and alerting solutions to improve the on-call experience.
  • Assist product developers in debugging and triaging production issues.
  • Be the first line of defense for our operational environments, triaging and resolving problems as they occur. You will be on an on-call rotation.
  • Design and scale platform and data architectures to sustain rapid user growth.
  • Level up the teams through pairing, code review, and mentoring.
  • Bring and share with our team extensive experience with industry best practices in software development.

Minimum Requirements: 

  • Bachelor's in computer science (preferred) or equivalent related experience 
  • At least 5+ years of overall software, data, deployments and platform infrastructure experience.

Ideal Skills & Experience: 

  • Experience with building and/or serving REST APIs using Go or a similar language.
  • Experience with Relational Databases, SQL and ORM technologies.
  • Strong overall Linux knowledge.
  • DevOps experience with CI/CD pipelines, Docker and Kubernetes, and cloud computing platforms like AWS.
  • Experience with deployment/provisioning tools like Terraform, Helm, Ansible.
  • Experience with implementing and maintaining observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki or similar.
  • Experience in ETL/ELT pipelines using Python and Open-source tools such as DBT.
  • Proficiency in building and maintaining large-scale data warehousing technologies such as Redshift.

Sezzle’s Technology Stack:

  • Languages:Golang, Typescript, Python
  • Frontend:Typescript - React and React Native
  • Backend:Golang
  • Database:MySQL, Postgres, Elasticsearch
  • DevOps & Cloud:AWS, Kubernetes
  • Version Control:Git
  • CI/CD:Gitlab
  • Testing:Developer-driven, focus on automated unit, integration, and end-to-end tests
  • Sezzle is focused on using open source, and we build what we can before buying!

About You: 

  • A+ character. We are team-first here at Sezzle. 
  • A hard-working mentality. It’s early and there is still a lot to build. 
  • An excellent communicator. 
  • A fun attitude. Life’s too short. We can have fun while we work hard on cool things. 
  • Smarts. We need people that are smart enough to make decisions on their own and also smart enough to know when they need input from others. 

Compensation

The compensation range for the role is as follows:

4,600 - 9,000 USD Monthly

Equal Employment Opportunity: Sezzle Inc. is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, sex, national origin, age, disability, genetic information, pregnancy, or any other legally protected status. Sezzle recognizes and values the importance of diversity and inclusion in enriching the employment experience of its employees and in supporting our mission.

#Li-remote

See more jobs at Sezzle

Apply for this job

+30d

Senior Site Reliability Engineer (Brazil)

SezzleBrazil, Remote
SalesDevOPSBachelor's degreeterraformsqlDesignc++dockerkuberneteslinuxpythonAWS

Sezzle is hiring a Remote Senior Site Reliability Engineer (Brazil)

About the Role: 

We are looking for a Site Reliability Engineer to work on our core Infrastructure and Security team, to assist us with designing, building, running, improving and scaling the infrastructure that engineering and data teams use to power their services. Your duties will include the development, testing, and maintenance of our serving and data platforms, using a combination of cloud products, open source tools and internal applications. Your duties will blend software development and operations in order to continuously automate our environments. You should be able to build high-quality, scalable solutions for a variety of problems.

Our Company:

Sezzle is a cutting-edge fintech company whose long-standing mission is to financially empower the next generation. Sezzle has built a payment platform that increases purchasing power for consumers by offering interest-free installment plans. This increase in purchasing power for consumers leads to increased sales and basket sizes for the numerous eCommerce merchants that currently work with Sezzle. 

What Makes Working at Sezzle Awesome? 

At Sezzle, we are more than just brilliant engineers, passionate data enthusiasts, out-of-the-box thinkers, and determined innovators; we are skilled musicians, yogis, cyclists, chefs, golfers, dog-lovers, and rock-climbers. We believe in surrounding ourselves with not only the best and the brightest individuals, but those that are unique and purpose-driven in all that they do. Our culture is not defined by a certain set of perks designed to give the illusion of the traditional startup culture, but rather, it is the visible example living in every employee that we hire. 

Responsibilities:

  • Design, build and maintain scalable infrastructure for running our systems, based on Kubernetes, Redshift and additional AWS services and products.
  • Help the product teams quickly build out MVP products to test new solutions on the market.
  • Maintain and develop monitoring and alerting solutions to improve the on-call experience.
  • Assist product developers in debugging and triaging production issues.
  • Be the first line of defense for our operational environments, triaging and resolving problems as they occur. You will be on an on-call rotation.
  • Design and scale platform and data architectures to sustain rapid user growth.
  • Level up the teams through pairing, code review, and mentoring.
  • Bring and share with our team extensive experience with industry best practices in software development.

Minimum Requirements: 

  • Bachelor's in computer science (preferred) or equivalent related experience 
  • At least 5+ years of overall software, data, deployments and platform infrastructure experience.

Ideal Skills & Experience: 

  • Experience with building and/or serving REST APIs using Go or a similar language.
  • Experience with Relational Databases, SQL and ORM technologies.
  • Strong overall Linux knowledge.
  • DevOps experience with CI/CD pipelines, Docker and Kubernetes, and cloud computing platforms like AWS.
  • Experience with deployment/provisioning tools like Terraform, Helm, Ansible.
  • Experience with implementing and maintaining observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki or similar.
  • Experience in ETL/ELT pipelines using Python and Open-source tools such as DBT.
  • Proficiency in building and maintaining large-scale data warehousing technologies such as Redshift.

About You: 

  • A+ character. We are team-first here at Sezzle. 
  • A hard-working mentality. It’s early and there is still a lot to build. 
  • An excellent communicator. 
  • A fun attitude. Life’s too short. We can have fun while we work hard on cool things. 
  • Smarts. We need people that are smart enough to make decisions on their own and also smart enough to know when they need input from others. 

Compensation

The compensation range for the role is as follows:

4,600 - 9,000 USD Monthly

Equal Employment Opportunity: Sezzle Inc. is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, sex, national origin, age, disability, genetic information, pregnancy, or any other legally protected status. Sezzle recognizes and values the importance of diversity and inclusion in enriching the employment experience of its employees and in supporting our mission.

#Li-remote

See more jobs at Sezzle

Apply for this job

+30d

Senior Site Reliability Engineer (Chile)

SezzleChile, Remote
SalesDevOPSBachelor's degreeterraformsqlDesignc++dockerkuberneteslinuxpythonAWS

Sezzle is hiring a Remote Senior Site Reliability Engineer (Chile)

About the Role: 

We are looking for a Site Reliability Engineer to work on our core Infrastructure and Security team, to assist us with designing, building, running, improving and scaling the infrastructure that engineering and data teams use to power their services. Your duties will include the development, testing, and maintenance of our serving and data platforms, using a combination of cloud products, open source tools and internal applications. Your duties will blend software development and operations in order to continuously automate our environments. You should be able to build high-quality, scalable solutions for a variety of problems.

Our Company:

Sezzle is a cutting-edge fintech company whose long-standing mission is to financially empower the next generation. Sezzle has built a payment platform that increases purchasing power for consumers by offering interest-free installment plans. This increase in purchasing power for consumers leads to increased sales and basket sizes for the numerous eCommerce merchants that currently work with Sezzle. 

What Makes Working at Sezzle Awesome? 

At Sezzle, we are more than just brilliant engineers, passionate data enthusiasts, out-of-the-box thinkers, and determined innovators; we are skilled musicians, yogis, cyclists, chefs, golfers, dog-lovers, and rock-climbers. We believe in surrounding ourselves with not only the best and the brightest individuals, but those that are unique and purpose-driven in all that they do. Our culture is not defined by a certain set of perks designed to give the illusion of the traditional startup culture, but rather, it is the visible example living in every employee that we hire. 

Responsibilities:

  • Design, build and maintain scalable infrastructure for running our systems, based on Kubernetes, Redshift and additional AWS services and products.
  • Help the product teams quickly build out MVP products to test new solutions on the market.
  • Maintain and develop monitoring and alerting solutions to improve the on-call experience.
  • Assist product developers in debugging and triaging production issues.
  • Be the first line of defense for our operational environments, triaging and resolving problems as they occur. You will be on an on-call rotation.
  • Design and scale platform and data architectures to sustain rapid user growth.
  • Level up the teams through pairing, code review, and mentoring.
  • Bring and share with our team extensive experience with industry best practices in software development.

Minimum Requirements: 

  • Bachelor's in computer science (preferred) or equivalent related experience 
  • At least 5+ years of overall software, data, deployments and platform infrastructure experience.

Ideal Skills & Experience: 

  • Experience with building and/or serving REST APIs using Go or a similar language.
  • Experience with Relational Databases, SQL and ORM technologies.
  • Strong overall Linux knowledge.
  • DevOps experience with CI/CD pipelines, Docker and Kubernetes, and cloud computing platforms like AWS.
  • Experience with deployment/provisioning tools like Terraform, Helm, Ansible.
  • Experience with implementing and maintaining observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki or similar.
  • Experience in ETL/ELT pipelines using Python and Open-source tools such as DBT.
  • Proficiency in building and maintaining large-scale data warehousing technologies such as Redshift.

About You: 

  • A+ character. We are team-first here at Sezzle. 
  • A hard-working mentality. It’s early and there is still a lot to build. 
  • An excellent communicator. 
  • A fun attitude. Life’s too short. We can have fun while we work hard on cool things. 
  • Smarts. We need people that are smart enough to make decisions on their own and also smart enough to know when they need input from others. 

Sezzle’s Technology Stack:

  • Languages:Golang, Typescript, Python
  • Frontend:Typescript - React and React Native
  • Backend:Golang
  • Database:MySQL, Postgres, Elasticsearch
  • DevOps & Cloud:AWS, Kubernetes
  • Version Control:Git
  • CI/CD:Gitlab
  • Testing:Developer-driven, focus on automated unit, integration, and end-to-end tests
  • Sezzle is focused on using open source, and we build what we can before buying!

Compensation

The compensation range for the role is as follows:

4,600 - 9,000 USD Monthly

Equal Employment Opportunity: Sezzle Inc. is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, sex, national origin, age, disability, genetic information, pregnancy, or any other legally protected status. Sezzle recognizes and values the importance of diversity and inclusion in enriching the employment experience of its employees and in supporting our mission.

#Li-remote

See more jobs at Sezzle

Apply for this job

+30d

Senior Site Reliability Engineer (Colombia)

SezzleColombia, Remote
SalesDevOPSBachelor's degreeterraformsqlDesignc++dockerkuberneteslinuxpythonAWS

Sezzle is hiring a Remote Senior Site Reliability Engineer (Colombia)

About the Role: 

We are looking for a Site Reliability Engineer to work on our core Infrastructure and Security team, to assist us with designing, building, running, improving and scaling the infrastructure that engineering and data teams use to power their services. Your duties will include the development, testing, and maintenance of our serving and data platforms, using a combination of cloud products, open source tools and internal applications. Your duties will blend software development and operations in order to continuously automate our environments. You should be able to build high-quality, scalable solutions for a variety of problems.

Our Company:

Sezzle is a cutting-edge fintech company whose long-standing mission is to financially empower the next generation. Sezzle has built a payment platform that increases purchasing power for consumers by offering interest-free installment plans. This increase in purchasing power for consumers leads to increased sales and basket sizes for the numerous eCommerce merchants that currently work with Sezzle. 

What Makes Working at Sezzle Awesome? 

At Sezzle, we are more than just brilliant engineers, passionate data enthusiasts, out-of-the-box thinkers, and determined innovators; we are skilled musicians, yogis, cyclists, chefs, golfers, dog-lovers, and rock-climbers. We believe in surrounding ourselves with not only the best and the brightest individuals, but those that are unique and purpose-driven in all that they do. Our culture is not defined by a certain set of perks designed to give the illusion of the traditional startup culture, but rather, it is the visible example living in every employee that we hire. 

Responsibilities:

  • Design, build and maintain scalable infrastructure for running our systems, based on Kubernetes, Redshift and additional AWS services and products.
  • Help the product teams quickly build out MVP products to test new solutions on the market.
  • Maintain and develop monitoring and alerting solutions to improve the on-call experience.
  • Assist product developers in debugging and triaging production issues.
  • Be the first line of defense for our operational environments, triaging and resolving problems as they occur. You will be on an on-call rotation.
  • Design and scale platform and data architectures to sustain rapid user growth.
  • Level up the teams through pairing, code review, and mentoring.
  • Bring and share with our team extensive experience with industry best practices in software development.

Minimum Requirements: 

  • Bachelor's in computer science (preferred) or equivalent related experience 
  • At least 5+ years of overall software, data, deployments and platform infrastructure experience.

Ideal Skills & Experience: 

  • Experience with building and/or serving REST APIs using Go or a similar language.
  • Experience with Relational Databases, SQL and ORM technologies.
  • Strong overall Linux knowledge.
  • DevOps experience with CI/CD pipelines, Docker and Kubernetes, and cloud computing platforms like AWS.
  • Experience with deployment/provisioning tools like Terraform, Helm, Ansible.
  • Experience with implementing and maintaining observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki or similar.
  • Experience in ETL/ELT pipelines using Python and Open-source tools such as DBT.
  • Proficiency in building and maintaining large-scale data warehousing technologies such as Redshift.

About You: 

  • A+ character. We are team-first here at Sezzle. 
  • A hard-working mentality. It’s early and there is still a lot to build. 
  • An excellent communicator. 
  • A fun attitude. Life’s too short. We can have fun while we work hard on cool things. 
  • Smarts. We need people that are smart enough to make decisions on their own and also smart enough to know when they need input from others. 

Sezzle’s Technology Stack:

  • Languages:Golang, Typescript, Python
  • Frontend:Typescript - React and React Native
  • Backend:Golang
  • Database:MySQL, Postgres, Elasticsearch
  • DevOps & Cloud:AWS, Kubernetes
  • Version Control:Git
  • CI/CD:Gitlab
  • Testing:Developer-driven, focus on automated unit, integration, and end-to-end tests
  • Sezzle is focused on using open source, and we build what we can before buying!

Compensation

The compensation range for the role is as follows:

4,600 - 9,000 USD Monthly

Equal Employment Opportunity: Sezzle Inc. is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, sex, national origin, age, disability, genetic information, pregnancy, or any other legally protected status. Sezzle recognizes and values the importance of diversity and inclusion in enriching the employment experience of its employees and in supporting our mission.

#Li-remote

See more jobs at Sezzle

Apply for this job

+30d

Site Reliability Engineer - II (SRE II)

Live PersonHyderabad, Telangana, India (Remote)
DevOPSterraformnosqlpostgressqlansiblemongodbazureelasticsearchMySQLkuberneteslinuxjenkinsAWS

Live Person is hiring a Remote Site Reliability Engineer - II (SRE II)

LivePerson (NASDAQ: LPSN) is the global leader in enterprise conversations. Hundreds of the world’s leading brands — including HSBC, Chipotle, and Virgin Media — use our award-winning Conversational Cloud platform to connect with millions of consumers. We power nearly a billion conversational interactions every month, providing a uniquely rich data set and safety tools to unlock the power of Conversational AI for better customer experiences.

At LivePerson, we foster an inclusive workplace culture that encourages meaningful connection, collaboration, and innovation. Everyone is invited to ask questions, actively seek new ways to achieve success, nd reach their full potential. We are continually looking for ways to improve our products and make things better. This means spotting opportunities, solving ambiguities, and seeking effective solutions to the problems our customers care about.

Overview:

LivePerson is looking for a Site Reliability Engineer for the GPT (Global Product & Technology) Division. You will be part of the LiverPerson SRE team building and managing highly available, distributed systems. You will have the opportunity to be part of a strong team and enjoy the work environment of a start-up, with a robust product and the benefits of a leading company in its field.

You will: 

  • Ensure product high uptime and reliability 24x7.
  • Manage Linux servers in a multi-cloud environment
  • Manage high availability Kubernetes resources using Helm charts
  • Assist with deploying upgrades and patches using Chef/Ansible/Puppet/Helm
  • Monitoring and troubleshooting warnings and alerts related to the reporting platform’s performance
  • Develop monitoring resources and alerting systems such as Grafana, Prometheus, Kibana, DataDog and PagerDuty
  • Coordinate with DBA and developers to manage SQL and NOSQL database systems, including MongoDB, ElasticSearch, Postgres, MySQL and others
  • Managing message bus systems such as Kafka and Pulsar
  • Build and maintain CI/CD pipelines using Jenkins/Gitlab/Teamcity

You have:

  • Minimum 4+ years of experience of managing cloud based production environment (AWS, GCP, Azure, etc)
  • Highly experienced working in the Linux environment, good scripting in Bash / Python.
  • Highly experienced working configuration management systems like OpsCode Chef, Ansible, Puppet,  etc.
  • Strong experience in Terraform, CloudFormation or other IAC
  • Experienced in SQL, including DDL and complex queries
  • Experienced working in the Kubernetes platform
  • Experience working in a microservices architecture using a message bus
  • Good knowledge of CI/CD pipelines orchestrators like TeamCity, Jenkins, Gitlab
  • Ability to integrate security best practices into the SRE workflow.
  • Highly motivated and independent.
  • Team player and excellent interpersonal Skills.
  • Excellent written and verbal communication skills.
  • BS in Computer Science or a related field, or equivalent work experience.
  • A strong background in cloud, network and application security and compliance
  • Experience with GPT or other LLMs a strong advantage

Benefits

  • Health: Medical, Dental, and Vision
  • Time away: Vacation and holidays
  • Development: Generous tuition reimbursement and access to internal professional development resources.
  • Equal opportunity employer

Why You’ll Love Working Here

As leaders in enterprise customer conversations, we celebrate diversity, empowering our team to forge impactful conversations globally. LivePerson is a place where uniqueness is embraced, growth is constant, and everyone is empowered to create their own success. And, we're very proud to have earned recognition from Fast Company, Newsweek, and BuiltIn for being a top innovative, beloved, and remote-friendly workplace.

Belonging At LivePerson

We are proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants with criminal histories, consistent with applicable federal, state, and local law.

We are committed to the accessibility needs of applicants and employees. We provide reasonable accommodations to job applicants with physical or mental disabilities. Applicants with a disability who require reasonable accommodation for any part of the application or hiring process should inform their recruiting contact upon initial connection.

Apply for this job

+30d

Senior Site Reliability Engineer

WorkableAthens,Attica,Greece, Remote Hybrid
kubernetes

Workable is hiring a Remote Senior Site Reliability Engineer

Workable makes software to help companies find and hire great people. We get recruiting and its role in building healthy workplaces — which is why we’re proud more than 20,000 teams around the world use Workable to do exactly that.

At Workable, you’ll find smart people who have fun, learn and innovate, and help others do the same. We brainstorm, we laugh, and, occasionally, we party (there’s a lot to celebrate), but we also appreciate people’s need for quiet time and focused work. We respect everyone, we hire the best, and make sure every experience is special.

We’re growing fast and we want to make sure that we scale from thousands to hundreds of thousands so we’re looking for a Senior Site Reliability Engineer to join our SRE team.

Our product is built with a microservices architecture deployed on the Kubernetes platform. Our SRE team is responsible for deploying, monitoring, optimizing, and securing our cloud infrastructure and company software; both rapidly expanding. Automation is at the core of what we do. If you love working with new technologies, open-source software, and solving complex problems on highly distributed systems then this is the job for you! You will be part of a talented team of engineers that demonstrate superb technical competency, delivering mission-critical infrastructure and ensuring the highest levels of availability, performance, and security. 


As a Senior Site Reliability Engineer in this team with an emphasis on Tools and Automations, you will be responsible for the following:

  • Develop tools and automations to make operations and deployments simpler and more robust.
  • Operate, deploy, and monitor cloud services from development to production.
  • Working in a highly cross-functional team with Developers on designing, releasing, and troubleshooting production systems.
  • Be responsible for the availability, scalability, and performance of our systems.
  • Troubleshoot issues, do capacity planning, and analyze system performance.
  • Lead projects within the team and be responsible for their timely delivery.
    • BS/MS degree in Computer Science, Engineering (or a proven strong background)
    • Excellent communication skills in English, particularly written communication.
    • Analytical and troubleshooting skills on large-scale distributed systems
    • Work autonomously and be able to deliver projects on time.
    • Passion for cutting-edge cloud technologies and automation
    • Strong curiosity for discovering new insights and eager to challenge the status quo
    • 5+ years of relevant work experience, including programming experience
    • Experience with the Kubernetes platform and technology stack
    • Experience with a major cloud provider (GCP and AWS preferred)
    • Experience with configuration management and orchestration tools (e.g., Ansible, Terraform)
    • Experience with centralized logging, monitoring systems, and tooling frameworks
    • Deep knowledge of Linux systems
    • Familiarity with at least one programming language (preferably Go, Python,  Java, C++)
    • Familiarity with Relational and NoSQL (MongoDB, Redis, Elastic, etc.) databases 
    • Oh, and if you're into DevOps technologies and the CNCF ecosystem, but have experience with other frameworks, please do apply. We value quality engineers, not the tools they've used.

Preferred qualifications:

    • Bonus: Networking skills, especially TCP/IP, HTTP, DNS and load balancers

Our employees enjoy benefits that make them more productive and contribute directly to the development of their professional skills. We want to be able to attract the best of the best and make sure they keep getting better. On top of an exciting, vibrant, and intellectually challenging environment, we are offering:

  • An attractive salary and a bonus plan
  • Health insurance plan including dependents
  • Mobile data plan
  • Apple gear and access to the best productivity tools
  • Annual retreats in awesome locations

Workable is most decidedly an equal-opportunity employer. We want applicants of diverse backgrounds and hire without regard to color, gender, religion, national origin, citizenship, disability, age, sexual orientation, or any other characteristic protected by law.

See more jobs at Workable

Apply for this job

+30d

Lead Site Reliability Engineer

Bachelor's degreekotlinterraformsqlDesignansiblegitjavac++dockerpostgresqlMySQLtypescriptkubernetespython

hims & hers is hiring a Remote Lead Site Reliability Engineer

Hims & Hers Health, Inc. (better known as Hims & Hers) is the leading health and wellness platform, on a mission to help the world feel great through the power of better health. We are revolutionizing telehealth for providers and their patients alike. Making personalized solutions accessible is of paramount importance to Hims & Hers and we are focused on continued innovation in this space. Hims & Hers offers nonprescription products and access to highly personalized prescription solutions for a variety of conditions related to mental health, sexual health, hair care, skincare, heart health, and more.

Hims & Hers is a public company, traded on the NYSE under the ticker symbol “HIMS”. To learn more about the brand and offerings, you can visit hims.com and forhers.com, or visit our investor site. For information on the company’s outstanding benefits, culture, and its talent-first flexible/remote work approach, see below and visit www.hims.com/careers-professionals.

About the Role:

We are seeking a Lead Site Reliability Engineer to help build a reliable web experience for our users. We believe that moving fast is our competitive advantage, and enables us to better serve our users. We also know that the faster we move, the more likely we are to break things.

You Will:

  • Design and implement SRE practices ensuring availability, scalability and observability of production systems with a strong focus on excellent customer experience
  • Actively seek and identify opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation
  • Use automation extensively to design, configure, manage, and monitor systems in support of our product development teams
  • Understanding of Infrastructure and infra automation (Infrastructure as Code)
  • Manage incidents and emergency response, track outages, ensure data integrity and engineer releases to promote safe, efficient and rapid deployments
  • Handle emergency response either by being on-call or by reacting to symptoms according to monitoring and escalation when needed
  • Improve the codebase by resolving logic issues, deprecating unused code, etc.
  • Implement monitoring, logging, alerting and SLO Reporting
  • Identify Service Level Indicators (SLIs) that will align the team to meet the availability and performance objectives
  • Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent incident reoccurrence
  • Provides reviews on design documents from internal and external teams
  • Performs more-complex tasks using highly-specialized knowledge and advanced business experience
  • Resolves complex tickets in creative manners
  • Develops and leads large and highly-complex cross-functional projects or programs 
  • Determines solutions to blockers, identify tasks, and developing solutions as appropriate
  • Responsible for at least for 1 major delivery domain and accountable for all the aspects of SRE for that domain
  • Develops standards, tools, and knowledge requirements for skill and career development

You Have:

  • 10+ years as a software engineer, shipping production code
  • 5+ years of experience as a Site Reliability Engineer or Production support Engineer
  • Bachelor's degree in Computer Science, Engineering, or related field, or relevant years of work experience
  • Experience with service-oriented architectures and microservices at scale
  • Strong proficiency with RDBMS databases (PostgreSQL, MySQL, SQL Server, etc.)
  • Strong proficiency in SQL scripting
  • Proficiency developing in one or more languages such as Java, Kotlin, Python, and/or others
  • Ability to use containers and orchestration frameworks (Kubernetes, Docker, Container registries etc.)
  • Knowledge of CDN, typescript frameworks, and GQL.
  • Knowledge and good understanding of any pub/sub / Queue messaging systems 
  • Proficiency in Git or other VCS
  • Experience with configuring, customizing, and extending monitoring tools (Datadog, Prometheus, New Relic etc.)
  • Excellent debugging and troubleshooting skills
  • Strong technical competency, with a data-driven analytical approach towards solving complex challenges
  • Have a systematic problem-solving approach, coupled with strong and effective communication skills and a sense of drive
    • Nice-to-have: Experience with Terraform or other IAC tools such as Chef, Puppet or Ansible

Our Benefits (there are more but here are some highlights):

  • Competitive salary & equity compensation for full-time roles
  • Unlimited PTO, company holidays, and quarterly mental health days
  • Comprehensive health benefits including medical, dental & vision, and parental leave
  • Employee Stock Purchase Program (ESPP)
  • Employee discounts on hims & hers & Apostrophe online products
  • 401k benefits with employer matching contribution
  • Offsite team retreats

 

#LI-Remote

 

Outlined below is a reasonable estimate of H&H’s compensation range for this role for US-based candidates. If you're based outside of the US, your recruiter will be able to provide you with an estimated salary range for your location.

The actual amount will take into account a range of factors that are considered in making compensation decisions, including but not limited to skill sets, experience and training, licensure and certifications, and location. H&H also offers a comprehensive Total Rewards package that may include an equity grant.

Consult with your Recruiter during any potential screening to determine a more targeted range based on location and job-related factors.

An estimate of the current salary range is
$150,000$175,000 USD

We are focused on building a diverse and inclusive workforce. If you’re excited about this role, but do not meet 100% of the qualifications listed above, we encourage you to apply.

Hims considers all qualified applicants for employment, including applicants with arrest or conviction records, in accordance with the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance, the California Fair Chance Act, and any similar state or local fair chance laws.

Hims & Hers is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, please contact us at accommodations@forhims.com and describe the needed accommodation. Your privacy is important to us, and any information you share will only be used for the legitimate purpose of considering your request for accommodation. Hims & Hers gives consideration to all qualified applicants without regard to any protected status, including disability. Please do not send resumes to this email address.

For our California-based applicants – Please see our California Employment Candidate Privacy Policy to learn more about how we collect, use, retain, and disclose Personal Information. 

See more jobs at hims & hers

Apply for this job

+30d

Site Reliability Engineer

Bachelor's degreekotlinsqlDesigngitjavac++postgresqlMySQLpython

hims & hers is hiring a Remote Site Reliability Engineer

Hims & Hers Health, Inc. (better known as Hims & Hers) is the leading health and wellness platform, on a mission to help the world feel great through the power of better health. We are revolutionizing telehealth for providers and their patients alike. Making personalized solutions accessible is of paramount importance to Hims & Hers and we are focused on continued innovation in this space. Hims & Hers offers nonprescription products and access to highly personalized prescription solutions for a variety of conditions related to mental health, sexual health, hair care, skincare, heart health, and more.

Hims & Hers is a public company, traded on the NYSE under the ticker symbol “HIMS”. To learn more about the brand and offerings, you can visit hims.com and forhers.com, or visit our investor site. For information on the company’s outstanding benefits, culture, and its talent-first flexible/remote work approach, see below and visit www.hims.com/careers-professionals.

About the Role:

We are seeking a Site Reliability Engineer to help build a reliable web experience for our users. We believe that moving fast is our competitive advantage, and enables us to better serve our users. We also know that the faster we move, the more likely we are to break things.

You Will:

  • Actively seek and identify opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation.
  • Use automation extensively to design, configure, manage, and monitor systems in support of our product development teams
  • Implement SRE practices ensuring availability, scalability and observability of production systems with a strong focus on excellent customer experience
  • Understanding of Infrastructure as Code
  • Incident management and emergency response, track outages, ensure data integrity and engineer releases to promote rapid deployments
  • Handle emergency response either by being on-call or by reacting to symptoms according to monitoring and escalation when needed
  • Implement monitoring, logging, alerting and SLO Reporting
  • Identify Service Level Indicators (SLIs) that will align the team to meet the availability and performance objectives.
  • Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent incident reoccurrence.
  • Demonstrates strong technical skills and expertise in any one of OOO programming languages 
  • Independently handle complex technical tasks in projects.

You Have:

  • 3+ years as a software engineer, shipping production code.
  • 1+ years of experience as a Site Reliability Engineer or Production support Engineer
  • Bachelor's degree in Computer Science, Engineering, or related field, or relevant years of work experience
  • Proficiency with RDBMS databases (PostgreSQL, MySQL, SQL Server, etc.)
  • Proficiency in SQL scripting
  • Proficiency developing in one or more languages such as Java, Kotlin, Python, and/or others
  • Proficiency in Git or other VCS
  • Good debugging and troubleshooting skills
  • Strong technical competency, with a data-driven analytical approach towards solving complex challenge

Our Benefits (there are more but here are some highlights):

  • Competitive salary & equity compensation for full-time roles
  • Unlimited PTO, company holidays, and quarterly mental health days
  • Comprehensive health benefits including medical, dental & vision, and parental leave
  • Employee Stock Purchase Program (ESPP)
  • Employee discounts on hims & hers & Apostrophe online products
  • 401k benefits with employer matching contribution
  • Offsite team retreats

 

#LI-Remote

 

Outlined below is a reasonable estimate of H&H’s compensation range for this role for US-based candidates. If you're based outside of the US, your recruiter will be able to provide you with an estimated salary range for your location.

The actual amount will take into account a range of factors that are considered in making compensation decisions including but not limited to skill sets, experience and training, licensure and certifications, and location. H&H also offers a comprehensive Total Rewards package that may include an equity grant.

Consult with your Recruiter during any potential screening to determine a more targeted range based on location and job-related factors.

An estimate of the current salary range for US-based employees is
$103,000$117,000 USD

We are focused on building a diverse and inclusive workforce. If you’re excited about this role, but do not meet 100% of the qualifications listed above, we encourage you to apply.

Hims considers all qualified applicants for employment, including applicants with arrest or conviction records, in accordance with the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance, the California Fair Chance Act, and any similar state or local fair chance laws.

Hims & Hers is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, you may contact us at accommodations@forhims.com. Please do not send resumes to this email address.

For our California-based applicants – Please see our California Employment Candidate Privacy Policy to learn more about how we collect, use, retain, and disclose Personal Information. 

See more jobs at hims & hers

Apply for this job

+30d

Site Reliability Engineer

Float.comNew York,United States, Remote
terraformslackqakubernetespythonPHP

Float.com is hiring a Remote Site Reliability Engineer

Who We Are

Float is the world’s leading software for teams to plan their time. Launched in 2012, we’ve grown every year since, and remain proudly independent, self-funded and profitable. As a certified B Corporation, we’re committed to making a positive contribution to our team, customers, the environment, and the remote community. We’re a team of 50 working 100% remotely who believe in living our Best Work Life. You’ll. partner with team members globally, including Australia, Mexico, Italy, Nigeria, Canada, and the USA. Hear what our team has to say by browsing our blog, or reading our Glassdoor reviews. Check out what our customers think of Float from our G2 reviews.

We’re on a scale up journey, and we’re seeking people who thrive in this stage, given the autonomy, and the opportunity, to do the best work of their career.

Why We’re Hiring For This Role

The role of Site Reliability Engineers at Float is to increase the autonomy of the product and engineering teams by growing their capabilities to focus on solving problems. SRE makes sure our engineers get scalable infrastructure to build software on top of, making sure pipelines from idea to customer run smoothly and are easily built upon, and we also deal with broad areas of security around our network and defining internal security policy and practices.

Our goals for the Engineering team are to increase the pace with which they deliver improvements for our customers, provide an increasingly sophisticated and reliable service from our teams, and mitigate external threats as we grow.

You will help us tackle those problems by increasing reliability of our services to support larger clients joining Float, and increasing the robust security systems we’ve implemented to continue protecting our growing customer base.

Chris Nash, our Team Lead (SRE & QA), explains the important role you will play within our SRE team. Watch this video.

You’ll be working asynchronously with a bright, dedicated team from across the globe, with a strong focus on taking complex problems and creating solutions that feel simple and intuitive for our customers.

What You’ll Be Responsible For

Early on, you’ll jump right into:

  • Continuing to support the regular maintenance of all the engineering systems supporting Float’s customers
  • Identifying areas requiring support to scale
  • Identifying areas for improving service resilience, ultimately delivering the ability to be resilient within the product and engineering teams themselves
  • Optimizing our monitoring and observability stack, building on the knowledge to create a standard set of tools and configurations for the product and engineering teams
  • Understanding Float’s SLOs in context, and building out SLO patterns and procedures for product and engineering teams

Once you are settled, we expect that you will jump into the following projects:

  • Building a repeatable and trustworthy disaster recovery program using chaos engineering techniques
  • Migrating all of our deployment configurations to a global single source of truth
  • Expanding Float’s infrastructure across multiple regions to create a global network

What You’ll Need To Be Successful

We want you to love your work and believe that these skills will allow you to succeed in the role.

Applying these skills requires:

  • An excellent understanding of how SRE operates as an enabling team
  • A very good understanding of Service Level Objectives
  • Working experience with Terraform, Bash, and a go-to language which ideally would be one of PHP, NodeJS, Python
  • Experience with Kubernetes and GCP would be highly valued

As a fully remote team, we’re looking for someone comfortable with asynchronous communication as the default, which means you have previous remote experience and are comfortable using tools like Slack, Loom, and Linear to communicate as needed. Don’t worry—you will have significant deep work time since we have very few meetings.

Why Join Us

Pay for this role is US $167,471 (Level 3). Here’s a blog post with more information on how we determine our salaries.

We’re a global async remote company with a diverse team of people from all over the world who share a common belief in living our best work life. We believe deeply in the idea of transparency and share our Float Handbook publicly so potential new team members can see first hand our perks & benefits as well as our ways of working. If you feel like you can thrive at Float to do your best work, we would love to hear from you.

Hiring Process For This Role

You’ll find a lot of useful information about our interview process and what it’s like to join our global team on the Float careers page. The hiring process for this role looks like this:

  • Initial First Meet (20 min): You'll meet with Julia Fulton, Talent Manager, to discuss your interest in the role and review your questions about working at Float.
  • Take-Home Assignment: Candidates that move forward will be invited to complete a take-home assignment for the engineering team to review. This is a 4-hour assignment. Candidates will receive high-level feedback from the hiring team and those that move forward will proceed to the technical interview stage to discuss results further in more detail.
  • Technical Interview (45 min): You’ll meet with Chris Nash (Team Lead, SRE & QA) and Bogdan Frunza (Senior SRE) to discuss more about your technical experience. This will be a great opportunity for you to ask any questions and talk about goals for the role.
  • Leadership Interview (45 min): You’ll meet with Lars Gelfin (CTO) and Colin Ross (Director of Engineering) to discuss more about your experience. This will be a great opportunity for you to ask any questions and talk about goals for the role.
  • Founder Interview (30 min): You’ll meet with Glenn, Float’s CEO, to get to know you and see if you have potential to be a great addition to the team.

Note: Industry research shows that women and those in traditionally underrepresented groups generally don’t apply to jobs unless they check all the boxes for the role. If you feel strongly that you have what it takes for this role but don’t check 100% of the boxes—that’s okay—we encourage you to apply anyway and highlight what you can bring to the table.

See more jobs at Float.com

Apply for this job