208 Results
Site Reliability Engineer (Fully Remote)
Salary: $71,000 - 91,000/year
We are looking for an experienced Site Reliability Engineer with operational and/or site reliability engineering background with a passion for providing superior system availability and customer experience. We are looking for candidates who can lead a 24/7 support organization, drive reliability and performance across a massive scale by mastering the full depth of the stack. As an SRE, you will have the opportunity to tackle complex problems of scale which are unique to tech companies while using your expertise in delivery and support of critical services.
Job Responsibilities:
Effectively manage troubleshooting and recovery of complex production incidents, ranging from low to critical impacts
Drive incident resolution through a systematic problem solving approach, coupled with a strong sense of ownership and drive
Actively participate in teams’ Agile stories (project work) to streamline and enhance day to day operations of the team
Create, manage and utilize appropriate technical procedural documentation (run books)
Proactively monitor all of the applications and infrastructure behind Capital One’s external and internal customer facing services including their availability, latency, performance, and capacity
Influence resiliency and scalability in production environments in Amazon Web Services (AWS)
Identify opportunities and develop proactive automated monitoring and alerting solutions by utilizing available tools (Splunk, DataDog, etc.)
Assist with conducting Root Cause Analysis (RCA) on critical production outages, develop and implement mitigation strategies
Utilize production support expertise to influence and support new designs, architectures, standards and methods maintaining stability and availability for large-scale distributed systems
Proactively identify and implement opportunities for automation of routine maintenance tasks, data gathering and resolution of common issues
Continuously seek to develop new skills and technical expertise, as well as proactively share knowledge with others
Basic Qualifications:
At least 3 years of experience in technology production support
Azure DevOps experience
2+ years of experience with Linux, UNIX, python, Ruby, Go, JavaScript, or NoSQL
2+ years of experience with AWS, Azure or GCP
2+ years experience with web API services
2+ years of experience with Splunk, New Relic, or DataDog monitoring and alerts
See more jobs at Latitude, Inc.
Germany / Remote
Corsearch’s solutions are revolutionizing how companies commercialize and protect their growth. Trusted by thousands of customers worldwide, Corsearch delivers data, analytics, and services that support brands to market their assets and reduce commercial risks.
From IP clearance to brand protection, Corsearch provides a comprehensive program that enables businesses to secure brand value and thrive commercially. Behind the world’s best-known brands, there’s Corsearch.
Corsearch has more than 1500 team members serving over 5,000 clients on five continents, and we’re growing and changing rapidly. We are a fantastic company to work for — with great benefits, growth opportunities, and a terrific internal culture — and we truly believe that it’s people who make us thrive.
Corsearch is growing fast and is always looking for new talented people to be part of the journey.
About the Position
Requirements:
See more jobs at Incopro
We’re looking for aSeniorInfrastructureEngineer with a passion to develop and provide stable infrastructure for backend applications. In this role, you will touch modern infrastructure architecture, CICD flow build up, SRE culture, IaC concept,.., etc..
What you will do:
Who you are:
What we offer
GOGOX is the first on-demand logistics and transportation platform in Asia. As a pioneer among tech and logistics startups, we transform the logistics industry, by making use of the trending sharing economy concept and embracing the beauty of simplicity and efficiency.
Over the years, GOGOX has expanded its business from Hong Kong to Singapore, South Korea, Mainland China, Taiwan and India and will continue to expand globally. If you share our vision and enjoy working in a creative, innovative and fun environment, apply to join our team and start your GOGOVanture today.
See more jobs at GOGOX
Verimatrix is seeking a talented and experienced engineer to join ourSite Reliability Engineering(SRE) team. We are looking for someone who is passionate about SRE and can help evangelize proper practices and mindsets. This position will also help build out and refine our observability stack to provide actionable data to other teams, provide fine-grained metrics for our alerting and on-call management system, and take a more proactive approach to issues. Lastly as a member of the SRE Team, this position will provide Tier 2 support and help improve the quality of our Runbooks. Bring your experience to help us prepare for scale by adopting industry best practices in availability, security, observability, reliability, and automation.
If this sounds like a challenge for you and you are a problem solver who loves collaboration, this position may be for you! We are operational, but now we need you to help us reach operational excellence. Be prepared to partner with other teams and collaborate across all functions in our organization. Learn what it is like to work in a company where transparency and visibility are valued. We encourage shared goals and objectives across teams. We care about our culture and our people.
Bring your SRE skills to Verimatrix and help us become proactive and able to anticipate problems rather than just be reactive. Solve hard problems with software and automation. Be part of a team and company who support each other and strive to have a positive impact on our customers.
What we looking for:
QUALIFICATION REQUIREMENTS: To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed below are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.
Verimatrix (Euronext Paris: VMX) helps power the modern connected world with security made for people. We protect digital content, applications, and devices with intuitive, people-centered, and frictionless security. Leading brands turn to Verimatrix to secure everything from premium movies and live streaming sports to sensitive financial and healthcare data, to mission-critical mobile applications. We enable the trusted connections our customers depend on to deliver compelling content and experiences to millions of consumers around the world. Verimatrix helps partners get to market faster, scale easily, protect valuable revenue streams, and win new business. To learn more, visit www.verimatrix.com.
By submitting this form, I agree to the processing of my personal data for the purpose of processing my job application and replying to my request,
See more jobs at Verimatri
Site Reliability Engineer (SRE) (h/f)
Today, video accounts for over 80% of all internet traffic! ????
We are increasingly living in a video-first world where our online experiences are dominated by real-time, streaming, and on-demand video.
At api.video our mission is to connect people through their cameras and videos. We are a global API-first platform managing and delivering online video at scale and our goal is to become the standard for how modern teams bring video experiences into their products and services.
Just like Stripe for payments, Twilio for text/VOIP and Sendgrid for email; we're making video accessible to every client and developer via our api, the world over.
As our company is scaling we’re looking to double the size of the team within the year.
We’re looking to add talented minds able to work during CET/CEST time-zones business hours.
In this role you’ll participate in the design, development and run of api.video's infrastructure, enabling developers and clients in more than 100 countries to quickly integrate all the features needed to deliver live or on-demand video into their applications and services.
A unique opportunity to be an early member of a success story. A welcoming and collaborative environment with people who love working on complex issues. With ambitious objectives enabling this role offers the opportunity push your learning curve.
As a Site Reliability Engineer, you will be in charge of service continuity and focusing on creating and maintaining solutions to achieve that goal. You will be the owner of the reliability of the infrastructure stack.
Among the subjects you will be working on :
See more jobs at api.video
Site Reliability Engineer (SRE) (PeopleFluent) UK, Remote
PeopleFluent is hiring! We have an exciting opportunity for a Site Reliability Engineer to join our Hosting team.
The ideal candidate will genuinely enjoy solving operational and development problems using the latest and greatest technologies / methodologies. We also need someone who knows how to play well with others (especially the super fun and interesting people we have on our team).
We expect you to have at least 3 years of professional experience in Systems Administration, Applications Development, Software Engineering, and/or Configuration Management. At least 1 year of professional experience (or more!) as a SRE is highly desired!
PeopleFluent provides flexible cloud solutions that put learning at the heart of talent strategy. As a market leader in integrated talent management and learning solutions, PeopleFluent helps companies hire, develop, and advance a skilled and motivated workforce. Whether they're deployed separately or as a suite, our Recruiting, Onboarding, Performance, Succession, Compensation, and Learning solutions deliver a superior user experience that guides managers and employees with contextual learning – right in the flow of work!
PeopleFluent Learning is part of Learning Technologies Group plc (LTG).
For more information, please visit www.peoplefluent.com and/or www.ltgplc.com.
We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any other federal, state or local protected class.
See more jobs at LTG
If you are a talented Site Reliability Engineer(SRE), this is the position for you! This is a great opportunity to work with a company that has a primary focus of making our customers happy by delivering value without all the burdensome policies and rules that have become typical for outsourced software development companies. If you work at Tech9, we will ensure that you are happy because at Tech9, we #techhappily!
**Note: This role is 100% remote. You will not be required to come to the office.
If that sounds attractive, please apply! We'd love to talk to you.
Minimum Qualifications:
#LI-Remote
See more jobs at Tech9
Site Reliability Engineer (Laravel/Vue/AWS/ECS) at
SportsRecruits is the leading sports recruiting network, connecting athletes, clubs, events, and college coaches in the recruiting process. The company’s network and tools are trusted by sports organizations such as the IWLCA, IMLCA, and Junior Volleyball Association. Every year, millions of connections are made on the network, resulting in commitments to the best academic and athletic institutions.
SportsRecruits is an equal opportunity employer and embraces diversity and equal opportunity on our team. Just like the student-athletes we support, we are trying to get better and stronger as a team everyday. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills. We strongly believe that the more inclusive our team is, the better we can serve all student-athletes, as well as their families and coaches, who are pursuing their dreams.
About the Position
We are a product development team full of fun, intelligent, happy, and hardworking engineers, designers and product managers distributed across the United States. We are profitable, funded and giving more high school athletes the ability to play college athletics than any other recruiting tool out there. Your input and coding/problem solving skills will make a direct impact in how we scale and grow the company.
We are looking for an SRE to join our team working remotely. We are looking for someone who is a programmer first and is capable of debugging code, refactoring code and writing tests. Our two main technologies that we are investing most of our resources in are Laravel (a modern PHP framework), which we use for our API and Vue.js (javascript frontend framework), which currently powers the frontend application.
As a Site Reliability Engineering on the Infra/DevOps team, you will take over setting up incident response protocols, Continuous Integration pipeline, Performance indicators and reduction of technical debt, diagnose and identify issues coming from Newrelic and Sentry.
You will spearhead SRE practices like:
You will aid with devops / infrastructure responsibilities:
Requirements:
Nice to have:
What we offer:
It’s important to us that our team is happy, and we're always looking for ways to improve our overall work culture and support our employees’ well-being. Here are a few of the benefits we offer at this time:
This is a full time position available as remote or in NYC, no freelancers please. Principals only, no recruiters please.
See more jobs at SportsRecruits
Site Reliability Engineer, evertz.io (Poland)
Skills and experience you will bring:
• 3 years of experience managing critical production infrastructure and maintaining reliability and uptime of serverless applications running on the cloud.
• 2 years of experience with monitoring, log-aggregation, and observability services like Datadog, CloudWatch, Honeycomb, Splunk, and New Relic.
• 2 years of experience implementing and managing production CI/CD pipelines using modern deployment mechanisms such as blue/green deployment
• 2 years of experience translating SLO’s and SLI’s into actionable improvements. Reliability, monitoring, and observability are not just words to you.
• Solid foundation in Linux systems administration, networking, and security.
Additional skills and experience that will be useful:
• Experience with security frameworks such as OWASP, ISO, CSA and PCI.
• Experience conducting threat assessments and creating remediation plans based on the results of threat assessments.
• Experience with penetration testing, threat modelling, open-source, and commercial security tools.
• Experience developing new deployment mechanisms for webapp infrastructure, such as: canary, A/B, blue/green, red-line and other deployment patterns
• Deep knowledge of performance tuning of core AWS services like Lambda, DynamoDB, APIGateway, SQS, EventBus, EC2
• Experience with chaos engineering that pushes systems and products to their limits to see how they will respond to unexpected events.
About the Role
The evertz.io Engineering Team builds next-generation systems for content management and distribution in the Media and Entertainment industry. Disney, NBCUniversal, Discovery, BBC, and many other content producers and publishers use our products and services to make the most of their file-based and live content for the least effort.
We work with high quality video in real-time and non-real-time scenarios across a wide range of cutting-edge tech. Specializations within the group span from low-level video manipulation and analysis, through back-end management and orchestration services, to web delivered UIs. Working in parallel with these teams is the Scientific Computing Group who work in computer vision, data science and machine learning, taking experiments in Jupyter notebooks through to deployment in production. This makes for a challenging and rewarding engineering experience of continual learning and plenty of opportunity to explore different parts of the stack.
Our technology stack includes a Serverless microservice architecture that capitalizes on the full breadth of AWS services with code written in Python, Rust and Java, our UI uses the latest versions of Angular, Typescript and NgRx, our CI/CD pipelines leverage AWS, Jenkins, Nexus, and Bazel in addition to our in-house release-management application to build and release 100's of software components.
As a Site Reliability Engineer, you will join our talented and passionate team building evertz.io: a collection of services that will be used by the biggest names in the exciting broadcast and media industry. Our services are hosted in AWS, with a Serverless First mindset.
“Work is a thing you do, not a place you go”
We work in agile, low-bureaucracy, high-creativity, cross-functional teams spread across the world. It’s a highly creative work environment where we support your growth with opportunities for career progression, mentoring others and third-party education. The team is built on trust and is relaxed, open and welcoming to all, and there’s fun to be had with regular social events and sports teams.
Responsibilities
As part of this role, you will be expected to:
• Establish and measure reliability goals like Uptime, Downtime, Mean time between failures, Mean time to resolution, etc.
• Define operational maturity by defining and implementing SLIs, SLOs, enable faster detection, and isolation of failures and proactively work to mitigate them
• Participate in an on-call rotation.
• Participate in daily scrum standups, sprint planning, and other team rituals including retrospectives.
• Implement and maintain CI/CD pipelines on AWS using CodeCommit, CodePipeline and CodeDeploy
• Evaluate, Implement, and use various monitoring, log-aggregation, and observability services like AWS CloudWatch, Honeycomb to troubleshoot and resolve issues rapidly
• Conducting and documenting root cause analysis (RCA) and post-incident reviews that document events.
• Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
Location
This role allows you to work with “Full Flexibility” - for any work where being physically close to fixed equipment is not a requirement, you have the option to work remotely.
Remote working is not the same as working from home, WFH is just one very common option. You can work from wherever gets the creative juices flowing: coffee shops, co-working places, the park, a different country even! Anywhere with Internet access.
Of course, working from an office is an option too especially if you’re craving some ad hoc in-person interaction! Evertz has offices in Canada, England, Scotland, India, Singapore, Hong Kong, Virginia, California, Arizona, Ohio, Hungary, Belgium, Poland and Australia. Many have great spaces for meet-ups as well as permanent or floating desk space.
Working Hours
This role allows you to work asynchronously meaning you can contribute at the times when you do your best work. Some people are early-birds, some are night-owls, maybe Saturday is better than Wednesday? Whilst some overlap for core meetings is needed, you don’t have to do your deep work between 9 and 5.
Salary & Benefits
We offer a competitive salary with annual performance-based bonus and stock option schemes. A pension plan; an employer funded health and medical plan; life insurance plan; long term disability coverage; paid time off; an employee assistance program; and a discount platform. The availability and specifics of these benefits vary by location, details of which will be provided during the hiring process.
See more jobs at Evertz Microsystems Limited
Our mission is to save engineers from building in-house data pipelines, by building one automated data pipeline that everyone can use. Every single company that uses SaaS tools to run their business will eventually need to analyze the data that sits in those tools. Fivetran unlocks this data with automated connectors that converts messy, chaotic APIs into normalized, standard schemas.
From Fivetran’s founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonical and ready to query, with no engineering or maintenance required. We’re proud that more organizations continue to leverage our technology every day to become truly data-driven.
Fivetran is looking for a high-performance, experienced engineer to be a part of a team of Site Reliability Engineers. You will be working closely with engineering teams, product managers, as well as support and sales engineers to build the future of the Fivetran Data Platform Reliability.
As a member of the Site Reliability Engineering team, you will take ownership over the overall performance and reliability of Fivetran’s infrastructure, the robustness of the deployment pipeline, as well as timely and effective incident response and resolution. You will take responsibility for the growth and stability of Fivetran’s infrastructure, and be a key player driving effective incident response and overall issue avoidance.
Preferred experience:
We’re honored to be valued at over $5.6 billion, but more importantly, we’re proud of our core values of Get Stuck In, Do the Right Thing, and One Team, One Dream. To learn more about Fivetran’s culture and what it’s like to be part of the team, click here and enjoy our video.
To learn more about our candidate privacy policy, you can read our statement here.
We've built a huge product with a small team by dividing our platform into simple, independent pieces and building our software in a disciplined, pragmatic way. We use Java, Google Cloud Platform, PostgreSQL, and React.
See more jobs at Fivetran
Site Reliability Engineer (SRE) (PeopleFluent) US, Remote
PeopleFluent is hiring! We have an exciting opportunity for a Site Reliability Engineer to join our Hosting team.
The ideal candidate will genuinely enjoy solving operational and development problems using the latest and greatest technologies / methodologies. We also need someone who knows how to play well with others (especially the super fun and interesting people we have on our team).
We expect you to have at least 3 years of professional experience in Systems Administration, Applications Development, Software Engineering, and/or Configuration Management. At least 1 year of professional experience (or more!) as a SRE is highly desired!
In addition to vacation benefits, you will be eligible upon your date of hire to participate in our comprehensive benefits program which includes medical, dental, and vision insurance; we also offer HSA and FSA plans as well as life insurance offerings. Additionally, you will be eligible to participate in our 401(k) plan.
PeopleFluent provides flexible cloud solutions that put learning at the heart of talent strategy. As a market leader in integrated talent management and learning solutions, PeopleFluent helps companies hire, develop, and advance a skilled and motivated workforce. Whether they're deployed separately or as a suite, our Recruiting, Onboarding, Performance, Succession, Compensation, and Learning solutions deliver a superior user experience that guides managers and employees with contextual learning – right in the flow of work!
PeopleFluent Learning is part of Learning Technologies Group plc (LTG).
For more information, please visit www.peoplefluent.com and/or www.ltgplc.com.
We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any other federal, state or local protected class.
See more jobs at LTG
Senior Site Reliability Engineer (APAC)
MariaDB is making a big impact on the world. Whether you’re checking your bank account, buying a coffee, shopping online, making a phone call, listening to music, taking out a loan or ordering takeout – MariaDB is the backbone of applications used everyday. Companies small and large, including 75% of the Fortune 500, run MariaDB, touching the lives of billions of people. With massive reach through Linux distributions, enterprise deployments and public clouds, MariaDB is uniquely positioned as the leading database for modern application development.
The Opportunity
MariaDB is building a web-based management tool to help our customers easily configure and manage enterprise MariaDB configurations. This role will join an existing team to help build and accelerate the delivery of this product. This is a high impact role where you will have the opportunity to work on hundreds of clusters in a multi-cloud environment.
Responsibilities:
Requirements:
Nice To Have Experience:
Location:APAC (Remote)
What’s in It for You?
Impact the world of technology by pushing the boundaries of technology and business models, working at MariaDB. Be part of a game-changing organization that encourages outside-the-box thinking, values empowerment, and is truly shaping the future of the software industry. You’ll be collaborating with high-caliber colleagues around the world, offering unparalleled learning and growth opportunities. We provide a very competitive compensation package, 25 days paid annual leave (plus holidays), stock options, a massive degree of flexibility and freedom, and more.
How to Apply
If you are interested in this position, please submit your application along with your resume/CV.
MariaDB does not sponsor work visas or relocation.
MariaDB is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process. To request an accommodation due to a disability, please inform your recruiter.
MariaDB is an equal opportunities employer.
See more jobs at MariaDB Corporation Ab
Senior Site Reliability Engineer
At Convex (YC W19), we’re building the leading B2B full-stack software platform for the $400bn+ commercial services market. It's a 100-year-old industry impacting millions of people every day. We already work with some of the largest enterprise companies in the sector and were one of the fastest growing companies in the Winter 2019 YC batch. Our team is a unique mix of industry veterans from Carrier, Siemens, and Honeywell as well as founders from MIT, Harvard, and Georgia Tech. Based in San Francisco, our investors include Emergence Capital, 1984 Ventures, UP2398, Liquid2 (Joe Montana), Y Combinator, the founders of PlanGrid, and others.
At Convex, we build the leading B2B platform for the fast growing commercial and building services industry. Our software provides rich data on every commercial property in the US (~63M) and workflow software built on top of that. For our users who serve these properties, that data and workflow becomes their secret weapon; there's nothing else like it available in the market today. Our customers rely on Convex to identify, win, and manage new growth opportunities.
We are based in, and love, the seven square miles of San Francisco, but our customers (and employees) live and work in almost every state in America. They include some of the largest enterprises in the country, like Siemens and Carrier, and smaller businesses we care just as deeply about.
The Product
Our flagship product, Atlas, is a “consumer-grade enterprise product.” Think Apple experience with Oracle utility. Atlas supercharges our users’ work by providing them with information on virtually every commercial property in the country. There is literally no other data source like this available anywhere. All that data is interesting, but it isn’t powerful unless you have the ability to work with it, which is why we are building a full suite of specialized software tools on top of it.
Your Role
Requirements
Nice to Have
Benefits
It’s important to us that we provide our employees with meaningful benefits & resources that support them through every stage of their life & career with us, so we’ve built a robust wellness plan to do just that.
Generous employer contributions towards medical, dental, and vision insurance
Paid parental leave of up to 6 months with 100% pay
Flexible & generous time-off plans (including mental health days!)
Income protection through short-term and long-term disability plans
Tax-favored benefits such as retirement savings plans and flexible spending accounts
Commuter programs
Healthy lunch, drink, and snack options at our corporate office
Flexible hybrid & remote work options
About Convex
At Convex (YC W19), we’re building
the leading B2B full-stack software platform for the $400bn+ commercial services market. It's a 100-year-old industry impacting millions of people every day. We already work with some of the largest enterprise companies in the sector and were one of the fastest growing companies in the Winter 2019 YC batch. Based in San Francisco, our investors include Fifth Wall, Emergence Capital, GGV, 1984 Ventures, UP2398, Liquid2 (Joe Montana), YCombinator, the founders of PlanGrid, and others.
At Convex, we welcome diverse perspectives and people who think rigorously and aren't afraid to challenge assumptions. Join us!
Convex is an equal opportunity employer and values diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
If you need assistance or an accommodation due to a disability, please let your recruiter know.
Convex is looking for engineers to help our customers serve an under the radar, but massive and ubiquitous industry, commercial building services. Our customers service the systems that provide air we breathe, the water we drink, as well as the lighting, safety, and security systems that power daily life for billions of people.
We are based in, and love, the seven square miles of San Francisco, but our customers are in every corner of America. We found our foothold in small to medium sized businesses, and have quickly been pulled up-market to some of the largest enterprises in the country, like Siemens and Carrier. We have shipped an impressive amount of product with a lean team, but now we have to scale to meet the demands of our customer base which is growing in both size and sophistication.
That is why we need you.
Our flagship product, Atlas, is a “consumer grade enterprise product.” Think Apple experience with Oracle utility. Atlas supercharges our users’ work by providing them with information on virtually every commercial property in the country. There is literally no other data source like this available anywhere. All that data is interesting, but it isn’t powerful unless you have the ability to work with it, which is why we are building a full suite of specialized software tools on top of it.
See more jobs at Convex
Senior Site Reliability Engineer (Backend Platform team)
Help us build meaningful software in healthcare used by doctors, patients, and researchers worldwide.
Doxy.me is the simple, free, and secure telemedicine solution used by over 1 million healthcare providers worldwide. Our mission is to eliminate barriers to telemedicine like cost and accessibility, so we are constantly striving to make doxy.me more accessible to everyone, everywhere. With over 350,000 telemedicine calls made through our platform every day, there are millions of people relying on us to simplify their healthcare services.
We are seeking a Senior SIte Reliability Engineer motivated by unique, interesting, meaningful challenges in the healthcare sector. We help doctors provide remote medical care, researchers collect structured data, the general public understand personal risks of disease, and much much more.
See more jobs at doxy.me
About Us
UJET is the world’s first and only cloud contact center platform for smartphone-era CX. By modernizing digital and in-app experiences, UJET unifies the enterprise brand experience across sales, marketing, and support, eliminating the frustration of channel switching between voice, digital, and self-service for consumers. Offering unsurpassed resiliency and the flexibility to deploy across leading public cloud infrastructures, UJET powers the world’s largest elastic CCaaS tenant at up to 22,000 agents globally and is trusted by innovative, customer-centric enterprises like Instacart, Turo, Wag!, and Atom Tickets to intelligently orchestrate predictive, contextual, conversational customer experiences.
Opportunity
We are looking to add a Site Reliability Engineer to our growing engineering team! The SRE teams own UJET’s cloud based infrastructure and scaling. We work closely with the Security team and other Engineering teams. Our ideal candidate is an experienced SRE who has built and maintained cloud infrastructure at scale and has meticulous code style and quality.
Responsibilities
Requirements
UJET is an Equal Opportunity Employer
Research shows that while men apply to jobs when they meet an average of 60% of the criteria, women and other marginalized folks tend to only apply when they check every box. So if you think you have what it takes, but don't necessarily meet every single point on the job description, please still get in touch. We'd love to have a chat and see if you could be a great fit. (Thanks CultureAmp who came up with this statement - it’s too good and too important to not repeat)
Compliance Responsibilities
Security, data protection and compliance (SDPC) are paramount to the success of our partnerships. All roles at UJET require compliance with legal and regulatory requirements and acceptance and adherence to all policies and standards within UJET. Personnel acknowledges they are personally responsible for reporting any suspected violations or abuse and are required to complete SDPC training and fulfill role-specific SDPC responsibilities.
Why UJET?
In addition to our great team and disruptive technology, we offer our teammates a competitive compensation and benefits package, work/life balance, unlimited vacation, stock options, monthly game nights, and more!
See more jobs at UJET
Staff Site Reliability Engineer
RevenueCat is a simple API for developers to manage subscriptions. We provide all the infrastructure needed for app developers to build, analyze and grow their subscription business.
RevenueCat makes building, analyzing and growing mobile subscriptions easy. We launched as part of Y Combinator's summer 2018 batch and today are handling more than $1.2B of in-app purchases annually across thousands of apps.
We are a mission driven, remote-first company that is building the standard for mobile subscription infrastructure. Top apps like VSCO, Notion, and ClassDojo count on RevenueCat to power their subscriptions at scale.
Our 40 team members (and growing!) are located all over the world, from San Francisco to Madrid to Taipei. We're a close-knit, product-driven team, and we strive to live our core values: Customer Obsession, Always Be Shipping, Own It, and Balance.
We are looking for a Senior Site Reliability engineer to help design, build and support reliable core systems and infrastructure. We drive cross-team collaboration to improve scalability and end-to-end reliability. Our SDK is shipped on over 10k apps, and our APIs receive more than 20 billion requests per month. Our stability affects the experience of millions of users.
We want to bring somebody onboard that is passionate about reliability, scalability and understanding the limits of computers and people. This person should be excited about all the technical challenges we will face growing our API throughput to millions of requests per minute.
We have an API, a web dashboard, and a proliferation of mobile SDKs.
The API is Flask + PSQL, the web dashboard is a React app, and the mobile SDKs are written in whatever language the target platform is.
Our API has to deal with a massive amount of requests and there going to be many interesting scaling problems in the future for us.
On the mobile SDK side, it is a great challenge in providing sane and native-feeling SDKs to many platforms. A great opportunity for a polyglot who cares about developer experience.
Apply NowSee more jobs at RevenueCat
Site Reliability Engineer - Remote
About Us:
VetCentric is focused on delivering outstanding services to the federal government. We have extensive experience in the fields of cyber security, supply chain & logistics management, strategy, business analytics, and IT services such as system design, continuous improvement, virtualization, and data center management. VetCentric is an SBA certified HUBZone company and VA CVE certified Service-Disabled Veteran Owned Small Business (SDVOSB). We operate in 15 states with offices in Washington DC and Northern Virginia.
Perks Working with Us:
Location(s): Anywhere, US. Candidates from HUBZones preferred.
Employment Eligibility: Eligible to work for any employer in the United States without requiring sponsorship. Sponsorship is not available currently.
As a Site Reliability Engineer (SRE) on our team, you will use your subject matter monitoring expertise and skills to improve the reliability of the VA’s applications via enterprise monitoring capability tools. You will be responsible for figuring out why an application with enterprise monitoring efforts allowed a high priority incident (HPI) or a critical priority incident (CPI). You'll work with the Enterprise Command Center’s (ECC) Business Line Management (BLM) Teams, the ECC Event Management (EM) Team and the Enterprise Command Operations’ (ECO) Incident Management Team detect, investigate, and diagnose monitoring problems and defects across Enterprise level applications and technology stacks. This position will be on a team dedicated to providing recommendations and instrumenting those approved recommendations in ECC’s monitoring tools to improve VA enterprise reliability and improve the quality of services provided to veterans. The ECC monitoring tools will be focused in Splunk Enterprise/ITSL, AppDynamics, DynaTrace, SolarWinds, ScienceLogic and Aternity. You will be working with system and application owners to obtain existing design and functionality, leverage comprehension of workflow systems and applications processes within multiple system environments and work across technology and development teams to diagnose outages due to inadequate monitoring instrumentation designs and recommend changes to increase reliability.
You Have:
Nice If You Have:
See more jobs at VetCentric
Associate Site Reliability Engineer
At IFS you will work in a growing, global enterprise software company built upon committed and empowered colleagues who come to work knowing they are making a difference. We work everyday with customers who continue to challenge their markets and competitors. As a challenger ourselves, we partner with our customers to guide them through their digital transformations and extract the most value out of our software solutions. We take pride in ensuring that our employees are able to achieve the company goals as well as develop their career. We believe empowered autonomy, committed colleagues and being part of a winning team are the keys to our success and what makes us great! We are #ForTheChallengers and if that resonates with you, we would love to hear from you!
Associate Site Reliability Engineer (SRE – US ITAR)
United States: Remote job role
The IFS Associate Site Reliability Engineer exists within the global Cloud Operations organization. The role forms part of a team, which reports into a Cloud Services manager who is responsible for the operational and people management aspects of the team. The team provides 24x7x365 operations support to the IFS customer base who have subscribed to the IFS Cloud Services for ITAR (International Traffic in Arms Regulations). The role handles multiple aspects of incident, service request, problem, and change management, as well as working with multiple internal and external stakeholders related to Cloud Services for ITAR. At times, the need to aid other areas of the global Cloud Services team will also be necessary.
Although not a role with people management duties, the selected individual will typically have an area(s) of technical expertise that not all members on the team share. Mentoring, handling escalations, writing documentation, promoting best practices, and taking a primary role in shared team initiatives will be required. At times, working with other members of the larger Cloud Operations organization, Application Support, R&D, Consulting, other groups within IFS, as well as external vendors will also be required.
Work performed is subject to ITAR compliance. Strict adherence to established processes is critically important to executing job responsibilities for maintaining compliance.
Key Duties
Personal Abilities
Experience
Technical Skills
The successful candidate must have the following skills and for each relevant skill, the candidate should either have commercial experience or a suitable professional grade qualification in one or more of the following areas:
In addition to having experience in one of the above areas, experience in the following areas of expertise are also desired:
The following are value add skills if available
Qualifications
Mandatory
A formal qualification (Degree, HND, etc.) in Computer Science, Information Technology or similar.
Optional Value Add
Working Environment
Team provides support 24x7x365. Flexibility to working some holidays, nights, weekends and assist with escalations at short notice.
Note: This role profile serves to provide objective criteria for selecting a candidate who best fits the requirements. This document summarizes the main duties and responsibilities of the role and is not intended as an exhaustive list.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran. VEVRAA Federal Contractor, Equal Opportunity Employer
See more jobs at IFS
Site Reliability Engineer (full remote working)
Launched in 1998, this pioneering British-born brand has specialised in creating amazing experiences and unforgettable memories - from hotels, city breaks and holidays to theatre, entertainment and spa days. Experts in brightening up online travel, lastminute.com is among the worldwide leaders in the field, helping hundreds of thousands of customers every year find, and do, "whatever makes them pink".
lastminute.com is part of lm group, publicly-traded multinational Group, among the worldwide leaders in the online travel industry. Every month, the Group reaches across all its websites and mobile apps (in 17 languages and 40 countries) 60 million unique users that search for and book their travel and leisure experiences. More than 1,200 people enjoy working with us and contribute to provide our audience with a comprehensive and inspiring offering of travel-related products and services.
At the heart of our culture is a commitment of inclusion across race, gender, age sexual orientation, religion, gender identity or expression and accessibility. We strongly believe in an equal opportunity space, which is welcoming and celebrates the uniqueness of everyone who works here. We value different lived experiences and respect viewpoints, as we know unicity drives innovation. We want to make sure our people reflect the communities across the world we help travel.
*Please note that is a full remote working position/on-site*
*This vacancy is also eligible for External Referral Programme: Do you have a friend that you think can be interested in this position? Don’t keep it for yourself, click here and suggest us his/her profile! Check out how our External referral policy works here
To support and participate in company-wide Continuous Deployment introductions and SRE projects we are looking for a Site Reliability Engineer with certified experience as SRE for our Technology department.
“Hope is not a strategy. Engineering solutions to design, build, and maintain efficient large-scale systems is a true strategy, and a good one.”
Key Responsibilities
Essential
Desirable
Abilities/qualities
By joining our company, you will have the chance to:
See more jobs at lastminute.com
Azure DevOps Site Reliability Engineer (Remote)
Vendavo is the leading provider of price management and optimization solutions for business-to-business companies worldwide. Vendavo solutions (On-premise, Mobile and SaaS) include comprehensive pricing analysis, optimization, price setting, and deal execution capabilities that help companies improve profits through the art of science and big data. Leading companies across chemicals, high-tech, industrial manufacturing, and distribution industries leverage Vendavo solutions to drive higher profits. We’re making a difference in business, and we’re looking for energetic, experienced, and talented professionals to grow our team. If you are someone who is driven to make a global impact and believes in a culture of mutual respect, then you need to join us here at Vendavo!
We collaborate with our customers like few others in our industry. That’s how we help global businesses achieve extraordinary outcomes in driving predictable, profitable outcomes and growth, by combining the best technology, processes, and – most importantly – people.
It doesn’t stop with unlocking opportunities for customers: We’re committed to creating growth, opportunity, diversity, and inclusion for our employees, too.
Our team is growing. You will too.
The Opportunity: We are seeking a DevOps Site Reliability Engineer to embed with our Cloud Services team. In this role, you will help maintain, develop, and scale the Vendavo Cloud platform to support our rapid growth and ambitious goals. Members of this team take a collaborative and customer-oriented approach. You will have the opportunity to offer new ideas and make valuable contributions to the team every day. If you love automating infrastructure as code, and enjoy the variety of systems administration, cloud services, and database administration, this role is for you!
Accommodations
Vendavo is an inclusive community, and we know that everyone has their own needs. If you have a disability or special need that requires accommodation during the interview process, please contact your recruiter with your request. Your message will be confidential, and we will be happy to assist you.
All your information will be kept confidential according to EEO guidelines.
See more jobs at Vendavo