Job Description
What will you do?
- To help us go to the next level we are looking to onboard a hands-on SME in leveraging big data tech to solve the most complex data issues. You will spend almost half of time with hands-on coding.
- It involves large scale text data processing, event driven data pipelines, in-memory computations, optimization considering CPU core to network IO to disk IO.
- You will be using cloud native services in AWS and GCP.
Who Are You?
- Solid grounding in computer engineering, Unix, data structures and algorithms would enable you to meet this challenge.
- Designed and built multiple big data modules and data pipelines to process large volume.
- Genuinely excited about technology and worked on projects from scratch.
Must have:
- 7+ years of hands-on experience in Software Development with a focus on big data and large data pipelines.
- Minimum 3 years of experience to build services and pipelines using Python.
- Expertise with a variety of data processing systems, including streaming, event, and batch (Spark, Hadoop/MapReduce)
- Understanding of at least one NoSQL stores like MongoDB, Elasticsearch, HBase
- Understanding of how data models, sharding and data location strategies for distributed data stores in large scale high-throughput and high-availability environments and their effect in non-structured text data processing
- Experience with running scalable & high available systems with AWS or GCP.
Good to have:
- Experience with Docker / Kubernetes
- Exposure with CI/CD
- Knowledge of Crawling/Scraping
Qualifications
See more jobs at DemandMatrix
Apply for this job