SRE Hadoop Consultant
Location: EST / PST or CST | Remote | Work from Home
Do you thrive on solving tough problems under pressure? Are you motivated by fast-paced environments with continuous learning opportunities? Do you enjoy collaborating with a team of peers who push you to constantly up your game?
At Pythian, we are building a Site Reliability Engineering team that is focused on Hadoop service operations and open source, cloud-enabled infrastructure architecture. We need motivated and talented individuals on our teams, and we want you!
You’ll act as a technology leader and advisor for our clients, as well as a mentor for other team members. Projects would include things such as Hadoop deployment, upgrade, disaster planning, system and ecosystem tuning, infrastructure architecture, performance analysis, deployment automation, and intelligent monitoring.
You will work with amazing clients from small, high-velocity startups to large enterprises with complex, hybrid infrastructures and large data processing requirements.
What will you be doing?
- Deploy, operate, maintain, secure and administer solutions that contribute to the operational efficiency, availability, performance and visibility of our customers’ infrastructure and Hadoop platform services, across multiple vendors (i.e. Cloudera, Hortonworks, MapR).
- Gather information and provide performance and root cause analytics and remediation planning for faults, errors, configuration warnings and bottlenecks within our customers’ infrastructure, applications and Hadoop ecosystems.
- Deliver well-constructed, explanatory technical documentation for architectures that we develop, and plan service integration, deployment automation and configuration management to business requirements within the infrastructure and Hadoop ecosystem.
- Understand distributed Java container applications, their tuning, monitoring and management; such as logging configuration, garbage collection and heap size tuning, JMX metric collection and general parameter-based Java tuning.
- Observe and provide feedback on the current state of the client’s infrastructure, and identify opportunities to improve resiliency, reduce the occurrence of incidents and automate repetitive administrative and operational tasks.
- Contribute heavily to the development of deployment automation artifacts, such as images, recipes, playbooks, templates, configuration scripts and other open source tooling.
- Be conversant about cloud architecture, service integrations, and operational visibility on common cloud (AWS, Azure, Google) platforms. Understanding of ecosystem deployment options and how to automate them via API calls is a huge asset.
What do we need from you?
While we understand you might not have everything on the list, to be successful you are likely to have skills such as;
- Understand the end-to-end operations of complex Hadoop-based ecosystems and handle / configure core technologies such as HDFS, MapReduce, YARN, HBase, ZooKeeper and Kafka.
- Understand the dependencies and interactions between these core components, alternative configurations (i.e. MRv2 vs Spark, scheduling in YARN), availability characteristics and service recovery scenarios.
- Identify workflow and job pipeline characteristics and tune the ecosystem to support high performance and scalability, from the infrastructure platform through to the application layers in the ecosystem.
- Understand and enable metric collection at all layers of a complex infrastructure, ensuring good visibility for engineering and troubleshooting tasks, and ensure end to end monitoring of critical ecosystem components and workflows.
- Understand the Hadoop toolset, how to manage and copy data between and within a Hadoop cluster, integrate with other ecosystems (for instance, cloud storage), configure replication and plan backups and resiliency strategies for data on the cluster.
- Comprehensive systems hardware and network troubleshooting experience in physical, virtual and cloud platform environments, including the operation and administration of virtual and cloud infrastructure provider frameworks. Experience with at least one virtualization and one cloud provider (for instance, VMWare, AWS) is required.
- Experience with the design, development and deployment of at least one major configuration management framework (i.e. Puppet, Ansible, Chef) and one major infrastructure automation framework (i.e. Terraform, Spinnaker, CloudFormation). Knowledge of DevOps tools, processes, and culture (i.e. Git, continuous integration, test-driven development, Scrum).
- Ability to pick up new technologies and ecosystem components quickly, and establish their relevance, architecture and integration with existing systems.
What do you get in return?
- Competitive total rewards package
- Flexible work environment: Why commute? Work remotely from your home, there’s no daily travel requirement to the office!
- Outstanding people: Collaborate with the industry’s top minds.
- Substantial training allowance: Hone your skills or learn new ones; participate in professional development days, attend conferences, become certified, whatever you like!
- Amazing time off: Start with a minimum 3 weeks vacation, 7 sick days, and 2 professional development days!
- Office Allowance: Purchase a device of your choosing and personalise your work environment!
- Fun, fun, fun: Blog during work hours; take a day off and volunteer for your favorite charity.
Established in 1997, Pythian is a global IT Data company based in Ottawa, Canada that specializes in designing, implementing, and managing systems that directly contribute to revenue and business success. We help companies adopt disruptive technologies such as advanced analytics, big data, cloud, databases, DevOps and infrastructure management to advance innovation and increase agility. We are focusing on product development leveraging new and exciting technology that will empower our customers through digital transformation. More than just a job we hire people who love what they do!
As a (title of the role) in (location), you will be part of our Customer Service Delivery team that is entrusted to manage our global client's mission critical systems as well as deploying cutting edge technology from blockchain to serverless and cloud databases, covering all modern data and infrastructure. They deliver first class personalized level of service to our clients across financial, educational, media, retail and many more.
Intrigued to see what a job is like at Pythian? Check us out @Pythian and #pythianlife.
Follow @PythianJobs on Twitter and @loveyourdata on Instagram!
Not the right job for you? Check out what other great jobs Pythian has around the world! Pythian Careers
- For this job an equivalent combination of education and experience, which results in demonstrated ability to apply skills will also be considered.
- Pythian is an equal opportunity employer and welcomes applications from people with disabilities. Accommodations are available on request for candidates taking part in all aspects of the selection process.
- The successful applicant will need to fulfill the requirements necessary to obtain a background check.
- US applicants must be legally authorized to work in the United States of America permanently– Pythian will not sponsor, or file petitions of any kind on behalf of, a foreign worker to become a U.S permanent resident based on a permanent job offer, or to otherwise obtain authorization to work in the U.S.