Press Tab to Move to Skip to Content Link Imagine your work helps millions of children to unlock their learning potential? HMH are a learning company. Over 53 million students and teachers use our learning platform every day. And that's growing every year. You can learn more Here! With millions of users, our technology infrastructure must be robust, responsive and highly scalable.That's where your deep Site Reliability/ DevOps expertise comes in.You’ll be supporting and scaling the infrastructure needed to help millions of little learners to dream big. At HMH, we take direct actions to attract, hire, and retain more diverse talent, nurture an inclusive workplace, and create opportunities for meaningful conversations about what it means to be antiracist. We believe that it is through learning that people find their voices, connect with others, and create a better world. We aim to increase the diversity of our employee base by growing our diverse talent pipeline, including partnerships with organizations like Resilient Coders, Girls Write Now, Hacker X, and Editors of Color. See here for our philosophy on diversity, equity and inclusion. Technical Infrastructure: Here’s just some of what we use: Microservices Architecture, Spring, Java & NodeJS, React, Koa, Express.js. You can read more on our Engineering Blog - Here. More About your role: This is a role with real impact. You’ll be constantly asking; what are the most important infrastructure problems we need to solve for, today, that will increase our applications and infrastructures reliability and performance. You will apply your deep technical knowledge, taking a broad look at our database technology infrastructure. You’ll help us identify common and systematic issues and validate these, prioritizing which to strategically address first. We value collaboration. So, you will partner with our SRE/DevOps team, discussing and refining your ideas and preparing proof of concepts. You’ll work with engineering teams to design optimized schemas to ensure data consistency and reliability You’ll bring automation and stability to our database platforms and help us deliver robust, secure, consistent, and predictable database services while ensuring 100 % availability of database platform You will manage and execute complex data platform projects You will work closely with DB team members to automate Database provisioning, including configuring Database clusters You will implement highly available and scalable database instances across multiple data centers There’s lot of interesting technology problems for you to solve, so you areconstantly applying latest thinking. These include, implementing database self-service and database platform-as-a-service interfaces, integrating database access with vault, automating Database as code provisioning , implementing distributed database environments, etc. You’ll help us plan for the future. You’ll get to evaluate existing technologies and design the future state, without being afraid to challenge the status quo. And you’ll regularly review existing infrastructure, looking for opportunities to improve (E.g. Service improvement, cost reduction, security, performance). You’ll also get to automate everything necessary, combining reliability with a pragmatic approach, doing it right, first time. We’re continuing our journey of making our code and configuration deployments self-serve for our development teams. You’ll help us build and maintain the right tooling. And you’ll have ownership to design and implement the infrastructure needed. You’ll also be involved in the daily management of our AWS infrastructure. This means working with our Agile development teams, to troubleshoot server, application, database, and performance issues. Skills and Experience: This role is for a data expert in cloud computing environments. To thrive in this role, you have; Significant hands-on SRE/DevOps experience in an Agile environment. Substantial experience using AWS in a production environment and Managing cloud infrastructure as code. (We use Terraform). A strong understanding of distributed computing environments Experience in multiple database technologies like Oracle, PostgreSQL, MSSQL, MySQL, and NoSQL Databases in AWS (any or all of EC2, S3, EBS, ELB, RDS, Dynamo DB). You’ll know how to tune, scale and how performance and reliability are achieved. Expertise with MySQL and PostgreSQL database administration skills Ability to write and understand SQL, including complex joins and aggregations Expertise in setting up replication, backups, monitoring, Database tuning and SQL tuning Experience managing cloud infrastructure, including Database Platforms as code. Strong scripting experience in multiple languages: Bash, Python, Powershell, SQL, etc Exceptional analytical skills with regards to systems analysis, data manipulation, and the ability to create information from data You’ll also have significant experience, and/or an interest in the following; You’re experienced working with Linux and Windows. Having considered security, you have experience working with firewalls, network and application load balancing&secret management.. You’re used to working with CI/CD tools. Additional skills: Experience or the ability to work as a member of a distributed team is important (as your team will be co-located).