Are you looking to join a team that is among the fastest growing organizations at Amazon? Does wearing multiple hats and working in a fast-paced, challenging environment sound like a good fit? Then consider joining the Amazon Advertising Trust team. You will be solving difficult non-deterministic workflows, work with highly scaled systems, and build technological frameworks/systems that keeps raising the bar on quality. Amazon Advertising Trust team is committed to enable the fastest growing advertising business in the world, while upholding customers and advertisers trust.
We are seeking an experienced and visionary individual to lead our Machine Learning Operations (ML Ops) team. In this role, you will be responsible for overseeing the end-to-end operationalization of machine learning models, ensuring seamless integration into our production environment. You will lead a team of talented ML Ops engineers, collaborate closely with data scientists, and work in tandem with engineering and DevOps teams to create and maintain a robust ML Ops framework.
Work alongside science teams as first responders to operational issues (manual/auto cut) and investigate anomalies such as high flagging, automation deviation, M2 benchmarking etc. Additionally, own/assist with other specialized tasks encompassing Model Lifecycle Management (Deployment/Monitoring/Retraining/Deprecation). Infrastructure and Environment Management. Data Management and Pipeline maintenance.
Key job responsibilities
As an Engineering Manager - ML Ops, you will be responsible for:
Team Leadership and Development:
* Lead and mentor a team of ML Ops engineers, fostering a culture of collaboration, innovation, and continuous learning.
* Provide guidance and support in the execution of ML Ops projects, ensuring team members meet goals and deadlines.
Strategic Planning:
* Develop and execute the strategic vision for ML Ops, aligning it with the overall business objectives.
* Collaborate with stakeholders to understand business requirements and formulate ML Ops strategies to support them.
Operationalization of ML Models:
* Oversee the end-to-end operationalization of machine learning models, from development to deployment and monitoring.
* Drive the adoption of best practices for model deployment, scaling, and maintenance.
Process Optimization:
* Identify and implement process improvements to streamline the ML Ops workflow and enhance efficiency.
* Collaborate with cross-functional teams to integrate ML Ops processes seamlessly into existing software development and IT operations.
Infrastructure and Resource Management:
* Manage infrastructure resources for machine learning workloads, collaborating with engineering teams to optimize performance and cost.
* Ensure scalability and reliability of ML Ops systems to support business growth.
Quality Assurance and Compliance:
* Implement quality assurance measures for ML Ops processes, ensuring the reliability and accuracy of deployed models.
* Ensure compliance with security and regulatory requirements in ML Ops workflows.
Communication and Collaboration:
* Facilitate effective communication and collaboration between ML Ops, data science, product and engineering teams.
* Provide regular updates to leadership on project status, challenges, and opportunities.
We are open to hiring candidates to work out of one of the following locations:
Bangalore, KA, IND
* Bachelor's or Master's degree in Computer Science, Data Science, or a related field.
* 7+ years of ML operations, development or technical support experience.
* Experience in managing ML Ops teams and successfully deploying machine learning models in production.
* Strong leadership and people management skills, with the ability to inspire and motivate a team.
* Excellent interpersonal and communication skills for effective collaboration with cross-functional teams.
* Comprehensive understanding of machine learning concepts and hands-on experience with ML Ops tools and technologies.
* Familiarity with cloud platforms, containerization, and orchestration tools.
* Ability to think strategically and align ML Ops initiatives with overall business goals.
* Strong problem-solving skills and the ability to make data-driven decisions.
Hands on experience with distributed and/or enterprise applications·
Knowledge of one high level programming language (preferably Java/Python).
Proven track record of service improvement and optimization in production
Demonstrates skill and passion for operational excellence.
Experience working with an international team and stakeholders.
Ability to handle multiple, competing priorities in a fast-paced environment.
Ability to navigate through ambiguity and delivery incrementally.