About the RoleThis position will be part of the Sponsored Product Demand Utilization team within the Amazon Advertising organization. You will work hands-on with scientists from day one, with exposure to all aspects of model lifecycle, and productionization of ML systems. This position is ideal for an engineer with prior MLOps or MLInfra experience looking to have exposure to science, with impact and ownership through engineering.You will design and code, troubleshoot, and support high volume and low latency distributed systems. The solutions you create would drive step increases in coverage of sponsored ads across the retail website and ensure relevant ads are served to Amazon's customers. You will directly impact our customers’ shopping experience while helping our sellers get the maximum ROI from advertising on Amazon. This role will provide exposure to cutting-edge innovations in product search, information retrieval, Large Language Models, and Generative AI.About the OrgAmazon Advertising is one of Amazon's fastest growing and most profitable businesses, responsible for defining and delivering a collection of advertising products that drive discovery and sales. As a core product offering within our advertising portfolio, Sponsored Products (SP) helps merchants, retail vendors, and brand owners succeed via native advertising, which grows incremental sales of their products sold through Amazon. The SP team's primary goals are to help shoppers discover new products they love, be the most efficient way for advertisers to meet their business objectives, and build a sustainable business that continuously innovates on behalf of customers. Our products and solutions are strategically important to enable our Retail and Marketplace businesses to drive long-term growth. We deliver billions of ad impressions and millions of clicks and break fresh ground in product and technical innovations every day!Our systems and algorithms operate on one of the world's largest product catalogs, matching shoppers with products - with a high relevance bar and strict latency constraints. We are a team of machine learning scientists and software engineers working on complex solutions to understand the customer intent and present them with ads that are not only relevant to their actual shopping experience, but also non-obtrusive. This area is of strategic importance to Amazon Retail and Marketplace business, driving long term-growth.Realtime systems within our org operate with tight 10s of milliseconds, and high throughput requirements for offline systems, resulting in high impact from performance optimization.Key job responsibilities* Serve as a tech lead for defining innovative and cutting edge ML infrastructure for both inference and training* Build POCs and infrastructure for deploying and supporting models in production. Own A/B testing of experiments using this infrastructure* Work closely with scientists across the org to understand requirements and impact opportunities* Work closely with product managers to contribute to our mission, and proactively identify opportunities where cutting edge ML Infra can help improve customer experience * Stay on top of modern ML Infra and ML Ops technologies to understand where they can provide the most value within the org* Help attract and recruit technical talent, mentor engineers and scientists in the teamAn ideal candidate is able to navigate through ambiguous requirements, working with various partner teams, and has experience in generative AI, large language models (LLMs), information retrieval, and ads recommendation systems. Using a combination of generative AI and online experimentation, our scientists develop insights and optimizations that enable the monetization of Amazon properties while enhancing the experience of hundreds of millions of Amazon shoppers worldwide. If you're fired up about being part of a dynamic, driven team, then this is your moment to join us on this exciting journey!Impact and Career Growth:This is a rare opportunity for an ML focused SDE within Amazon to work directly with, and sprint with talented scientists while delivering impact across teams in the SP Demand Utilization Org. You will have experience working with multiple stakeholders across the org. You will have hands-on experience building and deploying production machine learning systems at scale, with opportunities for some of the largest revenue impact from an org within Amazon. This is a highly visible role that will have a direct impact on customers and revenue!Why you love this opportunity:* Direct business impact, your work in sponsored products affects pixels on amazon.com and amazon app, with real, observable product impact.* Vertical integration with science and engineering, with opportunities to touch all aspects of the model lifecycle, through training, optimization and inference* Exposure to bleeding edge model optimization techniques for LLMs and other large models, with clear path to revenue impact from optimization. This is a rare opportunity across the industry to gain experience and drive exploration in techniques like quantization, sparsity, knowledge distillation and neural architecture search on large scale models in an enterprise setting.* Challenging technical space, where we care about fitting inference for the largest possible models within tight realtime constraints (10s of ms) and high performance of numerical systems is mission critical.* Large scale impact on revenue through delivery of bar-raising optimizations and solutions for teams across the org.* Latitude to innovate on greenfield projects, this is effectively an R&D role with the intention of innovating on existing legacy systems.BASIC QUALIFICATIONS* 3+ years of non-internship professional software development experience* 3+ years of programming with at least one software programming language experience* 3+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience* Experience as a mentor, tech lead or leading an engineering team* Experience with common machine learning techniques such as pre-processing data, training, and evaluation* Experience in building large-scale machine-learning MLOps infrastructure for inference, eval or other parts of the model lifecyclePREFERRED QUALIFICATIONS* 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience* Industry experience in software development.* Experience with production machine learning systems* Experience profiling and identifying performance bottlenecks on CPU, GPU or other accelerators (e.g. using nsys, torchprof, VTune, gprof etc...)* Excellent distributed systems (e.g. AWS or other cloud infra) design experience* Experience with ML libraries/frameworks such as PyTorch, JAX, Tensorflow, Keras etc...* Experience with MLOps tooling like MLFlow, Sagemaker, Kubeflow, DVC etc...* Experience with systems programming, and low level optimization in Rust, C++, C or other similar languages.* Coursework or thesis in machine learning, data mining, information retrieval, statistics or natural language processing* Advanced knowledge of performance, scalability, enterprise system architecture, and engineering best practicesBASIC QUALIFICATIONS- 3+ years of non-internship professional software development experience- 3+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience- Experience programming with at least one software programming language
...