Work on State of the Art runtime systems hosting cutting edge Large Language Models (LLM). We work in a fast paced dynamic environment to rapidly experiment and deliver scaled runtime solutions based on cutting edge experiments and research in the LLM space.

Key job responsibilities
- Design, develop, test and deploy inference solutions for high-end LLMs
- Explore emerging inference optimization techniques
- Collaborate with cross-functional teams of engineers and scientists to identify and solve complex problems
- Mentor and guide junior engineers, and contribute to the overall growth and development of the team

- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language

- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.