Design Verification Engineer

Amazon Web Services provides a highly reliable, scalable, low-cost infrastructure platform in the cloud that powers hundreds of thousands of businesses in 190 countries around the world. We have data center locations in the U.S., Europe, Singapore, and Japan, and customers across all industries. We are seeking experienced Hardware Design Engineers to build the next generation of our cloud server infrastructure. Our success depends on our world-class server infrastructure; we’re handling massive scale and rapid integration of emergent technologies.As a member of the Cloud-Scale Machine Learning Acceleration team you’ll be responsible for the design and optimization of hardware in our data centers .Some of your responsibilities will include verifying/validating that our hardware and software solutions achieve their desired functionality, developing and executing multi-faceted verification/validation plans, and measuring the teams progress towards our ambitious customer metrics.This is a fast-paced, intellectually challenging position, and you’ll work with thought-leaders in multiple technology areas. You’ll have high standards for yourself and everyone you work with, and you’ll be constantly looking for ways to improve our products' performance, quality and cost.We’re changing an industry, and we want individuals who are ready for this challenge and want to reach beyond what is possible today.About UsInclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.BASIC QUALIFICATIONS - BS Degree or Higher in EE or CS or CE.- 3+ years of design verification experience using System Verilog and UVM- 3+ years of experience in testbench development including: stimulus, checkers, assertions and coverage ...

Senior System Mfg Engineer, Annapurna Labs

AWS Utility Computing (UC) provides product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Additionally, this role may involve exposure to and experience with Amazon's growing suite of generative AI services and other cutting-edge cloud computing offerings across the AWS portfolio.Annapurna Labs designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.Annapurna Labs part of AWS is seeking highly experienced Hardware Test Engineers, System Test Engineers, Manufacturing Test Engineers, and System Validation Engineers to enable high quality and efficient testing for the next generation of our cloud server platforms. Our success depends on our world-class infrastructure as we are handling massive scale and rapid integration of emergent technologies. As a member of the Machine Learning Acceleration team you will be responsible for the enablement and improvement of our system level manufacturing environment.You will work on developing tests that ensure functionality and capability of our custom hardware used in the AWS server fleet. You will develop expertise in the top-to-bottom functionality of the entire system as well as the intended customer applications and stress the system from a customer perspective. You will work together with other engineering teams to develop, maintain, and improve manufacturing test code for new and existing products. You’ll work with both high-level and low-level operating system constructs to create first-boot images for products in manufacturing. You will develop and maintain the deployment and distribution system to ensure that our manufacturing partners have access to appropriate versions of our software as soon as it’s available. You will respond to new issues raised by our manufacturing partners, analyze logs and failures, and then develop and deploy solutions to those issues. You will develop documentation as well as testing and debug procedures for our manufacturing partners to follow. Key job responsibilities- Enable and maintain mass volume production testing, working with our ODMs and JDMs to verify stable high-quality execution- Drive ODM and JDM deliveries to ensure production manufacturing quality- Identify and develop tests needed to enhance coverage and increase failure granularity.- Debug test hardware and software used for system level and server level mass production- Develop manufacturing tests to exercise hw components and collect data for large scale analysisAbout the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- Bachelor's degree in Electrical Engineering or Computer Engineering- 4 + years of experience developing embedded systems code and hardware interfaces (I2C, UART, SPI, JTAG, PCIe, etc.)- Experience with Python, BASH or other scripting language- Experience analyzing yield and bin pareto- Experience working with system management components (BMC, BIOS, CPLD, etc)- Experience with debugging and root cause investigations using hardware schematics and tools such as logic analyzers- Strong background working in UNIX environments ...

Manufacturing Engineer - SW Tools/Infrastructure, Annapurna Labs

AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.You will work on developing at-scale software solutions to manage the manufacturing environments at board and server level test. You will work together with other engineering teams to unify testing solutions between manufacturing and data center operations groups. You will develop and maintain the test deployment and distribution systems to ensure that our manufacturing partners have access to appropriate versions of our software as soon as it's available. You will respond to new issues raised by our manufacturing partners, analyze logs and failures, and then develop and deploy solutions to those issues. You will develop documentation as well as testing and debug procedures for our manufacturing partners to follow.Key job responsibilities• Develop, validate and deploy test infrastructure mechanisms into manufacturing environments• Manage scaled fleets of custom test equipment and ensure their maintenance• Lead and develop data gathering/parsing solutions to ensure manufacturing data is stored on AWS servers in a structured way.• Support internal lab infrastructure and manufacturing unification effortsAbout the teamAbout the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS• Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science or related field• Experience with Python, BASH or other scripting language• Developing network deployment infrastructure such as PXE• Experience working with system management components (BMC, BIOS, CPLD, etc)• Strong background working in UNIX environments ...

Software Engineer II, Annapurna Labs

Annapurna Labs was a startup company acquired by AWS in 2015, and is now fully integrated. If AWS is an infrastructure company, then think Annapurna Labs as the infrastructure provider of AWS. Our org covers multiple disciplines including silicon engineering, hardware design and verification, software, and operations. AWS Nitro, ENA, EFA, Graviton and F1 EC2 Instances, AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe, are some of the products we have delivered, over the last few years. AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators and the Trn1 and Inf1 servers that use them. This position is for a Software Engineer that will lead the development of machine learning tools to run, optimize, and analyze machine learning workloads. This candidate must have had experience leading machine learning tool projects, preferably starting from architecture through several generations of delivery to customers. Deep knowledge of profiling and optimization, resource management, scheduling, code generation are needed. The ideal candidate will have worked on new instruction set architectures, which may include CPU, NPU, GPU and other forms of compute.Key job responsibilitiesThis engineer will lead the design and implementation of this new toolset, will work with developers, system architects, hardware engineers and users both within and external to Amazon to ensure compatibility of this new toolset with existing and next-generation AI accelerators.A day in the lifeAs you design and code solutions to help our team drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You’ll also:Build high-impact solutions to deliver to our large customer base.Participate in design discussions, code review, and communicate with internal and external stakeholders.Work cross-functionally to help drive business decisions with your technical input.Work in a startup-like development environment, where you’re always working on the most important stuff.About the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Hybrid WorkWe value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our US Amazon offices. Our hybrid models allow you the freedom to work from home whenever in-office collaboration isn’t necessary.BASIC QUALIFICATIONS- 3+ years of non-internship professional software development experience- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience- Experience programming with at least one software programming language ...

SDM, ML Acceleration, Neuron Frameworks

Utility Computing (UC)AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machinelearning accelerators and the Trn1 and Inf1 servers that use them. As the Software Development Manager for the ML Applications - Framework team, you will be responsible for leading a strong team of engineers to help design and deploy ML applications/usecases on various frameworks such as Pytorch, JAX, Tensorflow. You will be responsible for the full development life cycle of our integrations and extensions for inference and training support in Pytorch, XLA, Tensorflow and JAX. Develop reliability/scalability features and performance updates in the Neuron ML Frameworks as well as contribute to other popular open Frameworks to enable them make Trainium and Inferentia devices as the first-class citizens for ML Acceleration. Lead the way to ensure support for key ML functionality in a combined chip / software platform. Ensure the right thing is being built and delivered to customersA successful candidate will have an established background in developing ML frameworks using Pytorch on XLA devices and corresponding framework technology components such as Torch-XLA, Open-XLA project integrations using PJRT or StableHLO, familarity of OpenXLA compilers. The ideal candidate should have a strong technical ability to work/deliver on a vertically integrated system stack that consists of a combinatorial matrix of hardware, frameworks, and workflows. Deep expertise in Framework integrations and development using C++ is a must along-with direct customer-facing experience and a strong motivation to achieve results. A day in the lifeYou will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.About the teamAbout AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 3+ years of engineering team management experience- 7+ years of working directly within engineering teams experience- 3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience- Experience partnering with product or program management teams ...

Software Dev Engineer - Compiler, Annapurna Labs

Are you excited about Machine Learning, chip acceleration, compilers, storage, systems or EC2? Are you passionate about delivering high quality services that affect hundreds of thousands of users? We are the dubbed the "secret sauce" behind AWS's success with development centers in the U.S. and Israel, Annarpuna is at the forefront of innovation by combining cloud scale with the world’s most talented engineers.The Annapurna team hires for multiple disciplines Software and Hardware engineers including but not limited to complier engineer, machine learning engineer, runtime engineer, performance engineer and ML chip accelerator, ASIC, physical designs, SDE in Test. Because of our teams’ breadth of talent, we’ve been able to improve AWS cloud infrastructure in networking and security with products such as AWS Nitro, Enhanced Network Adapter (ENA), and Elastic Fabric Adapter (EFA), in compute with AWS Graviton and F1 EC2 Instances, in machine learning with AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe.Key job responsibilitiesInnovating and delivering creative SW Designs to develop new services, solve operational problems, drive improvements in developer velocity, or positively impact operational safetyWriting requirements capturing documents, design documents, integration test plans, and deployment plansCommunicating status and progress of deliverables to schedule, and sharing learnings/ innovations with your team and stakeholdersBASIC QUALIFICATIONS- Currently enrolled in, or completed a Bachelor’s degree program or higher in Computer Science, Computer Engineering, Electrical Engineering or related field- To qualify, applicants should have earned a Bachelor’s or Master’s degree between May 2023 to September 2025. Possible start dates for this role are between January 2025 to October 2025.- Programming experience in internship or coursework with programming language such as Python and/or C or C++.Candidates with strong interests and academic qualifications/research focus in two of the following:- Knowledge of code generation, compute graph optimization, resource scheduling- Data structure and algorithms - Compiler - Optimizing compilers (internals of LLVM, clang, etc)- Machine Learning - Experience with XLA, TVM, MLIR, LLVM- Deep learning models and algorithms- Tensorflow, PyTorch, or MxNET frameworks ...

Software Development Manager, AWS Neuron Machine Learning Distributed Training, ML Accuracy

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machinelearning accelerators and the Trn1,2 and Inf1 servers that use them. As the SDM of Software Development for the Machine Learning Distributed Training team, you will be responsible for leading a strong team of engineers to help design and deploy the ML models. You will be responsible for setting up methodologies for accuracy measurement and baselining for the ML models we deliver. Develop generic solutions for training with low precision. Develop accuracy related reliability/scalability features. Responsible for the full development life cycle of our integrations and extensions for inference and training support in Pytorch, XLA, JAX as well as distributed training libraries like FSDP, DDP and others. Lead the way to ensure support for key ML functionality in a combined chip / software platform. Ensure the right thing is being built and delivered to customers.A successful candidate will have an established background in developing Machine Learning products with direct customer-facing experience, a strong technical ability and a motivation to achieve results. Experience in Machine Learning and software development is also a must.Key job responsibilitiesOur engineers collaborate across diverse teams, projects, and environments to have a firsthand impact on our global customer base. You’ll bring a passion for innovation, data, search, analytics, and distributed systems. You’ll also:- Solve challenging technical problems, often ones not solved before, at every layer of the stack. - Design, implement, test, deploy and maintain innovative software solutions to transform service performance, durability, cost, and security.- Build high-quality, highly available, always-on products.- Research implementations that deliver the best possible experiences for customers.A day in the lifeYou will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.About the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 3+ years of engineering team management experience- 7+ years of working directly within engineering teams experience- 3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience- 8+ years of leading the definition and development of multi tier web services experience- Experience partnering with product or program management teams- Knowledge of engineering practices and patterns for the full software/hardware/networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations- 3+ Years of Deep Learning/Machine learning experience ...

Software Dev Engineer Intern - Machine Learning Chip Architect, Annapurna Labs

Amazon Web Services (AWS) internships are full-time (40 hours/week) for 12 consecutive weeks during summer. By applying to this position, your application will be considered for all locations we hire for in the United States.We are on the lookout for the curious, those who think big and want to define the world of tomorrow. At Amazon, you will grow into the high impact, visionary person you know you’re ready to be. Every day will be filled with exciting new challenges, developing new skills, and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow.Are you a student interested in computer architecture, machine learning, performance optimization, or application-specific silicon design? We are looking for engineers capable of using a variety of domain expertise to invent, design, evangelize, and implement state-of-the-art solutions for never-before-solved problems.A successful candidate will be a self-starter comfortable with ambiguity, strong attention to detail, and the ability to work in a fast-paced, ever-changing environment.Key job responsibilitiesAs a member of the ML chip architecture team, you will be responsible for accelerating large-scale machine learning workloads holistically across algorithms, software, and hardware, as part of our continuous effort to deliver a world-class customer experience. You will be the interface between SW and HW teams, bridging the gap between silicon capabilities and application requirements. Finally, you will have a chance to drive performance improvements on existing AWS hardware platforms, as well as propose, evaluate, and develop hardware optimizations targeting future generations of our products.If this sounds exciting to you - come build the future with us!Internal job descriptionThis requisition is for external candidates or campus employee referrals only, and is not eligible for internal transfers.Due to the volume of referrals and external applicants received, ECT team is unable to provide status updates on individual applicants. Please help us in setting expectations with our candidates and encourage them to reference their application portal for the most up to date information on their application.BASIC QUALIFICATIONS- Currently working towards a Bachelor’s degree, or higher, in Computer Science, Computer Engineering, Electrical Engineering, Machine Learning, or related fields, with an expected conferral date between December 2025 and September 2026.- Knowledge or past experience in computer architecture and silicon design.- • Experience with C++, Rust, or other programming languages, as well as with Python, or similar scripting language. ...

System Manufacturing Engineer

Amazon Web Services provides a highly reliable, scalable, low-cost infrastructure platform in the cloud that powers hundreds of thousands of businesses in 190 countries around the world. AWS has the broadest and deepest set of machine learning and AI services for our customers’ businesses.Annapurna Labs part of AWS is seeking highly experienced Hardware Test Engineers, System Test Engineers, Manufacturing Test Engineers, and System Validation Engineers to enable high quality and efficient testing for the next generation of our cloud server platforms. Our success depends on our world-class infrastructure as we are handling massive scale and rapid integration of emergent technologies. As a member of the Machine Learning Acceleration team you will be responsible for the enablement and improvement of our system level manufacturing environment.You will work on developing tests that ensure functionality and capability of our custom hardware used in the AWS server fleet. You will develop expertise in the top-to-bottom functionality of the entire system as well as the intended customer applications and stress the system from a customer perspective. You will work together with other engineering teams to develop, maintain, and improve manufacturing test code for new and existing products. You’ll work with both high-level and low-level operating system constructs to create first-boot images for products in manufacturing. You will develop and maintain the deployment and distribution system to ensure that our manufacturing partners have access to appropriate versions of our software as soon as it’s available. You will respond to new issues raised by our manufacturing partners, analyze logs and failures, and then develop and deploy solutions to those issues. You will develop documentation as well as testing and debug procedures for our manufacturing partners to follow. Key job responsibilities- Enable and maintain mass volume production testing, working with our ODMs and JDMs to verify stable high-quality execution- Drive ODM and JDM deliveries to ensure production manufacturing quality- Identify and develop tests needed to enhance coverage and increase failure granularity.- Debug test hardware and software used for system level and server level mass production- Develop manufacturing tests to exercise hw components and collect data for large scale analysisWe are open to hiring candidates to work out of one of the following locations:Austin, TX, USABASIC QUALIFICATIONS- Bachelor's degree in Electrical Engineering or Computer Engineering- 4 + years of experience developing embedded systems code and hardware interfaces (I2C, UART, SPI, JTAG, PCIe, etc.)- Experience with Python, BASH or other scripting language- Experience analyzing yield and bin pareto- Experience working with system management components (BMC, BIOS, CPLD, etc)- Experience with debugging and root cause investigations using hardware schematics and tools such as logic analyzers- Strong background working in UNIX environments ...

Software Dev Engineer - Embedded, Runtime, Storage, System & Performance , Annapurna Labs

Are you excited about Machine Learning, chip acceleration, compilers, storage, systems or EC2? Are you passionate about delivering high quality services that affect hundreds of thousands of users? We are the dubbed the "secret sauce" behind AWS's success with development centers in the U.S. and Israel, Annarpuna is at the forefront of innovation by combining cloud scale with the world’s most talented engineers.We hire for multiple disciplines Software and Hardware engineers including but not limited to: compiler engineer, machine learning engineer, runtime engineer, performance engineer and ML chip accelerator, ASIC, physical designs. Because of our teams’ breadth of talent, we’ve been able to improve AWS cloud infrastructure in networking and security with products such as AWS Nitro, Enhanced Network Adapter (ENA), and Elastic Fabric Adapter (EFA), in compute with AWS Graviton and F1 EC2 Instances, in machine learning with AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe.Key job responsibilities- Innovating and delivering creative SW Designs to develop new services, solve operational problems, drive improvements in developer velocity, or positively impact operational safety- Writing requirements capturing documents, design documents, integration test plans, and deployment plans- Communicating status and progress of deliverables to schedule, and sharing learnings/ innovations with your team and stakeholdersBASIC QUALIFICATIONS- Currently enrolled in, or completed a Bachelor’s degree program or higher in Computer Science, Computer Engineering, Electrical Engineering or related field- To qualify, applicants should have earned a Bachelor’s or Master’s degree between May 2023 to September 2025. Possible start dates for this role are between January 2025 to October 2025.- Programming experience in internship or coursework with programming language such as Python and/or C or C++.Candidates with strong interests and academic qualifications/research focus in two of the following:• Distributed systems, algorithms (MPI, NCCL, or similar)• Operating System - Linux system programming/services• Computer architecture• System Development• Complexity analysis ...

2025 ASIC RTL Engineer Intern, Annapurna Labs

Amazon Web Services (AWS) internships are full-time (40 hours/week) for 12 consecutive weeks during summer. By applying to this position, your application will be considered for all locations we hire for in the United States.In Annapurna Labs we are at the forefront of hardware co-design not just in Amazon Web Services (AWS) but across the industry. The work we do is cutting-edge and internet-scale while also being deeply important to our customers. We design and build every component of our hardware and software to come together into products that our customers use for accelerated computing through Machine Learning acceleration and FPGA acceleration. If you are interested in "building a complete product" from inception to delighted customers, Annapurna is a fantastic choice.If this sounds exciting to you - come build the future with us!Responsibilities: • Participate in logic design activities as part of Amazon's machine learning custom silicon solutions• Work with physical design teams to achieve performance and area requirements.• Develop a deep understanding of the end customer requirements, including software applications, use models, system architecture and the SoC architecture/micro-architecture of our solutions• Develop and execute design automation mechanisms and flowsMentorship & Career GrowthOur team is dedicated to supporting new team members in an environment that celebrates knowledge sharing and mentorship. Projects and tasks are assigned in a way that leverages your strengths and helps you further develop your skillset.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life HarmonyOur team puts a high value on work-life harmony. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility and encourage you to find your own balance between your work and personal lives.BASIC QUALIFICATIONS- Enrolled in a Bachelors’ degree program or higher in Electrical Engineering, Computer Engineering, or a related field with a graduation conferral date between December 2025 and September 2026- Programming experience in C/C++- Programming experience in System Verilog or Verilog ...

System Development Engineer, Annapurna Labs, Machine Learning Accelerator Systems - Fleet Triage

Annapurna Labs was a startup company acquired by AWS in 2015, and is now fully integrated. If AWS is an infrastructure company, then think Annapurna Labs as the infrastructure provider of AWS. Our org covers multiple disciplines including silicon engineering, hardware design and verification, software, and operations. AWS Nitro, ENA, EFA, Graviton and F1 EC2 Instances, AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe, are some of the products we have delivered, over the last few years.In Annapurna Labs we are at the forefront of hardware/software co-design not just in Amazon Web Services (AWS) but across the industry. The Machine Learning System Operations and Automation Team is looking for candidates interested in writing automation software for our "fleet" of Machine Learning servers deployed around the world. Do you like solving mysteries? Great! Figuring out what that light switch with no obvious function actually does? Me too! Are you wearing a smartwatch to monitor your sleep and activity over time to optimize your routines? You'll fit right in. Does the word exabyte excite you? Let's get to work. Our team writes truly massive scale autonomous software to monitor, optimize, and remediate hardware in the most advanced servers in the world. Come join us!Key job responsibilities- Member of a team responsible for system remediation, operational excellence, and customer experience on bleeding edge ML products- Utilize data to root cause hardware failures and identify live trends on the most complex systems in AWS- Implement and improve system level testing across the product lifecycle- Develop software which can be maintained, improved upon, documented, tested, and reused- Dive deep on issues at the intersection of hardware and softwareA day in the lifeAs you design and code solutions to help our team drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You’ll also:Build high-impact solutions to deliver to our large customer base.Participate in design discussions, code review, and communicate with internal and external stakeholders.Work cross-functionally to help drive business decisions with your technical input.Work in a startup-like development environment, where you’re always working on the most important stuff.About the teamThe MLA Systems Fleet Triage team is responsible for identifying and responding to the most challenging hardware and software failures from ML optimized servers at scale. We work in tandem with hardware design, firmware, and validation teams to improve test coverage and detection in production environments. Why AWS?Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Hybrid WorkWe value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords engineers options to work in the office every day or in a flexible, hybrid work model near one of our US Amazon offices. Our hybrid models allow you the freedom to work from home whenever in-office collaboration isn’t necessary.BASIC QUALIFICATIONS- 2+ years of non-internship professional software development experience- 1+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience- 3+ years of administrative experience in networking, storage systems, operating systems and hands-on systems engineering experience- Knowledge of systems engineering fundamentals (networking, storage, operating systems)- Experience programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby ...

Sr. Software Development Manager, AWS Neuron Machine Learning Distributed Training, Core Technologies and Infra (CoreTex)

Utility Computing (UC)AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.AWS Neuron is the complete software stack for the AWS Inferentia and Trainium (Neuron) cloud-scale machine learning accelerators.As a Sr. SDM of Software Development for the Machine Learning Distributed Training, Core Technologies and Infra org, you will be responsible for leading a strong teams of software engineers and managers to help design and deploy a software that enables ML workloads work seamlessly on these new products.A successful candidate will have an established background in developing Machine Learning products with direct customer-facing experience, a strong technical ability and a motivation to achieve results.This leader will manage the core technology and Infra org, directly managing several teams/managers focused on developing training libraries for PyTorch and Jax, Tooling for Large Scale Training Debug, Development Productivity and Benchmarking, Kernel development and Large scale training stability teams. The leader ensures support for key ML functionality in a combined chip / software platform and that the right thing is being built and delivered to customers. Experience in Machine Learning and software development is a must.Key job responsibilitiesResponsible for the full development life cycle of our integrations and extensions for training support in Pytorch, XLA, JAX as well as distributed training libraries like FSDP and others.In charge of. characterization, enablement and development of existing and future massive-scale ML models like Claude 3, GPT4 as well as ViT, Llava, Stable Diffusion3 and more.Lead the way to ensure support for key ML functionality in a combined chip / software platformEnsure the right thing is being built and delivered to customersA day in the lifeYou will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.About the teamAbout AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 10+ years of engineering experience- 5+ years of engineering team management experience- 10+ years of planning, designing, developing and delivering consumer software experience- Experience partnering with product and program management teams- Experience managing multiple concurrent programs, projects and development teams in an Agile environment ...

We show restricted results, but there are more jobs available in our database, use Search to see them