Senior Data Engineer, Items and Offers Platform
Job ID: 2816052 | Amazon.com Services LLC
Amazon is a fast paced innovative company that is developing software that no one has attempted before. If you are a data engineer who is passionate about writing code and loves to build large scale data pipelines which are scalable, high throughput, fault tolerant and always available, then get in touch with us.
The Item and Offers team is responsible for a variety of services that form a core part of the Amazon eCommerce platform. We are primarily responsible for developing the services that process all of the Item information from millions of merchants who want to sell through the Amazon family of websites. Our expertise lies in managing billions of products in the catalog and developing large scale distributed systems that process hundreds of millions of changes to the catalog every day in real time. The team offers a unique blend of hard computer science problems and an opportunity to help the businesses model their new ideas.
As a Sr. Data Engineer in the CDW team, you will own complex big data pipelines and data solutions to provide highly available datasets. You will work with large data sets (in petabytes) and transformations involving multiple data sources to enable downstream analytics for our stakeholders. You will build and manage large datasets to help teams drive data-driven decisions through analytical and business metrics dashboards.
The Data Engineer will play a crucial role in designing, developing, and maintaining efficient and scalable data pipelines, data models, and data warehousing solutions. This position will be responsible for ensuring data integrity, quality, and availability across the organization, enabling data-driven decision-making and supporting business analytics and insight initiatives.
Key job responsibilities
- Define and optimize data models for rapid analytics on catalog product data, improving freshness and LLM consumption while reducing costs and undifferentiated work.
- Automate metrics generation to support S-team goals, including pack hierarchy scaling and standard KPIs, while leading strategy for scaling self-serve analysis and dashboards.
- Mentor engineers, establish best practices in data engineering and operational excellence, and stay current with latest technologies to recommend innovations.
- Conduct comprehensive data discovery, profiling, and performance analysis for various sources, designing effective models for Page0, entitlement, propensity, and other relevant data.
- Collaborate with stakeholders to translate requirements into optimized data structures, while establishing and enforcing data governance policies to maintain quality, consistency, and security.
- Resolve root causes of endemic problems, unblocking innovation for related teams, and build consensus with stakeholders to influence and determine the best path forward.
BASIC QUALIFICATIONS
- Bachelor's degree in computer science, engineering, analytics, mathematics, statistics, IT or equivalent
- 5+ years of data engineering experience
- Experience with data modeling, warehousing and building ETL pipelines
- Experience mentoring team members on best practices
- Experience in at least one modern scripting or programming language, such as Python, Java or Scala
- Experience in dimensional data modeling and schema design
- Experience with diverse data formats: Parquet, JSON, big data formats, and table formats like Apache Iceberg
PREFERRED QUALIFICATIONS
- Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
- Experience with BDT toolsets like Cradle, DataCraft, Andes and other products
- Experience in managing data at scale (hundreds of terabyte size datasets)
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
#J-18808-Ljbffr