Data Engineer, Prime Video Personalization and Discovery Data Platform

Amazon | New York, New York, United States | December 19, 2024

Posted:	19 Dec 2024
Company:	Amazon
Category:	Data Engineering
Country:	US - United States
State:	None - None
City:	New York
Zip code:	None

Prime Video offers customers a vast collection of movies, series, and sports—all available to watch on hundreds of compatible devices. U.S. Prime members can also subscribe to 100+ channels including Max, discovery+, Paramount+ with SHOWTIME, BET+, MGM+, ViX+, PBS KIDS, NBA League Pass, MLB.TV, and STARZ with no extra apps to download, and no cable required.

Prime Video Personalization and Discovery team is building a new Data Platform . The platform will streamline ML and engineering teams to access and leverage high-quality, consistent customer-level data for offline processing and in real-time adhering to committed SLAs. We will act as guardians and gatekeepers of data shielding our customers from data issues while collaborating with data sources to address problems at their root. We will own and maintain "golden" datasets — centralized, reliable data assets commonly used across teams to train ML models and deliver top-notch personalized storefront experience, that is unique for every PV end-customer. We will own Data Registry; we ensure that datasets are enriched with comprehensive classification and metadata, making them easily discoverable and accessible, and enabling ML and engineering teams to focus on deriving value for end customers rather than resolving data inconsistencies. Our goal is to save time for science and engineering teams to work with customer-level data, avoid duplicated boiler-plate data transformations, quality inspections, save IMR costs and increase experimentation velocity.

Key job responsibilities
- Design and build big data pipelines via Spark/Flink that can handle Petabytes of data per month
- Build resilient data pipelines with extensive unit tests/integration tests with CI/CD development lifecycle
- Participate in oncall rotation supporting a large number of production data pipelines
- Manage and orchestrate version migrations across the metadata, transform, and storage layers.
- Oversee and continually improve production operations, including optimizing data delivery, re-designing infrastructure for greater scalability, code deployments, bug fixes and overall release management and coordination.
- Work closely with Product teams, Data Scientists, Software developers and Business Intelligence Engineer to explore new data sources and deliver the data.
- Able to read, write, and debug data processing and orchestration code written Python/Scala etc following best coding standards (e.g. version controlled, code reviewed, etc.)

About the team
Our vision is to build a resilient, centralized data platform that distributes and streamlines access to the high quality, easily-discoverable customer-level data for offline processing and in real-time, enabling PVPD science and engineering teams to accelerate the focus on ML and engineering innovations to enhance personalization of PV end customer experiences.

- 3+ years of data engineering experience
- Experience with data modeling, warehousing and building ETL pipelines
- Experience with SQL

- Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
- Experience building MPP data transforms that process multiple Tb per day.

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $118,900/year in our lowest geographic market up to $205,600/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit This position will remain posted until filled. Applicants should apply via our internal or external career site.