Data Engineer, Content Analysis & Strategy (CAST)

Amazon | Brooklyn, New York, United States | December 19, 2024

Posted:	19 Dec 2024
Company:	Amazon
Category:	Data Engineering
Country:	US - United States
State:	None - None
City:	Brooklyn
Zip code:	None

Amazon Music is an immersive audio entertainment service that deepens connections between fans, artists, and creators. From personalized music playlists to exclusive podcasts, concert livestreams to artist merch, Amazon Music is innovating at some of the most exciting intersections of music and culture. We offer experiences that serve all listeners with our different tiers of service: Prime members get access to all the music in shuffle mode, and top ad-free podcasts, included with their membership; customers can upgrade to Amazon Music Unlimited for unlimited, on-demand access to 100 million songs, including millions in HD, Ultra HD, and spatial audio; and anyone can listen for free by downloading the Amazon Music app or via Alexa-enabled devices. Join us for the opportunity to influence how Amazon Music engages fans, artists, and creators on a global scale. Learn more at Music is awash in data! To help make sense of it all, the CAST team within DISCO (Data, Insights, Science & Optimization) team accelerates and facilitates content analytics and provides independence to generate valuable insights in a fast, agile, and accurate way. This domain provides analytical support for the below topics within Amazon Music: Programming / Label Relations / PR / Stations / Livesports / Originals / Case & CAM.

The CAST team enables repeatable, easy, in depth analysis of music customer behaviors. Our goal is to empower all teams at Amazon Music to make data driven decisions and effectively measure their results by providing high quality, high availability data, and democratized data access through self-service tools.

If you love the challenges that come with big data and devops, then this role is for you. We collect billions of events a day, manage petabyte scale data on Redshift and S3, and develop data pipelines using Spark/Scala EMR, SQL based ETL, Airflow and Java services.

We are looking for a talented, curious, enthusiastic, and detail-oriented Data Engineer, who knows how to take on big data challenges in an agile way. Duties include big data design and analysis, data modeling, and development, deployment, and operations of big data pipelines.

The CAST team develops data specifically for a set of key business domains like personalization, playlist programming, merch, artists, and provides and protects a robust self-service core data experience for all internal customers. We deal in AWS technologies like Redshift, S3, EMR, EC2, DynamoDB, and Lambda.

Key job responsibilities
* Writing scripts, building microservices in AWS to increase team efficiency
* Building and managing ETL pipelines in AWS to ingest data from external vendors
* Assist the BIs on the team in managing our existing environment that consists of Redshift and SQL based pipelines. The activities around these systems will largely be well-defined via standard operation procedures (SOP) and typically involve approving data access requests, subscribing or adding new data to the environment, but there will be cases where creative problem-solving is required
* SQL data pipeline management (creating or updating existing pipelines) and maintenance tasks on the Redshift cluster
* Assist the team with the management of our next-generation AWS infrastructure. Tasks includes infrastructure monitoring via CloudWatch alarms, infrastructure maintenance through code changes or enhancements, and troubleshooting/root cause analysis infrastructure issues that arise, and in some cases this resource may also be asked to submit code changes based on infrastructure issues that arise.

- 1+ years of data engineering experience
- Experience as a data engineer or related specialty (e.g., software engineer, business intelligence engineer, data scientist) with a track record of manipulating, processing, and extracting value from large datasets
- Experience writing and optimizing SQL queries with large-scale, complex datasets
- Experience with one or more scripting language (e.g., Python, KornShell, Scala)

- Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
- Experience with any ETL tool like, Informatica, ODI, SSIS, BODI, Datastage, etc.

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $91,200/year in our lowest geographic market up to $185,000/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit This position will remain posted until filled. Applicants should apply via our internal or external career site.