As a Sr. Data Engineer, you will work closely with cross-functional teams to ensure data availability, reliability, and accessibility, enabling the organization to extract meaningful insights from diverse data sources. Your responsibilities will include designing, developing, implementing, testing, and operating large-scale, high-volume, and high-performance data structures for analytics. This will entail high degree of collaboration with our network engineering teams. You should possess expertise in architecting data warehouse solutions for the enterprise across multiple platforms and tools (EMR, RDBMS, Columnar, Cloud, Redshift, EMR, Athena, Aurora, DynamoDB, Kinesis, Glue, Lambda, S3, EC2, etc.). Additionally, you should have extensive experience in designing, creating, managing, and utilizing extremely large datasets. You will analyze source data systems and drive best practices within the source teams. Participation in the entire development life cycle, from design and implementation to testing, documentation, delivery, support, and maintenance, will be expected. You will produce comprehensive and usable dataset documentation and metadata and evaluate dataset implementations proposed by peer data engineers. Strong business and communication skills are essential for collaborating with business owners to develop key business questions and build data sets that provide answers, drive change, and promote a deep understanding of the data.
Key job responsibilities
• Lead the enterprise data lake strategy for OTIE including designing, building, and optimizing the data lake.
• Develop and maintain data pipelines to ingest data from various sources into the data lake. For example, implement ETL processes to cleanse, transform, and enrich data for analytical purposes.
• Lead development effort focusing on scalability, quality and performance.
• Implement data quality controls, metadata management, and data lineage tracking to ensure data integrity and compliance. Enforce data access controls and security measures to protect sensitive information.
• Implement data governance and security measures to ensure compliance with regulatory requirements and protect sensitive information.
• Monitor data pipelines and systems to identify and resolve issues promptly, ensuring high availability and reliability of data infrastructure.
• Stay up-to-date with emerging technologies, tools, and industry trends in data engineering and contribute to continuous improvement of data engineering practices within the organization.
Key job responsibilities
• Lead the enterprise data lake strategy for OTIE including designing, building, and optimizing the data lake.
• Develop and maintain data pipelines to ingest data from various sources into the data lake. For example, implement ETL processes to cleanse, transform, and enrich data for analytical purposes.
• Lead development effort focusing on scalability, quality and performance.
• Implement data quality controls, metadata management, and data lineage tracking to ensure data integrity and compliance. Enforce data access controls and security measures to protect sensitive information.
• Implement data governance and security measures to ensure compliance with regulatory requirements and protect sensitive information.
• Monitor data pipelines and systems to identify and resolve issues promptly, ensuring high availability and reliability of data infrastructure.
• Stay up-to-date with emerging technologies, tools, and industry trends in data engineering and contribute to continuous improvement of data engineering practices within the organization.
BASIC QUALIFICATIONS
- 5+ years of data engineering experience
- Experience with data modeling, warehousing and building ETL pipelines
- Experience with SQL
- Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
- Experience mentoring team members on best practices
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Excellent problem-solving and analytical skills, with the ability to work on complex data challenges.
- Strong communication and collaboration skills, with the ability to work effectively in a team environment.
- Hands-on expertise in building and optimizing ETL pipelines and data workflows.
- In-depth understanding of data modeling, database design, and performance tuning.
- Familiarity with cloud-based data technologies (e.g., AWS, Azure, or Google Cloud Platform).
PREFERRED QUALIFICATIONS
- Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
- Experience operating large data warehouses
- Familiarity with data visualization tools and techniques (e.g., Tableau, Power BI, or D3.js).
- Background in machine learning and statistical analysis.
- AWS certification (e.g., AWS Certified Big Data - Specialty).
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.