工作內容:
•Role Overview We are seeking a highly technical Data Scientist with 5+ years of experience who specializes in data engineering, robust pipeline implementation, and data lifecycle management. In this role, you will act as a high-level individual contributor responsible for designing scalable data architectures, optimizing complex ETL/ELT pipelines, and resolving deep-seated data issues within a highly regulated environment. While you possess a foundational understanding of data science and analytics, your primary focus will be building the stable, high-performance data infrastructure and automated management systems required to support downstream modeling, analytics, and business-critical operations. The ideal candidate is a master of Python, PySpark, SQL, and data orchestration tools .
•Key Responsibilities 1. Pipeline Implementation: Design, build, scale, and maintain automated ETL/ELT data pipelines to ingest and transform large, complex datasets . 2. Data Infrastructure Management: Manage and optimize core applications and data services supporting IBG, CBG, and enterprise data systems . 3. Data Issue Management: Diagnose, troubleshoot, and resolve deep-seated data issues, pipeline bottlenecks, and complex reconciliation gaps . 4. System Optimization: Continuously monitor, benchmark, and tune system performance to guarantee high availability, data integrity, and pipeline reliability . 5. Requirement Analysis: Analyze business logic requirements to develop data specifications, schemas, and optimal data models . 6. Downstream Enablement: Clean, structure, and interpret large data sets to create optimized feature stores and datasets that power downstream decision-making and reporting .
•Job Requirements 1. Education: Bachelor’s degree in Computer Science, Information Technology, or a related quantitative field . 2. Experience: Minimum 5 years of experience in a Data Science, Big Data Analyst, or Data Infrastructure role . 3. Core Technical Stack: Advanced expertise in Python, PySpark, SQL, and managing massive distributed datasets . 4. Data Management: Proven experience in data modeling, database schema design, and process mapping . 5. Analytical Skills: Strong problem-solving capabilities with a deep focus on data quality, reconciliation, and automated validation frameworks . 6. Tools (Preferred): Familiarity with orchestration tools (e.g., Airflow), version control (Git), and data visualization tools like Tableau .