- We are seeking a Senior Data Engineer (open to Principal level) to lead the modernization and ownership of a critical data pipeline within a large-scale healthcare analytics environment. This role will focus on transitioning legacy SAS-based pipelines to modern Python/PySpark on Databricks, while driving engineering best practices and scalable data solutions.
- This is a hands-on engineering role with a strong emphasis on development capabilities. Candidates with application development experience and exposure to AI/automation technologies will stand out.
- Key Responsibilities
- Lead the modernization of data pipelines from SAS to Python/PySpark on Databricks
- Own and evolve a mission-critical HEDIS data pipeline used for performance measurement and reporting
- Design, build, and optimize scalable data pipelines in a distributed environment
- Collaborate with SMEs during an initial knowledge transfer period, with eventual full pipeline ownership
- Develop, schedule, and automate end-to-end data workflows
- Ensure data quality, reliability, and performance across large datasets
- Partner with cross-functional teams and analytics vendors to deliver high-quality data outputs
- Contribute to best practices in version control, CI/CD, and agile development workflows
- Required Qualifications
- Strong development/engineering background (core requirement)
- Hands-on experience with Python (scripting and application development)
- Expertise in building and managing data pipelines and ETL workflows
- Experience processing large-scale datasets in distributed environments
- Proficiency with Databricks (notebooks, workflows, cluster management)
- Solid experience with AWS services including S3, Lambda, Glue, and EC2
- Strong SQL skills for complex transformations and data extraction
- Experience with pipeline orchestration and automation
- Familiarity with version control systems (Git) in a collaborative environment
- Experience managing work via issues, epics, and agile tooling
- Preferred / Nice-to-Have
- Experience with AI, machine learning, or automation frameworks
- Exposure to healthcare data (e.g., HEDIS)
- Background in transitioning legacy systems to modern data platforms
- What are we Looking For (Priority Order)
- Strong development engineering capabilities (must-have)
- Application development experience, especially Python scripting
- Expertise in AI or automation (highly desirable bonus)
Mention you found this on Data First Jobs — it helps us bring you more roles like this.
Lead Data Engineer
Perficient
Similar Engineering Jobs
View all Engineering jobs→Amazon
Data Engineer II, ISF Central Tech Team
New
Seattle, Washington (USA)
JOKER WAGYU
QA Tester / Quality Assurance Analyst Engineer
New
RemoteUSA
Jobright.ai
Senior Machine Learning Engineer
New
USA
Bright Vision Technologies
AI Data Infrastructure Engineer
New
Hanover Township, New Jersey (USA)
Jobright.ai
LLM / Machine Learning Engineer
New
New York, New York (USA)
CBRE
Project Engineer - Data Center
New
New Albany, Ohio (USA)
Like this role? Get carefully selected jobs like it, twice a week, straight to your inbox.
Free, no spam. Unsubscribe anytime.