Mid level Data Engineer
Location: Pittsburgh
Job Type: Full Time / Permanent
As a Senior Data Engineer, you will play a key role in developing and maintaining our next generation data platform in a fast-paced, agile environment. Partnering closely with our Product, Operations and Engineering teams, you will be responsible for delivering data-driven solutions to complex business problems as well as serve as a trusted advisor to the technical teams.
Responsibilities: • Engineer solutions to aggregate and automate large-scale data flows from varying sources • Work as a part of a cross-functional team (Product, Data Science and Data Integration) to build and maintain scalable data solutions • Collaborate with product owners in an agile environment to analyze requirements and translate them into functioning software • Lead/own the tactical implementation of new product/analytics once scope/strategy is defined • Primary escalation contact for bugs, onboarding issues, ad hoc functionality needs • Implement key tech debt and enhancements and process improvements to improve overall efficiency of the data team • Manage escalations and problem solving solutions regarding critical issues that require immediate attention and remediation • Responsible for the monitoring and maintenance of the health of servers and databases for all inbound data and ongoing warehousing • This role should be able to serve as internal education and support for Data and Integration Engineers • Provide mentoring to and collaboration with other team members and technical leadership to more junior team members
Qualifications: • Bachelors degree in Computer Science, Engineering or related quantitative fields and/or equivalent experience • Minimum of 7 years of experience as a Data Engineer or in a related role such as Database Administrator, Data Warehouse Developer, ETL, or Big Data Engineer which all require a strong record of working with large data sets • Real world expertise in data modeling and structure of databases, ETL development and data warehousing plus a demonstrated mastery with many data warehouse and processing technologies such as: • Postgres • AWS • Redshift • RDS • Glue • Lambda • S3 • EC2 • ETL • Pentaho Kettle • Talend • Scripting • Python • Shell • Airflow and DAG development • Experience using git/Github/Gitlab for project version control • Hands on experience with cloud computing and Linux-based systems, specifically AWS Linux • Demonstrates a mastery of using SQL with large data sets and a deep understanding of query performance tuning. Must be able to mentor others in the use of SQL with large data sets. • Project management experience using Jira and/or Confluence • Proven track record of successful communication of data infrastructure, data models and data engineering solutions through written communication including the ability to effectively communicate with both business and technical teams • Shows curiosity and an ability to learn quickly, especially regarding new technology and processes • Approaches all work with a team-based/collaborative orientation •
Bonus Points: • Healthcare data experience is a huge plus • Senior-level mastery of Python • Any software development experience, preferably python or JavaScript • Experience using sqitch for data schema loading