Software Engineer V – Big Data Cloud
Location: North of Pittsburgh, PA
Job Type: Full Time / Permanent
We are looking for an enthusiastic, experienced team-oriented Software Engineer who has a passion for data and creating insightful analytics so that we can transform healthcare in the US.
As a Performance Center team member, you will work hard with a team of engineers on our cloud data platform that streams data from a variety of healthcare software and hardware systems in real-time to create transformational recommendations for our customers. Our solutions help to drive improved financial performance, compliance, and better patient outcomes. Each day you will make an impact.
- Develop custom batch-oriented and real-time streaming data pipelines working within the MapReduce ecosystem, migrating flows from ELT to ETL
- Ensure proper data governance policies are followed by implementing or validating data lineage, quality checks, classification, etc.
- Act in a technical leadership capacity: Mentor junior engineers and new team members, and apply technical expertise to challenging programming and design problems
- Resolve defects/bugs during QA testing, pre-production, production, and post-release patches
- Possess a quality mindset, squash bugs with a passion, and work hard to prevent them in the first place through unit testing, test-driven development, version control, continuous integration, and deployment.
- Conduct design and code reviews
- Analyze and improve efficiency, scalability, and stability of various system resources
- Contribute to the design and architecture of the project
- Operate within Agile Development environment and apply the methodologies
Education & Experience:
- Bachelor’s degree in Engineering/IT/Computer Science
- 8+ years’ experience in software engineering
- 3+ years experience developing ETL processing flows using MapReduce technologies like Spark and Hadoop software Design
- 2+ years’ experience in Software Design with various messaging systems, such as Kafka or RabbitMQ
- 1+ years’ experience: developing with ingestion and clustering frameworks such as Kafka, Zookeeper, YARN, building stream using Spark-Streaming
- Experience with integration of data from multiple data sources
- Experience leading projects or teams
- Proficient understanding of distributed computing principles
- Good knowledge of Big Data querying tools, such as Pig or Hive
- Good understanding of Lambda Architecture, along with its advantages and drawbacks
- Proficiency with MapReduce, HDFS
- Ability to solve any ongoing issues with operating the cluster
- Ability to lead change, be bold, and have the ability to innovate and challenge the status quo
- Passionate about solving customer problems and develop solutions that result in a passionate customer/community following
- Master Degree in Engineering/IT/Computer Science (preferred)
- 10+ years’ experience in software engineering (preferred)
- 1+ years’ experience with DataBricks and Spark (preferred)
- 2+ years experience (all preferred):
- NoSQL databases, such as HBase, Cassandra, MongoDB
- Big Data ML toolkits, such as Mahout, SparkML, or H2O
- Scala or Java Language as it relates to product development.
- Management of Spark or Hadoop clusters, with all included services
- Experience Service Oriented Architecture ( SOA) /microservices
- Demonstrable advanced knowledge of data architectures, data pipelines, real-time processing, streaming, networking, and security