Staff Engineer – Big Data

Location: Pittsburgh, PA

Job Type: Full Time / Permanent

Responsibilities: 

  • Defines technology roadmap in support of product development roadmap
  • Lead the design, architecture and development of multiple real time streaming data pipelines encompassing multiple product lines and edge devices
  • Ensure proper data governance policies are followed by implementing or validating data lineage, quality checks, classification, etc.
  • Provide technical leadership to agile teams – onshore and offshore: Mentor junior engineers and new team members, and apply technical expertise to challenging programming and design problems
  • Resolve defects/bugs during QA testing, pre-production, production, and post-release patches
  • Have a quality mindset, squash bugs with a passion, and work hard to prevent them in the first place through unit testing, test-driven development, version control, continuous integration and deployment.
  • Ability to lead change, be bold, and have the ability to innovate and challenge stthe atus quo
  • Conduct design and code reviews
  • Analyze and improve efficiency, scalability, and stability of various system resources
  • Operate within Agile Development environment and apply the methodologies
  • Track technical debt and ensure unintentional technical debt is not created
  • Recommends improvements to the software delivery cycle to help remove waste and impediments for the team
  • Drives, promotes and measures team performance against the sprint and project goal.
  • Works with the team to continuously improve in development practices and process
  • Troubleshoots complex problems with existing or newly-developed software
  • Mentoring and coaching of Software Engineers

Education & Experience:

  • Bachelor’s Degree
  • 12+ years’ experience in software engineering with 2+ years using public cloud
  • 6+ years’ experience developing ETL processing flows using MapReduce technologies like Spark and Hadoop
  • 4+ years’ experience developing with ingestion and clustering frameworks such as Kafka, Zookeeper, YARN
  • 4+ years’ experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
  • 2+ years’ experience with spark structured streaming
  • 4+ years’ experience with various messaging systems
  • 2+ years’ experience with Kafka
  • 1+ years of DevOps experience
  • 1+ years’ benchmarking experience
  • Experience with integration of data from multiple data sources and multiple data types
  • Expert knowledge of data architectures, data pipelines, real time processing, streaming, networking, and security
  • Proficient understanding of distributed computing principles
  • Advanced knowledge of Big Data querying tools, such as Pig or Hive
  • Expert understanding of Lambda Architecture, along with its advantages and drawbacks
  • Proficiency with MapReduce, HDFS
  • Preferred experience:
    • Master Degree in Engineering/IT/Computer Science
    • 1+ years’ experience with DataBricks
    • 3+ years’ experience:
    • NoSQL databases, such as HBase, Cassandra, MongoDB
    • Big Data ML toolkits, such as Mahout, SparkML, or H2O
    • Scala or Java Language as it relates to product development.
    • 3+ Years’ DevOps experience in cloud technologies like AWS, CloudFront, Kubernetes, VPC, RDS, etc
    • Management of Spark or Hadoop clusters, with all included services
    • Experience Service Oriented Architecture (SOA) /microservices
APPLY NOW