Software Engineer IV – Big Data Cloud

Location: North of Pittsburgh, PA

Job Type: Full Time / Permanent


  • Develop custom batch-oriented and real-time streaming data pipelines working within the MapReduce ecosystem, migrating flows from ELT to ETL
  • Ensure proper data governance policies are followed by implementing or validating data lineage, quality checks, classification, etc.
  • Act in a technical leadership capacity: Mentor junior engineers and new team members, and apply technical expertise to challenging programming and design problems
  • Resolve defects/bugs during QA testing, pre-production, production, and post-release patches
  • Possess a quality mindset, squash bugs with a passion, and work hard to prevent them in the first place through unit testing, test-driven development, version control, continuous integration, and deployment.
  • Conduct design and code reviews
  • Analyze and improve efficiency, scalability, and stability of various system resources
  • Contribute to the design and architecture of the project
  • Operate within Agile Development environment and apply the methodologies

Education & Experience:

  • Bachelor’s degree in Engineering/IT/Computer Science
  • 6+ years’ experience in software engineering
  • 2+ years experience developing ETL processing flows using MapReduce technologies like Spark and Hadoop, In Software Design
  • 1+ years’ experience:
    • developing with ingestion and clustering frameworks such as Kafka, Zookeeper, YARN
    • building stream using Spark-Streaming
    • with various messaging systems, such as Kafka or RabbitMQ
  • Experience with integration of data from multiple data sources
  • Experience leading projects or teams
  • Proficient understanding of distributed computing principles
  • Good knowledge of Big Data querying tools, such as Pig or Hive
  • Good understanding of Lambda Architecture, along with its advantages and drawbacks
  • Proficiency with MapReduce, HDFS
  • Ability to solve any ongoing issues with operating the cluster
  • Ability to lead change, be bold, and have the ability to innovate and challenge the status quo
  • Passionate about solving customer problems and develop solutions that result in a passionate customer/community following
  • Preferred Qualifications:
    • Master Degree in Engineering/IT/Computer Science
    • 8+ years’ experience in software engineering
    • 1+ years’ experience with:
      • DataBricks and Spark
      • NoSQL databases, such as HBase, Cassandra, MongoDB
      • Big Data ML toolkits, such as Mahout, SparkML, or H2O
    • 2+ years’ experience with Scala or Java Language as it relates to product development.
    • Management of Spark or Hadoop clusters, with all included services
    • Experience Service Oriented Architecture ( SOA) /microservices
    • Demonstrable advanced knowledge of data architectures, data pipelines, real-time processing, streaming, networking, and security