Programming for Apache Spark 2.1

Programming for Apache Spark 2.1

This 3 day training course will teach you how to harness Apache Spark 2.1 for large scale data analysis, building big data applications and data processing pipelines. You will learn how to program Spark as efficiently and effectively as possible, by targeting the latest version of the platform (Spark 2.1), and learning the modern approach necessary to fully leverage the advantages it offers.

The entirety of the course is taught hands-on, using real code and interactive examples. In addition, longer labs allow attendees to work together to apply their growing Spark knowledge to solve common challenges faced by organizations running complex Big Data applications in production.

Focus on Open Source Technology, Not Commercial Products

While we’re enthusiastic about many of the products in the Big Data ecosystem, the focus of this training course is to make you as proficient and effective as possible with open source Apache Spark, enabling you to apply the fundamental skills gained to whichever products and tools work best for you.

Learn to Program for Apache Spark 2.1 – and the future of the Spark platform.

Targeting the latest version of the Spark platform, Apache Spark 2.1, will teach you how to optimize your Spark code to fully leverage the internal changes that make Spark 2.1 faster and more effective. At the same time, this training course will help prepare you for the future of the platform, by teaching you the modern approach to Spark programming required by future releases of the platform.

  • Overview
  • Outline
  • Instructors
  • Reviews
Duration: 3 Days


  • Program Apache Spark in the most performant, easy, modern, and effective ways possible to perform ETL, analytics, machine learning, and streaming operations.
  • Understand how Spark should – and shouldn't! – be used within your Big Data application architectures.
  • Learn how Apache Spark processes your jobs so that you can troubleshoot, analyze, and improve performance if they don't run well.
  • See important patterns, tricks, tips, and gotchas so that you don't have to learn them the hard way.


Data analysts, engineers, and scientists who want to conduct analytics with Big Data or build end-to-end applications and data processing pipelines. Attendees should have some knowledge of SQL and some background programming in Python, Java, Scala, or R.

Upcoming Classes

No classes have been scheduled, but you can always Request a Quote.

Request a private course for your team

Custom Quote

Don't see a date that works for you?

Request a Class

Confirmed Class