Macros in Data Pipelines

Macros in Data Pipelines

Tags:

Neville Li works at Spotify on the Music Recommendation Team. They’ve been using Scala since early 2013, specifically, using data science tools like Scalding and Spark.

He describess a particular “powerful data combo” trio:

In this talk for NE Scala, Neville presents how Scala Macros can be used to improve data pipeline code levering the listed items above. Quoting from his abstract, “We use macros to generate parquet schema projection and filter predicates in compile time. Compared to the standard approach, the macros are type-safe, more concise, and user friendly.”

The code, that Neville and his team are using in production at Spotify, can be found here: https://github.com/nevillelyh/parquet-avro-extra

Further Resources

About the Author

This post is part of Northeast Scala Symposium 2015

If you liked this post you'll probably be interested in these:

0 Comments

Comments