Apache Spark — Tips and Tricks for better performance

Apache Spark — Tips and Tricks for better performance

Processing data at scale usually results in struggling with performance, strict SLA, limited hardware and etc.I’ve struggled with cutting Spark SQL query run-time and found the culprit! This culprit, and SOLUTION! I would like to share with you. Today in the world of Big Data and Spark we are processing high volume transactions. Catalyst is the Spark SQL query optimizer and in this talk, you will learn how to fully utilize Catalyst optimization power in order to make our queries as fast as possible, by pushing down actions and trying to avoid UDFs as much as possible and maximizing performance.

Book your ticket now

Do you like this session? Join 500+ attendees by registering now and live the Voxxed Athens experience