O'Reilly、Cloudera 主办
Make Data Work
2017年7月12-13日:培训
2017年7月13-15日:会议
北京,中国

成为Apache Spark明星路上的技巧 (Tricks of the trade to be an Apache Spark rock star)

This will be presented in English.

Ted Malaska (Capital One)
14:00–14:40 Friday, 2017-07-14
Spark及更多发展 (Spark & beyond), 英文讲话 (Presented in English)
地点: 多功能厅2(Function Room 2) 观众水平 (Level): Intermediate

必要预备知识 (Prerequisite Knowledge)

A working knowledge of Spark

您将学到什么 (What you'll learn)

Learn how to build and run unit tests for Spark

描述 (Description)

编写一个可以让你得到结果的Apache Spark应用程序是一回事。使用本书中的所有技巧让你的应用尽可能快地运行则是另外一回事。本次会议将侧重于讲解这些技巧。

发现那些猛一看可能不明显的模式和方法,但是当把它们应用于你的场景时,可能会彻底改变一切。你将了解嵌套的类型、多线程、偏斜、reducing、笛卡儿连接以及一些诸如此类的有趣的东西。


It’s one thing to write an Apache Spark application that gets you to an answer. It’s another thing to know you used all the tricks in the book to make it run as fast as possible. Ted Malaska shares some of those tricks.

Join Ted to discover patterns and approaches that may not be apparent at first glance but that can be game-changing when applied to your use cases. You’ll learn about nested types, multithreading, skew, reducing, Cartesian joins, and other fun stuff.

Photo of Ted Malaska

Ted Malaska

Capital One

Ted Malaska is a director of enterprise architecture at Capital One. Previously, he was the director of engineering in the Global Insight Department at Blizzard; principal solutions architect at Cloudera, helping clients find success with the Hadoop ecosystem; and a lead architect at the Financial Industry Regulatory Authority (FINRA). He has contributed code to Apache Flume, Apache Avro, Apache Yarn, Apache HDFS, Apache Spark, Apache Sqoop, and many more. Ted is a coauthor of Hadoop Application Architectures, a frequent speaker at many conferences, and a frequent blogger on data architectures.

联系OReillyData

关注OReillyData微信号获取最新会议信息并浏览前沿数据文章。

WeChat QRcode

 

Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2

阅读关于大数据的最新理念。

ORB Data Site