O'Reilly、Cloudera 主办
Make Data Work
2017年7月12-13日:培训
2017年7月13-15日:会议
北京,中国

使用Alluxio(前Tachyon)来加速大数据计算 (Using Alluxio (formerly Tachyon) to speed up big data analytics)

此演讲使用中文 (This will be presented in Chinese)

Yupeng Fu (Alluxio), Rong Gu (南京大学)
09:00–12:30 Thursday, 2017-07-13
数据工程和架构 (Data engineering and architecture)
地点: 多功能厅5C(Function Room 5C) 观众水平 (Level): 中级 (Intermediate)

必要预备知识 (Prerequisite Knowledge)

A basic understanding of Hadoop and Spark

需要提前准备的资料和下载 (Materials or downloads needed in advance)

A laptop (Linux or macOS) with terminal access and Java installed
笔记本电脑(Linux或macOS),可以访问终端程序,安装好Java

您将学到什么 (What you'll learn)

了解Alluxio是什么,如何配置/运行Alluxio和如何构建简单的应用程序以受益于Alluxio
Learn what Alluxio is, how to configure and run Alluxio, and how to build simple applications that benefit from Alluxio

描述 (Description)

在这个三个小时的教学课中, 我们将向参与者讲授Alluxio基础知识,演示Alluxio如何工作以及如何使用此系统帮助分布式计算引擎(如Spark或MapReduce)以内存速度共享数据。在上机环节里, 讲师将指导参与者部署和运行Alluxio,将外部存储系统(如S3)挂载至Alluxio命名空间,以及使用Alluxio命令行工具以及WebUI,最后使用通用计算引擎(例如,Apache Spark,Hadoop MapReduce)来搭建一个简单的大数据应用,并使用这一应用从Alluxio来读取和写入数据。


Yupeng Fu and Rong Gu offer an overview of Alluxio basics, demonstrating how Alluxio works and how to use this system to enable distributed computation engines (like Spark or MapReduce) to share data at memory speed. Using hands-on exercises, Yupeng and Rong walk you through deploying and running Alluxio, mounting external storage systems (like S3) into Alluxio’s namespace, interacting Alluxio with built-in commands and WebUI, and building simple big data applications using common computation frameworks (e.g., Apache Spark and Hadoop MapReduce) to read from and write to Alluxio.
Photo of Yupeng Fu

Yupeng Fu

Alluxio

Yupeng Fu is a software engineer at Alluxio and a PMC member of the Alluxio open source project. Previously, Yupeng worked at Palantir, where he led the efforts to build the company’s storage solution. Yupeng holds a BS and an MS from Tsinghua University and has completed coursework toward a PhD at UCSD.

Photo of Rong Gu

Rong Gu

南京大学

Rong Gu is a research assistant professor at Nanjing University as well as an Alluxio PMC member and maintainer, where he worked on Alluxio’s performance evaluation framework, Alluxio-Perf, Alluxio ecosystem exploration, and documentation development, and an Apache Spark contributor. Rong has also worked to bridge Spark and Alluxio, contributing the OFF_HEAP storage level feature in to Spark 1.0, which allows Spark users to persist RDDs directly into Alluxio. Rong has been invited to share his work at many technical conferences, such as Spark Summit China, InfoQ Club, and Spark Meetup. He is the organizer of the Nanjing Big Data Technology meetup. He is the first author of 10 papers published in TPDS, JPDC, IPDPS, and IEEE Big Data and the author of several chapters in Understanding Big Data: Big Data Processing and Programming and Hadoop in Practice: Open the Shortcut to the Cloud Computing. Rong holds a PhD in computer science from Nanjing University. Over the last three years, he has held internships at several technology companies, including Microsoft, Intel, Baidu, and Transwarp.

联系OReillyData

关注OReillyData微信号获取最新会议信息并浏览前沿数据文章。

WeChat QRcode

 

Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2

阅读关于大数据的最新理念。

ORB Data Site