O'Reilly、Cloudera 主办
Make Data Work
2017年7月12-13日:培训
2017年7月13-15日:会议
北京,中国

现代流计算架构 (Modern streaming architectures)

此演讲使用中文 (This will be presented in Chinese)

Sijie Guo (Streamlio), Maosong Fu (Twitter)
13:30–17:00 Thursday, 2017-07-13
物联网&实时计算 (IoT & real-time)
地点: 多功能厅5A(Function Room 5A) 观众水平 (Level): Beginner

必要预备知识 (Prerequisite Knowledge)

Basic knowledge about stream computing systems, messaging/stream storage systems, and distributed systems

需要提前准备的资料和下载 (Materials or downloads needed in advance)

笔记本电脑(Linux或macOS),安装好DistributedLogHeron
A laptop (Linux or macOS) with DistributedLog and Heron installed

您将学到什么 (What you'll learn)

Explore modern streaming architectures, Twitter's real-time stack, and lessons learned running a real-time stack at Twitter's scale

描述 (Description)

Twitter的所有应用都是实时的。由于对于实时性的高要求,Twitter在过去几年的时间内投入人力和研发了一整套实时数据技术栈。最近,越来越多的企业对于实时数据技术架构感兴趣。从批处理向流计算机构的转型,是企业关于如何使用数据的一次技术革命。但是,实时数据技术栈(包括流计算引擎、数据存储引擎、编程语言和工具)的最前沿现状又是什么呢?在这其中,又有哪些技术挑战?以及这些前沿技术怎么影响流计算的架构和应用呢?

本辅导课将会介绍:
- 流计算的简介,以及一些典型应用
- 流计算架构是什么?
- 不同类型的流计算架构和他们的优缺点
- 详细讨论为实时数据存储设计的Apache DistributedLog,以及它在现代实时数据技术栈里的使用场景
- 详细讨论流计算引擎Heron,以及它在现代实时数据技术栈里的使用场景
- Twitter在使用Apache DistributedLog和Heron来搭建实时数据技术栈时获得的经验教训。


The move to streaming architectures from batch processing is a revolution in how companies use data. Twitter, for instance, is all about real time and has been at the forefront of investing and developing a real-time stack of several years, and recently, there has been a lot of interest in real time from enterprises as well. But what is the state of the art for a real-time data stack? Sijie Guo and Maosong Fu explore the typical challenges in a modern real-time data stack and explain how the modern technology will impact streaming architecture and applications in the future.

Topics include:

  • An introduction to streaming and some sample applications
  • Streaming architecture overview
  • Different types of streaming architectures and their pros and cons
  • An in-depth discussion of Apache DistributedLog for storage
  • An in-depth discussion of Heron
  • Lessons learned from building a real-time stack using Apache DistributedLog and Heron at Twitter
Photo of Sijie Guo

Sijie Guo

Streamlio

Sijie Guo is the tech lead of Twitter’s Messaging group. Sijie is the cocreator of Apache DistributedLog and the PMC chair of Apache BookKeeper.

Photo of Maosong Fu

Maosong Fu

Twitter

Maosong Fu is the technical lead for ​Heron and ​real-time analytics at Twitter. He ​is the author of ​few publications in distributed area​. Maosong holds a master’s degree from Carnegie Mellon University and a bachelor’s from Huazhong University of Science and Technology.

联系OReillyData

关注OReillyData微信号获取最新会议信息并浏览前沿数据文章。

WeChat QRcode

 

Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2

阅读关于大数据的最新理念。

ORB Data Site