O'Reilly、Cloudera 主办
Make Data Work

使用BigDL构建深度学习来驱动Apache Spark上的大数据分析,Intel赞助议题(Building deep learning power big data analytics on Apache Spark using BigDL—sponsored by Intel)

此演讲使用中文 (This will be presented in Chinese)

Yiheng Wang (Intel), Zhichao Li (Intel)
11:15–11:55 Friday, 2017-07-14
赞助商赞助 (Sponsored)
地点: 多功能厅8A+8B(Function Room 8A+8B)
平均得分:: *****
(5.00, 1 次得分)

随着深度学习技术的不断成功,多种感知形式的应用程序在图像分类、对象检测和语音识别方面都有了快速增长。顺应这个趋势,英特尔推出的BigDL是基于Apache Spark的开源分布式深度学习框架。它包括丰富的对深度学习的支持和英特尔数学内核库(Math Kernel Library)加速,使用户能够在现有的Hadoop生态系统上快速开发具有极高性能的深度学习应用。本议程将遍历主要几个英特尔成功利用Apache Spark和BigDL搭建的深度学习应用。了解他们开发出的技术以及他们从构建这些应用中学到的经验教训,包括系统中的工具栈和设计中的考虑;图像识别和对象检测(faster-rcnn和SSD)的应用;具有深度语音和声学特征变换器的语音识别的应用。英特尔在使用Apache Spark MLlib和BigDL构建统一数据分析平台的同时获得的其他见解和经验也将被分享。

With the continued success of deep learning techniques, there’s been a rapid growth in applications for perception in many modalities, such as image classification, object detection, and speech recognition. In response, Intel’s BigDL is an open source distributed deep learning framework for Apache Spark that includes rich deep learning support and Intel Math Kernel Library acceleration, allowing users to quickly develop deep learning applications with extremely high performance on their existing Hadoop ecosystems.

Yiheng Wang and Zhichao Li explore several key deep learning applications that Intel successfully built on top of Apache Spark with BigDL, discussing the technologies they developed and what they learned from building such applications, including the tool stack in the system and design considerations, applications for image recognition and object detection (faster-rcnn and SSD), and applications for speech recognition with deep speech and acoustic feature transformers. Along the way, they share insights and experiences gained while building a unified data analytics platform with Apache Spark MLlib and BigDL.

Photo of Yiheng Wang

Yiheng Wang


Yiheng Wang is a software development engineer on the Big Data Technology team at Intel working in the area of big data analytics. Yiheng and his colleagues are developing and optimizing distributed machine learning algorithms (e.g., neural network and logistic regression) on Apache Spark. He also helps Intel customers build and optimize their big data analytics applications.

Photo of Zhichao Li

Zhichao Li


Zhichao Li is a senior software engineer at Intel focused on distributed machine learning, especially large-scale analytical applications and infrastructure on Spark. He’s also an active contributor to Spark. Previously, Zhichao worked in Morgan Stanley’s FX Department.



WeChat QRcode


Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2


ORB Data Site