O'Reilly、Cloudera 主办
Make Data Work

机器人的预测性维护实战:解读实时、可扩展的分析管道 (Robot predictive maintenance in action: Real-time, scalable pipelines explained)

This will be presented in English.

Mathieu Dumoulin (MapR Technologies), Mateusz Dymczyk (H2O.ai)
14:50–15:30 Friday, 2017-07-14
物联网&实时计算 (IoT & real-time), 英文讲话 (Presented in English)
地点: 多功能厅6A+B(Function Room 6A+B) 观众水平 (Level): Intermediate

必要预备知识 (Prerequisite Knowledge)

A general understanding of big data technologies and machine learning

您将学到什么 (What you'll learn)

Explore a fully working pipeline from sensor to visualization explained step by step, learn how to apply anomaly detection on real-time streaming sensor data, and see a real application of modern big data streaming architecture in action

描述 (Description)



这个可用系统是一个预测性维护案例的实现。只有聪明地使用现代化的基于微服务的流式架构才让这一切成为可能。这个系统利用了MapR聚合数据平台(MapR Converged Data Platform)的独特特征来进行操作分析、消息系统和存储。机器学习的建模和部署则是使用H2O.ai来实现的。


Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality, and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.

It is now possible to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers, using a well-chosen selection of big data enterprise products and open source projects. Mathieu Dumoulin and Mateusz Dymczyk walk you step by step through building a working real-time ML-based anomaly detection system on a working industrial robot-analog installed with a wireless movement sensor. The working system is only made possible by a clever use of modern, microservices-based streaming architecture. You’ll learn how to gather data from a wireless movement sensor, process it with H2O on a MapR cluster, and visualize the output through an AR headset by an operator.

Photo of Mathieu Dumoulin

Mathieu Dumoulin

MapR Technologies

Mathieu Dumoulin is a data scientist in MapR Technologies’s Tokyo office, where he combines his passion for machine learning and big data with the Hadoop ecosystem. Mathieu started using Hadoop from the deep end, building a full unstructured data classification prototype for Fujitsu Canada’s Innovation Labs, a project that eventually earned him the 2013 Young Innovator award from the Natural Sciences and Engineering Research Council of Canada. Afterward, he moved to Tokyo with his family, where he worked as a search engineer at a startup and a managing data scientist for a large Japanese HR company, before coming to MapR.

Mateusz Dymczyk


Mateusz Dymczyk is a Tokyo-based software engineer at H2O.ai, the company behind H2O, the leading open source machine learning platform for smarter applications and data products. He works on distributed machine learning projects including the core H2O platform and Sparkling Water, which integrates H2O and Apache Spark. Previously, he worked at Fujitsu Laboratories on natural language processing and utilization of machine learning techniques for investments and at Infoscience on a highly distributed log data collection and analysis platform. Mateusz loves all things distributed and machine learning and hates buzzwords. In his spare time, he participates in the IT community by organizing, attending, and speaking at conferences and meetups. Mateusz holds an MSc in computer science from AGH UST in Krakow.



WeChat QRcode


Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2


ORB Data Site