O'Reilly、Cloudera 主办
Make Data Work
2017年7月12-13日:培训
2017年7月13-15日:会议
北京,中国

Hyperledger与CDH大数据生态系统的融合以及应用实践 (Hyperledger’s integration with CDH's big data ecosystem and its real-world applications)

此演讲使用中文 (This will be presented in Chinese)

蒋守壮 (Shouzhuang Jiang) (万达网络科技集团有限公司), 丛宏雷 (万达网络科技集团有限公司)
13:10–13:50 Saturday, 2017-07-15
企业应用 (Enterprise adoption)
地点: 多功能厅6A+B(Function Room 6A+B) 观众水平 (Level): 中级 (Intermediate)
平均得分:: **...
(2.00, 1 次得分)

必要预备知识 (Prerequisite Knowledge)

Cloudera Manager和CDH Hyperledger

描述 (Description)

1.Hyperledger背景介绍
超级账本(hyperledger)是Linux基金会于2015年发起的推进区块链数字技术和交易验证的开源项目,目标是让不同项目成员合作共建开放平台,满足来自多个不同行业需求,并简化业务流程。HyperLedger网络中所有节点,基于点对点网络共建分布式账本,整个分布式账本是完全共享和去中心化的,通过共识机制共同实现账本信息的更新,故非常适合于在金融行业的应用,以及其他的例如制造、银行、保险、物联网等无数个其他行业。通过创建分布式账本的公开标准,实现虚拟和数字形式的价值交换,例如资产合约、能源交易,能够安全和高效低成本的进行追踪和交易。

2.Cloudera Manager背景介绍
Cloudera Manager 是 CDH 市场领先的大数据管理平台,对 CDH 的每个部件(包括HDFS,YARN,HBase,Hive和Impala等等)都提供了细粒度的可视化和控制,从而设立了企业部署的标准。通过 Cloudera Manger,运维人员得以提高集群的性能,提升服务质量并降低管理成本。

Cloudera Manager 设计的目的是为了使得对于企业数据中心的管理变得简单和直观。通过 Cloudera Manager,可以方便地部署,并且集中式的操作完整的大数据软件栈。该应用软件会自动化安装过程,从而减少了部署集群的时间。通过 Cloudera Manager可以提供一个集群范围内的节点实时运行状态视图。同时,还提供了一个中央控制台,可以用于配置集群。不仅如此, Cloudera Manager 通过包含一系列的报道和诊断工具,可以帮助您优化集群性能,并且提高利用率。 Cloudera Manager 能够为您提供以下的功能:
● 自动化 Hadoop 安装过程,大幅缩短部署时间;
● 提供实时的集群概况,例如节点、服务的运行状况;
● 提供了集中的中央控制台对集群的配置进行更改;
● 包含全面的报告和诊断工具,帮助优化性能和利用率。

3.Hyperledger on CDH项目
因为使用Cloudera Manager非常方便将一个新的组件加入CDH大数据平台上面,所以我们想到使用CM将Hyperledger部署到企业级的大数据平台环境并且打通数据流的整合和高效分析,即利用已有的大数据平台存储和分析Hyperledger产生的数据,我们设立Hyperledger on CDH项目,专注于快速部署,监控和管理Hyperledger。

Hyperledger on CDH项目涉及以下几个方面工作内容:
• Hyperledger Parcel 包含了所有组件依赖以及所需的Docker镜像
CSD 如何被部署以及如何启动,停止以及监控
• Cloudera Manager Agent负责分发Parcel包并启动服务
• 用户选择需要部署服务各个角色的节点
• Cloudera Manager通知Agent产生一个用于运行的沙盒环境,这个环境包含了生成的配置文件,环境变量以及控制脚本
• 当用户点击启动服务,Agent调用控制脚本启动Hyperledger服务

4.Hyperledger的应用

会员制服务负责管理的是网络上的身份识别、隐私与机密。参与者通过注册来获取身份,然后属性授权机构才能发放密钥来进行交易。通过区块链纪录账本的更新历史,能够使审计人员浏览某参与者的交易情况,如果审计人员已经获得参与者授予的适当访问权限的话。

区块链服务负责管理分布式账本,通过在GRPC建立的点对点协议进行。数据结构经过优化能够有效维护众参与者重复的整体状态。不同的共识算法或将嵌入每一个配置中,以保证高度一致性(通过拜占庭容错算法来处理错误,通过崩溃容忍来处理延误与中断)。

链上代码(Chaincode)服务负责提供安全又轻便的沙盒装载路径,供链上代码执行验证节点。整个环境是一个封锁且安全的容器,内含一个签署过的图片库,包括安全的操作系统及链上代码语言,以及Golang/Java等软件开发工具包组图与执行环境。

万达网络科技集团在数字权益交易平台采用HyperLedger,将交易信息的发布和成交纪录保存到区块链上,保证所有交易纪录的安全与不可篡改。在供应链管理系统中采用HyperLedger,利用Hyperledger的溯源功能,打通商品生产与销售间关系,打造全新的商品销售模式。

5.展望
目前,我们已经将Hyperledger部署在企业级的大数据平台上,后面我们重点将打通Hyperledger的数据和大数据平台的组件之间实现数据流的打通和挖掘,最大程度的挖掘企业价值。


Hyperledger is an open source project, initiated by the Linux Foundation in 2015, to advance blockchain digital technology and transaction authentication. Hyperledger aims to promote cross-industry collaboration to build an open platform that can meet the requirements for multiple domains and simplify business processes. All nodes in a Hyperledger network forge a peer-to-peer distributed ledger, and the overall distributed ledger is totally shared and decentralized. Its information is jointly updated using a consensus mechanism. Hence, it is well suited for financial applications and many other industries, such as manufacturing, banking, insurance, and the internet of things. By creating a distributed ledger open standard and implementing virtual and digitized value exchange (e.g., asset contracts and energy transactions), it is possible to track and conduct transaction securely, efficiently, and cost effectively.

CDH’s Cloudera Manager provides fine-grained visualization and control of every component in CDH, including HDFS, YARN, HBase, Hive, and Impala, which helps set an enterprise-level deployment standard. Leveraging Cloudera Manager, operations personnel can improve their cluster’s performance and service quality while lowering management costs. The design goal of Cloudera Manager is to make management of an enterprise’s data center easy and intuitive. This software can automate the installation procedure, decreasing the time to cluster deployment. Cloudera Manager has a real-time and runtime status panel for all nodes within a cluster and a central control console by which users can configure a cluster. In addition, Cloudera Manager includes a set of report and debug tools that can help users optimize performance and increase utilization rate of clusters.

Shouzhuang Jiang and 丛宏雷 discuss Hyperledger’s integration with CDH’s big data ecosystem and its real-world applications. Cloudera Manager can easily add new components into CDH’s big data platform, so it is possible to use Cloudera Manager to deploy Hyperledger into an enterprise-level big data environment and integrate data flow with efficient analysis—that is, use your existing big data platform to store and analyze data generated by Hyperledger.

You’ll also explore real-world applications of Hyperledger, including:

  • Membership services that manage identification, privacy, and confidentiality on the network. Participants register to obtain identities, which enables the attribute authority to issue security keys for transacting. By using the blockchain to record the update history of a ledger, auditors can view transactions pertaining to a participant, assuming that each auditor has been granted proper access authority by the participants.
  • Blockchain services that manage the distributed ledger via a peer-to peer protocol based on GRPC. Optimized data structure can effectively represent participants’ states. Different consensus algorithms maybe be embedded to guarantee strong consistency among different networks (tolerating misbehavior with Byzantine fault tolerance, tolerating delays and outages with crash tolerance).
  • Chaincode services, a secured and lightweight way to sandbox the chaincode execution on validating nodes. The environment is a “locked down” and secured container with a set of signed base images that contain secure OS and chaincode language, runtime, and SDK images for Golang, Java, etc.
  • How Wanda Internet Technology Group applies Hyperledger in its Digital Right Exchange Platform to ensure all transaction history remains confidential and cannot be tampered with. Applying Hyperledger in a supply chain management system and leveraging its tracking capacity can link producing and sales parties, creating a brand-new sale model.

The talk concludes with a consideration of bridging Hyperledger’s data to big data platform’s components, implementing data stream pass-through and data mining, and extracting maximum value.

Topics include:

  • Hyperledger Parcel, which includes all dependent components and all required Docker images
  • How is CSD deployed and how to start, stop, and monitor it
  • Cloudera Manager Agent, which is in charge of sending Parcel packages and start the service
  • How users can configure the role of nodes on which services will be deployed
  • How Cloudera Manager requests an agent to create a runtime sandbox, which has all generated configure files, environment variables, and control scripts
  • How the agent evokes the control scripts to start Hyperledger services
Photo of 蒋守壮 (Shouzhuang Jiang)

蒋守壮 (Shouzhuang Jiang)

万达网络科技集团有限公司

专注于Hadoop,Spark,Flink,Kafka,Elastic,HBase,Hive,Kylin等大数据相关技术的源码研究和企业级实战,《基于Apache Kylin构建大数据分析平台》一书作者。

Photo of 丛宏雷

丛宏雷

万达网络科技集团有限公司

万达网络科技区块链研发

联系OReillyData

关注OReillyData微信号获取最新会议信息并浏览前沿数据文章。

WeChat QRcode

 

Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2

阅读关于大数据的最新理念。

ORB Data Site