Grab is sitting at the junction of the digital and physical worlds. Its vision is to drive Southeast Asia forward and transform the way people travel and pay across the region. With more than 700,000 drivers and 36 million app downloads, the Grab app has become a platform with one of the highest usage and transaction rates for the 620 million people in SEA—and is growing every day—giving the company an incredible opportunity to perfect the way it uses data to make lives easier across SEA.
In general, Grab aims to create and sustain a data-driven culture, using data to solve the toughest problems. The Data Engineering team is responsible for building a reliable data analytics platform, playing a big role in helping different teams to gain product and consumer insights from a multipetabyte scale data warehouse. Their work ranges from supporting ad hoc queries (booking, log, etc.) to analyzing user experience and training machine-learning models.
Feng Cheng and Edwin Law explain Grab’s data architecture and offer a history of its data platform migration and stream-processing apps. Feng and Edwin describe some of the challenges the company has faced in getting its back-office applications to scale and what it’s done to meet demand. They also explore its history of architecture traces, from Redshift to EMR + S3. In the early stage, Redshift was a simple and cost-effective solution to analyze all of Grab’s data. But when data volumes grew exponentially over the last year and data processing became more complicated, the company decided to make a big change in the architecture, leveraging AWS (EMR + S3) for its data warehouse. This architecture offers many advantages, including allowing Grab to separate the computing and storage layers and allowing multiple clusters to share the same data on S3 and data analytics.
Cheng Feng is a data engineer at Grab, where he works on the big data platform, distributed computing, stream processing, and data science. Previously, he was a data scientist at the Lazada Group, working on Lazada’s tracker, customer segmentation and recommendation systems, and fraud detection.
Edwin Law was the third person and first engineer on the Data team at Grab (formerly MyTeksi and Grab Taxi), which encompasses data engineering, data science, and data analytics. Edwin leads the almost-15-member-strong Data Engineering and Database Operations teams as their engineering manager.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org