Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale Carson Wang (Intel) and Yuanjian Li (Baidu) from data type in sql Watch Video
Preview(s): Play Video: (Note: The default playback of the video is HD VERSION. If your browser is buffering the video slowly, please play the REGULAR MP4 VERSION or Open The Video below for better experience. Thank you!)
⏲ Duration: 15 min 46 sec ✓ Published: 11-Jun-2018
Description: Spark SQL is a very effective distributed SQL engine for OLAP and widely adopted in Baidu production for many internal BI projects. However, Baidu has also been facing many challenges for large scale including tuning the shuffle parallelism for thousands of jobs, inefficient execution plan, and handling data skew.nnIn this talk, we will explore Intel and Baidu’s joint efforts to address challenges in large scale and offer an overview of an adaptive execution mode we implemented for Baidu’s B