Blog Logo
TAGS

Databricks Driver Sizing Optimization for Cost and Performance

This blog explores the impact of Databricks driver sizing on cost and performance using the TPC-DS 1TB benchmark. It investigates the relationship between the driver instance size and cost/runtime of the job. Additionally, the article explains the importance of driver sizing and its correlation with the number of workers for tuning systems for either cost or runtime goals. The technical parameters of the experiment, including the hardware specifications of the different drivers used on AWS, are provided. The article concludes with plots showing the results of the experiment which confirm the effect of driver sizing on cost and performance.