Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

apache spark

What is spark.driver.memoryOverhead in Spark 3?

Shahpar Khan

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

svg viewer

Spark configuration

Spark allows you to configure the system according to your needs. One of the locations to configure a system is Spark Properties.

Spark Properties control most application parameters and can be set using a SparkConf object. One of these properties is spark.driver.memory.OverHead.

The spark.driver.memoryOverHead enables you to set the memory utilized by every Spark driver process in cluster mode. This is the memory that accounts for things like VM overheads, interned strings, other native overheads, etc. – it tends to grow with the executor size (typically 6-10%). The spark.driver.memoryOverHead option is currently supported on YARN and Kubernetes; it has been available since Spark version 2.3.0.

The memory assigned using spark.driver.memoryOverHead is the non-heap memory atop of the memory assigned using spark.driver.memory. If not set explicitly, it is by default, driverMemory * 0.10, with minimum of 384.

Memory input format

Use the following format to set memory:

Format Description
1b bytes
1k or 1kb kibibytes = 1024 bytes
1m or 1mb mebibytes = 1024 kibibytes
1g or 1gb gibibytes = 1024 mebibytes
1t or 1tb tebibytes = 1024 gibibytes
1p or 1pb pebibytes = 1024 tebibytes

How to set driver memory overhead

You can set the driver memory overhead by using the SparkConf object. Here is a sample code:

conf = new SparkConf()
conf.set('spark.driver.memoryOverhead', '15g')

Or you can dynamically load property:

./bin/spark-submit --name "My app" --master local[4] --conf spark.driver.memoryOverhead=15g

You can use spark-submit to specify any Spark property using the --conf/-c flag and then specifying its argument after the = sign. Running --help will show you the entire list of these options.

RELATED TAGS

apache spark

CONTRIBUTOR

Shahpar Khan
Copyright ©2022 Educative, Inc. All rights reserved

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Keep Exploring