What is PySpark MapType?
A MapType interface is similar to dictionary in Python or HashMap in Java.
It’s used to store key-value pairs. The key and value have a data type associated with them. The keys in a MapType are not allowed to be None or NULL.
Syntax
MapType(keyType, valueType, valueContainsNull=True)
Parameters
keyType: This is the data type of the keys.valueType: This is the data type of the values.valueContainsNull: This is a boolean value indicating whether the values can beNULLorNone. The default value isTrue, which indicates that the values can beNULL.
Code example
from pyspark.sql import SparkSessionfrom pyspark.sql.types import StructField, StructType, StringType, MapTypespark = SparkSession.builder.appName('answers').getOrCreate()dfSchema = StructType([StructField('Emp Name', StringType(), True),StructField('Details', MapType(StringType(),StringType()),True)])data = [('John Wick',{'country':'usa','profession':'Don'}),('Yash',{'country':'india','profession':'Artist'}),('Novak Djokovic',{'country':'serbia','profession':'tennis player'}),('Sundar Picchai',{'country':'usa','profession':'CEO'}),('Kobe Bryant',{'country':'usa','profession':'Basket ball player'})]df = spark.createDataFrame(data=data, schema = dfSchema)df.show(truncate=False)
Code explanation
- Lines 1–2: The
SparkSessionand relevant data types are imported. - Line 4: A
SparkSessionwith the application nameanswersis created. - Lines 6–9: The schema for the DataFrame to be created is defined. The
Detailscolumn is aMapType. - Lines 11–17: The sample data for the DataFrame is defined. A Python dictionary is defined for the
Detailscolumn. - Line 19: A DataFrame is created with the schema and the sample data is defined.
- Line 20: The created DataFrame is displayed.
Free Resources
Copyright ©2026 Educative, Inc. All rights reserved