What is PySpark MapType?

A MapType interface is similar to dictionary in Python or HashMap in Java.

It’s used to store key-value pairs. The key and value have a data type associated with them. The keys in a MapType are not allowed to be None or NULL.

Syntax

MapType(keyType, valueType, valueContainsNull=True)

Parameters

keyType: This is the data type of the keys.
valueType: This is the data type of the values.
valueContainsNull: This is a boolean value indicating whether the values can be NULL or None. The default value is True, which indicates that the values can be NULL.

Code example

from pyspark.sql import SparkSession
from pyspark.sql.types import StructField, StructType, StringType, MapType
spark = SparkSession.builder.appName('answers').getOrCreate()
dfSchema = StructType([
    StructField('Emp Name', StringType(), True),
    StructField('Details', MapType(StringType(),StringType()),True)
])
data = [
        ('John Wick',{'country':'usa','profession':'Don'}),
        ('Yash',{'country':'india','profession':'Artist'}),
        ('Novak Djokovic',{'country':'serbia','profession':'tennis player'}),
        ('Sundar Picchai',{'country':'usa','profession':'CEO'}),
        ('Kobe Bryant',{'country':'usa','profession':'Basket ball player'})
        ]
df = spark.createDataFrame(data=data, schema = dfSchema)
df.show(truncate=False)

What is PySpark MapType?

Syntax

Parameters

Code example

Code explanation