Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

pyspark

How to import a CSV file in pyspark

Abhilash

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

The spark.read.csv() method is used to read a single CSV or a directory of CSV files to a spark DataFrame. Various different options can be specified via the spark.read.option() method.

Syntax

spark.read.option("option_name", "option_value").csv(file_path)

Parameter

  • file_path: This is the CSV file to be read.

Return value

This method returns a spark DataFrame.

Code example

Let’s look at the code below:

main.py
data.csv
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("answers").getOrCreate()
path = "data.csv"
df = spark.read.option("header",'True').option('delimiter', ',').csv(path)
df.printSchema()

Code explanation

  • Lines 1–2: We import pyspark and SparkSession.
  • Line 4: We create SparkSession with the application name answers.
  • Line 6: We define the path to the CSV file.
  • Line 8: We convert the CSV file to a DataFrame using the csv() method. Multiple options are chained together using the option() method.
  • Line 9: We print the DataFrame schema.

RELATED TAGS

pyspark

CONTRIBUTOR

Abhilash
Copyright ©2022 Educative, Inc. All rights reserved

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Keep Exploring