Series vs. DataFrame in Pandas

Pandas

Python provides a library called pandas that is popular with data scientists and analysts. Pandas enable users to manipulate and analyze data using sophisticated data analysis tools.

import pandas
# Integer indices
fruits = pandas.Series(["apples", "oranges", "bananas"], index=[4, 3, 2])
print("Fruit series:")
print(fruits)
# String indices
temperature = pandas.Series([32.6, 34.1, 28.0, 35.9], index=["one", "two", "three", "four"])
print("\nTemperature series:")
print(temperature)
# Non-unique index values
factors_of_12 = pandas.Series([1,2,4,6,12], index=[1, 1, 2, 2, 3])
print("\nFactors of 12 series:")
print(factors_of_12)
print("Type of this data structure is:", type(factors_of_12))

In the code example above, a DataFrame is initialized using a dictionary with two key-value pairs. Every key in this dictionary represents a column in the resulting DataFrame and the value represents all the elements in this column.

Both of the lists comprising of fruits as values are used to make a Python dictionary which is then passed to the pandas.DataFrame() method to make a DataFrame.

For the second DataFrame, we passed a list of indexes using the index argument in the pandas.DataFrame() method to use our custom indices.

Querying a DataFrame

The DataFrame can be queried in multiple ways.

.loc[] can be used to query the DataFrame using the user-defined indexes.
.iloc[] can be used to query using the default/built-in indexes.
Bracket operator [] can be used to query specific indices or columns.

We can also use chained queries to query a specific cell in the DataFrame.

These queries return a series or a single object depending on the type of query. Querying a row or a column returns series while querying a cell returns an object.

import pandas as pd

##### INITIALIZATION #####

fruits_jack = ["apples", "oranges", "bananas"]
fruits_john = ["guavas", "kiwis", "strawberries"]
index = ["a", "b", "c"]
all_fruits = {"Jack's": fruits_jack, "John's": fruits_john}

fruits = pd.DataFrame(all_fruits, index = index)
print(fruits, "\n")

new_fruits = pd.DataFrame(all_fruits)
print(new_fruits, "\n")


##### QUERY #####

#USING INDEX
print("1st fruit:")
print(fruits.iloc[0], "\n")

#USING KEY
print("Fruits at key \"c\":")
print(fruits.loc["c"], "\n")

#USING COLUMN NAME
print("Jack's fruits: ")
print(fruits["Jack's"], "\n")

#CHAINED QUERY, querying a cell
print("Johns third fruit: ")
print(fruits["John's"][2], "\n")

Series vs. DataFrame in Pandas

Pandas

Series

Syntax

Initializing a series

Querying a `series` object

DataFrame

Syntax

Initializing a DataFrame

Querying a DataFrame

Series vs. DataFrame in Pandas

Pandas

Series

Syntax

Initializing a series

Querying a series object

DataFrame

Syntax

Initializing a DataFrame

Querying a DataFrame

Querying a `series` object