What is sort_values() in pandas?

Pandas is a Python library that is used in manipulating and analyzing data. This data exists in the form of two-dimensional structures called data frames. These data frames make the organization of data straightforward and allows easy understanding of the tabular format. To further increase our comprehension of the data, the sort_values() function sorts the data frame in ascending or descending order of the column passed as an argument. For example, suppose you want data on your students based on increasing age. You could do this easily by using the sort_values() function and passing the column with the students’ ages as an argument.

Syntax

dataframe.sort_values(by, axis, ascending, inplace, kind, na_position, ignore_index, key)

Arguments

  1. by: Specifies the column or index label to sort by. Datatype must be a string.
  2. axis: Specifies the axis by which to sort. 1 for column or 0 for index. (Optional)
  3. ascending: If set to True, data frame will be sorted in ascending manner. True is the default. (Optional)
  4. inplace: If set to True, operation will be performed on the original data frame. False is the default. (Optional)
  5. kind: Specifies the sorting algorithm to use. The options are mergesort, quicksort, or heapsort. quicksort is the default. (Optional)
  6. na_position: Specifies where to put NULL values. Options are last or first. last is the default. (Optional)
  7. ignore_index: If set to True, index is ignored. False is the default. (Optional)
  8. key: Specifies a function to be executed before the data is sorted. (Optional)

Return Value

A sorted data frame object will be returned.

Code Example

#import library
import pandas as pd
#initialize the data
data = {
"Name": ['John', 'Kelly', 'Kris', 'Betty', 'Bob'],
"Age": [16, 19, 15, 21, 17],
"Marks": [89, 76, 85, 67, 53]
}
#create a dataframe object
df = pd.DataFrame(data)
#print the data frame
print(df)

To sort the data according to increasing number of marks, we use the sort_values() function with column Age passed as the parameter.

#sort the data frame
sorted_df = df.sort_values(by='Marks')
#print the sorted data frame
print(sorted_df)