How to view data in Koalas

Koalas is an important package used to deal with data science and big data in Python. It has a simple mechanism.

Koalas implements the pandas DataFrame API on top of the Apache Spark – this makes life easier for data scientists who constantly interact with Big Data. Pandas itself is widely used in the field of Data Science. The only difference between Pandas and Spark is that pandas has single node DataFrame implementation, whereas Spark is the standard for big data processing.

The Koalas package ensures that a user can immediately start working with Spark as long as one has experience working in pandas. Moreover, it provides a single codebase that works with both Spark and pandas.

How to view data in Koalas

All the necessary operations for Koalas dataframe are similar to those in pandas. Let’s look at the basic operations:

koalas_df = ks.DataFrame(
    {'unit': [1, 2, 3, 4, 5, 6],
     'hundred': [100, 200, 300, 400, 500, 600],
     'english': ["one", "two", "three", "four", "five", "six"]},
index=[1, 2, 3, 4, 5, 6])

// Viewing the dataframe
>> koalas_df

   unit  hundred english
1     1      100     one
2     2      200     two
3     3      300   three
4     4      400    four
5     5      500    five
6     6      600     six

// Viewing the first 5 values
>> koalas_df.head()

   unit  hundred english
1     1      100     one
2     2      200     two
3     3      300   three
4     4      400    four
5     5      500    five

// Viewing all index values
>> koalas_df.index

Int64Index([1, 2, 3, 4, 5, 6], dtype='int64')

// Viewing all columns
>> koalas_df.columns

Index(['unit', 'hundred', 'english'], dtype='object')

// Transpose operation
>> koalas_df.T
           1    2      3     4     5    6
unit       1    2      3     4     5    6
hundred  100  200    300   400   500  600
english  one  two  three  four  five  six

// Sorting values based on unit to descending order
>> df.sort_values(ascending=False, by='unit')
   unit  hundred english
6     6      600     six
5     5      500    five
4     4      400    four
3     3      300   three
2     2      200     two
1     1      100     one

Free Resources