Three ways to combine DataFrames in Pandas

Pandas `join()` function

This function allows the lowest level of control. It will join the rows from the two tables based on a common column or index. Have a look at the illustration below to understand various type of joins.

The output of the above join operation will be:

  key_caller   A key_other    B
0         K0  A0        K0   B0
1         K1  A1        K1   B1
2         K2  A2        K2   B2
3         K3  A3       NaN  NaN
4         K4  A4       NaN  NaN
5         K5  A5       NaN  NaN

Explanation :

By default, join() does a left join, but you can change the type of join by providing a value for the how parameter in the join() function as how='type_of_join'
The parameterlsuffix is the suffix that will be added to the column name from the left frame’s overlapping columns.
The parameter rsuffix is the suffix that will be added to the column name from the right frame’s overlapping columns.

The output of the above code is:

  lkey  value_x rkey  value_y
0  foo        1  foo        5
1  foo        1  foo        8
2  foo        5  foo        5
3  foo        5  foo        8
4  bar        2  bar        6
5  baz        3  baz        7

Explanation:

The parameter left_on is the column or index level names to join on in the left DataFrame.
The parameter right_on is the column or index level names to join on in the right DataFrame.
By default, the merge() function performs an inner join, but you can change it by passing the parameter value how='type_of_join'.

The output of the above code is:

Now, let’s concatenate the DataFrames.

The output of the above code is:

   Key data1 data2
0   b   0     NaN
1   b   1     NaN
2   a   2     NaN
3   c   3     NaN
4   a   4     NaN
5   a   5     NaN
6   b   6     NaN
0   a   NaN   0
1   b   NaN   1
2   d   NaN   2

Explanation:

The dataframe df2 is appended after df1.
NaN values denote that the values for that column are not present in the DataFrame.

Three ways to combine DataFrames in Pandas

Pandas `join()` function

Pandas `merge()` function

Pandas `concat()` function

Which to use and when to use?

Three ways to combine DataFrames in Pandas

Pandas join() function

Pandas merge() function

Pandas concat() function

Which to use and when to use?

Pandas `join()` function

Pandas `merge()` function

Pandas `concat()` function