Merging in a DataFrame

Let's see how to merge DataFrame objects.

The pandas provides a SQL-style join function, merge. It’s a high performance in-memory join operation. If you are familiar with SQL, you probably know that when you join two tables, an on clause is needed.

Two DataFrames would be merged based on some columns with the same value, which must exist in both DataFrames.

The how parameter of merge() specifies how to determine which keys are to be included in the final table. If a key combination does not appear in either the left or right tables, the values in the joined table will be filled by NaN. Below is a summary of the how options and their corresponding SQL equivalent.

Merge SQL Description
left LEFT OUTER JOIN Use keys from left frame only
right RIGHT OUTER JOIN Use keys from right frame only
outer FULL OUTER JOIN Use union of keys from both frames
inner INNER JOIN Use intersection of keys from both frames

Get hands-on with 1200+ tech skills courses.