Plotting Salary vs. Names
Explore the process of reading linked data from text files and using NumPy and Matplotlib to plot salaries against names. Understand handling numerical data types, creating arrays, managing x-axis labels with string names, and refining plots by excluding extreme values for clearer comparisons.
We'll cover the following...
Reading data from a file
We have two following files:
-
names.txtthat contains names. -
salaries.txtthat contains salaries.
Let’s look at the files. We’re using the Linux cat command, which prints the file contents on the screen.
Lets run the following commands on the terminal below and see what happens.
cat names.txt
cat salaries.txt
The data in the two files are linked. So Fluffy would have a salary of 0, John would have a salary of 100, and so on. We’re using two different files to show different ways of reading data.
salary = np.fromfile("salaries.txt", dtype=int, sep=",")
NumPy arrays have versatility that makes them much better than standard Python lists because they allow different numeric data types. Python lists are mainly used for strings. While the lists can store numbers, they aren’t optimized for numerical processing. NumPy arrays are. We can store data as 8, 16, or 32 bits. We can choose to use integers or floats. NumPy handles all the conversion and processing internally. In the line above, we’re setting dtype=int. This tells NumPy that ...