Why Python?

Learn what Python is and how to install it.

What is Python?

Python is a general-purpose computer programming language often used to build websites and software, automate tasks, and conduct data analysis. It can also be used to create various programs and isn’t specialized for any specific problems.

Python distributions

Python can be deceptively difficult to install. r The problem is that many libraries like NumPy andSciPy are written in Fortran or C because they were ported from old libraries.

Most users of online platforms like Stack Overflow and Reddit are web developers. They work in a highly sterilized environment, where every library they use is written in pure Python and is therefore easy to install. This means that if we get stuck installing Python and seek help, we might not find reliable answers.

When writing this course, we looked for many solutions to this problem. The best one we found was the Anaconda Python distribution. If you use Windows, we especially recommend using this distribution. It comes pre-installed with most engineering libraries (the notable exception is OpenCV).

SciPy / NumPy

SciPy/NumPy is a megaproject that contains everything under the sun for scientific computing.

Standard Python only has an integer data type. However, we know that integers can be 8, 16, 32, or 64-bit and floating-point numbers are usually stored as 64-bit. These additions were added with the help of SciPy. However, Scipy was a massive project, so a decision was made to extract some parts into a library called NumPy.

NumPy is what we’ll use 90% of the time. When NumPy does not have the required functionality, we’ll look into SciPy.

Many people get confused between SciPy and NumPy. What we need to keep in mind is that they both are the same thing!

Another important SciPy library that we’ll often use is Matplotlib (or Pylab). This is a graphing library that was based initially on Matlab (as is most of SciPy). For those who haven’t heard of it, Matlab is one of the most used packages for scientific computing. It is also super expensive with a complicated licensing scheme, which means most companies buy limited copies. The only place we’ll find Matlab installed commonly is at Universities!

Pandas

Pandas is a recent addition to Python It was originally written to replace the functionality of R, and is used mainly in the financial world for statistical computing.

NumPy is okay if we are working with simple text or CSV files. However, because there is so much low-level work in NumPy, applications like Microsoft Excel and other large complicated files quickly become unmanageable. Pandas takes care of that for us.

If we used the Anaconda distribution, SciPy and Pandas would be preinstalled. However, we’ll have to install Pandas ourselves.

OpenCv

OpenCV is a huge open-source library for computer vision, machine learning, and image processing. We will find most examples for OpenCV in C/C++, though the usage of Python is also increasing.

Other libraries

There are some other libraries we should know about, like Virtualenv.

Virtualenv, also known as Virtual environment, is a great library for installing different versions of libraries on the same machine. If we ever have a code that only works with v0.2 of some library, Virtualenv will install a special version of Python that comes with that specific version only. There are dozens of versions of the same library, and none of them clash with one another.