Write Production Code

Learn to set up production-level code in PySpark and pandas.

Overview

In production-level environments, writing code that is easy for the entire team to understand and maintain is key. There are a few general guidelines for writing production-level code, as shown below:

  • Use global scope as little as possible for variables, as this will reduce mutation bugs in the code.

  • Add docstrings in the function definition. This helps people understand the code.

  • Use type annotation or hint in the function arguments and the return value. This helps co-developers use your code as an API.

Create production-level code

Let’s look at an example of creating production-level code, using the code we’ve written so far. First, we separate the general variables and functions to another file so that we can reuse the codes for other projects which use the same data source.

Get hands-on with 1200+ tech skills courses.