Why good organization is needed

Good organization is a prerequisite for a readable and maintainable piece of software. How we organize code, configuration, and other files will affect how easily another person can understand and modify our code.

We can write all our code in a single file and hardcode it with configuration values and magic numbers, but this makes it extremely difficult for someone who is new to the project to parse what we’ve written. Most data scientists work in a team environment, so our objective is writing code that is not only functional and free of defects but also readable and maintainable by others. A logically sound directory structure is the perfect starting point for good code organization.

A well-designed directory structure achieves the following:

It makes it easy for others, especially those new to the project, to understand the architecture of the system. ...

Introduction

Getting Started

Structuring the ML Pipeline

Directed Acyclic Graphs (DAGs)

The ML Library

Create Your First Data Pipeline with a Dashboard

The Pipeline Core

Extending the Pipeline

Build a News ETL Data Pipeline Using Python and SQLite

Testing

Deployment

Other Considerations

Wrapping Up

Appendix

Final Assessment

Directory Structure

Why good organization is needed