Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

definition
data
data science
databases
big data

What is a data pipeline?

Educative Answers Team

Data pipeline, or pipeline, is a series of data processing steps. First, Data is ingested at the beginning of the pipeline. Then, there are a series of steps where the output of one step is the input of the next one. This continues until the pipeline is complete. The steps of a pipeline are often executed in a parallel or time-sliced fashion.

svg viewer

Data pipelines consist of three key elements: a source, a processing step or steps, and a destination. The source may be a database, an application, or a cloud. The output may be data consumers like a machine learning or data visualization algorithm or even another database.

Data pipelines enable the flow of data from, for example, an application to a data warehouse, a data lake to an analytics database, or into a payment processing system.

Common processing steps in data pipelines include data transformation, augmentation, enrichment, filtering, grouping, aggregating, and the running of algorithms against that data.

RELATED TAGS

definition
data
data science
databases
big data
Copyright ©2022 Educative, Inc. All rights reserved
RELATED COURSES

View all Courses

Keep Exploring