- Natality Streaming

Creating pipeline with Dataflow using the Natality dataset.

PubSub can be used to provide data sources and data sinks within a Dataflow pipeline, where a consumer is a data source and a publisher is a data sink.

Example

We’ll reuse the Natality dataset to create a pipeline with Dataflow, but for the streaming version, we’ll use a PubSub consumer as the input data source rather than a BigQuery result set.

Defining functions

For the output, we’ll publish predictions to Datastore and reuse the published DoFn from the previous chapter.

Get hands-on with 1200+ tech skills courses.