Neptune ML

Explore how Neptune ML integrates Amazon Neptune with SageMaker and graph neural networks to provide learned predictions from graph data. Learn about its pipeline stages, supported prediction tasks like node and edge classification, regression, and link prediction, plus operational and security considerations. This lesson helps you understand how to operationalize machine learning predictions within graph queries for real-time or batch inference scenarios.

We'll cover the following...

The Neptune ML workflow
- Stage-by-stage breakdown
Graph neural networks and task families
- Supported inference task families
Operationalizing predictions in queries
- Practical query pattern
  - Online vs. batch inference
Security and production considerations
Conclusion

While Neptune Analytics provides deterministic, in-memory graph algorithms for investigatory workloads, many production scenarios demand something different: learned predictions that generalize from graph topology and node features to answer questions the data does not explicitly contain. Neptune ML fills this gap by integrating Amazon Neptune with Amazon SageMaker AI and the Deep Graph Library (DGL)An open-source framework optimized for building and training graph neural networks on large-scale graph-structured data. to deliver machine learning predictions directly through graph queries.

Neptune ML is not a standalone training system embedded inside the database engine. It orchestrates an external pipeline that exports graph data, trains graph neural network models on SageMaker infrastructure, and surfaces predictions through Neptune's Gremlin or SPARQL query interface. The key distinction from Neptune Analytics is fundamental. Analytics runs deterministic algorithms such as PageRank or shortest path in memory, producing exact answers. Neptune ML produces probabilistic predictions learned from graph structure and feature signals, estimating outcomes that are not yet recorded in the graph.

Note: Neptune ML does not replace Neptune Analytics, or vice versa. Production architectures often combine both, using Analytics for algorithmic scoring and ML for predictive inference, depending on whether the question requires a computed answer or a learned one.

Several authority terms anchor this lesson. Graph neural networks (GNNs)A class of deep learning models that learn node and edge representations by iteratively aggregating information from graph neighborhoods. form the model architecture. SageMaker training jobs handle compute-intensive model fitting. SageMaker hosted inference endpoints serve predictions at query time. The four core inference task families, node classification, edge classification, regression, and link prediction, define what Neptune ML can predict.

The following diagram illustrates how data flows through the Neptune ML pipeline from the graph store to application-consumable predictions.

1.Introduction

2.Common Foundation for All AWS Database Study

Cloud Lab

3.Amazon RDS

Cloud Lab

Cloud Lab

4.Amazon Aurora

Cloud Lab

5.Amazon DocumentDB

Cloud Lab

Cloud Lab

6.Amazon DynamoDB

Cloud Lab

Cloud Lab

7.Amazon ElastiCache

Cloud Lab

8.Amazon KeySpaces

Cloud Lab

9.Amazon MemoryDB

Cloud Lab

10.Amazon Neptune

Cloud Lab

11.Amazon Timestream

Cloud Lab

12.Conclusion

Neptune ML

The Neptune ML workflow