Query Languages and Modeling

Explore how Amazon Neptune supports three query languages—Gremlin, openCypher, and SPARQL—each tied to a specific graph model. Understand schema-light modeling principles, learn to design efficient traversals, and apply tuning techniques to optimize query performance for real-world graph workloads.

We'll cover the following...

Gremlin, openCypher, and SPARQL
Schema-light modeling in Neptune
- Modeling principles for property graphs
- The supernode problem
Traversal design and performance
- Selectivity and starting position
- Depth vs. breadth
Query tuning thinking for Neptune
Conclusion

With Neptunes cluster architecture, secure connectivity, and S3-based bulk loading already in place, the operational question shifts from infrastructure to expression. How do you describe the graph logic you want Neptune to execute? Neptune supports three query languages, but each is bound to a specific graph model. Selecting the wrong model is often a costlier architectural mistake than writing a slow query. This lesson walks through Gremlin, openCypher, and SPARQL as distinct paradigms, then connects language choice to schema-light modeling, traversal design, and the performance thinking that underpins every well-built graph workload.

The arc is deliberate. First, you will see how each language works and what graph model it assumes. Then you will learn why Neptunes lack of enforced schema does not excuse careless modeling. Finally, you will understand how traversal shape drives latency and how to tune queries before workload-specific patterns like fraud detection or recommendation engines enter the picture in the next lesson.

Gremlin, openCypher, and SPARQL

Each of Neptunes three query languages reflects a fundamentally different way of thinking about graph data. Understanding the paradigm behind each language matters more than memorizing syntax because the paradigm determines how you model data, how the engine executes your request, and where performance bottlenecks appear.

Gremlin as imperative traversal

Gremlin is a traversal-oriented languageA query paradigm where the developer explicitly programs each step the engine takes through the graph, controlling the order of navigation, filtering, and projection. for property graphs. The developer writes a chain of steps that tell Neptune exactly how to walk through vertices and edges. A query like g.V().has(Person,name,Alice).out(KNOWS).values(name) starts at vertices labeled Person, filters to Alice, follows outbound KNOWS edges, and returns the name property of each neighbor. Every step is explicit, giving the developer fine-grained control over traversal order and intermediate filtering. This imperative style is powerful when you need to shape complex, multi-hop paths where the order of operations matters.

openCypher as declarative pattern matching

openCypher takes the opposite approach. Instead of prescribing steps, you describe the shape of the subgraph you want to find using an ASCII-art syntax. A query like MATCH (a:Person {name:Alice})-[:KNOWS]->(b) RETURN b.name declares a pattern, and Neptunes query planner decides the most efficient execution order. Teams familiar with SQL often find openCyphers declarative feel more natural. The trade-off is that you cede control over execution order to the planner, which is usually beneficial but can produce suboptimal plans when patterns are loosely constrained.

SPARQL

...

1.Introduction

2.Common Foundation for All AWS Database Study

Cloud Lab

3.Amazon RDS

Cloud Lab

Cloud Lab

4.Amazon Aurora

Cloud Lab

5.Amazon DocumentDB

Cloud Lab

Cloud Lab

6.Amazon DynamoDB

Cloud Lab

Cloud Lab

7.Amazon ElastiCache

Cloud Lab

8.Amazon KeySpaces

Cloud Lab

9.Amazon MemoryDB

Cloud Lab

10.Amazon Neptune

Cloud Lab

11.Amazon Timestream

Cloud Lab

12.Conclusion

Query Languages and Modeling

Gremlin, openCypher, and SPARQL

Gremlin as imperative traversal

openCypher as declarative pattern matching

SPARQL