Schemaless Design in PostgreSQL
Explore the concept of schemaless design in PostgreSQL by using JSON data types and GIN indexes. Understand how to manage flexible document structures, query large JSON datasets efficiently, and the trade-offs regarding data quality when using schemaless approaches. Gain practical experience using PostgreSQL to handle semistructured data with example queries and scripts.
We'll cover the following...
An area where the NoSQL systems have been prominent is in breaking with the normalization rules and the hard step of modeling a database schema. Instead, most NoSQL systems will happily manage any data the application sends through. This is called the schemaless approach.
In truth, there’s no such thing as a schemaless design, actually. What it means is that the name and type of the document properties, or fields, are hard-coded into the application code.
JSON data set
A readily available JSON dataset is provided at MTGJSON that provides Magic: The GatheringTM card data in JSON format, using the CC0 license. We can load it easily given this table definition:
Then we use a small Python script:
Giant documents and PostgreSQL
The giant JSON document in a single table doesn’t represent the kind of schemaless design. It goes ...