Schemaless Design in PostgreSQL
Learn about schemaless design and how PostgreSQL manages giant documents.
We'll cover the following...
An area where the NoSQL systems have been prominent is in breaking with the normalization rules and the hard step of modeling a database schema. Instead, most NoSQL systems will happily manage any data the application sends through. This is called the schemaless approach.
In truth, there’s no such thing as a schemaless design, actually. What it means is that the name and type of the document properties, or fields, are hard-coded into the application code.
JSON data set
A readily available JSON dataset is provided at MTGJSON that provides Magic: The GatheringTM card data in JSON format, using the CC0 license. We can load it easily given this table definition:
begin;drop schema if exists magic cascade;create schema magic;create table magic.allsets(data jsonb);commit;
Then we use a small Python script:
#! /usr/bin/env python3import psycopg2PGCONNSTRING = "user=postgres dbname=magic"if __name__ == '__main__':pgconn = psycopg2.connect(PGCONNSTRING)curs = pgconn.cursor()allset = open('MagicAllSets.json').read()allset = allset.replace("'", "''")sql = "insert into magic.allsets(data) values('%s')" % allsetcurs.execute(sql)pgconn.commit()pgconn.close()
Giant documents and PostgreSQL
The giant JSON document in a single table doesn’t represent the kind of schemaless design. It goes too far to push a 27 MB document containing collections of cards ...