...

/

Schemaless Design in PostgreSQL

Schemaless Design in PostgreSQL

Learn about schemaless design and how PostgreSQL manages giant documents.

An area where the NoSQL systems have been prominent is in breaking with the normalization rules and the hard step of modeling a database schema. Instead, most NoSQL systems will happily manage any data the application sends through. This is called the schemaless approach.

In truth, there’s no such thing as a schemaless design, actually. What it means is that the name and type of the document properties, or fields, are hard-coded into the application code.

JSON data set

A readily available JSON dataset is provided at MTGJSON that provides Magic: The GatheringTM card data in JSON format, using the CC0 license. We can load it easily given this table definition:

Press + to interact
begin;
drop schema if exists magic cascade;
create schema magic;
create table magic.allsets(data jsonb);
commit;

Then we use a small Python script:

Press + to interact
#! /usr/bin/env python3
import psycopg2
PGCONNSTRING = "user=postgres dbname=magic"
if __name__ == '__main__':
pgconn = psycopg2.connect(PGCONNSTRING)
curs = pgconn.cursor()
allset = open('MagicAllSets.json').read()
allset = allset.replace("'", "''")
sql = "insert into magic.allsets(data) values('%s')" % allset
curs.execute(sql)
pgconn.commit()
pgconn.close()

Giant documents and PostgreSQL

The giant JSON document in a single table doesn’t represent the kind of schemaless design. It goes too far to push a 27 MB document containing collections of cards ...