Introduction to Data Manipulation and Concurrency Control
Explore data manipulation techniques and concurrency control using PostgreSQL. Understand how to structure normalized database models to prevent duplication and conflicts. Learn to apply INSERT, UPDATE, and DELETE commands with returning clauses for efficient data handling.
Tweets dataset
We used a dataset of 200,000 USA-geolocated Tweets with a very simple data model. The data model is a direct port of the Excel sheet format, allowing a straightforward loading process—we used the \copy command from the psql tool.
Database model and normalization
The tweets.sql database model is all wrong per the normal forms introduced earlier:
-
There’s neither a unique constraint nor a primary key, so there is nothing preventing the insertion of duplicate entries, violating 1NF.
-
Some non-key attributes are not dependent on the key because we mix data from the Twitter account posting the message and the message itself, violating ...