With a table created with this command:
CREATE TABLE people (
age INTEGER NOT NULL,
name TEXT NOT NULL
);
We can have columns that repeat the same value. The DBMS does not care.
But, we can force a column to have only unique values using the UNIQUE
key constraint:
DROP TABLE people;
CREATE TABLE people (
age INTEGER NOT NULL,
name TEXT NOT NULL UNIQUE
);
Now if you try to add the ‘Flavio’ twice:
INSERT INTO people (age, name) VALUES (39, 'Flavio');
INSERT INTO people (age, name) VALUES (38, 'Flavio');
You’d get an error:
A primary key is a unique key that has another property: it’s the primary way we identify a row in the table.
What we usually do (but not always) is we use the SERIAL
type for the primary key, which means it’s a unique INTEGER
value that we assign to each record and we increment it automatically any time we add as new record.
DROP TABLE people;
CREATE TABLE people (
id SERIAL PRIMARY KEY,
age INTEGER NOT NULL,
name TEXT NOT NULL
);
So when we add an item we can omit its value, because the DBMS knows its value:
INSERT INTO people (age, name) VALUES (39, 'Flavio');
INSERT INTO people (age, name) VALUES (20, 'Roger');
id | age | name
---+-----+------
1 | 39 | 'Flavio'
2 | 20 | 'Roger'
It’s important to define a primary key so:
- we always know we can uniquely identify a row in the table
- we can create relationships between tables, so a table can reference a row into another table
- we can have data integrity and referential integrity (something we’ll not talk about, but it’s important to note)
- and last but not least, it’s important for speed, so the database can figure out how to optimize your table