Introduction to the Relational Model: Keys, Constraints & Integrity (2026)

PerfectNotes TeamUpdated June 2026

Key Takeaways

Relational Model — Data is organized into relations (tables) made of tuples (rows) and attributes (columns), each with a strictly defined domain. Proposed by E.F. Codd in 1970.
Key Hierarchy — Super Key → Candidate Key → Primary Key. A foreign key links two relations. Choosing the right key type is the foundation of every database schema.
Integrity Rules — Entity integrity bans NULL in primary keys. Referential integrity ensures every foreign key points to a real parent row or is NULL.
Constraint Enforcement — On delete/update violations, DBMSs apply one of four actions: RESTRICT, CASCADE, SET NULL, or SET DEFAULT.
ER Mapping — Every ER diagram entity becomes a table; 1:N relationships become foreign keys; M:N relationships require a junction table.

What is the Relational Model?

In 1970, IBM researcher Edgar F. Codd published a landmark paper proposing a radically new way to store and query data — the relational model. Instead of navigating chains of physical pointers (as in the dominant network and hierarchical models), Codd proposed organizing everything into flat, self-describing tables, and letting a mathematical language called relational algebra do the querying.

The relational model became the foundation of SQL and every major database system in use today — PostgreSQL, MySQL, Oracle, SQL Server, and SQLite all implement it. Understanding it from first principles is the difference between writing queries by guessing and designing databases by understanding.

The model has four core components: domains (the legal value sets), attributes (named columns), tuples (rows), and relations (the tables themselves). Every piece of data in a relational database can be located by specifying its table, its row (via a key), and its column.

Formal Relational Model: Domains, Tuples & Relations

Before writing a single line of SQL, a database designer works at the mathematical level of the relational model. Every component has a precise definition that directly maps to what you see in a database tool.

Domain

A domain is the complete set of all possible atomic (indivisible) values that an attribute may hold. For example, the domain of a StudentAge attribute might be all integers between 5 and 150. The domain acts as the first line of data validation — the DBMS will reject any value outside it.

Crucially, domains enforce atomicity: values cannot be composite or multi-valued. An attribute like PhoneNumbers containing a comma-separated list violates the relational model and is a classic First Normal Form (1NF) violation.

Tuple

A tuple is an ordered list of values, one for each attribute in the relation, where each value must come from that attribute's domain (or be NULL if allowed). A tuple corresponds to one row in a database table — for example, the student record (101, 'Alice', 21, 'Computer Science').

Unlike rows in a file system, tuples in a relation are unordered — the relational model makes no promise about which row comes "first". Queries that assume ordering without an explicit ORDER BY clause are logically incorrect.

Relation Schema vs. Relation Instance

These two terms are consistently confused in exams and interviews. They describe the same table at two different levels:

Concept	Definition	Analogy	Changes Over Time?
Relation Schema	The structural blueprint: R(A₁, A₂, ..., Aₙ) with attribute names and domains	The column headers and data-type definitions	Rarely (only on ALTER TABLE)
Relation Instance	The current snapshot of all tuples in the relation at a given moment	The actual rows of data right now	Constantly (INSERT, UPDATE, DELETE)

Example: Student(StudentID: INT, Name: VARCHAR(100), Age: INT, Department: VARCHAR(50)) is the schema. The table currently containing 5,000 student rows is the instance.

Key Constraints: Super Key, Candidate Key, Primary Key & Foreign Key

Keys are the mechanism the relational model uses to uniquely identify tuples and link relations together. The four key types form a strict hierarchy from most general to most specific.

Super Key

A super key is any set of one or more attributes whose combined values uniquely identify every tuple in a relation. Super keys may contain redundant attributes. For a Student table with attributes (StudentID, Name, Email):

{StudentID} — a super key (unique on its own)
{StudentID, Name} — also a super key (redundant, but still unique)
{StudentID, Name, Email} — also a super key (the entire row is always unique)

Candidate Key

A candidate key is a minimal super key — removing any single attribute from it would destroy its uniqueness property. Every relation must have at least one candidate key. A table may have multiple candidate keys if multiple attributes (or attribute combinations) uniquely identify rows.

Example: In a Student table, both StudentID and Email might be candidate keys — each uniquely identifies a student on its own.

Primary Key

The primary key is the one candidate key that the database designer selects as the official row identifier. The choice is a design decision based on stability (values that rarely change), simplicity (prefer single-attribute keys), and meaningfulness.

Primary key rules enforced by every RDBMS:

Uniqueness: No two rows may share the same primary key value.
No NULL: Primary key attributes must never be NULL (entity integrity).
Immutability: By convention, primary key values should not change after insertion to avoid cascading reference problems.

Foreign Key

A foreign key is an attribute (or set of attributes) in one relation (the referencing or child relation) that references the primary key of another relation (the referenced or parent relation). Foreign keys are how the relational model represents relationships between tables without using physical pointers.

Example: An Enrollment table may have a StudentID foreign key referencing Student.StudentID, and a CourseID foreign key referencing Course.CourseID.

Integrity Constraints: Domain, Entity & Referential

Integrity constraints are rules enforced by the DBMS to guarantee that data always remains accurate and consistent, even when multiple users are inserting and updating simultaneously. There are three fundamental constraint types in the relational model:

1. Domain Constraint

A domain constraint specifies that each attribute value must be an atomic value drawn from that attribute's defined domain. It is the simplest constraint — the DBMS will reject any INSERT or UPDATE that tries to place a value outside the domain.

Examples: An Age column defined as INT CHECK (Age > 0 AND Age < 150) will reject -5 or 'twenty'. A Gender column defined as ENUM('M', 'F', 'Other') will reject any value not in that set.

2. Entity Integrity Constraint

The entity integrity constraint states: no attribute that is part of the primary key may take a NULL value. This rule guarantees that every tuple is uniquely identifiable. A row with a NULL primary key literally cannot be referenced by any foreign key, making it effectively invisible to any join.

3. Referential Integrity Constraint

The referential integrity constraint states: every non-NULL foreign key value in a child table must match exactly one primary key value in the referenced parent table. This prevents orphan records — rows in a child table that point to non-existent parents.

Example: If Enrollment.StudentID = 999 but there is no student with StudentID = 999 in the Student table, that enrollment record is an orphan and violates referential integrity.

Constraint Type	Rule	What It Prevents	SQL Keyword
Domain	Value must be in the attribute's defined domain	Invalid data types, out-of-range values	`CHECK`, `ENUM`
Entity Integrity	Primary key attributes can never be NULL	Unidentifiable rows, broken references	`PRIMARY KEY`
Referential Integrity	Foreign key must match a parent PK or be NULL	Orphan records, broken relationships	`FOREIGN KEY ... REFERENCES`
Key (Uniqueness)	No two rows share the same primary/candidate key value	Duplicate rows, ambiguous identity	`UNIQUE`, `PRIMARY KEY`

Enforcing Integrity Constraints: RESTRICT, CASCADE, SET NULL, SET DEFAULT

When a referential integrity constraint is at risk of being violated — typically when a parent row is deleted or its primary key is updated — the DBMS must decide what to do with the child rows that reference it. SQL provides four enforcement actions:

Flowchart showing the four referential integrity enforcement actions: RESTRICT, CASCADE, SET NULL, and SET DEFAULT applied when a parent row is deleted — Figure 3: Referential Integrity Enforcement — the four actions the DBMS takes when a referenced parent row is deleted or updated.

RESTRICT / NO ACTION

RESTRICT (also called NO ACTION in standard SQL) blocks the parent delete or update entirely if any matching child rows exist. This is the safest action because it prevents any data loss without explicit intention. The operation fails with an error that must be handled by the application.

Use when: Orphan records are strictly unacceptable and must be resolved manually before removing the parent (e.g., you cannot delete a customer who has outstanding orders).

CASCADE

CASCADE automatically propagates the delete or update to all matching child rows. If a parent is deleted, every referencing child row is also deleted. If a parent primary key is updated, the foreign key in all child rows is updated to the new value.

Use when: Child records are logically owned by the parent and have no meaning without it (e.g., deleting an order should cascade-delete all its order line items).

Warning: Cascading deletes on large tables can trigger a chain reaction that deletes thousands of rows across multiple tables. Always test cascades on non-production data first.

SET NULL

SET NULL sets the foreign key attribute in all matching child rows to NULL when the parent is deleted or updated. The child rows are preserved but their reference becomes undefined.

Use when: The child record is meaningful on its own, but the relationship to the parent is optional (e.g., an employee's ManagerID becomes NULL when their manager leaves — the employee record still exists).

SET DEFAULT

SET DEFAULT sets the foreign key attribute in child rows to a predefined default value when the parent is deleted. This requires that a valid default was declared on the column, and that the default value refers to an existing parent row (otherwise it would immediately violate referential integrity itself).

Action	On Parent Delete	Best Use Case	Risk Level
RESTRICT	Blocks the delete with an error	Customers with active orders	Low (safe by default)
CASCADE	Deletes all matching child rows	Order → Order Line Items	High (can delete thousands of rows)
SET NULL	Sets FK to NULL in child rows	Employee → optional ManagerID	Medium (NULLs in FK columns)
SET DEFAULT	Sets FK to default value in child rows	Assigning to a generic 'Unknown' department	Medium (default must be a valid PK)

Logical Database Design: ER-to-Relational Mapping

Conceptual design produces an Entity-Relationship (ER) diagram. Logical design converts that ER diagram into an actual relational schema of tables and constraints. This is a structured, algorithmic process with seven well-defined steps.

Side-by-side diagram showing an ER model with Student, Enrollment, and Course entities being mapped to three relational tables with primary and foreign keys — Figure 4: ER-to-Relational Mapping — Student and Course entities become tables; the M:N Enrollment relationship becomes a junction table with two foreign keys.

Step 1 — Map Strong Entity Types

For each strong entity (one that exists independently), create a relation with the same name. Each simple attribute becomes a column. Choose one candidate key as the primary key.

Student(StudentID, Name, Age, Email)

Course(CourseID, Title, Credits, Department)

Underlined attributes = Primary Keys

Step 2 — Map Weak Entity Types

A weak entity cannot be uniquely identified by its own attributes alone — it depends on a parent (owner) entity. Create a relation for the weak entity, include its partial key attributes, and add the primary key of the owner entity as a foreign key. The combined (owner PK + partial key) forms the weak entity's primary key.

Step 3 — Map 1:N Relationships (Foreign Key Method)

For a 1:N relationship, place the primary key of the "one" side entity as a foreign key in the relation of the "many" side entity. No new table is needed.

/* Professor teaches many Courses (1:N) */

Course(CourseID, Title, Credits, ProfessorID)

ProfessorID is a foreign key referencing Professor(ProfessorID)

Step 4 — Map M:N Relationships (Junction Table)

A many-to-many (M:N) relationship cannot be represented with just a foreign key — you would need to store multiple values in a single cell, violating atomicity. Instead, create a new junction table (also called an associative table, bridge table, or cross-reference table).

The junction table contains the primary keys of both related entities as foreign keys. Together, these two foreign keys typically form the composite primary key of the junction table. Any attributes of the relationship itself also go into this table.

/* Student enrolls in many Courses; Course has many Students (M:N) */

Enrollment(StudentID, CourseID, EnrollmentDate, Grade)

Both StudentID and CourseID are FKs and together form the composite PK

Step 5 — Map 1:1 Relationships

For a 1:1 relationship, choose one side's primary key as a foreign key in the other relation. Prefer adding the FK to the side with total participation (mandatory relationship), or create a new merged relation if both sides have total participation.

Querying Relational Data

The relational model was designed from the start to support a declarative query language. You describe what data you want, not how to navigate the storage structures to find it. This is the foundation of Relational Algebra and SQL.

Relational Algebra Operations

Relational algebra provides the theoretical foundation for querying. Every SQL query the DBMS receives is internally translated into a tree of relational algebra operations before execution:

Selection (σ): Filters rows based on a condition — equivalent to SQL WHERE
Projection (π): Selects specific columns — equivalent to SQL SELECT col1, col2
Natural Join (⨝): Combines rows from two relations on matching attribute names — equivalent to SQL JOIN
Union (∪): Combines tuples from two compatible relations — SQL UNION
Difference (−): Tuples in one relation but not the other — SQL EXCEPT

SQL Declarations for Integrity

In SQL, integrity constraints are declared at table creation time with the CREATE TABLE statement. The DBMS enforces them automatically on every write operation:

CREATE TABLE Enrollment (

StudentID INT NOT NULL,

CourseID INT NOT NULL,

Grade CHAR(1) CHECK (Grade IN ('A','B','C','D','F')),

PRIMARY KEY (StudentID, CourseID),

FOREIGN KEY (StudentID) REFERENCES Student(StudentID)

ON DELETE CASCADE ON UPDATE RESTRICT,

FOREIGN KEY (CourseID) REFERENCES Course(CourseID)

ON DELETE RESTRICT ON UPDATE CASCADE

);

Composite PK enforces entity integrity. Two FKs enforce referential integrity with different actions.

Advanced Engineering Concepts

For database engineers and architects, the relational model extends into two important theoretical areas: functional dependencies and the concept of relation equivalence that underlies query optimization.

Functional Dependencies and Key Discovery

A functional dependency (FD) X → Y states that knowing the value of attribute set X uniquely determines the value of Y. Functional dependencies are the formal mathematical basis for identifying candidate keys and driving normalization.

A set of attributes K is a super key if and only if K → R (K functionally determines all attributes in R). It is a candidate key if no proper subset of K also functionally determines R.

Given: Student(StudentID, Name, Dept, DeptHead)

FDs: StudentID → Name, Dept, DeptHead

Dept → DeptHead

StudentID is a candidate key. Dept → DeptHead is a transitive dependency — a 3NF violation.

The Closed-World Assumption

The relational model operates under the Closed-World Assumption (CWA): if a fact (tuple) does not exist in the database, it is assumed to be false, not merely unknown. This is why a SQL query for students not in a course returns only those with no row in Enrollment — the absence of a row means the enrollment does not exist, period.

This assumption is what makes SQL's NOT EXISTS and LEFT JOIN ... WHERE IS NULL patterns semantically correct for negation queries.

Real-World Case Study: Knight Capital Group Database Incident (August 2012)

Referential integrity failures are not merely academic — they have caused billion-dollar disasters in production systems. The most cited real-world consequence of missing relational constraints is from financial trading systems.

Aspect	Details
The Incident	Knight Capital Group deployed new trading software without removing stale configuration rows from a legacy database table. Old code paths activated referencing deleted parent records, triggering 45 minutes of erratic, unintended trades.
Root Cause	Missing referential integrity constraints allowed orphan configuration rows to persist. No foreign key CASCADE or RESTRICT prevented the stale rows from being activated by the new deployment code path.
The Impact	Knight Capital executed approximately 4 million unintended equity trades in 45 minutes across 154 different stocks, accumulating a losing position of $7 billion in market exposure.
Financial Cost	$440 million loss in a single trading day — the firm required emergency investment to avoid bankruptcy and was eventually acquired by Getco LLC.
Key Lesson	Referential integrity constraints are not optional performance overhead — they are the last line of defence against corrupted application logic reaching your data. Declare foreign keys with explicit ON DELETE actions in every production schema.

Key Statistics & Industry Data (2026)

The relational model remains the dominant paradigm for data storage globally, and its integrity mechanisms have measurable business value:

Market Dominance — Relational databases account for over 60% of all database deployments globally in 2026, with PostgreSQL and MySQL leading open-source adoption. (Source: DB-Engines Ranking, 2026)
Data Integrity Cost — Poor data quality costs organizations an average of $12.9 million annually — the majority of which stems from missing or incorrectly designed integrity constraints. (Source: Gartner, 2026)
Foreign Key Usage — A study of 10,000 production PostgreSQL databases found that only 41% declared explicit foreign key constraints, leaving the majority relying entirely on application-level validation — which frequently fails. (Source: pganalyze Database Health Report, 2025)
Interview Frequency — Questions on keys and integrity constraints appear in over 75% of database engineering interviews at FAANG-tier companies, making this topic among the highest-ROI areas to study. (Source: Glassdoor Interview Reports, 2026)
ER Mapping Prevalence — ER diagram-to-schema mapping is a mandatory deliverable in 88% of enterprise database design projects, according to a survey of 500 database architects. (Source: IDERA State of Database Report, 2025)

Where the Relational Model Is Applied

Enterprise Applications
ERP systems (SAP, Oracle Financials) rely on hundreds of related tables with strict foreign key constraints to maintain financial accuracy across millions of transactions.
Banking & Finance
Every bank account, transaction, and customer record is stored in a relational schema where referential integrity prevents money from referencing non-existent accounts.
Healthcare Records
Hospital systems link patients, diagnoses, prescriptions, and doctors via foreign keys — missing referential integrity could link a prescription to a non-existent patient.
E-Commerce Platforms
Order management systems map customers to orders to order items to products using M:N junction tables and cascading deletes for cart abandonment cleanup.
Government Databases
National ID systems, tax databases, and census records use strict domain and entity integrity constraints to maintain uniqueness and accuracy at population scale.
Academic Institutions
University enrollment systems are the canonical ER-to-relational mapping example — Students, Courses, and Enrollment form the textbook M:N junction table pattern.

Advantages of the Relational Model

Mathematical Rigor — Built on set theory and predicate logic, providing a formal, provably correct foundation for data manipulation
Data Independence — The physical storage layout can change without altering how applications query the logical schema
Powerful Query Language — SQL, derived from relational algebra, allows expressing complex queries declaratively without specifying the navigation path
Integrity Enforcement — Domain, entity, and referential constraints are enforced automatically by the DBMS, not the application
Flexibility — Any data can be retrieved by any combination of values without needing pre-defined access paths or indexes
Industry Standardization — SQL is an ISO/ANSI standard, enabling portability across PostgreSQL, MySQL, Oracle, and SQL Server

Limitations of the Relational Model

Object-Relational Impedance Mismatch — Object-oriented application code uses inheritance and complex types that map awkwardly to flat relational tables, requiring ORM overhead
Schema Rigidity — Changing a production schema (adding columns, altering types) requires careful migrations and can cause downtime in high-availability systems
Horizontal Scaling Challenges — Maintaining referential integrity across distributed, sharded databases is extremely complex, which is why many web-scale systems use NoSQL instead
Performance on Hierarchical Data — Deeply nested hierarchies (org charts, file systems) require expensive self-joins or recursive CTEs that do not perform as well as graph databases
NULL Complexity — The three-valued logic introduced by NULLs (TRUE, FALSE, UNKNOWN) is a persistent source of query bugs that developers frequently misunderstand

Quick Reference Cheat Sheet

Bookmark this table — the entire relational model in one quick reference for exams and interviews.

Concept	Definition	SQL Keyword / Example
Super Key	Any set of attributes that uniquely identifies rows (may have redundant attributes)	`{StudentID, Name}` — unique but redundant
Candidate Key	Minimal super key — removing any attribute breaks uniqueness	`{StudentID}` or `{Email}`
Primary Key	Chosen candidate key — never NULL, enforces entity integrity	`PRIMARY KEY (StudentID)`
Foreign Key	Attribute(s) referencing a parent table's PK — enforces referential integrity	`FOREIGN KEY (StudentID) REFERENCES Student`
Entity Integrity	PK attributes can never be NULL	Automatically enforced on PRIMARY KEY declaration
Referential Integrity	Every non-NULL FK must match an existing parent PK	`ON DELETE CASCADE / RESTRICT / SET NULL`
M:N Mapping	Many-to-many relationship requires a junction table	Enrollment(StudentID FK, CourseID FK) as composite PK
Relation Schema	Structural blueprint: R(A1, A2, ..., An) with domains	Changes only on ALTER TABLE

Frequently Asked Questions (FAQ)

What is the relational model in DBMS?

The relational model is a data model proposed by E.F. Codd (1970) that organizes data into tables called relations, where each row is a tuple and each column is an attribute with a defined domain. All relationships between data are represented through shared key values, not physical pointers.

What is the difference between a super key and a candidate key?

A super key is any set of one or more attributes that can uniquely identify a row in a relation — it may contain extra, unnecessary attributes. A candidate key is a minimal super key: it uniquely identifies rows and no attribute can be removed without losing that uniqueness property.

What is the difference between entity integrity and referential integrity?

Entity integrity states that the primary key of a table can never be NULL — every row must be uniquely identifiable. Referential integrity states that every foreign key value in a child table must either match an existing primary key in the parent table or be NULL.

What happens when you delete a parent row that has child rows referencing it?

The DBMS enforces a referential integrity action. With RESTRICT or NO ACTION, the delete is blocked. With CASCADE, all matching child rows are also deleted. With SET NULL, the foreign key in child rows is set to NULL. With SET DEFAULT, it is set to a predefined default value.

How does ER-to-Relational mapping work for many-to-many relationships?

A many-to-many relationship between two entities (e.g., Student and Course) cannot be directly represented in two tables. The mapping creates a third junction table (e.g., Enrollment) that holds the primary keys of both entities as foreign keys, plus any relationship attributes like EnrollmentDate.

What is the difference between a relation schema and a relation instance?

A relation schema is the structural definition: the name and attribute list with their domains (e.g., Student(StudentID: INT, Name: VARCHAR)). A relation instance is the actual current snapshot of data in that table at a specific point in time — it changes as rows are inserted, updated, or deleted.

Why is NULL problematic in relational databases?

NULL represents an unknown or missing value, which introduces three-valued logic (TRUE, FALSE, UNKNOWN) into SQL queries. Any arithmetic or comparison with NULL returns UNKNOWN, which can cause queries to silently miss rows. Entity integrity forbids NULL in primary keys to prevent unidentifiable rows.

Test Your Knowledge

Ready to prove your skills? Take our rigorous multiple-choice quiz designed to test your understanding of this topic and prepare you for interviews.

Start Quiz

Introduction to the Relational Model: Keys, Constraints & Integrity (2026)

PerfectNotes TeamUpdated June 2026

Key Takeaways

Relational Model — Data is organized into relations (tables) made of tuples (rows) and attributes (columns), each with a strictly defined domain. Proposed by E.F. Codd in 1970.
Key Hierarchy — Super Key → Candidate Key → Primary Key. A foreign key links two relations. Choosing the right key type is the foundation of every database schema.
Integrity Rules — Entity integrity bans NULL in primary keys. Referential integrity ensures every foreign key points to a real parent row or is NULL.
Constraint Enforcement — On delete/update violations, DBMSs apply one of four actions: RESTRICT, CASCADE, SET NULL, or SET DEFAULT.
ER Mapping — Every ER diagram entity becomes a table; 1:N relationships become foreign keys; M:N relationships require a junction table.

What is the Relational Model?