Relational Algebra

Relational Algebra is a formal, mathematical language used to describe operations on relations (tables). It defines what operations are allowed on data and how results are derived, independent of any database software.

Why Relational Algebra Matters

Relational algebra is important because it:

  • Forms the theoretical foundation of SQL
  • Helps DBMS optimize queries
  • Provides a precise way to reason about correctness

Think of SQL as the user language and relational algebra as the engine language.


Operands and Results

A key property of relational algebra is closure.

  • Input → Relations (tables)
  • Output → Relations (tables)

Because every operation produces another relation, the results can be fed back as inputs to further operations. This chaining makes it possible to build complex queries step by step, while always staying within the relational model.


Fundamental Operations

OperationSymbolDescriptionWorks onKey Notes
SelectionσFilters rows based on a conditionRows (tuples)Like SQL WHERE
ProjectionπSelects specific columns; removes duplicate rowsColumns (attributes)Eliminates duplicates automatically
UnionCombines rows from two compatible relationsRowsRequires same schema
Set DifferenceRows in first relation but not in secondRowsRequires same schema
Cartesian Product×Combines every row of one relation with every row of anotherRowsBasis for joins; can be very large

Derived Operations

OperationSymbolDescriptionKey Notes
Theta Join⨝_θCartesian product followed by selection on a condition θGeneral join with any condition
Equi-Join⨝_θ (θ uses =)Theta join using equality conditionsKeeps both joining columns
Natural JoinEqui-join on all attributes with the same name, then removes duplicate columnsConvenient but can be dangerous if names clash unintentionally

Example

Students

sidnameagedept_id
101Alice211
102Bob192
103Carol221
104David203

Department

dept_iddept_name
1Computer Science
2Mathematics
3Physics

Ques: Find students older than 20

σ_{age > 20}(Students)

Ques: Get names of all students

π_{name}(Students)

Ques: Get names of students older than 20

π_{name}(σ_{age > 20}(Students))

How This Fits in the DBMS Stack

  • ER Model → Conceptual design
  • Relational Model → Tables & constraints
  • Normalization → Clean structure
  • Relational Algebra → Query logic
  • SQL → User interface

Relational algebra is the bridge between theory and execution.


Conclusion

Relational algebra defines how databases reason about data. It gives DBMS the freedom to optimize queries while guaranteeing correctness. You may never write it directly, but every efficient query depends on it.