Relation Model

Relational Data Structure

We define a relation to be a subset of a Cartesian product of a list of domains.

Formally, given sets , a relation is a subset of Thus a relation is a set of n-tuples where .

And the difference here are :

  • Assign attribute name to every domain
  • Finite set

Relation Schema

are attributes, each attribute of a relation has a name, and is a relation schema. is a relation on the relation schema .

Relation Instance

The current values(a.k.a. relation instance) of a relation are specified by a table. An element of is a tuple, represented by a row in a table.

image-20230919140820132

  • Order of tuples is irrelevant, tuples may be stored in an arbitrary order
  • Order of attributes is irrelevant
  • For all relations, the domains of all attributes be atomic
  • Attributes names must be different
  • Several attributes can have the same domain
  • Tuples are not duplicate

Super Key

is a relation schema, is a super key for ,for any , no two distinct tuples have the same values on . That is, if and are in and , then . Values for are sufficient to identify a unique tuple of each possible relation by "possible ".

A candidate key of en entity set is a minimal super key. Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.

Candidate Key has two properties:

  • Uniqueness. No legal value of ever contains two distinct tuples with the same value for
  • Irreducibility. No proper subset of has the uniqueness property.

Foreign Key

For , , includes the Primary Key of , is called a foreign key referencing . is called the referencing relation of the foreign key dependency. is called the referenced relation of the foreign key.

Integrity Constraint

Integrity constraint of primary key: each specified column of primary key is assumed to be not null.

Referential integrity constraint: the value of foreign key in any tuple of relation are either null or must appear as the value of primary key of a tuple of relation .

Fundamental Relational-Algebra-Operations

We have six basic operators:

  • Select
  • Project
  • Union
  • Set difference
  • Cartesian Product
  • Rename

Select

Notation , is called the selection predicate, defined as where is a formula in propositional calculus consisting of terms connected by and, or , not.

Project

Notation: where , are attribute names and is a relation name.

The result is defined as the relation of columns obtained by erasing the columns that are not listed. And the duplicate rows will be removed since relations are sets.

Union

Notation: For to be valid:

  • and must have the same number of attribute

  • The attribute domains must be compatible

  • And the attribute name can be not the same

Set Difference

Notation: For to be valid, the and has the same requirements with union operation.

Cartesian Product

Notation: Combine information from and . Assume that there are relations in , relations in , there will be in ; Assume that has attributes and has attributes, there will be in

Rename

Allow us to rename and therefore to refer to the result of relational-algebra expressions. And allow use to refer to a relation by more than one name.

returns the expression under the name of .

Additional Relational-Algebra-Operations

We define additional operations that do not add any power to the relation algebra but simplify common queries.

Set Intersection

Notation: Having the same requirements with union operation.

And we have:

Join

And the is

Natural Join

Natural join is the default of join: is the same attribute of and .

Outer Join

An extension of the join operation to avoid loss of information. Computes the join and then adds tuples from one relation that does not match tuples in other relation to the result of the join; use null value to fulfill the nonexistent attribute.

And there are three outer join:

  • Left outer join
  • Right outer join
  • Full outer join

Extended Relational-Algebra-Operations

Null Values

It is possible for tuples to have a null value denoted by null for some of their attributes.

Null signifies an unknown value or that a value does not exist. The result of any arithmetic expression involving null is null. Aggregate functions simply ignore null values. Duplicate elimination and grouping, null is treated like any other value and two null are assumed to be the same.

Comparisons with null values return the special truth value: unknown. And we can extend logic to three values with:

  • Unknown or true is true

    Unknown or false is unknown

    Unknown or Unknown is unknown

  • True and unknown is unknown

    False and unknown is false

    Unknown and Unknown is unknown

  • Not unknown is unknown

Modification Of the Database

Deleting

where is a relation and is a relation algebra query.

Insertion

where is a relation and is a relation algebra query.

Updating

Each is either the ith attribute of , if this attribute is not updated or if the attribute is to be updated, as an expression.