Monday 30 January 2012

Null Values


nIt is possible for tuples to have a null value, denoted by null, for some of their attributes
nnull signifies an unknown value or that a value does not exist.
nThe result of any arithmetic expression involving null is null.
nAggregate functions simply ignore null values (as in SQL)
nFor duplicate elimination and grouping, null is treated like any other value, and two nulls are assumed to be  the same (as in SQL)

nComparisons with null values return the special truth value: unknown
lIf false was used instead of unknown, then    not (A < 5)
               would not be equivalent to              
A >= 5
nThree-valued logic using the truth value unknown:
lOR: (unknown or true)         = true,
       (
unknown or false)        = unknown
       (
unknown or unknown) = unknown
lAND:   (true and unknown)         = unknown,  
          
(false and unknown)        = false,
          
(unknown and unknown) = unknown
lNOT(not unknown) = unknown
lIn SQL “P is unknown evaluates to true if predicate P evaluates to unknown
nResult of select  predicate is treated as false if it evaluates to unknown

Rename Operation


nAllows us to name, and therefore to refer to, the results of relational-algebra expressions.
nAllows us to refer to a relation by more than one name.
nExample:
   r x (E)
  returns the expression E under the name X
nIf a relational-algebra expression E has arity n, then
                                         
  returns the result of expression E under the name X, and with the
  attributes renamed to A1 , A2 , …., An .

Banking Example

branch (branch_name, branch_city, assets)

customer (customer_name, customer_street, customer_city)
account (account_number, branch_name, balance)
loan (loan_number, branch_name, amount)
depositor (customer_name, account_number)
borrower (customer_name, loan_number)


Query Languages


nLanguage in which user requests information from the database.
nCategories of languages
lProcedural
lNon-procedural, or declarative
n“Pure” languages:
lRelational algebra
lTuple relational calculus
lDomain relational calculus
nPure languages form underlying basis of query languages that people use.

Relational Algebra
nProcedural language
nSix basic operators
lselect: s
lproject: Õ
lunion: È
lset difference:
lCartesian product: x
lrename: r
nThe operators take one or  two relations as inputs and produce a new relation as a result.





Keys ( Database )


Let K Í R
K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R)
by “possible r ” we mean a relation r that could exist in the enterprise we are modeling.
Example:  {customer_name, customer_street} and
                 {
customer_name}
are both
superkeys of Customer, if no two customers can possibly have the same name
4In real life, an attribute such as customer_id would be used instead of customer_name to uniquely identify customers, but we omit it to keep our examples small, and instead assume customer names are unique.


Keys (Cont.)

nK is a candidate key if K is minimal
Example:  {
customer_name} is a candidate key for Customer, since it is a superkey and no subset of it is a superkey.
nPrimary key: a candidate key chosen as the principal means of identifying tuples within a relation
lShould choose an attribute whose value never, or very rarely, changes.
lE.g. email address is unique, but may change

Relational Model ( Database system concepts )


Example of a Relation




Attribute Types

Each attribute of a relation has a name
The set of allowed values for each attribute is called the domain of the attribute
Attribute values are (normally) required to be atomic; that is, indivisible
lE.g. the value of an attribute can be an account number,
but cannot be a set of account numbers
Domain is said to be atomic if all its members are atomic
The special value null  is a member of every domain
The null value causes complications in the definition of many operations
** We shall ignore the effect of null values in our main presentation and consider their effect later


Relation Schema

Formally, given domains D1, D2, …. Dn a relation r is a subset of
       
D1D2  x … x Dn
Thus, a relation is a set of
n-tuples (a1, a2, …, an) where each ai  Î Di
Schema of a relation consists of
attribute definitions
4name
4type/domain
integrity constraints


Relation Instance

The current values (relation instance) of a relation are specified by a table
* An element t of r is a tuple, represented by a row in a table
Order of tuples is irrelevant (tuples may be stored in an arbitrary order)




Instances and Schemas

nSimilar to types and variables in programming languages
nSchema – the logical structure of the database
He.g., the database consists of information about a set of customers and accounts and the relationship between them)
HAnalogous to type information of a variable in a program
HPhysical schema: database design at the physical level
HLogical schema: database design at the logical level
nInstance – the actual content of the database at a particular point in time
HAnalogous to the value of a variable
nPhysical Data Independence – the ability to modify the physical schema without changing the logical schema
HApplications depend on the logical schema
HIn general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.

Data Models

nA collection of tools for describing
Hdata
Hdata relationships
Hdata semantics
Hdata constraints
nEntity-Relationship model
nRelational model
nOther models:
Hobject-oriented model
Hsemi-structured data models
HOlder models: network model and hierarchical model




Database Management System (DBMS)


nCollection of interrelated data

nSet of programs to access the data

nDBMS contains information about a particular 
  enterprise

nDBMS provides an environment that is both 
convenient and efficient to use.

nDatabase Applications:
HBanking: all transactions
HAirlines: reservations, schedules
HUniversities:  registration, grades
HSales: customers, products, purchases
HManufacturing: production, inventory, orders, supply chain
HHuman resources:  employee records, salaries, tax deductions

nDatabases touch all aspects of our lives


Purpose of Database System

In the early days, database applications were 
   built on top of file systems

Drawbacks of using file systems to store data:

HData redundancy and inconsistency
4Multiple file formats, duplication of information in different files
HDifficulty in accessing data
4Need to write a new program to carry out each new task
HData isolation — multiple files and formats
HIntegrity problems
4Integrity constraints  (e.g. account balance > 0) become part of program code
4Hard to add new constraints or change existing ones