database system (Normalization)

Description

The question was uploaded to the file is about database system (Normalization) also, there is a PDF file was uploaded to understand the hole question 1. Consider the following relation that involves student data:
GRADE(ClassName, Section, Term, Grade, StudentID, StudentName,
ProfessorName, Department, ProfessorEmail)
Assume that only one professor teaches a class section, and professors teach in only one
department. Suppose the following dependencies exist:
StudentID -> StudentName
(ClassName, Section, Term) -> ProfessorName
ProfessorName -> Department, ProfessorEmail
What normal form is the relation in? Normalize it up to 3NF.

Don't use plagiarized sources. Get Your Custom Assignment on
database system (Normalization)
From as Little as $13/Page

Unformatted Attachment Preview

Major Steps in Database Design

Requirements Analysis:
• What data is to be stored, and what operations and requirements are desired

Conceptual Database Design:
• Develop a high-level description of the data and constraints

Logical Database Design:
• Convert the conceptual model to relational schema

Schema Refinement:
• Look for potential problems and try to redesign

Physical Database Design:
• Direct the DBMS into choice of underlying data layout (e.g., indexes and clustering) in hopes
of optimizing the performance

Applications and Security Design:
• How will the underlying database interact with surrounding applications
2
Informal Measures for Relational Database Design
GUIDELINE 1:
Informally, each tuple in a relation should represent one
entity or relationship.
GUIDELINE 2:
Design a schema that does not suffer from the insertion,
deletion, and update anomalies.
GUIDELINE 3:
Relations should be designed such that their tuples will
have as few NULL values as possible.
3
Problem with Redundancy
Redundant data stores the same ‘information’ more than once
i.e., the redundant data could be removed without the loss of information
Example: Relation contains staff and department details:
staffNo job
dept dname
city
SL10 Salesman 10
Sales
Stratford
SA51 Manager 20 Accounts
Barking
DS40 Clerk
20 Accounts Barking
OS45 Clerk
30
Operations Barking
Such ‘redundancy’
could lead to the
following
‘anomalies’
Insert Anomaly: Can’t insert a dept without inserting a member of staff
that works in that department.
Update Anomaly: Can’t change the name of the dept that SA51 works in
without having to change the dept that DS40 works in.
Deletion Anomaly: By removing employee SL10 we have removed all
information pertaining to the Sales dept.
4
Problem with Multi-valued Attributes (Repeating Groups)
A multi-valued attribute is an attribute that can have more than one
value for a primary key value
Example: A relation that contains staff and department details and
a list of telephone contact numbers
staffNo job
dept dname
city
SL10 Salesman 10
Sales
Stratford
SA51 Manager 20 Accounts
Barking
DS40 Clerk
20 Accounts Barking
OS45 Clerk
30
Operations Barking
contact number
018111777, 018111888, 079311122
017111777
079311555
Multi-valued attributes are not allowed in a relational
design, since all attributes have to be ‘atomic’
i.e., there can only be one value per cell in a table!
5
Definitions
6
Review of Definitions
• CANDIDATE KEY is a set of attributes, K=(Ai..Ak) from the schema, A1..An, which, at all
times, satisfies:
• UNIQUENESS: No two distinct tuples have same K-value
• MINIMALITY: None of Ai…Ak can be discarded and still maintain uniqueness
• PRIMARY KEY = one candidate key is designated primary
• ALTERNATE KEY = non-primary candidate key
• For example, the STUDENT relation may have username as Primary key and SSN as alternate key
• SUPERKEY = attributes that is the superset of a candidate key
STUDENT(SID, SSN, name, bdate)
7
Functional Dependency
Formal Definition: Given a relation R with attribute sets X and Y, we say that R.Y is
FUNCTIONALLY DEPENDENT (FD) on R.X or R.X FUNCTIONALLY DETERMINES R.Y,
written R.X -> R.Y if the following holds
iff (x,y1) and (x,y2) in R[X,Y] implies y1=y2.
X is called the determinant and Y is called the object of the determinant.
8
Functional Dependency/2
Compound Determinants: If more than one attribute is necessary to determine another
attribute in an entity, then such a determinant is a composite determinant.
Full Functional Dependency: For composite attribute X, R.Y is FULLY FUNCTIONALLY
DEPENDENT (FFD) on R.X or R.X FULLY FUNCTIONALLY DETERMINES R.Y, written R.X -> R.Y
iff R.X → R.Y but R.Z not→ R.Y for any proper subset Z  X
Full Functional Dependencies
(Sid, Course-id) → grade
9
Functional Dependency/3
Partial Functional Dependency: This is the situation that exists if it is necessary to
only use a subset of the attributes of the composite determinant to identify its
object uniquely.
10
Transitive Dependency
Definition: A transitive dependency exists when there is an intermediate functional
dependency.
Formal Notation: If A → B and B → C, then it can be stated that the following
transitive dependency exists: A → B → C
Example:
staffNo job
dept dname
SL10 Salesman 10
Sales
SA51 Manager 20 Accounts
DS40 Clerk
20 Accounts
OS45 Clerk
30
Operations
Transitive Dependencies
staffNo → dept
dept → dname
staffNo → dept → dname
+Repetition of data!
Transitive Dependencies
Sid → Activity → Fee
11
Normalization
12
Normalization: What Is It?
• The process of decomposing unsatisfactory “bad” relations by breaking
up their attributes into smaller relations
• A set of formal conditions that assure that a database is maintainable
! Results of a well executed normalization process are the same as
those of a well planned E-R model
13
Normalization: Goal
Relational Database Design: All attributes in a table must be
atomic and solely dependent upon the full (candidate) key of
that table.
THE KEY, THE WHOLE KEY, AND NOTHING BUT THE KEY!
14
Normalization: When
• Two relational design approaches
• Perform a conceptual schema design using ER and map the conceptual design
into a set of relations
Use normalization for refinement
• Design the relations based on external knowledge derived from an existing
implementation of files or forms or reports
Integrate normalization into the design process
15
Normalization: Stages
Unnormalised
(UNF)
Remove multi-valued attributes
First normal form
(1NF)
Remove partial dependencies
Second normal form
(2NF)
Remove transitive dependencies
Third normal form
(3NF)
Boyce-Codd normal
form (BCNF)
Fourth normal form
(4NF)
Fifth normal form
(5NF)
Remove remaining functional
dependency anomalies
Remove multivalued dependencies
Remove remaining anomalies
16
Normalization: Normal Forms
• The defining conditions get stronger from 1NF to 4NF and therefore
the sets of qualifying relations gets smaller from 1NF to 4NF
• By defining conditions 4NF => BCNF => 3NF => 2NF => 1NF
• By classes of relations 4NF  BCNF  3NF  2NF  1NF
17
Unnormalised Normal Form (UNF)
Definition: A relation is unnormalized when it has not had any
normalization rules applied to it, and it has non-atomic values and
suffers from various anomalies.
This only tends to occur where the relation has been designed using a
‘bottom-up approach’. i.e., the capturing of attributes to a ‘Universal
Relation’ from a screen layout, manual report, manual document, etc…
18
Unnormalised Normal Form (UNF)
ORDER
Customer No:
Name:
Address:
001964
Mark Campbell
1 The House
Leytonstone
E11 9ZZ
Product
Number
Product
Description
T5060
Order Number:
Order Date:
00012345
14-Feb-2002
Unit
Price
Order
Quantity
Line
Total
Hook
5.00
5
25.00
PT42
Bolt
2.50
10
20.50
QZE48
Spanner
20.00
1
20.00
Order Total:
65.50
Order_no -> all non-key attributes
Cust_no -> cust_name, cust_add
Prod_no -> prod_desc, unit_price
Order_no, prod_no -> ord_qty, line_total
ORDER (order-no, order-date, cust-no, cust-name, cust-add,
(prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total)
19
First Normal Form (1NF)
Definition: A relation is in 1NF if, and only if, all of its underlying
attributes contain atomic values only.
Disallows:
• composite attributes
• multivalued attributes
• nested relations
• attributes whose values for an individual tuple are non-atomic
A repeating group is shown by a pair of brackets within the relational schema.
ORDER (order-no, order-date, cust-no, cust-name, cust-add,
(prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total)
20
Example – UNF to 1NF
ORDER (order-no, order-date, cust-no, cust-name, cust-add,
(prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total)
1. Remove the outermost repeating group (and any nested repeated
groups it may contain) and create a new relation to contain it.
ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total)
(prod-no, prod-desc, unit-price, ord-qty, line-total)
2. Add to this relation a copy of the PK of the relation immediately
enclosing it.
ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total)
(order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
3. Name the new entity (appending the number 1 to indicate 1NF)
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
4. Determine the PK of the new entity
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
21
Second Normal Form (2NF)
Definition: A relation is in 2NF if, and only if, it is in 1NF and
every non-key attribute is fully dependent on the key.
Remove partial functional dependencies
into a new relation
ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total)
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
Order_no -> all non-key attributes
Cust_no -> cust_name, cust_add
Prod_no -> prod_desc, unit_price
Order_no, prod_no -> ord_qty, line_total
22
Example – 1NF to 2NF
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
1. Remove the offending attributes that are only partially functionally
dependent on the composite key, and place them in a new relation.
ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total)
(prod-desc, unit-price)
2. Add to this relation a copy of the attribute(s) which determine(s) these
offending attributes. These will become the PK of this new relation…
ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total)
(prod-no, prod-desc, unit-price)
3. Name the new entity (appending the number 2 to indicate 2NF)
PRODUCT-2 (prod-no, prod-desc, unit-price)
4. Rename the original entity (ending with a 2 to indicate 2NF)
ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total)
23
Third Normal Form (3NF)
Definition: A relation is in 3NF if, and only if, it is in 2NF and every
non-key attribute is non-transitively dependent on the key.
Remove transitive dependencies
into a new relation
ORDER-2 (order-no, order-date, cust-no, cust-name, cust-add, order-total)
PRODUCT-2 (prod-no, prod-desc, unit-price)
ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total)
Order_no -> all non-key attributes
Cust_no -> cust_name, cust_add
Prod_no -> prod_desc, unit_price
Order_no, prod_no -> ord_qty, line_total
24
Example – 2NF to 3NF
ORDER-2 (order-no, order-date, cust-no, cust-name, cust-add, order-total)
1. Remove the offending attributes that are transitively dependent
on non-key attributes, and place them in a new relation.
ORDER-2 (order-no, order-date, cust-no, order-total)
(cust-name, cust-add)
2. Add to this relation a copy of the attribute(s) which determines
these offending attributes. These will automatically become the PK of
ORDER-2
(order-no, order-date, cust-no, order-total)
this
new relation…
(cust-no, cust-name, cust-add)
3. Name the new entity (appending the number 3 to indicate 3NF)
CUSTOMER-3 (cust-no, cust-name, cust-add)
4. Rename the original entity (ending with a 3 to indicate 3NF)
ORDER-3 (order-no, order-date, cust-no, order-total)
25
Example – Relations in 3NF
ORDER-3 (order-no, order-date, cust-no, order-total)
CUSTOMER-3 (cust-no, cust-name, cust-add)
PRODUCT-3 (prod-no, prod-desc, unit-price)
ORDER-LINE-3 (order-no, prod-no, ord-qty, line-total)
prod-no
PRODUCT
order-no
ORDER
places placed by
cust-no
CUSTOMER
contains
part of
shows
belongs to
order-no, prod-no
ORDER-LINE
26
BCNF (Boyce/Codd Normal Form)
Definition: A relation R is in Boyce-Codd normal form
(BCNF) if and only if it is in 3NF and every determinant is a
candidate key.
27
StudMajor(StudNo, Major, Advisor)
Functional Dependencies:
StudNo, Major -> Advisor
Advisor -> Major
Problems:
— If the record for student 456 is deleted, we lose not only
information on student 456 but also the fact that DARWIN advises
in BIOLOGY
— We cannot record the fact that WATSON can advise on
COMPUTING until we have a student majoring in COMPUTING to
whom we can assign WATSON as an advisor
BCNF:
StudMajor(StudNo, Advisor)
Advisors(Advisor, Major)
28
Fourth Normal Form
Definition: A relation R is in fourth
normal form if it is in BCNF and has no
multi-valued dependency.
Multi-valued dependency occurs when,
for each value in field A, there is a set of
values for field B and a set of values for
field C but fields B and C are not related
Primary key = (SID, Major, Activity)
Multi-valued Dependencies:
SID
Major
Activity
100
Music
swimming
100
Accounting
swimming
100
Music
tennis
100
Accounting
tennis
150
Math
jogging
4NF:
Student_major(SID, Major)
Student_activity(SID, Activity)
(SID -> Major) and (SID -> Activity)
It is not in fourth normal form.
29
Summary: Normal Forms
Relational theory defines a number of structure conditions called Normal Forms
that assure that certain data anomalies do not occur in a database.
1NF
2NF
3NF
BCNF
4NF
Keys; No repeating groups/(multi-valued attributes)
No partial dependencies
No transitive dependencies
Determinants are candidate keys
No multivalued dependencies
30
Practical Use of Normal Forms
• There are several ways in which we could have come up with the
schema r(R):
• r(R): generated in converting an E-R diagram to a set of relation schemas
• r(R): a single relation schema containing all attributes that are of interest. The
normalization process then breaks up r(R) into smaller schemas.
• r(R): the result of an ad-hoc design of relations that we then test to verify that
it satisfies a desired normal form
31
Practical Use of Normal Forms/2
• Normalization is carried out in practice so that the resulting designs
are of high quality and meet the desirable properties
• The practical utility of these normal forms becomes questionable
when the constraints on which they are based are hard to understand
or to detect
• The database designers need not normalize to the highest possible
normal form
• (usually up to 3NF, BCNF, or 4NF)
• Denormalization:
• The process of storing the join of higher normal form relations as a base
relation—which is in a lower normal form
32
Summary: Normal Forms
Every nonkey attribute is functionally dependent upon the key
1NF
the whole key and
2NF
nothing but the key
3NF
so help me Codd (E.F. Codd invented relational model and normal forms)
This is an analogy based on the way in which witnesses are sworn into legal
proceedings in the US:
Do you swear to tell the truth, the whole truth and nothing but the truth, so
help you God?”
33
Reading
• Chapters 7.1, 7.2, 7.3, 7.5, 7.6, 7.7, 7.8, and 7.9
34
In-Class Exercise 1
Consider the following relation for published books:
BOOK(book_title, author_name, book_type,
list_price, author_affil, publisher)
author_affil refers to the affiliation of author. Suppose the
following dependencies exist:
book_title -> publisher, book_type
book_type -> list_price
author_name -> author_affil
What normal form is the relation in? Normalize it up to 3NF.
35
In-Class Exercise 2
A company obtains parts from a number
of suppliers. Each supplier is located in
one city. A city can have more than one
supplier located there and each city has
a status code associated with it.
Each supplier may provide many parts.
The company creates a simple relational
table (FIRST) to store this information
that can be expressed in relational
notation as:
FIRST (s#, status, city, p#, qty)
In order to uniquely associate quantity
supplied (qty) with part (p#) and
supplier (s#), a composite primary key
composed of s# and p# is used.
s#
status
city
p#
qty
supplier identification number
status code assigned to city
name of city where supplier is located
part number of part supplied
quantity of parts supplied to date
36

Purchase answer to see full
attachment