Database Management/Normalization

From Wikiversity
Jump to navigation Jump to search

Objectives and Skills[edit | edit source]

This lesson allows users to learn how to understand, learn and apply database normalization concepts.

Readings[edit | edit source]

  1. Watt: Database Design - Chapter 12 (Normalization)
  2. McFadyen: Relational Databases and Microsoft Access - Chapter 10 (Normalization)
  3. Microsoft: Description of the database normalization basics
  4. Tutorials Point: DBMS Normalization
  5. Microsoft: Implementing a Relational Database
  6. Wikibooks: Database Design/Database Design Development Process
  7. Wikibooks: Database Design/Normalization
  8. W3Schools: Database Normalization

Multimedia[edit | edit source]

Learn How to understand, learn and apply examples using Database Normalization and Normal Forms:

  1. Microsoft Virtual Academy: Introduction to Databases
  2. Youtube: Normalization
  3. Youtube: Understanding and applying First Normal Form
  4. Youtube: First Normal Form Example
  5. Youtube: Understanding and applying Second Normal Form
  6. Youtube: Second Normal Form Example
  7. Youtube: Understanding and applying Third Normal Form
  8. Youtube: Third Normal Form Example
  9. Student Recommendation Youtube: Database Normalization
  10. Youtube: Normalization - 1NF, 2NF, 3NF and 4NF
  11. Youtube: Basic Concept of Database Normalization
  12. Youtube: 4NF Normalization and Multi-Value dependency explained
  13. Normalization, ER modeling, 1NF, 2NF, 3NF
  14. YouTube: Database Systems - Normalization, 1NF, 2NF, 3NF

Activities[edit | edit source]

  1. Watt: Database Design - Chapter 12 (Normalization) Complete the session exercises.
  2. McFadyen: Relational Databases and Microsoft Access - Chapter 10 (Normalization). Complete the session exercises.
  3. Normalization - Normalization Exercises and Answers.
  4. [2] - Module 44 Normalization

Lesson Summary[edit | edit source]

  • Database normalization is a logical database design technique to structure a relational database in accordance with a series of normal forms in order to reduce data redundancy and improve data integrity.[1]
  • Normalization is a process of using normal forms to avoid logical design error occurrences within a database. Unnormalized database design will display database inconsistencies which will cause errors when inserting, deleting or updating record information.[2]
  • De-normalization is a process that changes relations from higher to lower normal forms and is done to improve the performance of retrieving relations from the database. It also reduces the cost to maintain a database because it requires less joins. [3]
  • Database normalization is a progressive process using different normal forms. A higher level of database normalization cannot be achieved unless the previous levels have been satisfied.[1] Any relation that is in BCNF satisfies 3NF. Any relation in 3NF satisfies 2NF. Any relation in 2NF satisfies 1NF. [3]
  • A fully normalized database allows its structure to be extended to accommodate new types of data with slight changes to the existing structure.[1]
  • The basic normal forms (from least normalized to most normalized) discussed will include: UNF (unnormalized form); 1NF (first normal form), 2NF (second normal form), 3NF (third normal form).[1]

Key Terms[edit | edit source]

alternate key
Alternate keys are all candidate keys not chosen as the primary key.[4]
anomalies
A deviation from a rule or from what is regarded as normal.[5]
Boyce-Codd normal form (BCNF)
Boyce–Codd normal form (or BCNF or 3.5NF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). BCNF was developed in 1975 by Raymond F. Boyce and Edgar F. Codd to address certain types of anomalies not dealt with by 3NF as originally defined. It is typically the goal of any database designer.[6]
candidate keys
A candidate key is a column(s)in a table that can uniquely identify any database record without referring to any other data. A candidate key that is unique is referred to as a primary key.[7]
DBDL
Database Design Language (DBDL) is a written formatting design language used to display tables by identifying the primary key (underlined) and columns. For example: Customer(CustomerID (PK), CustomerFN, CustomerLN, CustomerAddress, CustomerCity, CustomerState, CustomerZip). CustomerID would be underlined in the DBDL design format.[8]
Decomposition
The process of breaking down a relation into smaller relations in an attempt to normalize the database.[9]
Edgar F. Codd
Edgar Frank "Ted" Codd was an English computer scientist who, while working for IBM, invented the relational model for database management, the theoretical basis for relational databases and relational database management systems.[1]
first normal form
First normal form (1NF) sets the very basic rules for an organized database: Eliminate duplicate columns from the same table. Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).[10]
fourth normal form
Fourth normal form (4NF) has one additional requirement: Meet all the requirements of the third normal form. A relation is in 4NF if it is in BCNF and it has a single column key, or the table only has attributes of one multivalued dependency, or the determinant of the multivalued dependency is a superset of or equal to a the key.[10]
functional dependency
A functional dependency (FD) is a relationship between two attributes, typically between the PK and other non-key attributes within a table. For any relation R, attribute Y is functionally dependent on attribute X (usually the PK), if for every valid instance of X, that value of X uniquely determines the value of Y. This relationship is indicated by the representation below: X ———–> Y [11]
Normal Form
Normal forms are used to eliminate or reduce redundancy in database tables.[[3]] Additionally, normal forms are apart of the few rules for database normalization. Each rule is called a "normal form." If the first rule is observed, the database is said to be in "first normal form." If the first three rules are observed, the database is considered to be in "third normal form."[[4]]
Normalization theory
gives us a theoretical basis to judge the quality of a database and helps one understand the impact of some design decisions.[12]
Partial Functional Dependency
The normalization theory where the determinant is a subset of the candidate key.[13]
primary key
A Primary Key is a Column that uniquely identifies a particular Row in a Table.[14]
repeating group
Information that repetitive within a database table.
second normal form
Second normal form (2NF) further addresses the concept of removing duplicate data: Meet all the requirements of the first normal form. Remove subsets of data that apply to multiple rows of a table and place them in separate tables. Create relationships between these new tables and their predecessors through the use of foreign keys.[10]
third normal form
Third normal form (3NF) goes one large step further: Meet all the requirements of the second normal form. Remove columns that are not dependent upon the

primary key.[10]

transitive dependency
Transitive dependency occurred because a non-key attribute was determining another non-key attribute.[15]
unnormalized relation
An unnormalized data model will contain data redundancy, where multiple values and/or complex data structures may be stored within a single field or attribute, or where fields may be replicated within a single table.[16]

Review Questions[edit | edit source]

Enable JavaScript to hide answers.
Click on a question to see the answer.
  1. _______________ can be applied in any database to avoid logical inconsistencies and contain several distinct levels.
    Normalization[10]
  2. ______________________ is a simple database data model lacking the efficiency of database normalization.
    Unnormalized form (UNF) is a simple database data model lacking the efficiency of database normalization.[1]
  3. One way to achieve the _____ normal form would be to separate the duplicities into multiple columns.
    One way to achieve the first normal form would be to separate the duplicities into multiple columns.[1]
  4. In order to conform to 2NF, the table should not contain _______________________.
    In order to conform to 2NF, the table should not contain partial dependencies.[1]
  5. In order to conform to 3NF, the table should not contain _______________________.
    In order to conform to 3NF, the table should not contain transitive dependencies.[1]
  6. There are several distinct levels of normalization which is referred to as _______________.
    There are several distinct levels of normalization which is referred to as normal forms.[10]
  7. ______ normal form ____ further addresses the concept of removing duplicate data and meets all the requirements of the _______ normal form.
    Second normal form further addresses the concept of removing duplicate data and meets all the requirements of the first normal form.

    [10]

  8. A _________________ is another name for a nonkey column.
    A non key attribute is another name for a nonkey column.

    [10]

  9. When you normalize data you are removing the risk of __________ effecting your databases.
    Anomalies
  10. A database designer's objective most of the time is to achieve ____ normalization.
    BCNF
  11. A_________key is the collection of keys for a relation. From all these keys; one is chosen to be the primary key.
    candidate
  12. A ______ _____ is an association between two attributes.
    functional dependency

See Also[edit | edit source]

References[edit | edit source]