The rules around making sure that a database design is efficient are referred to as normalization. In this post, we study Database design and normalization techniques in DBMS.
Database design and normalization techniques in DBMS
- Normalization ensures that there is less redundancy and less inconsistency in the data.
- A poorly designed database might result in data being stored twice because the same attributes exist in different tables, or because different records include the same data (for example, storing the address of a customer every time they make a purchase).
- A poorly designed database might result in erroneous data because duplicate data has been incorrectly updated or deleted (for example, if the address for a customer is stored every time they make a purchase but needs to be changed then it is possible to miss one or more records with the old address.
- A fully normalized database is atomic. Each table, group, or attribute, cannot be broken down any further and so there are no repeating groups of attributes.
a ‘name’ attribute could be broken down into ‘first name’ and ‘last name’
an address attribute could be broken down into ‘house number’, ‘street’, ‘town’, ‘county’, and ‘postcode’.
- A fully normalized database has no partial dependencies. A partial dependency is when an attribute depends on part of a composite key but not the whole composite key (note, this point is only relevant to tables with composite keys).
For example, a table for an appointment with a doctor uses a composite key:
appointment(date, time, doctorName, patientID, notes, doctorSpecialism)
The doctorSpecialism attribute depends on the doctorName, but not on the rest of the attributes in the composite primary key.
The solution is to create a separate doctor entity:
doctor(doctorID, doctorName, doctorSpecialism)
appointment(date, time, doctorID, patientID, notes)
- A fully normalized database has no non-key dependencies. The value of each attribute must only depend on the key. If the value of an attribute can be determined using any other attribute, then it is not independent. If this is the case, then the table normally needs splitting into two further linked tables.
For example, an address entity is designed as follows:
address(addressID, houseNumber, street, town, county, postcode)
Two neighbors have different house numbers, but the same street, town, county, and postcode. The attributes for street, town, and county can be determined from the postcode as well as the primary key addressID, as everyone on that street has the same postcode.
To resolve this, it can then be split into two tables:
address(addressID, houseNumber, postcode)
postcode(postcode, street, town, county)
definitions (on normalization topic)
Normalization – Structuring a database in a way designed to reduce data redundancy and improve integrity.
Redundancy – Not needed, for example, data that is unnecessarily stored twice.
Inconsistency – The result is not always the same.
Third normal form – A database in which all data is atomic, there are no partial dependencies and no nonkey dependencies.