assignment 1 building a relational database for efficiency

Pardon Our Interruption

As you were browsing something about your browser made us think you were a bot. There are a few reasons this might happen:

You've disabled JavaScript in your web browser.
You're a power user moving through this website with super-human speed.
You've disabled cookies in your web browser.
A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this support article .

To regain access, please make sure that cookies and JavaScript are enabled before reloading the page.

DEV Community

Posted on May 6 • Updated on May 8

Mastering Relational Database Design: A Comprehensive Guide

Introduction.

In today's data-driven world, storing and managing information efficiently is crucial for businesses and organizations of all sizes. Relational databases have emerged as a powerful solution for organizing and manipulating data in a structured and scalable manner. In this blog post, we'll explore the fundamentals of relational databases, their management systems, and the principles that underpin effective database design.

What is a Database?

A database is a structured collection of data that is organized and stored in a way that facilitates efficient retrieval, manipulation, and management. Think of it as a digital filing cabinet, where instead of physical folders and documents, you have tables and records neatly organized for easy access.

What is a Relational Database?

A relational database is a type of database that organizes data into tables (relations) with rows (records) and columns (fields). These tables are interconnected through relationships, allowing data to be accessed and combined in various ways. Imagine a collection of spreadsheets, each representing a different aspect of your data, but with the ability to link and combine information across them seamlessly.

A Relational Database Management System (RDBMS) is a software application designed to create, manage, and interact with relational databases. It provides a structured framework for storing, retrieving, and manipulating data within the database. Some popular examples of RDBMS include MySQL, PostgreSQL, Oracle, and Microsoft SQL Server.

Introduction to SQL

SQL (Structured Query Language) is the standard programming language used to interact with relational databases. It allows you to create, read, update, and delete data within the database, as well as define and modify the database structure itself. SQL is like a universal language that enables you to communicate with different RDBMS platforms.

Naming Conventions

In SQL, following consistent naming conventions is crucial for clarity and maintainability. Here's an example:

What is Database Design?

Database design is the process of creating an efficient and organized structure for storing and managing data in a database. It involves defining tables, columns, relationships, and constraints to ensure data integrity, minimize redundancy, and optimize performance. Proper database design is the foundation for building robust and scalable applications.

Data Integrity

Data integrity refers to the accuracy, consistency, and reliability of data stored in a database. It ensures that the data follows specific rules and constraints, preventing errors and inconsistencies. There are three types of data integrity:

Entity Integrity : Ensures each row in a table is uniquely identifiable by a primary key, and the primary key cannot have null values.
Referential Integrity : Maintains relationships between tables by ensuring foreign key values in one table match the primary key values in another table.
Domain Integrity : Enforces valid entries for a given column by restricting the data type, format, and range of values that can be stored.

Database Terms

Table : A collection of related data organized in rows and columns.
Row : A single instance or entry in a table (also known as a record or tuple).
Column : A specific characteristic or property of the data in a table (also known as a field or attribute).
Primary Key : A column or combination of columns that uniquely identifies each row in a table.
Foreign Key : A column or combination of columns that references the primary key of another table, establishing a relationship between the two tables.
Join : An operation that combines rows from two or more tables based on a related column.
Index : A data structure that improves the performance of data retrieval operations by creating a sorted representation of the data in a table.
View : A virtual table that is dynamically generated from one or more underlying tables.
Stored Procedure : A pre-compiled collection of SQL statements that can be executed as a single unit.
Trigger : A special type of stored procedure that is automatically executed when a specific event occurs in a table, such as an INSERT, UPDATE, or DELETE statement.

Atomic Values

In database design, it's important to store atomic values, which means storing the smallest pieces of information that cannot be further divided. This principle helps maintain data integrity and avoid redundancy.

For example, instead of storing a customer's full name in a single column, it's better to separate it into first name and last name columns. This way, you can easily search, sort, or manipulate each part of the name independently.

Introduction to Keys

Keys are essential components in database design that help ensure data integrity and establish relationships between tables. They serve as unique identifiers for records and enable efficient data retrieval and manipulation.

Primary Key Index

A primary key is a column or a combination of columns that uniquely identifies each record in a table. It ensures that each record is unique and can be easily located. Primary keys are typically indexed to improve query performance.

Look up Table

A lookup table, also known as a reference table or a code table, is a table that contains predefined sets of values that can be used to populate columns in other tables. Lookup tables help maintain data integrity by ensuring consistency and reducing redundancy.

Superkey and Candidate Key

A superkey is a set of one or more columns in a table that uniquely identifies each record. A candidate key is a minimal superkey, meaning it contains no unnecessary columns. In other words, a candidate key is a superkey with the minimal number of columns required to uniquely identify each record.

Primary Key and Alternate Key

A primary key is a candidate key chosen as the main unique identifier for a table. An alternate key, also known as a unique key, is any other candidate key that could have been chosen as the primary key but was not.

Surrogate Key and Natural Key

A surrogate key is an artificial key (often a sequential number or a GUID) that is used as the primary key in a table. It has no inherent meaning or relationship to the data itself. A natural key, on the other hand, is a key derived from the data itself, such as an employee ID or a product code.

Should I use Surrogate Keys or Natural Keys?

The choice between surrogate keys and natural keys depends on several factors, including the nature of the data, the likelihood of data changes, and the potential for duplication or conflicts.

Surrogate keys are often preferred because they are:

Immutable: They do not change over time, even if the data changes.
Guaranteed unique: They are generated by the database system, ensuring uniqueness.
Opaque: They do not reveal any information about the data itself, which can be beneficial for security and privacy.

Natural keys, on the other hand, can be advantageous when:

The data has inherent uniqueness, such as product codes or employee IDs.
The data is unlikely to change over time, reducing the risk of conflicts or duplication.
There is a need for human-readable and meaningful identifiers.

Foreign Key

A foreign key is a column or a combination of columns in one table that references the primary key of another table. It establishes a link between the two tables and enforces referential integrity, ensuring that data in the child table is valid and consistent with the data in the parent table.

NOT NULL Foreign Key

In some cases, it might be desirable to have a NOT NULL constraint on a foreign key column, meaning that the column cannot have a null value. This constraint ensures that every record in the child table is associated with a valid record in the parent table.

Foreign Key Constraints

Foreign key constraints define the rules for referential integrity between tables. These constraints can include actions to be taken when a referenced record in the parent table is updated or deleted, such as:

CASCADE : When a record in the parent table is updated or deleted, the corresponding records in the child table are also updated or deleted.
SET NULL : When a record in the parent table is updated or deleted, the corresponding foreign key values in the child table are set to NULL.
NO ACTION : When a record in the parent table is updated or deleted, the corresponding foreign key values in the child table remain unchanged, and the operation is rolled back if it violates referential integrity.

Simple Key, Composite Key, Compound Key

A simple key is a single column used as a primary key or a foreign key.
A composite key is a combination of two or more columns used as a primary key or a foreign key.
A compound key is a combination of two or more simple keys used as a foreign key.

Relationships

Relationships are the cornerstone of relational databases, allowing you to connect and combine data from different tables. There are three main types of relationships:

One-to-One Relationships

In a one-to-one relationship, each record in one table is associated with exactly one record in another table, and vice versa. For example, consider a database where each employee has one and only one manager, and each manager manages one and only one employee. This type of relationship is relatively rare in practice.

One-to-Many Relationships

In a one-to-many relationship, each record in one table (the "one" side) can be associated with multiple records in another table (the "many" side). For example, in a database for a school, one teacher can teach multiple classes, but each class is taught by only one teacher.

Many-to-Many Relationships

In a many-to-many relationship, each record in one table can be associated with multiple records in another table, and vice versa. For example, in a database for a university, a student can enroll in multiple courses, and each course can have multiple students enrolled.

Summary of Relationships

One-to-One: One record in Table A is related to one and only one record in Table B, and vice versa.
One-to-Many: One record in Table A can be related to multiple records in Table B, but one record in Table B can be related to only one record in Table A.
Many-to-Many: Multiple records in Table A can be related to multiple records in Table B, and vice versa.

Designing Relationships (SQL Implementation)

1 - Designing One-to-One Relationships

To design a one-to-one relationship, you can either include all the columns from both tables in a single table or create two separate tables and use a foreign key constraint to link them.

2 - Designing One-to-Many Relationships

To design a one-to-many relationship, you typically create two tables: a parent table (the "one" side) and a child table (the "many" side). The child table includes a foreign key column that references the primary key of the parent table.

3 - Designing Many-to-Many Relationships

To design a many-to-many relationship, you typically create a third table (called a junction table or associative table) that links the two main tables together. This junction table includes foreign key columns that reference the primary

Parent Tables and Child Tables

In a one-to-many or many-to-many relationship, the table on the "one" side is often referred to as the parent table, while the table on the "many" side is called the child table. The child table contains a foreign key that references the primary key of the parent table.

For example, in the teacher-class relationship, the teachers table is the parent table, and the classes table is the child table. Similarly, in the student-course relationship, the students and courses tables are parent tables, while the enrollm

Introduction to Entity Relationship Modeling

Entity Relationship Modeling (ER Modeling) is a technique used in database design to represent the logical structure of a database visually. It helps identify the entities (tables), attributes (columns), and relationships between them, making it easier to understand and communicate the database design.

ER diagrams consist of the following components:

Entities : Represented by rectangles, entities are the tables or objects in the database.
Attributes : Listed inside the entity rectangle, attributes are the columns or fields that describe the entity.
Relationships : Represented by lines connecting entities, relationships depict the associations between entities.

Cardinality

Cardinality defines the numerical relationship between two entities. It specifies the maximum number of instances of one entity that can be associated with a single instance of another entity. The most common cardinalities are:

One-to-One (1:1): One instance of Entity A can be associated with at most one instance of Entity B, and vice versa.
One-to-Many (1:N): One instance of Entity A can be associated with multiple instances of Entity B, but one instance of Entity B can be associated with only one instance of Entity A.
Many-to-Many (M:N): Multiple instances of Entity A can be associated with multiple instances of Entity B, and vice versa.

In ER diagrams, cardinality is represented using specific notation, such as a single line for one-to-one, a line with an arrowhead for one-to-many, and a line with arrowheads at both ends for many-to-many relationships.

Modality refers to whether the existence of an entity instance depends on its relationship with another entity. There are two types of modality:

Partial modality: The existence of an instance does not depend on its relationship with another entity. For example, a customer can exist without having any orders.
Total modality: The existence of an instance depends on its relationship with another entity. For example, an order item cannot exist without an order.

In ER diagrams, modality is represented using specific notation, such as a single bar for partial modality and a double bar for total modality.

Introduction to Database Normalization

Database normalization is the process of organizing data in a database to reduce redundancy, minimize data anomalies (insertion, update, and deletion anomalies), and improve data integrity. It involves breaking down a database into smaller tables and defining relationships between them based on specific rules or normal forms.

The primary goals of database normalization are:

Eliminating redundant data
Ensuring data integrity
Facilitating data manipulation and maintenance

There are several normal forms in database normalization, each building upon the previous one. The most commonly used normal forms are:

First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)

1NF (First Normal Form of Database Normalization)

The First Normal Form (1NF) is the most basic form of normalization. It states that an attribute (column) in a table must have atomic values, meaning each cell in the table should contain a single value, not a set of values.

For example, consider a table with a column named "PhoneNumbers" that stores multiple phone numbers for a customer. This violates 1NF because the column contains a set of values instead of a single value. To conform to 1NF, you would need to separate the phone numbers into individual columns or create a separate table for phone numbers.

2NF (Second Normal Form of Database Normalization)

The Second Normal Form (2NF) builds upon 1NF by addressing the issue of partial dependencies. A table is in 2NF if it is in 1NF and every non-prime attribute (column) is fully dependent on the entire primary key.

In other words, if a table has a composite primary key (consisting of multiple columns), then all non-key columns must depend on the entire primary key, not just a part of it.

For example, consider a table with a composite primary key of (student_id, course_id) and a column grade . If the grade column depends only on the course_id and not on the combination of student_id and course_id , then the table violates 2NF.

3NF (Third Normal Form of Database Normalization)

The Third Normal Form (3NF) builds upon 2NF by addressing the issue of transitive dependencies. A table is in 3NF if it is in 2NF and every non-prime attribute is non-transitively dependent on the primary key.

In other words, if a non-key column is dependent on another non-key column, then the table violates 3NF, and the non-key columns should be separated into their own table.

For example, consider a table with columns student_id , student_name , class_id , and class_name . The student_name column depends on the student_id , and the class_name column depends on the class_id . However, the class_name column also transitively depends on the student_id through the class_id column. This violates 3NF.

By following the principles of database normalization, you can create well-structured and efficient databases that minimize redundancy, maintain data integrity, and facilitate data manipulation and maintenance.

Indexes (Clustered, Nonclustered, Composite Index)

Indexes are data structures that improve the performance of data retrieval operations in a database. They create a sorted representation of the data in a table, allowing for faster searches and queries. There are several types of indexes:

Clustered Index : A clustered index physically reorders the rows in a table based on the index key values. Each table can have only one clustered index.
Nonclustered Index : A nonclustered index is a separate object that contains the index key values and pointers to the corresponding rows in the table. A table can have multiple nonclustered indexes.
Composite Index : A composite index is an index that includes multiple columns in the index key. It can be either clustered or nonclustered.

In the world of databases, data types are like different shapes of containers that hold specific types of information. Just as you wouldn't store liquids in a basket or solid objects in a jar, databases need to enforce specific data types to ensure data integrity and consistency.

Some common data types in SQL include:

INT or INTEGER : Stores whole numbers, like 42 or 17 .
FLOAT or DOUBLE : Stores decimal numbers, like 3.14159 or 0.00005 .
VARCHAR or TEXT : Stores text data, like names or descriptions.
DATE or DATETIME : Stores date and time values, like '2023-05-06' or '2024-01-01 12:34:56' .
BOOLEAN : Stores true/false values, like 1 (true) or 0 (false).

Choosing the right data type is crucial because it affects how the data is stored, queried, and manipulated. For example, trying to store a large string in an INT column would result in an error or data truncation.

In this example, we create a users table with columns for id (integer), name (string of up to 50 characters), age (integer), and is_active (boolean, with a default value of 1 or true).

Introduction to Joins

Joins are like bridges that connect different tables in a database, allowing you to combine and retrieve related data from multiple sources. They are a fundamental concept in relational databases and are essential for querying and manipulating data efficiently.

An inner join is like a friendly handshake between two tables, where only the rows that have matching values in both tables are included in the result set. It's a way to combine data from multiple tables based on a common column or set of columns.

In this example, we retrieve the name column from the users table and the order_date column from the orders table, but only for rows where the id in the users table matches the user_id in the orders table.

Inner Join on 3 Tables (Example)

Let's say we have three tables: users , orders , and products . We want to retrieve the user's name, the order date, and the product name for each order. We can achieve this by performing an inner join across all three tables:

Here, we first join the users and orders tables on the id and user_id columns. Then, we join the result of that join with the products table on the product_id and id columns. This way, we can retrieve data from all three tables in a single query, but only for rows where the join conditions are met.

Introduction to Outer Joins

While inner joins are like friendly handshakes, outer joins are more like welcoming hugs. They include not only the matching rows from both tables but also the non-matching rows from one or both tables, depending on the type of outer join.

Right Outer Join

A right outer join is like a warm hug from the right table to the left table. It includes all rows from the right table, along with the matching rows from the left table. If there are no matching rows in the left table, the result will contain NULL values for the left table's columns.

In this example, we retrieve all rows from the orders table (the right table), along with the matching name values from the users table (the left table). If an order doesn't have a matching user, the name column will contain NULL .

JOIN with NOT NULL Columns

Sometimes, you may want to perform a join only on columns that are not null. This can be useful when you want to exclude rows with missing data from the result set.

In this example, we perform an inner join between the users and orders tables, but we add an additional condition users.name IS NOT NULL to ensure that only rows with non-null name values are included in the result set.

Outer Join Across 3 Tables

Similar to the inner join example, we can perform outer joins across multiple tables. Let's say we want to retrieve all orders, along with the user's name and the product name, even if there are missing values in the users or products tables.

Here, we start with the orders table and perform a left outer join with both the users and products tables. This ensures that all orders are included in the result set, along with the matching user names and product names if available. If there are no matching rows in the users or products tables, the respective columns will contain NULL values.

Aliases are like nicknames for tables or columns in SQL queries. They can make queries more readable and easier to understand, especially when dealing with long table or column names, or when referencing the same table multiple times in a query.

In this example, we use the aliases u for the users table, o for the orders table, and p for the products table. This makes the query more concise and easier to read, without having to repeat the full table names multiple times.

A self join is like a table having a conversation with itself. It's a way to join a table with itself, based on a specific condition or relationship within the same table. This can be useful when dealing with hierarchical or recursive data structures, such as employee-manager relationships or nested categories.

In this example, we perform a self join on the employees table to retrieve the name of each employee and their corresponding manager's name. We use a left outer join to ensure that all employees are included in the result set, even if they don't have a manager assigned.

Database Markup Language (DBML)

Database Modeling Language (DBML) is a simple and intuitive markup language for describing the structure of relational databases. It provides a human-readable way to define tables, columns, relationships, and constraints, making it easy to communicate and collaborate on database designs.

Getting Started with DBML

To get started with DBML, you'll need a text editor and a basic understanding of database concepts. Let's create our first DBML file:

In this example, we've defined a users table with columns for id , username , email , and created_at . The [pk] tag specifies that id is the primary key, [increment] indicates auto-incrementing, [unique] ensures uniqueness for the email , and [default: now() ] sets the default value of created_at to the current timestamp.

Creating Tables

DBML allows you to define multiple tables and their columns in a single file. Let's add more tables to our database:

In this example, we've added a posts table with columns for id , title , content , user_id , and created_at . The [ref: > users.id] tag establishes a foreign key relationship between the user_id column in the posts table and the id column in the users table.

Defining Relationships

DBML supports various types of relationships between tables, including one-to-one, one-to-many, and many-to-many. Let's define some relationships in our database:

In this example, we've defined a one-to-many relationship between the users and posts tables. The Ref: users.id < posts.user_id line specifies that the id column in the users table is referenced by the user_id column in the posts table.

Adding Constraints

Constraints ensure data integrity and enforce rules on the database. DBML supports various constraints such as primary keys, foreign keys, unique constraints, and default values. Let's add some constraints to our tables:

In this updated example, we've added a [unique] constraint to the username column in the users table to ensure that each username is unique.

Documenting Your Database

DBML allows you to add comments and annotations to your database schema, making it easier to understand and maintain. Let's document our tables with comments:

Benefits of DBML

Simple and human-readable syntax
Database-agnostic approach
Free visualization tool at dbdiagram.io
Consistent conventions for readability and maintainability
Extensive documentation and examples

In wrapping up, relational database design forms the backbone of efficient data organization in modern applications. Understanding its principles empowers you to create robust and scalable databases. Next up, we'll delve into SQL syntax, unlocking the power to interact with databases effectively. Stay tuned for our exploration into the world of SQL!

Top comments (39)

Templates let you quickly answer FAQs or store snippets for re-use.

Location Minneapolis, MN
Education B.A. English (Creative Writing), Minnesota State University
Pronouns he/him/dude
Work Solutions Architect at Wings Financial Credit Union
Joined Sep 30, 2021

Wow. This might be the best post I've seen on here re: RDBMS. Seriously. Very comprehensive, well-written, and succinct at the same time. Kudos!

Education Web Development & Infographics
Work Back-end Developer
Joined Apr 5, 2024

Thank you Dan, Really appreciate that

Location Germany
Work Numerical simulation specialist, IoT developer
Joined May 24, 2021

Great post, thank you!

It's worth to mention that - depending on your data - other types of db could be used:

No-SQL (schema-free) databases could be beneficial, if structures are dynamically changing or if structures are still under development. As structures can be introduced with the data, they can handle even unknown structures as long as data are provided in an appropriate format
TSDB (time series databases) like influxDB store time series data very efficiently. And they provide methods to aggregate or interpolate large time series very fast. You can do the job using an RDBMS, but operations will probably be much less efficient.

RDBMS can be very efficient, but setting up a data model can be challenging - especially if you are new to this topic. So, it's always good to know the different options and to select right one.

Email [email protected]
Location Somerset, UK
Education BSc (Hons) Computer Science
Work Software Engineer specialising in web technologies, frontend and full stack (Node & xAMPP)
Joined Jul 16, 2017

I don't think that a no-sql or document database is absolutely schema-free. Having a schema helps maintain consistency in the data structure and quality (validation).

Surely not, and finally you need to know the structure to work with your data. But because the Schema is driven by the data, it can be easily expanded. This makes validation much harder and in the worst case you fill your database with lot´s of useless data. So, you will need to do some schema-exploration and garbage collection working with the database.

As a main difference, the schema in an RDB is maintained by the database, while in a no-SQL db it is maintained by the supplier and consumer of data. The database is relative agnostic about the data it stores, which is a real paradigm shift. But it depends much on your task, it can be an advantage or a disadvantage.

Joined Mar 26, 2024

This is well done - compact and comprehensive! Nice things to possibly add:

Table Data Population - initial
Table Data Population - incremental (maintaining/refreshing)
Views Thank you!

Very nice suggestions, for sure I'll add them ASAP, Thank you very much Dan

Joined May 8, 2024

Very clear and well explained 👏 In the 2NF, it was not the course_name column violating the rule ?

Absolutely true, the coure_name is the subject matter not the grade , Corrected

Thank you very much flosej

Location Boumerdes , Algeria
Joined Apr 7, 2024

that's more than what I can understand in one lecture (or 10🙂) , that was pretty direct and clear thank u for this explanation (You could have explained this all at class ; We wouldn't have died of boredom from BDD's teacher explanaition )

Thank you my biggest supporter ✨

Joined May 12, 2024

Man i like your explaining method the information is stored directly in my database

Joined May 9, 2024

Nicely done. A great refresher or solid beginning knowledge.

As always your articles are so pretty -nice organization -variety of informations Congratulations 🎉🎉

Joined May 3, 2024

Just Wow!!! That truly very well structured and in depth guide, thanks!

Thank you I really appreciate that

Joined Jan 10, 2021

Great article for a revision you nailed it. You can look forward to entity sub type and super type in future.

This is very precise, simple. Good for an expert and for a total beginner.

Joined Dec 20, 2018

Thank you for providing such valuable content.

Joined Mar 4, 2024

Thank you for great post. Can you please tell me when should I need to normalize in my DB.

Well, applying normalization on ur database schema design is tropically done, among the latest steps you do in the process, where after u define the entities, primary keys, foreign keys, indexes and refine the structure, you start applying the normalization rules up from 1NF to 2NF to 3NF.

Hope that clarified things a little.

Email [email protected]
Location Bangladesh
Education B.sc in CSE
Work Software Engineer
Joined Oct 19, 2020

Great post!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink .

Hide child comments as well

For further actions, you may consider blocking this person and/or reporting abuse

JSW ONE Platform SDE-1 Interview Experiance (2022)

shashi - Jul 24

Recursive Hard Problems Using Javascript

Ashutosh Sarangi - Aug 16

What's new in Frontend AI?

OpenSource - Aug 6

How to Redirect HTTP to HTTPS Using Free WAF

MotorBuy6 - Aug 16

We're a place where coders share, stay up-to-date and grow their careers.

Database schema design 101 for relational databases

This database schema design guide walks you through walk through the basics of creating and designing schemas for relational databases.

Getting started with a relational database can seem like a daunting task. Whether you’re coming from a NoSQL database or you’ve never used a database before, I’m going to talk you through designing a relational database and am hoping to answer the following questions:

What is a relational database?

How are relationships made in the database?
What are the steps to take to ensure an efficient database?

A relational database is one way to store data related to each other in a pre-defined way. By pre-defined, we mean that at the time of the creation of the database, you can identify the relationships that exist between different entities or groups of data. Relational databases are great for storing structured data that should model the relationship between real-life entities.

The anatomy of a relational database:

Tables: Data representing an entity organized into columns in rows.
Properties: Attributes that you want to store about an entity.
Relationships: The relationships between tables.
Indexes: Useful for connecting tables and making quick look-ups.

A relational database is made up of two or more tables with a variable number of rows and columns. Tables are unique to the entity they represent. Each column represents one specific property associated with each row in the table, and the rows are the actual records stored in that table. To illustrate the magic of a relational database, we’ll be designing a database for a retailer that wants to manage their products, customers, orders, and employees.

Design a database for a new retailer in town. This retailer really cares about customer relationships and wants to reward customers who meet a spending goal and gift these top customers on the 1 year anniversary of their first purchase. This retailer needs a way to organize products by price and category to make smart recommendations to their customers based on their age. This retailer also wants to track the best-performing employees to reward those with the highest sales with a raise at the end of the year.

Designing the database schema

The schema is the structure that we define for our data. The schema defines the tables, relationships between tables, fields, and indexes.

The schema will also have a significant impact on the performance of our database. By dedicating time to the schema design, we will save ourselves a headache in the future. One tool that will help us design our schema is an ERD, entity-relationship diagram. We’ll use Lucidchart to build out our ERD, and you can sign up for free. This diagram will allow us to visualize our entities and their relationships.

Here are the major to-dos when designing our schema we will cover in this post:

Understand business needs
Identify entities
Identify properties/fields on those entities
Define relationships between tables

Step 1: Understand business needs

The first step in designing a relational database schema is to understand the needs of the business. This will help us determine what type of information we should be storing. For example, if we are working with a retailer that wants to offer an anniversary gift for clients on their first anniversary, we would have to store the date a customer joins.

A recap of the requirements for our customers:

Store customer spending to-date
Store customer anniversary date of first purchase
Store customer’s age
Store employee sales total in dollar amount
Store products and include a category and price property

Step 2: Define entities, aka tables

4 rectangles titled with the names of the entities we will be working with.

Once that is clear, the next step is defining the entities we want to store data about. These entities will also be our tables. Following the retailer example, our entities should be:

This could extend to add more entities like stores if there are multiple storefront locations, manufacturers, etc., depending on the needs of the business. For this blog post, we’ll just be working with the four entities we defined above to meet the needs of our fictitious client. We can represent an entity in our ERD with a rectangle and the table/entity name at the top.

Step 3: Define properties, aka fields

An diagram showing the properties associated with each table and their types

Once we’ve identified our entities, we should define what fields we want to store about these entities. One important thing to keep in mind is that each table, or entity, should have one unique, identifying property. This unique value is known as the primary key, and this helps us differentiate records from each other. For example, if we have two customers with the same name or same birthdate, we would have to spend some time figuring out which customer is the one we intend to work with.

Two common ways to come up with a primary key:

Programmatically generate a unique value
Assign an integer that automatically increases with each new entry

All of these are straightforward and were taken directly from the specs that the business gave us. For example, the business wants to know which customer made the purchase, which employee made the sale, and which products were in the order. In the Orders table, you will noticed that we reference a customerID , employeeID , and productID to meet those needs.

Step 4: Define relationships

A diagram showing the relationships between tables, illustrated by color-coded lines connecting primary and foreign keys.

Once we’ve defined our entities and their properties, we can think about how these tables relate to each other. The cornerstone of relational databases is that tables are often related. A parent table will have a unique primary key column, and a child table will have its own primary key and then a parent_id column that references the parent table. We have already inadvertently done this when we defined the properties in the preceding step. For example, the customers table has a customerID , which is the primary key. In the Orders table, we set an orderID as the primary key and reference the customerID to denote which customer made the order. Similarly, we also have a column referencing the Employees table, employeeID , to denote which employee made the sale.

When a primary key appears in another table, that field is called a foreign key in that table. The relationship between primary keys and foreign keys creates the relationship between tables.

You’ve done it

We’ve covered the main steps to take when designing your database schema: understand the business needs, define entities, define properties, and define relationships. Designing your database schema can be scary because with traditional relational databases, some schema changes can bring your whole application down and cause you to lose data. With PlanetScale’s branching feature , you can branch your schema like your code. Test your schema changes in an isolated environment, and once you are happy with your new schema, you can merge your changed branch into your main production branch without experiencing any downtime or data loss. Sign up for a PlanetScale account to get started.

Increasing PlanetScale plan limits to include billions of reads

IMAGES

Assignment 1.docx
Assignment 1.docx
Database Schema Design 101 For Relational Databases, 45% OFF
Solved 3. Relational Database Project In this assignment,
Assignment 4.docx
Design and Implementation of a Relational Database: A Practical

COMMENTS

CMIS351 Assignment No. 1 - CMIS 351: Management ... - Studocu
The purpose of this report is to demonstrate how Solomon Enterprises can use Microsoft Access as a relational database that can help with decision making moving forward. Within this report, we will explain the difficulties of data integrity issues within a spreadsheet.
Building a Relational Database for Efficiency - Studocu
Assignment 1: Building a Relational Database for Efficiency. Elizabeth Davidson. Student # 3492267. CMIS 351: Management Information System. Athabasca University.
CMIS 351 Assignment 1 instructions - Assignment 1: Building a ...
Assignment 1: Building a Relational Database for Efficiency. Assignment 1 is worth 10 percent of your final grade for this course. You should plan to complete and submit this assignment after you have completed Lesson 4. This assignment is to be completed individually, not as a member of a group or team.
CMIS351assign1 (docx) - CliffsNotes
By simply learning how to use Access and converting that data into a relational database, we can organize, sort, and link the data together in meaningful and useful ways.
CMIS 351 - Assignment 1.docx - CMIS351 Assignment 1...
Assignment 1: Building a Relational Database for Efficiency CMIS 351 Management Information System Assignment 1 By: Arozo Azma Introduction This analytical report will concentrate on developing an effective database to solve some of the challenges Soloman Enterprises is
Building a Relational Database: Efficiency & Design ...
Assignment 1: Building a Relational Database for Efficiency Assignment 1 is worth 10 percent of your final grade for this course. You should plan to complete and submit this assignment after you have completed Lesson 4.
Designing a Relational Database and Creating an Entity ...
In this article we have introduced the basic ideas of what a relational database is and how it works, discussed some of the different RDBMS packages available, and gone through the whole process of creating an Entity Relationship Diagram to describe the database we want to build based on the requirements document.
Assignment 1.docx - Assignment 1: Building a Relational ...
As an analyst/designer, you will review the current Excel file, and then design and build an Access database that will be a more efficient and effective method of handling the data.
Mastering Relational Database Design: A Comprehensive Guide
Introduction. In today's data-driven world, storing and managing information efficiently is crucial for businesses and organizations of all sizes. Relational databases have emerged as a powerful solution for organizing and manipulating data in a structured and scalable manner.
Database schema design 101 for relational databases - PlanetScale
What are the steps to take to ensure an efficient database? What is a relational database? A relational database is one way to store data related to each other in a pre-defined way.