Sunday, January 20, 2019

Database Normalization

What is Normalization?
Normalzation is a database design technique that reduces the data redundancy and dependency.
Let's start to see how we do normalization in sample data of football player information.
We will see how we procced from Un-normalized Form to First, Second, Third Normal Form step-by step.

Initially, our data is in un-normalized form. It contains composite or multiple values in an attribute (Clubs).
Player Name Clubs Agent Position
Ronaldo Juventus, Real Madrid, Manchester United, Sporting Lisbon Mandez Forward
Rooney DC United, Manchester United, Everton David Forward
Nani Manchester United, Sporting Lisbon Mandez Winger



1st Normal Form
Each column should contain a single value and each record have to be unique.
Player Name Club Agent Position
Ronaldo Juventus Mandez Forward
Ronaldo Real Madrid Mandez Forward
Ronaldo Manchester United Mandez Forward
Ronaldo Sporting Lisbon Mandez Forward
Rooney DC United David Forward
Rooney Manchester United David Forward
Rooney Everton David Forward
Nani Manchester United Mandez Winger
Nani Sporting Lisbon Mandez Winger



2nd Normal Form
Now, we are in 1NF. To make it 2NF, we need to define primary key.
Player Id Player Name Agent Position
1 Ronaldo Mandez Forward
2 Rooney David Forward
3 Nani Mandez Winger

Player Id Club
1 Juventus
1 Real Madrid
1 Manchester United
1 Sporting Lisbon
2 DC United
2 Manchester United
2 Everton
3 Manchester United
3 Sporting Lisbon



3rd Normal Form
There are data anomalies and inconsistencies in current design, for example:
+ If we deleted the "Ronaldo" and "Nani", we would inevitably delete the agent "Mandez" completely from the database.
+ We cannot add a new agent to the database unless we also add a player.
+ If "Mandez" retired from agent job, we would have to change it in all records in which he appears.
So, we will remove this transitive functional dependency in 3NF. Agents will be moved to another table and its id will be used as foreign key in Player table.
Player Id Player Name Agent Id Position
1 Ronaldo 1 Forward
2 Rooney 2 Forward
3 Nani 1 Winger

Agent Id Name
1 Mandez
2 David
Now, we have "Players", "Agents", "Players_Clubs" tables.
We can optimized by moving Clubs into "Clubs" table and linking its id as foreign key in "Players_Clubs" table.

Saturday, January 12, 2019

How can we dynamically generate pivot table in MySQL?

Pivot table is an important concept we are using in many applications. Technically, it generate the data values into dynamic columns. One of the examples is attendance information. We store the person name, date and present (boolean that determined if that person is present or absent on this date).

This is how data is stored is database. This is what we want to see in report.
id student date present
1 John 2019-01-07 1
2 David 2019-01-07 1
3 Larry 2019-01-07 1
4 Wesley 2019-01-07 0
5 Amy 2019-01-07 1
6 John 2019-01-08 1
7 David 2019-01-08 0
8 Larry 2019-01-08 1
9 Wesley 2019-01-08 1
10 Amy 2019-01-08 1
11 John 2019-01-09 0
12 David 2019-01-09 1
13 Larry 2019-01-09 1
14 Wesley 2019-01-09 1
15 Amy 2019-01-09 1
16 John 2019-01-10 1
17 David 2019-01-10 1
18 Larry 2019-01-10 0
19 Wesley 2019-01-10 1
20 Amy 2019-01-10 1
21 John 2019-01-11 1
22 David 2019-01-11 1
23 Larry 2019-01-11 0
24 Wesley 2019-01-11 1
25 Amy 2019-01-11 0
date John David Larry Wesley Amy
2019-01-07 1 1 1 0 1
2019-01-08 1 0 1 1 1
2019-01-09 0 1 1 1 1
2019-01-10 1 1 0 1 1
2019-01-11 1 1 0 1 0

SQL Code
SET @dynamic_columns = NULL;
select group_concat(student_col) from (
select DISTINCT CONCAT('MIN(IF(student = ''', student, ''', present, NULL)) AS ', student) as student_col from 
attendance
) column_list INTO @dynamic_columns;
SET @sql = CONCAT('SELECT date, ', @dynamic_columns, ' FROM attendance GROUP BY date');
PREPARE qry FROM @sql;
EXECUTE qry;
DEALLOCATE PREPARE qry;

Retrieve all the employees under a manager from a self-referencing table in MySQL

In previous post, we discussed how we can manage to store the hierarchical data in MySQL using self-referencing relationship.
Now, we will see how we can retrieve the data back.
Here, we have manager-employees relationship. One manager has other employees (maybe manager or non-manager) working under him. We will retrieve the whole list of employees under a manager.

SQL Code
SET @id := '2';
SELECT  id, name, managerid, @id
FROM (SELECT * FROM employee ORDER BY managerid, name) E 
WHERE FIND_IN_SET(managerid, @id) > 0
AND @id := CONCAT(@id, ',', id);

Let's assume the managerid to search is "2".
We first need to sort employee table (derived table E) by managerid to make sure that records are in correct order. (This is required as we iterate the record from beginning and add to manager list if conditions matched).
FIND_IN_SET returns the position of a string within a list of strings. (Greater than zero means this employee is under him).
If current record work under this manager(s), CONCAT the id of current record into manager list. (He may have other employees under him).

Thursday, January 10, 2019

How can we store self-referencing entities in MySQL?

Self-referencing entities are one of the database designs we are facing in many applications.
An employee will have direct reporting manager who is also an employee.
Each person has mother who also has her own mother and so on.
We can efficiently store those type of hierarchical data in RDBMS.
This type of relationship is called as recursive association which connect to the same class type.
Well, let's have some hands-on work now.

Following SQL code will create the employee table with self-referencing key.
CREATE TABLE `employee` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(45) DEFAULT NULL,
  `managerid` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index2` (`managerid`),
  CONSTRAINT `fk` FOREIGN KEY (`managerid`) REFERENCES `employee` (`id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

After creating the table, we will enter the dummy data rows to test.
INSERT INTO employee(name) values('CEO');
INSERT INTO employee(name, managerid) values('Mgr1', 1);
INSERT INTO employee(name, managerid) values('Mgr2', 1);
INSERT INTO employee(name, managerid) values('TL1', 2);
INSERT INTO employee(name, managerid) values('TL2', 3);

Now, we can easily retrieve the values from our self-referencing table using following SQL join query.
SELECT EMP.id, EMP.name AS "Employee", MGR.name AS "Manager"
FROM employee EMP LEFT OUTER JOIN employee MGR
ON EMP.managerID = MGR.id
ORDER BY EMP.managerid, EMP.name

Output
Id Employee Manager
1 CEO (null)
2 Mgr1 CEO
3 Mgr2 CEO
4 TL1 Mgr1
5 TL2 Mgr2

Image compression using C#

Sometimes, we need to compress the image files while maintaining the image quality. This can be achieved by the following C# implementation....