Summary
This database is designed to store information about employees, departments, roles, and salaries. The employee database schema is designed to store comprehensive information about employees, departments, roles, and salaries within an organization.
This document describes the Employees sample database.
The Employees sample database was developed by Patrick Crews and Giuseppe Maxia and provides a combination of a large base of data (approximately 160MB) spread over six separate tables and consisting of 4 million records in total. The structure is compatible with a wide range of storage engine types. Through an included data file, support for partitioned tables is also provided.
In addition to the base data, the Employees database also includes a suite of tests that can be executed across the test data to ensure the integrity of the data that you have loaded. This should help ensure the quality of the data during initial load, and can be used after usage to ensure that no changes have been made to the database during testing.
Data dictionnary
Table name | Records |
---|---|
employees | 300024 |
departments | 9 |
dept_manager | 24 |
dept_emp | 331603 |
titles | 443308 |
salaries | 2844047 |
- These tables and their relationships provide a structured representation of employee-related data, allowing for efficient management and retrieval of information within the organization. The schema captures essential details such as personal information, department assignments, job titles, and salary history for each employee.
(1) ’employees’ Table:
- emp_no: Unique identifier for each employee.
- birth_date: Date of birth of the employee.
- first_name: First name of the employee.
- last_name: Last name of the employee.
- gender: Gender of the employee (‘M’ for Male, ‘F’ for Female).
- hire_date: Date when the employee was hired.
(2) ‘departments’ Table:
- dept_no: Unique identifier for each department.
- dept_name: Name of the department.
(3) ‘dept_manager’ Table:
- emp_no: Employee number (foreign key referencing ’employees’ table).
- dept_no: Department number (foreign key referencing ‘departments’ table).
- from_date: Start date of the managerial role.
- to_date: End date of the managerial role.
(4) ‘dept_emp’ Table:
- emp_no: Employee number (foreign key referencing ’employees’ table).
- dept_no: Department number (foreign key referencing ‘departments’ table).
- from_date: Start date of the department assignment.
- to_date: End date of the department assignment.
(5) ’titles’ Table:
- emp_no: Employee number (foreign key referencing ’employees’ table).
- title: Job title of the employee.
- from_date: Start date of the job title.
- to_date: End date of the job title (can be NULL for current titles).
(6) ‘salaries’ Table:
- emp_no: Employee number (foreign key referencing ’employees’ table).
- salary: Salary amount.
- from_date: Start date of the salary.
- to_date: End date of the salary.
Approach
To improve data quality and simplify analysis, I added an explicit
is_active
column to thedept_emp
table. Previously, active employees were inferred fromto_date = '9999-01-01'
or future dates, which is error-prone and less transparent. Theis_active
flag is directly populated based on current date comparisons, making queries for active or inactive employees simple, clear, and resilient to changes over time. This approach enhances readability, reduces reliance on arbitrary placeholder dates, and aligns with best practices for robust HR data management. Future updates to employment status can now update this flag explicitly, ensuring consistent and accurate reporting.item
Purpose
Insights
Recommendations
DISCLAIMER
- To the best of my knowledge, this data is fabricated and it does not correspond to real people. Any similarity to existing people is purely coincidental. This is coming from the sample SQL directory.
LICENSE
- This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.