Home Employees Database
Post
Cancel

Employees Database

Summary

This database is designed to store information about employees, departments, roles, and salaries. The employee database schema is designed to store comprehensive information about employees, departments, roles, and salaries within an organization.

This document describes the Employees sample database.

The Employees sample database was developed by Patrick Crews and Giuseppe Maxia and provides a combination of a large base of data (approximately 160MB) spread over six separate tables and consisting of 4 million records in total. The structure is compatible with a wide range of storage engine types. Through an included data file, support for partitioned tables is also provided.

In addition to the base data, the Employees database also includes a suite of tests that can be executed across the test data to ensure the integrity of the data that you have loaded. This should help ensure the quality of the data during initial load, and can be used after usage to ensure that no changes have been made to the database during testing.

Data dictionnary

Table nameRecords
employees300024
departments9
dept_manager24
dept_emp331603
titles443308
salaries2844047
  • These tables and their relationships provide a structured representation of employee-related data, allowing for efficient management and retrieval of information within the organization. The schema captures essential details such as personal information, department assignments, job titles, and salary history for each employee.

(1) ’employees’ Table:

  • emp_no: Unique identifier for each employee.
  • birth_date: Date of birth of the employee.
  • first_name: First name of the employee.
  • last_name: Last name of the employee.
  • gender: Gender of the employee (‘M’ for Male, ‘F’ for Female).
  • hire_date: Date when the employee was hired.

(2) ‘departments’ Table:

  • dept_no: Unique identifier for each department.
  • dept_name: Name of the department.

(3) ‘dept_manager’ Table:

  • emp_no: Employee number (foreign key referencing ’employees’ table).
  • dept_no: Department number (foreign key referencing ‘departments’ table).
  • from_date: Start date of the managerial role.
  • to_date: End date of the managerial role.

(4) ‘dept_emp’ Table:

  • emp_no: Employee number (foreign key referencing ’employees’ table).
  • dept_no: Department number (foreign key referencing ‘departments’ table).
  • from_date: Start date of the department assignment.
  • to_date: End date of the department assignment.

(5) ’titles’ Table:

  • emp_no: Employee number (foreign key referencing ’employees’ table).
  • title: Job title of the employee.
  • from_date: Start date of the job title.
  • to_date: End date of the job title (can be NULL for current titles).

(6) ‘salaries’ Table:

  • emp_no: Employee number (foreign key referencing ’employees’ table).
  • salary: Salary amount.
  • from_date: Start date of the salary.
  • to_date: End date of the salary.

Approach

  1. To improve data quality and simplify analysis, I added an explicit is_active column to the dept_emp table. Previously, active employees were inferred from to_date = '9999-01-01' or future dates, which is error-prone and less transparent. The is_active flag is directly populated based on current date comparisons, making queries for active or inactive employees simple, clear, and resilient to changes over time. This approach enhances readability, reduces reliance on arbitrary placeholder dates, and aligns with best practices for robust HR data management. Future updates to employment status can now update this flag explicitly, ensuring consistent and accurate reporting.

  2. item

Purpose

Insights

Recommendations

DISCLAIMER

  • To the best of my knowledge, this data is fabricated and it does not correspond to real people. Any similarity to existing people is purely coincidental. This is coming from the sample SQL directory.

LICENSE

  • This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
This post is licensed under CC BY 4.0 by the author.