Home Employees HR Analysis
Post
Cancel

Employees HR Analysis

2024-06-19 07:54:08

DESCRIPTION

This dataset contains 100,000 rows of data capturing key aspects of employee performance, productivity, and demographics in a corporate environment. It includes details related to the employee’s job, work habits, education, performance, and satisfaction. The dataset is designed for various purposes such as HR analytics, employee churn prediction, productivity analysis, and performance evaluation. I started with mysql analysis and then predicting employee churn with Python.

TABLES

ColumnDescription
Employee_IDUnique identifier for each employee.
DepartmentThe department in which the employee works (e.g., Sales, HR, IT).
GenderGender of the employee (Male, Female, Other).
AgeEmployee’s age (between 22 and 60).
Job_TitleThe role held by the employee (e.g., Manager, Analyst, Developer).
Hire_DateThe date the employee was hired.
Years_At_CompanyThe number of years the employee has been working for the company.
Education_LevelHighest educational qualification (High School, Bachelor, Master, PhD).
Performance_ScoreEmployee’s performance rating (1 to 5 scale).
Monthly_SalaryThe employee’s monthly salary in USD, correlated with job title and performance score.
Work_Hours_Per_WeekNumber of hours worked per week.
Projects_HandledTotal number of projects handled by the employee.
Overtime_HoursTotal overtime hours worked in the last year.
Sick_DaysNumber of sick days taken by the employee.
Remote_Work_FrequencyPercentage of time worked remotely (0%, 25%, 50%, 75%, 100%).
Team_SizeNumber of people in the employee’s team.
Training_HoursNumber of hours spent in training.
PromotionsNumber of promotions received during their tenure.
Employee_Satisfaction_ScoreEmployee satisfaction rating (1.0 to 5.0 scale).
ResignedBoolean value indicating if the employee has resigned.

Approach to Dataset Analysis

Understanding the Dataset

I began by thoroughly reviewing the dataset’s schema, identifying 20 well-defined columns. This is documented what each column represents — e.g., demographics (Age, Gender), job information (Department, Job_Title, Hire_Date), performance and productivity metrics (Performance_Score, Projects_Handled, Training_Hours), and outcome indicators (Resigned, Employee_Satisfaction_Score). Then I ensured I got a clear understanding of the available variables and their relevance to business questions.

Defining the Business Use Cases

I aligned the analysis with four key HR and business use cases:

  • Churn Prediction: to identify risk factors leading to employee resignation.
  • Productivity Analysis: to uncover drivers of employee output.
  • Performance Evaluation: to assess how employees’ performance relates to various factors.
  • HR Analytics: to gain a demographic and behavioral view of the workforce for strategic planning.

These use cases served me as the guiding questions for what to measure and analyze. This focus on the business problem definition.

Designing Insightful Queries

I crafted SQL queries**, grouped by topic, designed to extract actionable insights from the dataset. These queries focused on:

  • Key metrics and distributions (e.g., resignation rate by department or age group, average tenure of resigned vs. retained employees).
  • Correlations and patterns (e.g., performance vs. salary, training hours vs. productivity, remote work frequency vs. projects handled).
  • Identifying extremes and top performers (e.g., employees with highest projects handled, highest satisfaction & performance).

This approach ensures me that insights are not only descriptive but also diagnostic and predictive where possible.

Documenting and Presenting Results

To make the work reproducible and readable:

  • I documented the dataset columns and their definitions in a Markdown table, creating clear metadata for stakeholders.
  • I grouped the SQL queries under meaningful categories, making it easier for others to navigate and use them.
  • I prepared the queries to be directly executable in MySQL, avoiding unnecessary complexity and ensuring efficiency.

Innovative Touches

I also considered:

  • Advanced queries that check potential correlations (e.g., salary vs. satisfaction).
  • Segmenting results (e.g., by age group, education level, or department) to allow more granular insights.
  • Preparing the foundation for dashboards or reports by aligning queries to metrics that could feed into visual tools like Tableau or Power BI.

Outcome

With this structured and business-driven approach I ensured that HR manager and leadership can:

  • Detect early warning signs of churn.
  • Optimize workforce productivity.
  • Reward and recognize high performers.
  • Build strategic workforce plans grounded in data.

Structure of the approach:

Use CaseDescription
Churn PredictionIdentifying patterns that lead to employee resignation.
Productivity AnalysisUnderstanding the factors that drive productivity, such as remote work, overtime, training.
Performance EvaluationAnalyzing how performance scores correlate with salary, team size, education level.
HR AnalyticsProviding insights into workforce demographics and behavior for strategic decision-making.

Insights

    1. first

Recommendations

    1. first

DISCLAIMER

  • To the best of my knowledge, this data is fabricated and it does not correspond to real people. Any similarity to existing people is purely coincidental.

LICENSE

  • This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
This post is licensed under CC BY 4.0 by the author.