Employee retention using Machine Learning

Using machine learning to understand employee behaviour and retention can occur as a cool idea.

These days when employee engagement has gone completely online and work from home models have disrupted traditional workplaces its all the more important for the HR department to collect and understand employee behaviour.

I tried a solution to collect a sample dataset and use it to come up with an analysis .

First and most important what we need to do is to pick a sample HR dataset.

HR can use a carefully orchestered online survey from there employees .

R is the programming language chosen to perform the analysis.

Firts step — Lets get started to create a sample application using R shiny and R for data analysis /presentation.

The best place to start would be to get an HR survey data .HR survey should be able to capture the below data points.

The below data points are some of the interesting fields to look for :

· Satisfaction Level

· Last evaluation

· Number of projects

· Average monthly hours

· Time spent at the company

· Whether they have had a work accident

· Whether they have had a promotion in the last 5 years

· Departments (column sales)

· Salary

· Whether the employee has left

If you intend to follow a similar approach you should focus on collecting the data based on the above field ideas and ensure its accuracy. The data can be based on a internal employee survey and need not cover all the fields.

Sample dataset to refer from Kaggle


Technologies and Tools

Technologies: — R, R Shiny

Tools: — R Studio

Machine Learning Workflow should start by asking the below questions :-

  • Are we asking the right questions to the data set ?
  • Have we prepared our data and made it free of noise ?
  • Selected the right algorithm for the problem statement?
  • Have trained the model enough?

An important points to note,

Test the model once you have prepared your data and chosen an algorithm, to train the dataset you should test the model .

For my sample project refer here

Instructions to run the project :

Run Algorithm.rmd file to understand the data .

For this problem , I selected logical regression algorithm for the analysis.

Don't forget to install R studio and the necessary libraries listed in Readme .

Run the Server.R file in R studio to get the application deployed .

Upload the testing.csv survey results obtained from Kaggle or the custom dataset.

The application will display the predicted results of the first 300 employees most likely to leave and this result can be downloaded .

Interests in tech ,economics and literature.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store