HOW LIVING THINGS SURVIVE ENVIRONMENT IN UAE .YES I KNOW DESERTS ARE TOO HOT

YES I KNOW DESERTS ARE HOT , YOU HARDLY FIND WATER FOR HUMANS BUT AS FOR ANIMALS THE QUESTION IS THAT HOW DO THE SURVIVE?? SO BASICALLY ANIMALS IN THE DESERT HAVE A SYSTEM OF DRINKING WATER FROM…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Data Cleaning in SQL

Welcome to this version of SQL cleaning!

Being a Data Analyst (inspiring Data Scientist), I realized my SQL skills need some fixing, so what better way than to write an article detailing the step by step cleaning I’ve done for this particular dataset. I welcome feedback of any kind, and please let me know if you have any questions!

The fictional dataset retrieved uncovers the external/internal factors that lead to employee attrition.

My hypothesis is: both genders receive the same amount of perfomance ratings, workers from various educational backgrounds have been at the same company for many years, and the higher the age of a worker, the more daily rate they recieve. The results or findings will be listed at the end of the article.

I uncover the factors that lead to employee attrition and explore important questions such as ‘show me a breakdown of distance from home by job role and attrition’ or ‘compare average monthly income by education and attrition’. This is a fictional data set created by IBM data scientists.

Dataset via Kaggle:

Since this was the first time I worked with a .csv dataset that needed to be imported from my drive to PostgreSQL, I incorporated creating a table with the datasets columns’ and datatype.

COPY QUERY

Now this, I had some trouble with. After turning to YouTube, Stackflow, and Google, I realized the answers I was hoping to find were either outdated or solely for Windows operators. As I have an Apple computer, I tinkered with the query for a day or two, but really three. I transferred the file to the PostgreSQL folder and ensured the query columns matched the dataset columns. It worked!

OUTPUT

My goal in this portion was to profile the data, find missing values/duplicate records, and calculate descriptive statistics.

View all records from dataset:

QUERY OUTPUT

From what I learned, if the missing data is less than 5% of entire dataset, we can go ahead and delete rows. If not, we can follow either or:

1. Delete rows

2. Input with mean/median/mode

FINDING DUPLICATES QUERY

As there is no results in the output, there are no duplicates. We can move on.

MIN/MAX/AVG QUERY

We can note the minimum daily rate is $102 as the maximum is $1499. That can be factual or may be an outlier. Taking a closer look at the dataset, it is important to not have bias as external/internal factors can affect values within a dataset. Since the average daily rate is $802, we can concur that the maximum daily rate is on trend.

DISTINCT QUERY

From the table above, we can see there are 9 distinct job roles in the dataset. Our next step is view the total count of each role, which job role has the highest number of employees, and determine whether work life balance would be considered good, best or better.

Can we determine how many workers are in each distinct job role?

Sales Executives has the most employee count, followed by Laboratory Technicans and Healthcare Representatives.

Which department pays more?

We can conclude all three values, Human Resources, Research & Development and Sales has the highest USD amount per month. The USD values are similar in length, as the numerical values consist of 5 figures.

To determine the amount of workers that have varying educational degrees, we can use the GROUP BY function.

Most employees have obtained a Bachelor’s Degree, followed by an Associate’s Degree. Since there are 1,459 records in this database, we can conclude more than one third of the employees have a Bachelor’s degree.

Let’s see what the total job performance ratings are for men and females.

Males has a higher count of performance ratings, compared to women.

Which departments has the highest numbers of male and female?

All departments including Human Resources, Research & Development, and Sales has more male employees than females.

Is there a thing as work life balance though? This would mean you are not spending 100% of your time at work or thinking about the work. You might say you have good, great or better work life balance.

Within the departments, Human Resources, Research & Development and Sales — Research & Development overall had the best work life balance figures. Human Resources did not however had close to those numbers.

How about the amount of years an employee has put it in? Does it influence a workers hourly rate?

As I was working on this query, the returned records were massive. So I decided to implement a query that shows hourly rate was greater or equal to $100. We can note that an employee who has worked in the company for 22 years thus far can have the same hourly rate as an employee whose been at the company for 3 years or so. This can lead to workers quitting when feeling under-appreciated at work.

Do employees have a high, medium or low job satisfaction with their employer?

Department, maximum/average years at a company might have an impact with job satisfaction. I wanted to hone in ‘high’ rated job satisfaction values, and from the table above, employees who have stayed at their company for over 5 years has a great job relationship. The maximum years an employee has stayed at a company enforces this ideal.

My goal for this section is to gather the total amount of employees within various age brackets.

18–23 : Employee rates has a upward and steady trend.

24–36 : There is a massive hike in total employees

37 — so forth : Employee rates have a fast decline.

Let’s dig into the marital status of an employee and business travel. I wanted to see which group — single, divorced or married employees travel more often?

Totaling more than half the records in the dataset (1,042 out of 1,469), married, as well as divorced and single employees rarely travel.

Incoming workers are choosing to enter the sales field as that department pays relatively more than any other department.

We can infer potential candidates may be required to have a Bachelors Degree (depending on the department and job responsibilites).

Go into Research & Development for better work life balance practices.

An employee that has stayed at one company for 10 years compared to 1 years, can make the same exact hourly rate.

Marital Status has no influence on business travel options.

Thank you for reading!

Contact Information:

Add a comment

Related posts:

5 things you should do to generate more leads on your real estate website

My journey in real estate as a product developer began over a decade ago when I was hired by a local real estate team to build them a website. In addition to building the website, I was also…

Hard to say anything right now!

Due to low volume level, we don’t expected too much of short-term volatility. It seems to random walk or drifting a while for couple of day. The keys support zone around 7400–7600. One thing for sure…

My first Christmas gifts from my boyfriend.

My first Christmas gifts from my boyfriend., a Medium series by AndrewWu