Solution: Help Patients
Use data filtering, aggregation, and modification techniques to complete the challenge.
We'll cover the following...
Compare the answers
In this lesson, we’ll go through the challenge and explain the tasks required from us one by one. Check out the solution provided in the container below. Please note that there is not just one correct answer. We can still reach the same result by taking different paths.
Press + to interact
# We use the 'patients' dataset in this exerciselibrary(dplyr)print(head(patients))print('----------The details of the dataset: -----')print(str(patients))print('--------- The result: ----------')# First we define the average BMI for females and males by ignoring the null valuesavg_f = mean(patients[patients$gender=='Female',]$BMI,na.rm = TRUE) # Average BMI for female patientsavg_m = mean(patients[patients$gender=='Male',]$BMI,na.rm = TRUE) # Average BMI for male patientsresult <- patients %>% mutate(BMI = ifelse(is.na(BMI)&gender=='Female',avg_f,BMI))%>% # Impute null values with female averagemutate(BMI = ifelse(is.na(BMI)&gender=='Male',avg_m,BMI))%>% # Impute null values with male averagemutate(fasting_glucose=as.double(fasting_glucose))%>% # Convert the character data type into double data typemutate(glucose_diff = abs((fasting_glucose - mean(fasting_glucose))/sd(fasting_glucose)))%>% # Calculate the Z-score for each fasting glucose value and create a new columnmutate(avg_bp = (high_bp + low_bp)/2)%>% # Create a new column for the average blood pressuremutate(in_danger= ifelse(avg_bp>=100,TRUE,FALSE))%>% # Create a binary column stating that the patient is in danger if the average blood pressure is over 110select(c(id,gender,BMI,fasting_glucose,glucose_diff,avg_bp,in_danger))%>% # Select the columns desired by the managers.filter(glucose_diff >= 2 & BMI >= 25 & in_danger == TRUE) # Filter the patients that carry danger in the given measurements.print(result) # Check the patients who are in dangerprint('-------- The number of rows: -------')result<- result%>%nrow() # Find the number of patients who needs to be care.print(result)
-
Line 5: We check the data types of the columns. Notice that the
fasting_glucose
column is in character data type even though the characters are numeric. ...