Exam 1 Review
Suggested answers
b, c, f, g -
The
blizzard_salarydataset has 409 rows.The
percent_incrvariable is numerical and continuous.The
salary_typevariable is categorical.
Figure 1 - A shared x-axis makes it easier to compare summary statistics for the variable on the x-axis.
c - It’s a value higher than the median for hourly but lower than the mean for salaried.
b - There is more variability around the mean compared to the hourly distribution.
a, b, e - Pie charts and waffle charts are for visualizing distributions of categorical data only. Scatterplots are for visualizing the relationship between two numerical variables.
c -
mutate()is used to create or modify a variable.a -
"Poor", "Successful", "High", "Top"b - Option 2. The plot in Option 1 shows the number of employees with a given performance rating for each salary type while the plot in Option 2 gives the proportion of employees with a given performance rating for each salary type. In order to assess the relationship between these variables (e.g., how much more likely is a Top rating among Salaried vs. Hourly workers), we need the proportions, not the counts.
There may be some
NAs in these two variables that are not visible in the plot.The proportions under Hourly would go in the Hourly bar, and those under Salaried would go in the Salaried bar.
c -
filter(salary_type != "Hourly" & performance_rating == "Poor")- There are 5 observations for “not Hourly” “and” Poor.a -
arrange()- The result is arranged in increasing order ofannual_salary, which is the default forarrange().c, d, e, f.
Part 1: The following should be fixed:
There should be a
|after#beforelabelThere should be a
:after label, not=There shouldn’t be a space in the chunk label, it should be
plot-blizzardThere should be spaces after commas in the code
There should be spaces on both sides of
=in the codeThere should be a space before
+geom_boxplot()should be on the next line and indentedThere should be a
+at the end of thegeom_boxplot()linelabs()should be indented
Part 2: The warning is caused by
NAin the data. It means that 39 observations wereNAs and are not plotted/represented on the plot.Part 1:
- Render: Run all of the code and render all of the text in the document and produce an output.
- Commit: Take a snapshot of your changes in Git with an appropriate message.
- Push: Send your changes off to GitHub.
Part 2: c - Rendering or committing isn’t sufficient to send your changes to your GitHub repository, a push is needed. A pull is also not needed to view the changes in the browser.