Resources for dplyr and ggplot2
Today we introduced the R packages 'dplyr' and 'ggplot2'. We only had time for a few brief demos, but these packages are very powerful and you may be using them quite a bit!
More about dplyr
dplyr is useful for manipulating data. Most of the time you'll be using one of 5 main functions:
filter(): subsets rows based on a condition
select(): selects specific columns
mutate(): creates a new column
group_by(): groups data by some variable
summarize(): returns new data frame with specified summary statistics
These aren't the only functions in the dplyr package, but you can get pretty far with your data manipulation with only these 5 functions!
Some dplyr resources:
Main dplyr webpage
dplyr cheatsheet (includes some functions from package tidyr)
Tutorial from RStudio
Tutorial with biological data
More about ggplot2
You can make plots using base R (i.e. no packages loaded), but sometimes you may want to make certain plots that are a challenge in base R. ggplot2 is the main graphics package people use in R: it's very powerful and you can make great visualizations with it.
To best understand ggplot2, you'll need to start thinking in terms of "the grammar of graphics." It can take a while to get a hang of the ggplot2 syntax, particularly the aesthetics, or aes(), portion, but with it you can really fine tune your graphics.
Some ggplot2 resources:
Main ggplot2 website
ggplot2 cheatsheet
Tutorial using social science data
Tutorial on making different kinds of plots
Graphics gallery (with code)
Another graphics gallery (with code)
Welcome to the tidyverse!
This is also our first introduction to the "tidyverse" - R packages for data science that are designed to work well together. You can learn more about the the idea and implementation of the R tidyverse here.