The tidyverse collection of packages is a suite of packages that simplifies a huge number of the commonest tasks I do in R. It’s become indispensable for me, and I’ll make heavy use of it.

I draw your attention to dplyr, one of the tidyverse packages. It provides a set of functions that makes manipulating data frames a lot neater. You can filter, select, sort, and create new columns in a much neater way than using R’s… esoteric… native syntax.

I strongly recommend visiting — and bookmarking — their website.

Here’s one example of how the clarity of a piece of code can improve. Suppose you want to subset the (inbuilt) iris data frame according to the width of the sepals and the length of the petals. In the traditional R way, you might write

iris[iris$Sepal.Width < 3.25 & iris$Petal.Length < 5, ]

But using dplyr, it’s

iris %>%
filter(Sepal.Width < 3.25) %>%
filter(Petal.Length < 5)

I will use filter, select, arrange, mutate from the dplyr package, crossing from the tidyr package, and many functions from the stringr package frequently.

Published by densurekalkun

https://twitter.com/GeocacherB

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: