Hidden R gems: package janitor
janitor has simple functions for examining and cleaning dirty data. It was built with beginning and intermediate R users in mind and is optimized for user-friendliness. Advanced R users can perform many of these tasks already, but with janitor they can do it faster and save their thinking for the fun stuff.
The main janitor functions:
- perfectly format
data.framecolumn names; - create and format frequency tables of one, two, or three variables - think an improved
table(); - provide other tools for cleaning and examining
data.frames. The tabulate-and-report functions approximate popular features of SPSS and Microsoft Excel.
janitor is a #tidyverse-oriented package. Specifically, it plays nicely with the %>% pipe and is optimized for cleaning data brought in with the readr and readxl packages.