Hidden R gems: package janitor

janitor has simple functions for examining and cleaning dirty data. It was built with beginning and intermediate R users in mind and is optimized for user-friendliness. Advanced R users can perform many of these tasks already, but with janitor they can do it faster and save their thinking for the fun stuff.

The main janitor functions:

  • perfectly format data.frame column names;
  • create and format frequency tables of one, two, or three variables - think an improved table();
  • provide other tools for cleaning and examining data.frames. The tabulate-and-report functions approximate popular features of SPSS and Microsoft Excel.

janitor is a #tidyverse-oriented package. Specifically, it plays nicely with the %>% pipe and is optimized for cleaning data brought in with the readr and readxl packages.

Krzysztof Banas
Krzysztof Banas
Principal Research Fellow

I work as beam-line scientist at Singapore Synchrotron Light Source. My research interests include application of advanced statistical methods for hyperspectral data processing (dimension reduction, clustering and identification).

Related