Module 10.
GitHub link: https://github.com/christyj777/tidycleanr/tree/main
Scope and Purpose:
The goal of tidycleanr is to make everyday data cleaning faster and more consistent for analysts and students who work with messy CSVs and survey data. Instead of repeating the same wrangling steps by hand, users can call short helper functions to standardize column names, guess variable types, and handle missing values. The package focuses on simple utilities that make it easy to prepare data for visualization or modeling.
Key Functions:
clean_names() – wraps janitor to quickly convert messy column names into snake_case and fix duplicates.
guess_types() – automatically converts character columns to numeric, date, or factor types where appropriate.
impute_fast() – fill missing values using median or mode for numeric and categorical variables.
drop_dupes() – remove duplicate rows across one or more key columns using dplyr.
Description:
Some of the fields above such as (dependencies and license) was provided, so I used those to maintain integrity. The author has my personal details. The description and title were based on the package itself and its capabilities. Imports include janitor, dplyr, tidyr, and stringr, since these packages provide the backbone of data manipulation and have used them many times. LazyData is set to true to make it easy to include small example datasets in the future.
Comments
Post a Comment