Data Analysis in R - Book details.csv cleanup and exploration
Richard Rowe
Book Data Analysis
Richard_Rowe
2025-02-27
Explore the Book Details csv file from the Kaggle Amazon Book Reviews dataset.
Purpose: Clean data and conduct basic analysis exploring different packages.
citations at end of file.
Summary.
The data set contains 212404 observations that lists various information on books published in 10 columns. A large amount of data (approx 23%) is missing and the dates are in a number of different formats. An attempt was made to normalise the dates but did not achieve full results due to multiple differnt formats.
This is to be re-examined at a later date.
There are multiple entries for some book titles due to spelling and also the fact that some books have been published multiple times.
The author and categories columns contain lists that were unnested as part of the analysis.
Click here for complete webpage