Note: Other read functions exist such as readxl to directly read xls files, but functionality and
options vary. Help documentation can be found within R using the appropriate help
command (eg. ?readxl command for readxl).
4. Basic Tidying of your Data
Clean Up Your Data
As with all statistical software, your data must be in the correct format for
visualisation and analysis to function correctly.
To check the type of each variable in a dataframe, the
class() function could be used separately. A quicker way
to check the variables in a dataframe is the structure or
str() function, which displays all variables at once.
While numeric variables generally cause few problems, logical and factor
variables often do have issues due to R interpreting the variable type when
importing from an Excel file.
•
Problem: One of the logical values is in lower case (eg. TRUE, TRUE,
false, FALSE), so all values are interpreted as a character type variable
(“TRUE”, “TRUE”, ”false”, ”FALSE”). R interprets logical variables only if
The first and second variables are categories or “factors”
The third variable is a number or “numeric”
The fourth is True/False or “logical”