I have an R dataframe. Several columns have binary values (e.g. ‘Y’ or ‘N’). Some fields in these binary columns had NULLs (
NA) values. I wanted to change the NULLs to ‘N’.
I thought the task was obvious: just use
No. I tried
mutate...coalesce.... Nothing took. I didn’t get errors, but the NULLs remained. I tried banging my head on the desk, but that didn’t help, either.
Finally, while testing with a small, dummy test set, I finally got some errors and, therefore, a clue to the problem.
Long story short, I needed to convert the columns from
character datatypes before trying to replace the NULLs. After converting the datatypes, NULL replacement worked just fine.
Try it yourself …
# create test data.frame tdf <- data.frame(col1=letters[1:3], col2=c(NA, "Y", NA)) # view data.frame tdf
col1 col2 <fctr> <fctr> a NA b Y c NA 3 rows
# replace NA's with 'N' tdf$col2 <- tdf$col2 %>% replace_na('N')
invalid factor level, NA generated
# convert column to character before replace_na, then back to factor tdf$col2 <- as.factor(as.character(tdf$col2) %>% replace_na('N')) # display results tdf
col1 col2 <fctr> <fctr> a N b Y c N 3 rows