ADP (R)

[R]`data` and `reference` should be factors...

멋쟁이천재사자 2022. 10. 30. 17:11

오류

# Error: `data` and `reference` should be factors with the same levels.

 

발생

분류 분석 이후 caret::confusionMatrix 를 이용하여 성능을 평가하려고 시도하다가 발생

 

재현

library(caret)
mtcars.glm <- glm(formula=am ~ .,data=mtcars,family = binomial)
mtcars.glm.step <- step(mtcars.glm,direction = "both")
pred <- predict(mtcars.glm.step,mtcars,type = "response")
pred.f <- factor(ifelse(pred>0.5,1,0))
confusionMatrix(pred.f,mtcars$am)

 

원인

mtcars$am 이 factor 가 아님

해결

confusionMatrix 의 인수에 mtcars$am 대신 as.factor(mtcars$am) 또는 factor(mtcars$am)를 사용

mtcars.glm <- glm(formula=am ~ .,data=mtcars,family = binomial)
mtcars.glm.step <- step(mtcars.glm,direction = "both")
pred <- predict(mtcars.glm.step,mtcars,type = "response")
confusionMatrix(pred,mtcars$am)
pred.f <- factor(ifelse(pred>0.5,1,0))
confusionMatrix(pred.f,as.factor(mtcars$am))

 


library(caret)
library(dplyr)
library(car)

Duncan.rf <- train(type~., Duncan,
                 method = "rf",
                 tuneLength = 10,
                 trControl = trainControl(method = "cv"))
Duncan.pred <- predict(Duncan.rf,Duncan)

# Error: `data` and `reference` should be factors with the same levels.
confusionMatrix(Duncan.pred,Duncan)
# 형식을 맞춰주니 해결됨
confusionMatrix(Duncan.pred,Duncan$type)