1. 문제
ADP 23회 (출처1:https://cafe.naver.com/sqlpd/28193)
출처2 : https://www.kaggle.com/code/kukuroo3/problem-r-base?scriptVersionId=87642636
코로나 시계열 데이터 5만 관측치 가량, 날짜, 코로나 누적확진자 등 변수 3개
1. ACF 사용해서 distance 계산 (10)
2 계층적 군집 분석을 위해 덴드로그램 작성 (10)
2. 답안
1. ACF 사용해서 distance 계산 (10)
rm(list=ls())
pacman::p_load(tidyverse,magrittr,lubridate,reshape2,recipes,forecast, factoextra, dtw)
library(TSdist)
temp <- read.csv("adp23/problem3_covid.csv")
temp %>%
na.omit() %>%
group_by(location) %>%
mutate (row =row_number()) %>%
pivot_wider(names_from = location,
values_from = new_cases,
values_fill =0) %>%
ungroup() %>%
select (-date, -row) %>%
t() %>%
TSdist::TSDatabaseDistances( distance = "acf") -> dist
dist
2 계층적 군집 분석을 위해 덴드로그램 작성 (10)
dist %>%
hclust(method ="average") %>%
plot(hang=-1)
3. 공부 과정 log
ACF 사용해서 distance 계산??
https://cafe.naver.com/sqlpd/28807
시계열에 보면 TSdist:: TSDatabaseDistances(distance = "acf") acf 구하는 TSdist 패키지가 있던데요..
https://blog.naver.com/meta_com/221571968119
먼소린지 너무 어렵다
library(TSdist)
install.packages("TSdist") -- ADP packages 목록에는 없다.
ACFDistance {TSdist} R Documentation
Autocorrelation-based Dissimilarity
Details
This is simply a wrapper for the diss.ACF function of package TSclust.
As such, all the functionalities of the diss.ACF function are also available when using this function.
TSclust -- NO ADP package
-- ADP packages 중 관련있을 법한 패키지 대충 검토. 쓸만한 놈 못찾음
bigdist
bridgedist
disttools
ecodist
emdist
epandist
FAdist
freqdist
Newdistns
NFWdist
p2distance
parallelDist
DACF
https://journal.r-project.org/archive/2016-2/mori-mori-mendiburu-etal.pdf
<=== 2022-07-14
슬통 유튜브 영상 - "23회 ADP 실기 합격자 인터뷰" (https://youtu.be/eSRZrkDpAow) study
https://it-freelancer.tistory.com/28
rm(list=ls())
pacman::p_load(tidyverse,magrittr,lubridate,reshape2,recipes,forecast, factoextra, dtw)
temp <- read.csv("adp23/problem3_covid.csv")
temp %>%
group_by(location) %>%
mutate (row =row_number()) %>%
pivot_wider(names_from = location,
values_from = new_cases,
values_fill =0) %>%
ungroup() %>%
select (-date, -row) %>%
t() %>%
TSdist::TSDatabaseDistances( distance = "acf") %>% # <simpleError in .common.ts.sanity.check(x): NA in the series>
hclust(method ="average") %>% #NA/NaN/Inf in foreign function call (arg 10)
plot()
<=== 2022-07-17
1. ACF 사용해서 distance 계산 (10)
rm(list=ls())
pacman::p_load(tidyverse,magrittr,lubridate,reshape2,recipes,forecast, factoextra, dtw)
library(TSdist)
temp <- read.csv("adp23/problem3_covid.csv")
temp %>%
na.omit() %>%
group_by(location) %>%
mutate (row =row_number()) %>%
pivot_wider(names_from = location,
values_from = new_cases,
values_fill =0) %>%
ungroup() %>%
select (-date, -row) %>%
t() %>%
TSdist::TSDatabaseDistances( distance = "acf") -> dist
head(dist)
Error in .Primitive("[")(x, 1:6, , drop = FALSE) :
incorrect number of dimensions
2 계층적 군집 분석을 위해 덴드로그램 작성 (10)
dist %>%
hclust(method ="average") %>%
plot()
<=== 2022-08-08
plot() -> plot(hang=-1)
<=== 2022-09-03
'ADP (R)' 카테고리의 다른 글
[R 연습 문제] tapply 코드를 purrr::map 으로 변환하기 (0) | 2022.07.20 |
---|---|
오늘의 ADP 준비 / 2022-07-19 화(D68) (0) | 2022.07.19 |
purrr 을 이용해야 진정한 R 사용자 ! (0) | 2022.07.17 |
[ADP 실기 study log] ADP 23회 객실사용여부 (풀이중) (0) | 2022.07.15 |
[ADP 실기 study log] 20회 기출 2번 전력사용량 군집분석 (0) | 2022.07.09 |