Line

Summary

散布図にGLMなどの予測値を加える方法の備忘録。plot関数で図示したのち、abline関数やlines関数で上書きする。

Sponsored link

Gaussian

function “abline”

  • First argumentに切片、Second argumentに傾きを入れる
  • 正規分布を仮定したモデルの場合のみ有効
fit <- lm(Y ~ X, dat1)
plot(Y ~ X, dat1)

# estimated intercept and slope
fit$coefficients
## (Intercept)           X 
##  0.07065496  0.20422560
# first argument: intercept, second argument: slope
abline(fit$coefficients[1], fit$coefficients[2])

plot of chunk fig1

function “lines”

  • 説明変数Xの最小値から最大値までの範囲の値を100等分(もしくはより多く)する変数xを作る
  • 新しい変数xを基にしたdata.frameを作る
  • predict関数で予測値をえる
  • lines関数で書き加える
fit <- lm(Y ~ X, dat1)

# set new variable x
x <- seq(min(dat1$X), max(dat1$X), length = 100) # x ranging from min(X) to max(X) with 100 length
x
##   [1] 0.0294348 0.1295672 0.2296996 0.3298320 0.4299644 0.5300968 0.6302292
##   [8] 0.7303616 0.8304940 0.9306264 1.0307588 1.1308912 1.2310236 1.3311560
##  [15] 1.4312884 1.5314208 1.6315532 1.7316856 1.8318180 1.9319504 2.0320828
##  [22] 2.1322152 2.2323476 2.3324800 2.4326124 2.5327448 2.6328772 2.7330096
##  [29] 2.8331420 2.9332744 3.0334068 3.1335392 3.2336716 3.3338040 3.4339364
##  [36] 3.5340688 3.6342012 3.7343336 3.8344660 3.9345984 4.0347308 4.1348632
##  [43] 4.2349956 4.3351279 4.4352603 4.5353927 4.6355251 4.7356575 4.8357899
##  [50] 4.9359223 5.0360547 5.1361871 5.2363195 5.3364519 5.4365843 5.5367167
##  [57] 5.6368491 5.7369815 5.8371139 5.9372463 6.0373787 6.1375111 6.2376435
##  [64] 6.3377759 6.4379083 6.5380407 6.6381731 6.7383055 6.8384379 6.9385703
##  [71] 7.0387027 7.1388351 7.2389675 7.3390999 7.4392323 7.5393647 7.6394971
##  [78] 7.7396295 7.8397619 7.9398943 8.0400267 8.1401591 8.2402915 8.3404239
##  [85] 8.4405563 8.5406887 8.6408211 8.7409535 8.8410859 8.9412183 9.0413507
##  [92] 9.1414831 9.2416155 9.3417479 9.4418803 9.5420127 9.6421451 9.7422775
##  [99] 9.8424099 9.9425423
# new data frame with x for prediction
# variable name should be as in the original model (in this case "X", not x)
dat_new <- data.frame(X = x)

# prediction
# intervel specifies "confidence" or "prediction" interval (only available for "lm" function)
# level specifies confindence level (default 95% CI)
pred <- predict(fit, dat_new, interval = "confidence", level = 0.95)
head(pred)
##          fit        lwr       upr
## 1 0.07666630 0.04304971 0.1102829
## 2 0.09711589 0.06400617 0.1302256
## 3 0.11756549 0.08495956 0.1501714
## 4 0.13801509 0.10590976 0.1701204
## 5 0.15846469 0.12685659 0.1900728
## 6 0.17891429 0.14779991 0.2100287
# plot
plot(Y ~ X, dat1)
lines(pred[,1] ~ x, lty = 1) # solid line, median prediction
lines(pred[,2] ~ x, lty = 2) # broken line, lower prediction
lines(pred[,3] ~ x, lty = 2) # broken line, upper prediction

plot of chunk fig2

Poisson

function “lines”

  • 説明変数Xの最小値から最大値までの範囲の値を100等分(もしくはより多く)する変数xを作る
  • 新しい変数xを基にしたdata.frameを作る
  • predict関数で予測値をえる(予測値はリンクスケールlogであることに注意)
  • lines関数で書き加える
fit <- glm(Y ~ X, dat2, family = poisson)

# set new variable x
x <- seq(min(dat1$X), max(dat1$X), length = 100) # x ranging from min(X) to max(X) with 100 

# new data frame with x for prediction
# variable name should be as in the original model (in this case "X", not x)
dat_new <- data.frame(X = x)

# prediction
# "type" specifies the scale of predicted values; in this case, link scale
pred <- predict.glm(fit, dat_new, type = "link", se.fit = T)
head(pred$fit); head(pred$se.fit)
##         1         2         3         4         5         6 
## 0.2719658 0.2899878 0.3080098 0.3260318 0.3440539 0.3620759
##         1         2         3         4         5         6 
## 0.1398443 0.1380268 0.1362131 0.1344033 0.1325975 0.1307960
#median prediction (Note: "exp" was applied to predicted values because link function is "log")
pred_mat <- matrix(NA, nrow = 100, ncol = 3)
pred_mat[,1] <- exp(pred$fit) # median
pred_mat[,2] <- exp(pred$fit - pred$se.fit*1.96) #lower 95%CI
pred_mat[,3] <- exp(pred$fit + pred$se.fit*1.96) #upper 95%CI
head(pred_mat)
##          [,1]      [,2]     [,3]
## [1,] 1.312542 0.9978733 1.726438
## [2,] 1.336411 1.0196457 1.751584
## [3,] 1.360714 1.0418856 1.777108
## [4,] 1.385459 1.0646024 1.803019
## [5,] 1.410655 1.0878059 1.829321
## [6,] 1.436308 1.1115058 1.856023
# plot
plot(Y ~ X, dat2)
lines(pred_mat[,1] ~ x, lty = 1) # solid line, median prediction
lines(pred_mat[,2] ~ x, lty = 2) # broken line, lower prediction
lines(pred_mat[,3] ~ x, lty = 2) # broken line, upper prediction

plot of chunk fig3

Binomial

function “lines”

  • 説明変数Xの最小値から最大値までの範囲の値を100等分(もしくはより多く)する変数xを作る
  • 新しい変数xを基にしたdata.frameを作る
  • predict関数で予測値をえる(予測値はリンクスケールlogitであることに注意)
  • lines関数で書き加える
fit <- glm(cbind(Y, N-Y) ~ X, dat3, family = binomial)

# set new variable x
x <- seq(min(dat1$X), max(dat1$X), length = 100) # x ranging from min(X) to max(X) with 100 

# new data frame with x for prediction
# variable name should be as in the original model (in this case "X", not x)
dat_new <- data.frame(X = x)

# prediction
# "type" specifies the scale of predicted values; in this case, link scale
pred <- predict.glm(fit, dat_new, type = "link", se.fit = T)
head(pred$fit); head(pred$se.fit)
##         1         2         3         4         5         6 
## -1.700849 -1.657750 -1.614650 -1.571551 -1.528451 -1.485352
##         1         2         3         4         5         6 
## 0.2383530 0.2343346 0.2303395 0.2263692 0.2224248 0.2185077
#median prediction (Note: "ilogit" was applied to predicted values because link function is "logit")
ilogit <- function(x) exp(x)/(1 + exp(x))

pred_mat <- matrix(NA, nrow = 100, ncol = 3)
pred_mat[,1] <- ilogit(pred$fit) # median
pred_mat[,2] <- ilogit(pred$fit - pred$se.fit*1.96) #lower 95%CI
pred_mat[,3] <- ilogit(pred$fit + pred$se.fit*1.96) #upper 95%CI
head(pred_mat)
##           [,1]      [,2]      [,3]
## [1,] 0.1543544 0.1026592 0.2255384
## [2,] 0.1600643 0.1074511 0.2317503
## [3,] 0.1659440 0.1124341 0.2380891
## [4,] 0.1719954 0.1176127 0.2445549
## [5,] 0.1782204 0.1229913 0.2511481
## [6,] 0.1846205 0.1285740 0.2578685
# plot
plot(I(Y/N) ~ X, dat3)
lines(pred_mat[,1] ~ x, lty = 1) # solid line, median prediction
lines(pred_mat[,2] ~ x, lty = 2) # broken line, lower prediction
lines(pred_mat[,3] ~ x, lty = 2) # broken line, upper prediction

plot of chunk fig4

Posted in: R

Leave a Reply

Your email address will not be published. Required fields are marked *