Generalized Linear Model

Summary

統計モデリングの枠組みのひとつ。リンク関数を導入することで、様々な確率分布を仮定した線形モデルを統一的に扱う。

Sponsored link

function “lm”

確率分布:正規分布
パッケージ:デフォルト
その他:一般線形モデル(General Linear Model)と呼ばれることもある

# sample data
plot(Y ~ X, dat1)

plot of chunk lm

# run lm
fit <- lm(Y ~ X, data = dat1)
summary(fit)
## 
## Call:
## lm(formula = Y ~ X, data = dat1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.213478 -0.050924  0.002303  0.061646  0.241424 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.004005   0.018956   52.97   <2e-16 ***
## X           0.097899   0.003176   30.83   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0885 on 98 degrees of freedom
## Multiple R-squared:  0.9065, Adjusted R-squared:  0.9056 
## F-statistic: 950.2 on 1 and 98 DF,  p-value: < 2.2e-16

function “glm”

確率分布:ポアソン分布、二項分布、ガンマ分布など
パッケージ:デフォルト
その他:なし

Poisson

# sample data
plot(Y ~ X, dat2)

plot of chunk glm_poisson

# run poisson model
fit <- glm(Y ~ X, family = poisson, data = dat2)
summary(fit)
## 
## Call:
## glm(formula = Y ~ X, family = poisson, data = dat2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.3712  -0.7268  -0.1901   0.6003   2.2485  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  0.20326    0.13592   1.495    0.135    
## X            0.19971    0.01998   9.993   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 210.49  on 99  degrees of freedom
## Residual deviance: 100.99  on 98  degrees of freedom
## AIC: 390.84
## 
## Number of Fisher Scoring iterations: 5

Binomial

# sample data
plot(Y ~ X, dat3)

plot of chunk glm_binomial

# run binomial model (N is the number of trials)
fit <- glm(cbind(Y, N-Y) ~ X, family = binomial, data = dat3)
summary(fit)
## 
## Call:
## glm(formula = cbind(Y, N - Y) ~ X, family = binomial, data = dat3)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.6676  -0.8225  -0.4033   0.5509   2.7551  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -4.57227    0.28315  -16.15   <2e-16 ***
## X            0.72404    0.04436   16.32   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 610.18  on 99  degrees of freedom
## Residual deviance: 108.85  on 98  degrees of freedom
## AIC: 279.33
## 
## Number of Fisher Scoring iterations: 4

function “glm.nb”

確率分布:負の二項分布
パッケージ:MASS
その他:なし

# sample data
plot(Y ~ X, dat4)

plot of chunk glm_negative-binomial

# run negative-binomial model
fit <- glm.nb(Y ~ X, data = dat4)
summary(fit)
## 
## Call:
## glm.nb(formula = Y ~ X, data = dat4, init.theta = 1.305803229, 
##     link = log)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.8728  -0.8809  -0.2989   0.2843   3.0466  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  0.45982    0.21550   2.134   0.0329 *  
## X            0.44411    0.03436  12.925   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Negative Binomial(1.3058) family taken to be 1)
## 
##     Null deviance: 263.18  on 99  degrees of freedom
## Residual deviance: 113.60  on 98  degrees of freedom
## AIC: 777.98
## 
## Number of Fisher Scoring iterations: 1
## 
## 
##               Theta:  1.306 
##           Std. Err.:  0.202 
## 
##  2 x log-likelihood:  -771.982
Posted in: R

Leave a Reply

Your email address will not be published. Required fields are marked *