Generalized Linear Model

Summary

統計モデリングの枠組みのひとつ。リンク関数を導入することで、様々な確率分布を仮定した線形モデルを統一的に扱う。

Sponsored link

function “lm”

確率分布:正規分布
パッケージ:デフォルト
その他:一般線形モデル(General Linear Model)と呼ばれることもある

# sample data
plot(Y ~ X, dat1)

plot of chunk lm

# run lm
fit <- lm(Y ~ X, data = dat1)
summary(fit)
## 
## Call:
## lm(formula = Y ~ X, data = dat1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.213395 -0.061946  0.007265  0.067644  0.230442 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.01801    0.01992   51.09   <2e-16 ***
## X            0.09862    0.00350   28.18   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.09808 on 98 degrees of freedom
## Multiple R-squared:  0.8901, Adjusted R-squared:  0.889 
## F-statistic:   794 on 1 and 98 DF,  p-value: < 2.2e-16

function “glm”

確率分布:ポアソン分布、二項分布、ガンマ分布など
パッケージ:デフォルト
その他:なし

Poisson

# sample data
plot(Y ~ X, dat2)

plot of chunk glm_poisson

# run poisson model
fit <- glm(Y ~ X, family = poisson, data = dat2)
summary(fit)
## 
## Call:
## glm(formula = Y ~ X, family = poisson, data = dat2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.2530  -0.7131  -0.1580   0.5653   1.9877  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  0.19837    0.14420   1.376    0.169    
## X            0.19819    0.01877  10.562   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 213.147  on 99  degrees of freedom
## Residual deviance:  83.572  on 98  degrees of freedom
## AIC: 395.73
## 
## Number of Fisher Scoring iterations: 4

Binomial

# sample data
plot(Y ~ X, dat3)

plot of chunk glm_binomial

# run binomial model (N is the number of trials)
fit <- glm(cbind(Y, N-Y) ~ X, family = binomial, data = dat3)
summary(fit)
## 
## Call:
## glm(formula = cbind(Y, N - Y) ~ X, family = binomial, data = dat3)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2565  -0.6995  -0.3797   0.5522   2.1399  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -5.28607    0.34307  -15.41   <2e-16 ***
## X            0.84592    0.05257   16.09   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 725.359  on 99  degrees of freedom
## Residual deviance:  90.825  on 98  degrees of freedom
## AIC: 238.26
## 
## Number of Fisher Scoring iterations: 5

function “glm.nb”

確率分布:負の二項分布
パッケージ:MASS
その他:なし

# sample data
plot(Y ~ X, dat4)

plot of chunk glm_negative-binomial

# run negative-binomial model
fit <- glm.nb(Y ~ X, data = dat4)
summary(fit)
## 
## Call:
## glm.nb(formula = Y ~ X, data = dat4, init.theta = 1.051486619, 
##     link = log)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.6239  -1.2212  -0.3026   0.3720   1.4538  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.06102    0.22421  -0.272    0.786    
## X            0.54672    0.03626  15.079   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Negative Binomial(1.0515) family taken to be 1)
## 
##     Null deviance: 334.97  on 99  degrees of freedom
## Residual deviance: 112.51  on 98  degrees of freedom
## AIC: 748.74
## 
## Number of Fisher Scoring iterations: 1
## 
## 
##               Theta:  1.051 
##           Std. Err.:  0.167 
## 
##  2 x log-likelihood:  -742.739
Posted in: R

Leave a Reply

Your email address will not be published. Required fields are marked *