Foreword
Using the boot package and function, we can bootstrap a single statistic (for example, a median) or a vector of several statistics (for example, regression weights).
mpg) on car weight (wt) and displacement (disp).# The regression
summary(lm(mpg ~ wt + disp, data = mtcars))[1]## $call
## lm(formula = mpg ~ wt + disp, data = mtcars)
summary(lm(mpg ~ wt + disp, data = mtcars))[8]## $r.squared
## [1] 0.7809306
library(boot)
# Function to obtain R-Squared from the data
rsq <- function(formula, data, indices) {
d <- data[indices,] # allows boot to select sample
fit <- lm(formula, data = d)
return(summary(fit)$r.square)
}
# Bootstrapping with 1000 replications
results <- boot(statistic = rsq,
formula = mpg ~ wt + disp, data = mtcars, R = 1000)
# Results
results##
## ORDINARY NONPARAMETRIC BOOTSTRAP
##
##
## Call:
## boot(data = mtcars, statistic = rsq, R = 1000, formula = mpg ~
## wt + disp)
##
##
## Bootstrap Statistics :
## original bias std. error
## t1* 0.7809306 0.01278098 0.04760536
# 95% confidence interval
boot.ci(results, type = 'bca')## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 1000 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = results, type = "bca")
##
## Intervals :
## Level BCa
## 95% ( 0.6252, 0.8486 )
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable
Plot the results.
plot(results)wt), displacement (disp), in the same regression as abovesummary(lm(mpg ~ wt + disp, data = mtcars))##
## Call:
## lm(formula = mpg ~ wt + disp, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4087 -2.3243 -0.7683 1.7721 6.3484
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 34.96055 2.16454 16.151 4.91e-16 ***
## wt -3.35082 1.16413 -2.878 0.00743 **
## disp -0.01773 0.00919 -1.929 0.06362 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.917 on 29 degrees of freedom
## Multiple R-squared: 0.7809, Adjusted R-squared: 0.7658
## F-statistic: 51.69 on 2 and 29 DF, p-value: 2.744e-10
library(boot)
# Function to obtain regression weights
bs <- function(formula, data, indices) {
d <- data[indices,] # allows boot to select sample
fit <- lm(formula, data = d)
return(coef(fit))
}
# Bootstrapping with 1000 replications
results <- boot(statistic = bs,
formula = mpg ~ wt + disp, data = mtcars, R = 1000)
# Results
results##
## ORDINARY NONPARAMETRIC BOOTSTRAP
##
##
## Call:
## boot(data = mtcars, statistic = bs, R = 1000, formula = mpg ~
## wt + disp)
##
##
## Bootstrap Statistics :
## original bias std. error
## t1* 34.96055404 0.2264555291 2.57365563
## t2* -3.35082533 -0.1110873061 1.22643344
## t3* -0.01772474 0.0003609026 0.00899893
# 95% confidence intervals
boot.ci(results, type = "bca", index = 1) # intercept## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 1000 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = results, type = "bca", index = 1)
##
## Intervals :
## Level BCa
## 95% (29.56, 39.57 )
## Calculations and Intervals on Original Scale
boot.ci(results, type = "bca", index = 2) # wt## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 1000 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = results, type = "bca", index = 2)
##
## Intervals :
## Level BCa
## 95% (-5.648, -0.680 )
## Calculations and Intervals on Original Scale
boot.ci(results, type = "bca", index = 3) # disp ## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 1000 bootstrap replicates
##
## CALL :
## boot.ci(boot.out = results, type = "bca", index = 3)
##
## Intervals :
## Level BCa
## 95% (-0.0358, -0.0004 )
## Calculations and Intervals on Original Scale
plot(results, index = 1) # interceptplot(results, index = 2) # wtplot(results, index = 3) # dispWe can generate both nonparametric and parametric resampling.
For the nonparametric bootstrap, resampling methods include ordinary, balanced, antithetic and permutation.
For the nonparametric bootstrap, stratified resampling is supported. Importance resampling weights can also be specified.
The boot.ci function can generate 5 different types of two-sided nonparametric confidence intervals: the first order normal approximation, the basic bootstrap interval, the studentized bootstrap interval, the bootstrap percentile interval, and the adjusted bootstrap percentile (BCa) interval.