Title: | Bias Reduction in Generalized Linear Models |
---|---|
Description: | Estimation and inference from generalized linear models based on various methods for bias reduction and maximum penalized likelihood with powers of the Jeffreys prior as penalty. The 'brglmFit' fitting method can achieve reduction of estimation bias by solving either the mean bias-reducing adjusted score equations in Firth (1993) <doi:10.1093/biomet/80.1.27> and Kosmidis and Firth (2009) <doi:10.1093/biomet/asp055>, or the median bias-reduction adjusted score equations in Kenne et al. (2017) <doi:10.1093/biomet/asx046>, or through the direct subtraction of an estimate of the bias of the maximum likelihood estimator from the maximum likelihood estimates as in Cordeiro and McCullagh (1991) <https://www.jstor.org/stable/2345592>. See Kosmidis et al (2020) <doi:10.1007/s11222-019-09860-6> for more details. Estimation in all cases takes place via a quasi Fisher scoring algorithm, and S3 methods for the construction of of confidence intervals for the reduced-bias estimates are provided. In the special case of generalized linear models for binomial and multinomial responses (both ordinal and nominal), the adjusted score approaches to mean and media bias reduction have been found to return estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation; see Kosmidis and Firth, 2020 <doi:10.1093/biomet/asaa052>, for a proof for mean bias reduction in logistic regression). |
Authors: | Ioannis Kosmidis [aut, cre] , Euloge Clovis Kenne Pagui [aut], Kjell Konis [ctb], Nicola Sartori [ctb] |
Maintainer: | Ioannis Kosmidis <[email protected]> |
License: | GPL-3 |
Version: | 0.9.2 |
Built: | 2024-11-11 06:16:40 UTC |
Source: | https://github.com/ikosmidis/brglm2 |
The data is from a 3-year study on the effects of AZT in slowing the development of AIDS symptoms. 338 veterans whose immune systems were beginning to falter after infection with the AIDS virus were randomly assigned either to receive AZT immediately or to wait until their T cells showed severe immune weakness.
aids
aids
A data frame with 4 rows and 4 variables:
symptomatic
: counts of veterans showing AIDS symptoms during the 3-year study
asymptomatic
: counts of veterans not showing AIDS symptoms during the 3-year study
race
: the race of the veterans with levels "White"
and "Black"
AZT_use
: whether the veterans received AZT immediately ("Yes"
)
or waited until their T cells showed severe immune weakness ("No"
)
The data set is analyzed in Agresti (2002, Subsection 5.4.2). Its original source is New York Times, Feb. 15, 1991.
Agresti A (2002). Categorical Data Analysis. Wiley Series in Probability and Statistics. Wiley.
Alligator food choice data
alligators
alligators
A data frame with 80 rows and 5 variables:
foodchoice
: primary food type, in volume, found in an alligator’s stomach, with levels fish
, invertebrate
,reptile
, bird
, other
lake
: lake of capture with levels Hancock
, Oklawaha
, Trafford
, George
.
gender
: gender of the alligator with levels Male
and Female
size
: size of the alligator with levels <=2.3
meters long and >2.3
meters long
freq
: number of alligators for each foodchoice, lake, gender and size combination
The alligators data set is analyzed in Agresti (2002, Subsection 7.1.2).
Agresti A (2002). Categorical Data Analysis. Wiley Series in Probability and Statistics. Wiley.
bracl()
is a wrapper of brglmFit()
that fits adjacent category
logit models with or without proportional odds using implicit and
explicit bias reduction methods. See Kosmidis & Firth (2011) for
details.
bracl( formula, data, weights, subset, na.action, parallel = FALSE, contrasts = NULL, model = TRUE, x = TRUE, control = list(...), ... )
bracl( formula, data, weights, subset, na.action, parallel = FALSE, contrasts = NULL, model = TRUE, x = TRUE, control = list(...), ... )
formula |
a formula expression as for regression models, of the form
|
data |
an optional data frame, list or environment in which to interpret
the variables occurring in |
weights |
optional case weights in fitting. Default to 1. |
subset |
expression saying which subset of the rows of the data should be used in the fit. All observations are included by default. |
na.action |
a function to filter missing data. |
parallel |
if |
contrasts |
a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. |
model |
logical for whether the model matrix should be returned. |
x |
should the model matrix be included with in the result
(default is |
control |
a list of parameters for controlling the fitting
process. See |
... |
arguments to be used to form the default |
The bracl()
function fits adjacent category models, which assume
multinomial observations with probabilities with proportional odds
of the form
or with non-proportional odds of the form
where is a vector of covariates and
is the
probability that category
is observed at the covariate setting
.
Ioannis Kosmidis [aut, cre]
[email protected]
Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. Statistics and Computing, 30, 43-59. doi:10.1007/s11222-019-09860-6.
Agresti, A (2010). Analysis of Ordinal Categorical Data (2nd edition). Wiley Series in Probability and Statistics. Wiley.
Albert A, Anderson J A (1984). On the Existence of Maximum Likelihood Estimates in Logistic Regression Models. Biometrika, 71, 1-10. doi:10.2307/2336390.
Kosmidis I, Firth D (2011). Multinomial logit bias reduction via the Poisson log-linear model. Biometrika, 98, 755-759. doi:10.1093/biomet/asr026.
Palmgren J (1981). The Fisher Information Matrix for Log Linear Models Arguing Conditionally on Observed Explanatory Variables. Biometrika, 68, 563-566. doi:10.1093/biomet/68.2.563.
nnet::multinom()
, brmultinom()
data("stemcell", package = "brglm2") # Adjacent category logit (non-proportional odds) fit_bracl <- bracl(research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML") # Adjacent category logit (proportional odds) fit_bracl_p <- bracl(research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML", parallel = TRUE)
data("stemcell", package = "brglm2") # Adjacent category logit (non-proportional odds) fit_bracl <- bracl(research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML") # Adjacent category logit (proportional odds) fit_bracl_p <- bracl(research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML", parallel = TRUE)
Estimation and inference from generalized linear models using implicit and explicit bias reduction methods (Kosmidis, 2014), and other penalized maximum likelihood methods. Currently supported methods include the mean bias-reducing adjusted scores approach in Firth (1993) and Kosmidis & Firth (2009), the median bias-reduction adjusted scores approach in Kenne Pagui et al. (2017), the correction of the asymptotic bias in Cordeiro & McCullagh (1991), the mixed bias-reduction adjusted scores approach in Kosmidis et al (2020), maximum penalized likelihood with powers of the Jeffreys prior as penalty, and maximum likelihood.
In the special case of generalized linear models for binomial, Poisson and multinomial responses (both nominal and ordinal), mean and median bias reduction and maximum penalized likelihood return estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation in multinomial regression). Estimation in all cases takes place via a modified Fisher scoring algorithm, and S3 methods for the construction of confidence intervals for the reduced-bias estimates are provided.
The core model fitters are implemented by the functions
brglm_fit()
(univariate generalized linear models),
brmultinom()
(baseline category logit models for nominal
multinomial responses), bracl()
(adjacent category logit models
for ordinal multinomial responses), and brnb()
for negative
binomial regression.
The similarly named brglm R package can only handle generalized
linear models with binomial responses. Special care has been taken
when developing brglm2 in order not to have conflicts when the
user loads brglm2 and brglm simultaneously. The development
and maintenance of the two packages will continue in parallel,
until brglm2 incorporates all brglm functionality and
provides an appropriate wrapper to the brglm::brglm()
function.
Ioannis Kosmidis [aut, cre]
[email protected]
Kosmidis I, Firth D (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71-82. doi:10.1093/biomet/asaa052.
Cordeiro G M, McCullagh P (1991). Bias correction in generalized linear models. Journal of the Royal Statistical Society. Series B (Methodological), 53, 629-643. doi:10.1111/j.2517-6161.1991.tb01852.x.
Firth D (1993). Bias reduction of maximum likelihood estimates, Biometrika, 80, 27-38. doi:10.2307/2336755.
Kenne Pagui E C, Salvan A, Sartori N (2017). Median bias reduction of maximum likelihood estimates. Biometrika, 104, 923–938. doi:10.1093/biomet/asx046.
Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. Statistics and Computing, 30, 43-59. doi:10.1007/s11222-019-09860-6.
Kosmidis I, Firth D (2009). Bias reduction in exponential family nonlinear models. Biometrika, 96, 793-804. doi:10.1093/biomet/asp055.
Kosmidis I, Firth D (2010). A generic algorithm for reducing bias in parametric estimation. Electronic Journal of Statistics, 4, 1097-1112. doi:10.1214/10-EJS579.
Kosmidis I (2014). Bias in parametric estimation: reduction and useful side-effects. WIRE Computational Statistics, 6, 185-196. doi:10.1002/wics.1296.
brglm_fit()
, brmultinom()
, bracl()
The functions or variables listed here are no longer part of brglm2.
check_infinite_estimates(...) detect_separation(...)
check_infinite_estimates(...) detect_separation(...)
... |
arguments to be passed to functions and methods. |
detect_separation()
: This function is defunct from brglm2
since version 0.8.0. A new version of detect_separation()
is now
maintained in the
detectseparation
R package.
check_infinite_estimates()
is defunct from brglm2 since
version 0.8.0. An new version of check_infinite_estimates()
is
now maintained in the
detectseparation
R package.
glm()
fitting using the brglmFit()
method.Typically only used internally by brglmFit()
, but may be used to
construct a control
argument.
brglmControl( epsilon = 1e-06, maxit = 100, check_aliasing = TRUE, trace = FALSE, type = c("AS_mixed", "AS_mean", "AS_median", "correction", "MPL_Jeffreys", "ML"), transformation = "identity", slowit = 1, response_adjustment = NULL, max_step_factor = 12, a = 1/2, ... ) brglm_control( epsilon = 1e-06, maxit = 100, check_aliasing = TRUE, trace = FALSE, type = c("AS_mixed", "AS_mean", "AS_median", "correction", "MPL_Jeffreys", "ML"), transformation = "identity", slowit = 1, response_adjustment = NULL, max_step_factor = 12, a = 1/2, ... )
brglmControl( epsilon = 1e-06, maxit = 100, check_aliasing = TRUE, trace = FALSE, type = c("AS_mixed", "AS_mean", "AS_median", "correction", "MPL_Jeffreys", "ML"), transformation = "identity", slowit = 1, response_adjustment = NULL, max_step_factor = 12, a = 1/2, ... ) brglm_control( epsilon = 1e-06, maxit = 100, check_aliasing = TRUE, trace = FALSE, type = c("AS_mixed", "AS_mean", "AS_median", "correction", "MPL_Jeffreys", "ML"), transformation = "identity", slowit = 1, response_adjustment = NULL, max_step_factor = 12, a = 1/2, ... )
epsilon |
positive convergence tolerance epsilon. Default is
|
maxit |
integer giving the maximal number of iterations
allowed. Default is |
check_aliasing |
logical indicating where a QR decomposition
of the model matrix should be used to check for
aliasing. Default is |
trace |
logical indicating if output should be produced for
each iteration. Default is |
type |
the type of fitting method to be used. The options are
|
transformation |
the transformation of the dispersion to be
estimated. Default is |
slowit |
a positive real used as a multiplier for the
stepsize. The smaller it is the smaller the steps are. Default
is |
response_adjustment |
a (small) positive constant or a vector
of such. Default is |
max_step_factor |
the maximum number of step halving steps to
consider. Default is |
a |
power of the Jeffreys prior penalty. See Details. |
... |
further arguments passed to |
brglmControl()
provides default values and sanity checking for
the various constants that control the iteration and generally the
behaviour of brglmFit()
.
When trace = TRUE
, calls to cat()
produce the output for each
iteration. Hence, options(digits = *)
can be used to increase
the precision.
When check_aliasing = TRUE
(default), a QR decomposition of the
model matrix is computed to check for aliasing. If the model matrix
is known to be of full rank, then check_aliasing = FALSE
avoids
the extra computational overhead of an additional QR decomposition,
which can be substantial for large model matrices. However, setting
check_aliasing = FALSE
tells brglmFit()
that the model matrix
is full rank, and hard to trace back errors will result if it is
rank deficient.
transformation
sets the transformation of the dispersion
parameter for which the bias reduced estimates are computed. Can be
one of "identity"
, "sqrt"
, "inverse"
, "log"
and
"inverseSqrt"
. Custom transformations are accommodated by
supplying a list of two expressions (transformation and inverse
transformation). See the examples for more details.
The value of response_adjustment
is only relevant if brglmFit()
is called with start = NULL
, and family
is binomial()
or
poisson()
. For those models, an initial maximum likelihood fit is
obtained on adjusted data to provide starting values for the
iteration in brglmFit()
. The value of response_adjustment
governs how the data is adjusted. Specifically, if family
is
binomial()
, then the responses and totals are adjusted by
response_adjustment
and 2 * response_adjustment
, respectively;
if family
is poisson()
, then the responses are adjusted by and
response_adjustment
. response_adjustment = NULL
(default)
is equivalent to setting it to
"number of parameters" / "number of observations"
.
When type = "AS_mixed"
(default), mean bias reduction is used for
the regression parameters, and median bias reduction for the
dispersion parameter, if that is not fixed. This adjustment has
been developed based on equivariance arguments (see, Kosmidis et
al, 2020, Section 4) in order to produce regression parameter
estimates that are invariant to arbitrary contrasts, and estimates
for the dispersion parameter that are invariant to arbitrary
non-linear transformations. type = "AS_mixed"
and type = "AS_mean"
return the same results if brglmFit()
is called with
family
binomial()
or poisson()
(i.e. families with fixed
dispersion).
When type = "MPL_Jeffreys"
, brglmFit()
will maximize the
penalized log-likelihood
where is the expected information matrix about the
regression parameters
and the dispersion parameter
. See,
vignette("iteration", "brglm2")
for more
information. The argument a
controls the amount of penalization
and its default value is a = 1/2
, corresponding to maximum
penalized likelihood using a Jeffreys-prior penalty. See, Kosmidis
& Firth (2021) for proofs and discussion about the finiteness and
shrinkage properties of the maximum penalized likelihood estimators
for binomial-response generalized linear models.
The estimates from type = "AS_mean"
and type = "MPL_Jeffreys"
with a = 1/2
(default) are identical for
Poisson log-linear models and logistic regression models, i.e. for
binomial and Poisson regression models with canonical links. See,
Firth (1993) for details.
brglm_control()
is an alias to brglmControl()
.
a list with components named as the arguments, including
symbolic expressions for the dispersion transformation
(Trans
) and its inverse (inverseTrans
)
Ioannis Kosmidis [aut, cre]
[email protected]
Kosmidis I, Firth D (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71-82. doi:10.1093/biomet/asaa052.
Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. Statistics and Computing, 30, 43-59. doi:10.1007/s11222-019-09860-6.
Firth D (1993). Bias reduction of maximum likelihood estimates. Biometrika, 80, 27-38. doi:10.2307/2336755.
brglm_fit()
and glm.fit()
data("coalition", package = "brglm2") ## The maximum likelihood fit with log link coalitionML <- glm(duration ~ fract + numst2, family = Gamma, data = coalition) ## Bias reduced estimation of the dispersion parameter coalitionBRi <- glm(duration ~ fract + numst2, family = Gamma, data = coalition, method = "brglmFit") coef(coalitionBRi, model = "dispersion") ## Bias reduced estimation of log(dispersion) coalitionBRl <- glm(duration ~ fract + numst2, family = Gamma, data = coalition, method = "brglmFit", transformation = "log") coef(coalitionBRl, model = "dispersion") ## Just for illustration: Bias reduced estimation of dispersion^0.25 my_transformation <- list(expression(dispersion^0.25), expression(transformed_dispersion^4)) coalitionBRc <- update(coalitionBRi, transformation = my_transformation) coef(coalitionBRc, model = "dispersion")
data("coalition", package = "brglm2") ## The maximum likelihood fit with log link coalitionML <- glm(duration ~ fract + numst2, family = Gamma, data = coalition) ## Bias reduced estimation of the dispersion parameter coalitionBRi <- glm(duration ~ fract + numst2, family = Gamma, data = coalition, method = "brglmFit") coef(coalitionBRi, model = "dispersion") ## Bias reduced estimation of log(dispersion) coalitionBRl <- glm(duration ~ fract + numst2, family = Gamma, data = coalition, method = "brglmFit", transformation = "log") coef(coalitionBRl, model = "dispersion") ## Just for illustration: Bias reduced estimation of dispersion^0.25 my_transformation <- list(expression(dispersion^0.25), expression(transformed_dispersion^4)) coalitionBRc <- update(coalitionBRi, transformation = my_transformation) coef(coalitionBRc, model = "dispersion")
glm()
for reduced-bias estimation and
inferencebrglmFit()
is a fitting method for glm()
that fits generalized
linear models using implicit and explicit bias reduction methods
(Kosmidis, 2014), and other penalized maximum likelihood
methods. Currently supported methods include the mean bias-reducing
adjusted scores approach in Firth (1993) and Kosmidis & Firth
(2009), the median bias-reduction adjusted scores approach in Kenne
Pagui et al. (2017), the correction of the asymptotic bias in
Cordeiro & McCullagh (1991), the mixed bias-reduction adjusted
scores approach in Kosmidis et al (2020), maximum penalized
likelihood with powers of the Jeffreys prior as penalty, and
maximum likelihood. Estimation is performed using a quasi Fisher
scoring iteration (see vignette("iteration", "brglm2")
, which, in
the case of mean-bias reduction, resembles an iterative correction
of the asymptotic bias of the Fisher scoring iterates.
brglmFit( x, y, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(), intercept = TRUE, fixed_totals = NULL, singular.ok = TRUE ) brglm_fit( x, y, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(), intercept = TRUE, fixed_totals = NULL, singular.ok = TRUE )
brglmFit( x, y, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(), intercept = TRUE, fixed_totals = NULL, singular.ok = TRUE ) brglm_fit( x, y, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(), intercept = TRUE, fixed_totals = NULL, singular.ok = TRUE )
x |
a design matrix of dimension |
y |
a vector of observations of length |
weights |
an optional vector of ‘prior weights’ to be used
in the fitting process. Should be |
start |
starting values for the parameters in the linear
predictor. If |
etastart |
applied only when start is not |
mustart |
applied only when start is not |
offset |
this can be used to specify an a priori known
component to be included in the linear predictor during fitting.
This should be |
family |
a description of the error distribution and link
function to be used in the model. For |
control |
a list of parameters controlling the fitting
process. See |
intercept |
logical. Should an intercept be included in the null model? |
fixed_totals |
effective only when |
singular.ok |
logical. If |
A detailed description of the supported adjustments and the quasi
Fisher scoring iteration is given in the iteration vignette (see,
vignette("iteration", "brglm2")
or Kosmidis et al, 2020). A
shorter description of the quasi Fisher scoring iteration is also
given in one of the vignettes of the enrichwith R package (see,
https://cran.r-project.org/package=enrichwith/vignettes/bias.html).
Kosmidis and Firth (2010) describe a parallel quasi Newton-Raphson
iteration with the same stationary point.
In the special case of generalized linear models for binomial,
Poisson and multinomial responses, the adjusted score equation
approaches for type = "AS_mixed"
, type = "AS_mean"
, and type = "AS_median"
(see below for what methods each type
corresponds)
return estimates with improved frequentist properties, that are
also always finite, even in cases where the maximum likelihood
estimates are infinite (e.g. complete and quasi-complete separation
in multinomial regression). See, Kosmidis and Firth (2021) for a
proof for binomial-response GLMs with Jeffreys-prior penalties to
the log-likelihood, which is equivalent to mean bias reduction for
logistic regression. See, also,
detectseparation::detect_separation()
and
detectseparation::check_infinite_estimates()
for pre-fit and
post-fit methods for the detection of infinite estimates in
binomial response generalized linear models.
The type of score adjustment to be used is specified through the
type
argument (see brglmControl()
for details). The available
options are
type = "AS_mixed"
: the mixed bias-reducing score adjustments in
Kosmidis et al (2020) that result in mean bias reduction for the
regression parameters and median bias reduction for the dispersion
parameter, if any; default.
type = "AS_mean"
: the mean bias-reducing score adjustments in
Firth, 1993 and Kosmidis & Firth, 2009. type = "AS_mixed"
and
type = "AS_mean"
will return the same results when family
is
binomial()
or poisson()
, i.e. when the dispersion is fixed
type = "AS_median"
: the median bias-reducing score
adjustments in Kenne Pagui et al. (2017)
type = "MPL_Jeffreys"
: maximum penalized likelihood
with powers of the Jeffreys prior as penalty.
type = "ML"
: maximum likelihood.
type = "correction"
: asymptotic bias correction, as in
Cordeiro & McCullagh (1991).
The null deviance is evaluated based on the fitted values using the
method specified by the type
argument (see brglmControl()
).
The family
argument of the current version of brglmFit()
can
accept any combination of "family"
objects and link functions,
including families with user-specified link functions, mis()
links, and power()
links, but excluding quasi()
,
quasipoisson()
and quasibinomial()
families.
The description of method
argument and the Fitting functions
section in glm()
gives information on supplying fitting
methods to glm()
.
fixed_totals
specifies groups of observations for which the sum
of the means of a Poisson model will be held fixed to the observed
count for each group. This argument is used internally in
brmultinom()
and bracl()
for baseline-category logit models and
adjacent category logit models, respectively.
brglm_fit()
is an alias to brglmFit()
.
Ioannis Kosmidis [aut, cre]
[email protected], Euloge Clovis Kenne Pagui [ctb]
[email protected]
Kosmidis I, Firth D (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71-82. doi:10.1093/biomet/asaa052.
Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. Statistics and Computing, 30, 43-59. doi:10.1007/s11222-019-09860-6.
Cordeiro G M, McCullagh P (1991). Bias correction in generalized linear models. Journal of the Royal Statistical Society. Series B (Methodological), 53, 629-643. doi:10.1111/j.2517-6161.1991.tb01852.x.
Firth D (1993). Bias reduction of maximum likelihood estimates. Biometrika. 80, 27-38. doi:10.2307/2336755.
Kenne Pagui E C, Salvan A, Sartori N (2017). Median bias reduction of maximum likelihood estimates. Biometrika, 104, 923–938. doi:10.1093/biomet/asx046.
Kosmidis I, Firth D (2009). Bias reduction in exponential family nonlinear models. Biometrika, 96, 793-804. doi:10.1093/biomet/asp055.
Kosmidis I, Firth D (2010). A generic algorithm for reducing bias in parametric estimation. Electronic Journal of Statistics, 4, 1097-1112. doi:10.1214/10-EJS579.
Kosmidis I (2014). Bias in parametric estimation: reduction and useful side-effects. WIRE Computational Statistics, 6, 185-196. doi:10.1002/wics.1296.
brglmControl()
, glm.fit()
, glm()
## The lizards example from ?brglm::brglm data("lizards") # Fit the model using maximum likelihood lizardsML <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "glm.fit") # Mean bias-reduced fit: lizardsBR_mean <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit") # Median bias-reduced fit: lizardsBR_median <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit", type = "AS_median") summary(lizardsML) summary(lizardsBR_median) summary(lizardsBR_mean) # Maximum penalized likelihood with Jeffreys prior penatly lizards_Jeffreys <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit", type = "MPL_Jeffreys") # lizards_Jeffreys is the same fit as lizardsBR_mean (see Firth, 1993) all.equal(coef(lizardsBR_mean), coef(lizards_Jeffreys)) # Maximum penalized likelihood with powers of the Jeffreys prior as # penalty. See Kosmidis & Firth (2021) for the finiteness and # shrinkage properties of the maximum penalized likelihood # estimators in binomial response models a <- seq(0, 20, 0.5) coefs <- sapply(a, function(a) { out <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit", type = "MPL_Jeffreys", a = a) coef(out) }) # Illustration of shrinkage as a grows matplot(a, t(coefs), type = "l", col = 1, lty = 1) abline(0, 0, col = "grey") ## Another example from ## King, Gary, James E. Alt, Nancy Elizabeth Burns and Michael Laver ## (1990). "A Unified Model of Cabinet Dissolution in Parliamentary ## Democracies", _American Journal of Political Science_, **34**, 846-870 data("coalition", package = "brglm2") # The maximum likelihood fit with log link coalitionML <- glm(duration ~ fract + numst2, family = Gamma, data = coalition) # The mean bias-reduced fit coalitionBR_mean <- update(coalitionML, method = "brglmFit") # The bias-corrected fit coalitionBC <- update(coalitionML, method = "brglmFit", type = "correction") # The median bias-corrected fit coalitionBR_median <- update(coalitionML, method = "brglmFit", type = "AS_median") ## An example with offsets from Venables & Ripley (2002, p.189) data("anorexia", package = "MASS") anorexML <- glm(Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian, data = anorexia) anorexBC <- update(anorexML, method = "brglmFit", type = "correction") anorexBR_mean <- update(anorexML, method = "brglmFit") anorexBR_median <- update(anorexML, method = "brglmFit", type = "AS_median") # All methods return the same estimates for the regression # parameters because the maximum likelihood estimator is normally # distributed around the `true` value under the model (hence, both # mean and component-wise median unbiased). The Wald tests for # anorexBC and anorexBR_mean differ from anorexML because the # bias-reduced estimator of the dispersion is the unbiased, by # degree of freedom adjustment (divide by n - p), estimator of the # residual variance. The Wald tests from anorexBR_median are based # on the median bias-reduced estimator of the dispersion that # results from a different adjustment of the degrees of freedom # (divide by n - p - 2/3) summary(anorexML) summary(anorexBC) summary(anorexBR_mean) summary(anorexBR_median) ## endometrial data from Heinze & Schemper (2002) (see ?endometrial) data("endometrial", package = "brglm2") endometrialML <- glm(HG ~ NV + PI + EH, data = endometrial, family = binomial("probit")) endometrialBR_mean <- update(endometrialML, method = "brglmFit", type = "AS_mean") endometrialBC <- update(endometrialML, method = "brglmFit", type = "correction") endometrialBR_median <- update(endometrialML, method = "brglmFit", type = "AS_median") summary(endometrialML) summary(endometrialBC) summary(endometrialBR_mean) summary(endometrialBR_median)
## The lizards example from ?brglm::brglm data("lizards") # Fit the model using maximum likelihood lizardsML <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "glm.fit") # Mean bias-reduced fit: lizardsBR_mean <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit") # Median bias-reduced fit: lizardsBR_median <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit", type = "AS_median") summary(lizardsML) summary(lizardsBR_median) summary(lizardsBR_mean) # Maximum penalized likelihood with Jeffreys prior penatly lizards_Jeffreys <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit", type = "MPL_Jeffreys") # lizards_Jeffreys is the same fit as lizardsBR_mean (see Firth, 1993) all.equal(coef(lizardsBR_mean), coef(lizards_Jeffreys)) # Maximum penalized likelihood with powers of the Jeffreys prior as # penalty. See Kosmidis & Firth (2021) for the finiteness and # shrinkage properties of the maximum penalized likelihood # estimators in binomial response models a <- seq(0, 20, 0.5) coefs <- sapply(a, function(a) { out <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit", type = "MPL_Jeffreys", a = a) coef(out) }) # Illustration of shrinkage as a grows matplot(a, t(coefs), type = "l", col = 1, lty = 1) abline(0, 0, col = "grey") ## Another example from ## King, Gary, James E. Alt, Nancy Elizabeth Burns and Michael Laver ## (1990). "A Unified Model of Cabinet Dissolution in Parliamentary ## Democracies", _American Journal of Political Science_, **34**, 846-870 data("coalition", package = "brglm2") # The maximum likelihood fit with log link coalitionML <- glm(duration ~ fract + numst2, family = Gamma, data = coalition) # The mean bias-reduced fit coalitionBR_mean <- update(coalitionML, method = "brglmFit") # The bias-corrected fit coalitionBC <- update(coalitionML, method = "brglmFit", type = "correction") # The median bias-corrected fit coalitionBR_median <- update(coalitionML, method = "brglmFit", type = "AS_median") ## An example with offsets from Venables & Ripley (2002, p.189) data("anorexia", package = "MASS") anorexML <- glm(Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian, data = anorexia) anorexBC <- update(anorexML, method = "brglmFit", type = "correction") anorexBR_mean <- update(anorexML, method = "brglmFit") anorexBR_median <- update(anorexML, method = "brglmFit", type = "AS_median") # All methods return the same estimates for the regression # parameters because the maximum likelihood estimator is normally # distributed around the `true` value under the model (hence, both # mean and component-wise median unbiased). The Wald tests for # anorexBC and anorexBR_mean differ from anorexML because the # bias-reduced estimator of the dispersion is the unbiased, by # degree of freedom adjustment (divide by n - p), estimator of the # residual variance. The Wald tests from anorexBR_median are based # on the median bias-reduced estimator of the dispersion that # results from a different adjustment of the degrees of freedom # (divide by n - p - 2/3) summary(anorexML) summary(anorexBC) summary(anorexBR_mean) summary(anorexBR_median) ## endometrial data from Heinze & Schemper (2002) (see ?endometrial) data("endometrial", package = "brglm2") endometrialML <- glm(HG ~ NV + PI + EH, data = endometrial, family = binomial("probit")) endometrialBR_mean <- update(endometrialML, method = "brglmFit", type = "AS_mean") endometrialBC <- update(endometrialML, method = "brglmFit", type = "correction") endometrialBR_median <- update(endometrialML, method = "brglmFit", type = "AS_median") summary(endometrialML) summary(endometrialBC) summary(endometrialBR_mean) summary(endometrialBR_median)
brmultinom()
is a wrapper of brglmFit()
that fits multinomial
regression models using implicit and explicit bias reduction
methods. See Kosmidis & Firth (2011) for details.
brmultinom( formula, data, weights, subset, na.action, contrasts = NULL, ref = 1, model = TRUE, x = TRUE, control = list(...), ... )
brmultinom( formula, data, weights, subset, na.action, contrasts = NULL, ref = 1, model = TRUE, x = TRUE, control = list(...), ... )
formula |
a formula expression as for regression models, of the form
|
data |
an optional data frame in which to interpret the variables occurring
in |
weights |
optional case weights in fitting. |
subset |
expression saying which subset of the rows of the data should be used in the fit. All observations are included by default. |
na.action |
a function to filter missing data. |
contrasts |
a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. |
ref |
the reference category to use for multinomial
regression. Either an integer, in which case
|
model |
logical. If true, the model frame is saved as component |
x |
should the model matrix be included with in the result
(default is |
control |
a list of parameters for controlling the fitting
process. See |
... |
arguments to be used to form the default |
The models brmultinom()
handles are also known as
baseline-category logit models (see, Agresti, 2002, Section 7.1),
because they model the log-odds of every category against a
baseline category. The user can control which baseline (or
reference) category is used via the ref
. By default
brmultinom()
uses the first category as reference.
The maximum likelihood estimates for the parameters of baseline-category logit models have infinite components with positive probability, which can result in problems in their estimation and the use of inferential procedures (e.g. Wad tests). Albert and Andreson (1984) have categorized the possible data patterns for such models into the exclusive and exhaustive categories of complete separation, quasi-complete separation and overlap, and showed that infinite maximum likelihood estimates result when complete or quasi-complete separation occurs.
The adjusted score approaches to bias reduction that brmultinom()
implements for type = "AS_mean"
and type = "AS_median"
are
alternatives to maximum likelihood that result in estimates with
smaller asymptotic mean and median bias, respectively, that are
also always finite, even in cases of complete or quasi-complete
separation.
brmultinom()
is a wrapper of brglmFit()
that fits multinomial
logit regression models through the 'Poisson trick' (see, for
example, Palmgren, 1981; Kosmidis & Firth, 2011).
The implementation relies on the construction of an extended model
matrix for the log-linear model and constraints on the sums of the
Poisson means. Specifically, a log-linear model is fitted on a
Kronecker product of the
original model matrix X
implied by the formula, augmented by
nrow(X)
dummy variables.
The extended model matrix is sparse, and the Matrix package is used for its effective storage.
While brmultinom()
can be used for analyses using multinomial
regression models, the current implementation is more of a proof of
concept and is not expected to scale well with either of nrow(X)
,
ncol(X)
or the number of levels in the categorical response.
Ioannis Kosmidis [aut, cre]
[email protected]
Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. Statistics and Computing, 30, 43-59. doi:10.1007/s11222-019-09860-6.
Agresti A (2002). Categorical data analysis (2nd edition). Wiley Series in Probability and Statistics. Wiley.
Albert A, Anderson J A (1984). On the Existence of Maximum Likelihood Estimates in Logistic Regression Models. Biometrika, 71 1–10. doi:10.2307/2336390.
Kosmidis I, Firth D (2011). Multinomial logit bias reduction via the Poisson log-linear model. Biometrika, 98, 755-759. doi:10.1093/biomet/asr026.
Palmgren, J (1981). The Fisher Information Matrix for Log Linear Models Arguing Conditionally on Observed Explanatory Variables. Biometrika, 68, 563-566. doi:10.1093/biomet/68.2.563.
nnet::multinom()
, bracl()
for adjacent category logit models for ordinal responses
## The housing data analysis from ?MASS::housing data("housing", package = "MASS") # Maximum likelihood using nnet::multinom houseML1nnet <- nnet::multinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) # Maximum likelihood using brmultinom with baseline category 'Low' houseML1 <- brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 1) # The estimates are numerically the same as houseML0 all.equal(coef(houseML1nnet), coef(houseML1), tolerance = 1e-04) # Maximum likelihood using brmultinom with 'High' as baseline houseML3 <- brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 3) # The fitted values are the same as houseML1 all.equal(fitted(houseML3), fitted(houseML1), tolerance = 1e-10) # Bias reduction houseBR3 <- update(houseML3, type = "AS_mean") # Bias correction houseBC3 <- update(houseML3, type = "correction") ## Reproducing Bull et al. (2002, Table 3) data("hepatitis", package = "brglm2") # Construct a variable with the multinomial categories according to # the HCV and nonABC columns hepat <- hepatitis hepat$type <- with(hepat, factor(1 - HCV * nonABC + HCV + 2 * nonABC)) hepat$type <- factor(hepat$type, labels = c("noDisease", "C", "nonABC")) contrasts(hepat$type) <- contr.treatment(3, base = 1) # Maximum likelihood estimation fails to converge because some estimates are infinite hepML <- brmultinom(type ~ group * time, data = hepat, weights = counts, type = "ML", slowit = 0.1) # Mean bias reduction returns finite estimates hep_meanBR <- brmultinom(type ~ group * time, data = hepat, weights = counts, type = "AS_mean") # The estimates in Bull et al. (2002, Table 3, DOI: 10.1016/S0167-9473(01)00048-2) coef(hep_meanBR) # Median bias reduction also returns finite estimates, which are a bit larger in absolute value hep_medianBR <- brmultinom(type ~ group * time, data = hepat, weights = counts, type = "AS_median") coef(hep_medianBR)
## The housing data analysis from ?MASS::housing data("housing", package = "MASS") # Maximum likelihood using nnet::multinom houseML1nnet <- nnet::multinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) # Maximum likelihood using brmultinom with baseline category 'Low' houseML1 <- brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 1) # The estimates are numerically the same as houseML0 all.equal(coef(houseML1nnet), coef(houseML1), tolerance = 1e-04) # Maximum likelihood using brmultinom with 'High' as baseline houseML3 <- brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 3) # The fitted values are the same as houseML1 all.equal(fitted(houseML3), fitted(houseML1), tolerance = 1e-10) # Bias reduction houseBR3 <- update(houseML3, type = "AS_mean") # Bias correction houseBC3 <- update(houseML3, type = "correction") ## Reproducing Bull et al. (2002, Table 3) data("hepatitis", package = "brglm2") # Construct a variable with the multinomial categories according to # the HCV and nonABC columns hepat <- hepatitis hepat$type <- with(hepat, factor(1 - HCV * nonABC + HCV + 2 * nonABC)) hepat$type <- factor(hepat$type, labels = c("noDisease", "C", "nonABC")) contrasts(hepat$type) <- contr.treatment(3, base = 1) # Maximum likelihood estimation fails to converge because some estimates are infinite hepML <- brmultinom(type ~ group * time, data = hepat, weights = counts, type = "ML", slowit = 0.1) # Mean bias reduction returns finite estimates hep_meanBR <- brmultinom(type ~ group * time, data = hepat, weights = counts, type = "AS_mean") # The estimates in Bull et al. (2002, Table 3, DOI: 10.1016/S0167-9473(01)00048-2) coef(hep_meanBR) # Median bias reduction also returns finite estimates, which are a bit larger in absolute value hep_medianBR <- brmultinom(type ~ group * time, data = hepat, weights = counts, type = "AS_median") coef(hep_medianBR)
brnb()
is a function that fits negative binomial regression
models using implicit and explicit bias reduction methods.
brnb( formula, data, subset, weights = NULL, offset = NULL, link = "log", start = NULL, etastart = NULL, mustart = NULL, control = list(...), na.action, model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, intercept = TRUE, singular.ok = TRUE, ... )
brnb( formula, data, subset, weights = NULL, offset = NULL, link = "log", start = NULL, etastart = NULL, mustart = NULL, control = list(...), na.action, model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, intercept = TRUE, singular.ok = TRUE, ... )
formula |
an object of class |
data |
an optional data frame, list or environment (or object
coercible by |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
weights |
an optional vector of ‘prior weights’ to be used
in the fitting process. Should be |
offset |
this can be used to specify an a priori known
component to be included in the linear predictor during fitting.
This should be |
link |
The link function. Currently must be one of |
start |
starting values for the parameters in the linear predictor. |
etastart |
starting values for the linear predictor. |
mustart |
starting values for the vector of means. |
control |
a list of parameters for controlling the fitting
process. See |
na.action |
a function which indicates what should happen
when the data contain |
model |
a logical value indicating whether model frame should be included as a component of the returned value. |
x , y
|
For For |
contrasts |
an optional list. See the |
intercept |
logical. Should an intercept be included in the null model? |
singular.ok |
logical; if |
... |
For For |
A detailed description of the fitting procedure is given in the
iteration vignette (see, vignette("iteration", "brglm2")
and
Kosmidis et al, 2020). The number of iterations when estimating
parameters are controlled by the maxit
argument of
brglmControl()
.
The type of score adjustment to be used is specified through the
type
argument (see brglmControl()
for details).
The available options are:
type = "AS_mixed"
: the mixed bias-reducing score
adjustments in Kosmidis et al (2020) that result in mean bias
reduction for the regression parameters and median bias reduction
for the dispersion parameter, if any; default.
type = "AS_mean"
: the mean bias-reducing score
adjustments in Firth (1993) and Kosmidis & Firth (2009).
type = "AS_median"
: the median bias-reducing score
adjustments in Kenne Pagui et al. (2017)
type = "MPL_Jeffreys"
: maximum penalized likelihood
with powers of the Jeffreys prior as penalty.
type = "ML"
: maximum likelihood.
type = "correction"
: asymptotic bias correction, as in
Cordeiro & McCullagh (1991).
The choice of the parameterization for the dispersion is controlled
by the transformation
argument (see brglmControl()
for
details). The default is "identity"
. Using transformation = "inverse"
uses the dispersion parameterization that
MASS::glm.nb()
uses.
A fitted model object of class "brnb"
inheriting
from "negbin"
and "brglmFit"
. The
object is similar to the output of brglmFit()
but contains
four additional components: theta
for the maximum likelihood
estimate of the dispersion parameter as in MASS::glm.nb()
,
vcov.mean
for the estimated variance-covariance matrix of the
regression coefficients, vcov.dispersion
for the estimated
variance of the dispersion parameter in the chosen
parameterization (using the expected information), and
twologlik
for twice the log-likelihood function.
Euloge Clovis Kenne Pagui [aut]
[email protected], Ioannis Kosmidis [aut, cre]
[email protected]
Cordeiro G M, McCullagh P (1991). Bias correction in generalized linear models. Journal of the Royal Statistical Society. Series B (Methodological), 53, 629-643. doi:10.1111/j.2517-6161.1991.tb01852.x.
Firth D (1993). Bias reduction of maximum likelihood estimates. Biometrika. 80, 27-38. doi:10.2307/2336755.
Kenne Pagui E C, Salvan A, Sartori N (2017). Median bias reduction of maximum likelihood estimates. Biometrika, 104, 923–938. doi:10.1093/biomet/asx046.
Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. Statistics and Computing, 30, 43-59. doi:10.1007/s11222-019-09860-6.
Kosmidis I, Firth D (2009). Bias reduction in exponential family nonlinear models. Biometrika, 96, 793-804. doi:10.1093/biomet/asp055.
## Example in Saha, K., & Paul, S. (2005). Bias-corrected maximum ## likelihood estimator of the negative binomial dispersion ## parameter. Biometrics, 61, 179--185. # # Number of revertant colonies of salmonella data salmonella <- data.frame(freq = c(15, 16, 16, 27, 33, 20, 21, 18, 26, 41, 38, 27, 29, 21, 33, 60, 41, 42), dose = rep(c(0, 10, 33, 100, 333, 1000), 3), observation = rep(1:3, each = 6)) # Maximum likelihood fit with glm.nb of MASS salmonella_fm <- freq ~ dose + log(dose + 10) fitML_glmnb <- MASS::glm.nb(salmonella_fm, data = salmonella) # Maximum likelihood fit with brnb fitML <- brnb(salmonella_fm, data = salmonella, link = "log", transformation = "inverse", type = "ML") # Mean bias-reduced fit fitBR_mean <- update(fitML, type = "AS_mean") # Median bias-reduced fit fitBR_median <- update(fitML, type = "AS_median") # Mixed bias-reduced fit fitBR_mixed <- update(fitML, type = "AS_mixed") # Mean bias-corrected fit fitBC_mean <- update(fitML, type = "correction") # Penalized likelihood with Jeffreys-prior penalty fit_Jeffreys <- update(fitML, type = "MPL_Jeffreys") # The parameter estimates from glm.nb and brnb with type = "ML" are # numerically the same all.equal(c(coef(fitML_glmnb), fitML_glmnb$theta), coef(fitML, model = "full"), check.attributes = FALSE) # Because of the invariance properties of the maximum likelihood, # median reduced-bias, and mixed reduced-bias estimators the # estimate of a monotone function of the dispersion should be # (numerically) the same as the function of the estimate of the # dispersion: # ML coef(fitML, model = "dispersion") 1 / coef(update(fitML, transformation = "identity"), model = "dispersion") # Median BR coef(fitBR_median, model = "dispersion") 1 / coef(update(fitBR_median, transformation = "identity"), model = "dispersion") # Mixed BR coef(fitBR_mixed, model = "dispersion") 1 / coef(update(fitBR_mixed, transformation = "identity"), model = "dispersion") ## The same is not true for mean BR coef(fitBR_mean, model = "dispersion") 1 / coef(update(fitBR_mean, transformation = "identity"), model = "dispersion") ## An example from Venables & Ripley (2002, p.169). data("quine", package = "MASS") quineML <- brnb(Days ~ Sex/(Age + Eth*Lrn), link = "sqrt", transformation="inverse", data = quine, type="ML") quineBR_mean <- update(quineML, type = "AS_mean") quineBR_median <- update(quineML, type = "AS_median") quineBR_mixed <- update(quineML, type = "AS_mixed") quine_Jeffreys <- update(quineML, type = "MPL_Jeffreys") fits <- list(ML = quineML, AS_mean = quineBR_mean, AS_median = quineBR_median, AS_mixed = quineBR_mixed, MPL_Jeffreys = quine_Jeffreys) sapply(fits, coef, model = "full")
## Example in Saha, K., & Paul, S. (2005). Bias-corrected maximum ## likelihood estimator of the negative binomial dispersion ## parameter. Biometrics, 61, 179--185. # # Number of revertant colonies of salmonella data salmonella <- data.frame(freq = c(15, 16, 16, 27, 33, 20, 21, 18, 26, 41, 38, 27, 29, 21, 33, 60, 41, 42), dose = rep(c(0, 10, 33, 100, 333, 1000), 3), observation = rep(1:3, each = 6)) # Maximum likelihood fit with glm.nb of MASS salmonella_fm <- freq ~ dose + log(dose + 10) fitML_glmnb <- MASS::glm.nb(salmonella_fm, data = salmonella) # Maximum likelihood fit with brnb fitML <- brnb(salmonella_fm, data = salmonella, link = "log", transformation = "inverse", type = "ML") # Mean bias-reduced fit fitBR_mean <- update(fitML, type = "AS_mean") # Median bias-reduced fit fitBR_median <- update(fitML, type = "AS_median") # Mixed bias-reduced fit fitBR_mixed <- update(fitML, type = "AS_mixed") # Mean bias-corrected fit fitBC_mean <- update(fitML, type = "correction") # Penalized likelihood with Jeffreys-prior penalty fit_Jeffreys <- update(fitML, type = "MPL_Jeffreys") # The parameter estimates from glm.nb and brnb with type = "ML" are # numerically the same all.equal(c(coef(fitML_glmnb), fitML_glmnb$theta), coef(fitML, model = "full"), check.attributes = FALSE) # Because of the invariance properties of the maximum likelihood, # median reduced-bias, and mixed reduced-bias estimators the # estimate of a monotone function of the dispersion should be # (numerically) the same as the function of the estimate of the # dispersion: # ML coef(fitML, model = "dispersion") 1 / coef(update(fitML, transformation = "identity"), model = "dispersion") # Median BR coef(fitBR_median, model = "dispersion") 1 / coef(update(fitBR_median, transformation = "identity"), model = "dispersion") # Mixed BR coef(fitBR_mixed, model = "dispersion") 1 / coef(update(fitBR_mixed, transformation = "identity"), model = "dispersion") ## The same is not true for mean BR coef(fitBR_mean, model = "dispersion") 1 / coef(update(fitBR_mean, transformation = "identity"), model = "dispersion") ## An example from Venables & Ripley (2002, p.169). data("quine", package = "MASS") quineML <- brnb(Days ~ Sex/(Age + Eth*Lrn), link = "sqrt", transformation="inverse", data = quine, type="ML") quineBR_mean <- update(quineML, type = "AS_mean") quineBR_median <- update(quineML, type = "AS_median") quineBR_mixed <- update(quineML, type = "AS_mixed") quine_Jeffreys <- update(quineML, type = "MPL_Jeffreys") fits <- list(ML = quineML, AS_mean = quineBR_mean, AS_median = quineBR_median, AS_mixed = quineBR_mixed, MPL_Jeffreys = quine_Jeffreys) sapply(fits, coef, model = "full")
This data set contains survival data on government coalitions in parliamentary democracies (Belgium, Canada, Denmark, Finland, France, Iceland, Ireland, Israel, Italy, Netherlands, Norway, Portugal, Spain, Sweden, and the United Kingdom) for the period 1945-1987. For parsimony, country indicator variables are omitted in the sample data.
coalition
coalition
A data frame with 314 rows and the 7 variables "duration", "ciep12", "invest", "fract", "polar", "numst2", and "crisis". For variable descriptions, please refer to King et al (1990).
Data is as it is provided by the Zeilig R package.
King G, Alt J E, Burns N E, Laver M. (1990). A Unified Model of Cabinet Dissolution in Parliamentary Democracies. American Journal of Political Science, 34, 846-870. doi:10.2307/2111401.
King G, Alt J E, Burns N E, Laver M. ICPSR Publication Related Archive, 1115.
"brglmFit"
objectsExtract model coefficients from "brglmFit"
objects
## S3 method for class 'brglmFit' coef(object, model = c("mean", "full", "dispersion"), ...)
## S3 method for class 'brglmFit' coef(object, model = c("mean", "full", "dispersion"), ...)
object |
an object for which the extraction of model coefficients is meaningful. |
model |
one of |
... |
other arguments. |
See coef()
for more details.
"brglmFit_expo"
objectsExtract estimates from "brglmFit_expo"
objects
## S3 method for class 'brglmFit_expo' coef(object, ...)
## S3 method for class 'brglmFit_expo' coef(object, ...)
object |
an object for which the extraction of model coefficients is meaningful. |
... |
other arguments. |
"brnb"
objectsExtract model coefficients from "brnb"
objects
## S3 method for class 'brnb' coef(object, model = c("mean", "full", "dispersion"), ...)
## S3 method for class 'brnb' coef(object, model = c("mean", "full", "dispersion"), ...)
object |
an object for which the extraction of model coefficients is meaningful. |
model |
one of |
... |
other arguments. |
See coef()
for more details.
"brglmFit"
objectMethod for computing confidence intervals for one or more
regression parameters in a "brglmFit"
object
## S3 method for class 'brglmFit' confint(object, parm, level = 0.95, ...)
## S3 method for class 'brglmFit' confint(object, parm, level = 0.95, ...)
object |
a fitted model object. |
parm |
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
the confidence level required. |
... |
additional argument(s) for methods. |
"brmultinom"
objectMethod for computing confidence intervals for one or more
regression parameters in a "brmultinom"
object
## S3 method for class 'brmultinom' confint(object, parm, level = 0.95, ...)
## S3 method for class 'brmultinom' confint(object, parm, level = 0.95, ...)
object |
a fitted model object. |
parm |
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
the confidence level required. |
... |
additional argument(s) for methods. |
"brnb"
objectMethod for computing Wald confidence intervals for one or more
regression parameters in a "brnb"
object
## S3 method for class 'brnb' confint(object, parm, level = 0.95, ...)
## S3 method for class 'brnb' confint(object, parm, level = 0.95, ...)
object |
a fitted model object. |
parm |
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
the confidence level required. |
... |
additional argument(s) for methods. |
Histology grade and risk factors for 79 cases of endometrial cancer
endometrial
endometrial
A data frame with 79 rows and 4 variables:
NV
: neovasculization with coding 0 for absent and 1 for present
PI
: pulsality index of arteria uterina
EH
: endometrium height
HG
histology grade with coding 0 for low grade and 1 for high grade
The packaged data set was downloaded in .dat
format from
https://users.stat.ufl.edu/~aa/glm/data/. The latter link
provides the data sets used in Agresti (2015).
The endometrial data set was first analyzed in Heinze and Schemper (2002), and was originally provided by Dr E. Asseryanis from the Medical University of Vienna.
Agresti A (2015). Foundations of Linear and Generalized Linear Models. Wiley Series in Probability and Statistics. Wiley.
Heinze G, Schemper M (2002). A Solution to the Problem of Separation in Logistic Regression. Statistics in Medicine, 21, 2409–2419. doi:10.1002/sim.1047.
Kosmidis I, Firth D (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71-82. doi:10.1093/biomet/asaa052.
Liver enzyme data collected from 218 patients with liver disease
(Plomteux, 1980). The laboratory profile consists of enzymatic
activity measured for four liver enzymes: aspartate
aminotransferase (AST
), alanine aminotransferase (ALT
),
glutamate dehydrogenase (GLDH
) and ornithine carbamyltransferase
(OCT
).
enzymes
enzymes
A data frame with 218 rows and the following 6 columns:
Patient
: Patient ID
Group
: Four diagnostic groups were considered: acute viral
hepatitis (1), persistent chronic hepatitis (2), aggressive chronic
hepatitis (3) and post-necrotic cirrhosis (4).
AST
: Aspartate aminotransferase (in U/L)
ALT
: Alanine aminotransferase (in U/L)
GLDH
: Glutamate dehydrogenase (in U/L)
OCT
: Ornithine carbamyltransferase (in U/L)
Data from Albert and Harris (1984, Chapter 5, Appendix I), and is also provided by the pmlr R package.
Albert A, Harris E K (1984). Multivariate Interpretation of Clinical Laboratory Data. Dekker: New York.
Plomteux G (1980). Multivariate analysis of an enzyme profile for the differential diagnosis of viral hepatitis. Clinical Chemistry, 26, 1897-1899.
The expo()
method uses the supplied "brglmFit"
or
"glm"
object to estimate the exponential of parameters of
generalized linear models with maximum likelihood or various mean
and median bias reduction methods. expo()
is useful for computing
(corrected) estimates of the multiplicative impact of a unit
increase on a covariate on the mean of a Poisson log-linear model
(family = poisson("log")
in glm()
) while adjusting for other
covariates, the odds ratio associated with a unit increase on a
covariate in a logistic regression model (family = binomial("logit")
in glm()
) while adjusting for other
covariates, the relative risk associated with a unit increase on a
covariate in a relative risk regression model (family = binomial("log")
in glm()
) while adjusting for other covariates,
among others.
## S3 method for class 'brglmFit' expo( object, type = c("correction*", "correction+", "Lylesetal2012", "AS_median", "ML"), level = 0.95 ) ## S3 method for class 'glm' expo( object, type = c("correction*", "correction+", "Lylesetal2012", "AS_median", "ML"), level = 0.95 )
## S3 method for class 'brglmFit' expo( object, type = c("correction*", "correction+", "Lylesetal2012", "AS_median", "ML"), level = 0.95 ) ## S3 method for class 'glm' expo( object, type = c("correction*", "correction+", "Lylesetal2012", "AS_median", "ML"), level = 0.95 )
object |
an object of class |
type |
the type of correction to be used. The available
options are |
level |
the confidence level required. Default is |
The supported methods through the type
argument are:
"ML"
: the estimates of the exponentiated parameters are
, where
is the maximum
likelihood estimates for the
th regression parameter.
"correction*"
: the estimates of the exponentiated parameters
are , where
is the estimate of the
th regression
parameter using
type = "AS_mixed"
in brglmFit()
.
"correction+"
: the estimates of the exponentiated parameters
are , where
is the estimate of the
th regression
parameter using
type = "AS_mixed"
in brglmFit()
.
"Lylesetal2012"
: the estimates of the exponentiated parameters
are , where
is the estimate of the
th regression
parameter using
type = "AS_mixed"
in brglmFit()
. This estimator
has been proposed in Lyles et al. (2012).
"AS_median"
: the estimates of the exponentiated parameters are
, where
is the estimate
of the
th regression parameter using
type = "AS_median"
in
brglmFit()
.
"correction*"
and "correction+"
are based on multiplicative and
additive adjustments, respectively, of the exponential of a
reduced-bias estimator (like the ones coming from brglmFit()
with
type = "AS_mixed"
, type = "AS_mean"
, and type = "correction"
). The form of those adjustments results from the
expression of the first-term in the mean bias expansion of the
exponential of a reduced-bias estimator. See, for example, Di
Caterina & Kosmidis (2019, expression 12) for the general form of
the first-term of the mean bias of a smooth transformation of a
reduced-bias estimator.
The estimators from "correction+"
, "correction*"
,
"Lylesetal2012"
have asymptotic mean bias of order smaller than
than of the maximum likelihood estimator. The estimators from
"AS_median"
are asymptotically closed to being median unbiased
than the maximum likelihood estimator is.
Estimated standard errors are computed using the delta method, where both the Jacobin and the information matrix are evaluated at the logarithm of the estimates of the exponentiated parameters.
Confidence intervals results by taking the exponential of the limits of standard Wald-type intervals computed at the logarithm of the estimates of the exponentiated parameters.
a list inheriting from class "brglmFit_expo"
with
components coef
(the estimates of the exponentiated
regression parameters), se
(the corresponding estimated
standard errors for the exponentiated parameters), ci
(confidence intervals of level level
for the exponentiated
parameters), and type
for the type
of correction that has
been requested.
Ioannis Kosmidis [aut, cre]
[email protected]
Di Caterina C, Kosmidis I (2019). Location-Adjusted Wald Statistics for Scalar Parameters. Computational Statistics & Data Analysis, 138, 126-142. doi:10.1016/j.csda.2019.04.004.
Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. Statistics and Computing, 30, 43-59. doi:10.1007/s11222-019-09860-6.
Cordeiro G M, McCullagh P (1991). Bias correction in generalized linear models. Journal of the Royal Statistical Society. Series B (Methodological), 53, 629-643. doi:10.1111/j.2517-6161.1991.tb01852.x.
Lyles R H, Guo Y, Greenland S (2012). Reducing bias and mean squared error associated with regression-based odds ratio estimators. Journal of Statistical Planning and Inference, 142 3235–3241. doi:10.1016/j.jspi.2012.05.005.
brglm_fit()
and and brglm_control()
## The lizards example from ?brglm::brglm lizardsML <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "glm.fit") # Get estimates, standard errors, and confidence intervals of odds # ratios with various methods expo(lizardsML, type = "ML") expo(lizardsML, type = "correction*") expo(lizardsML, type = "Lylesetal2012") expo(lizardsML, type = "correction+") expo(lizardsML, type = "AS_median") ## Example from ?glm ## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) expo(glm.D93, type = "correction*")
## The lizards example from ?brglm::brglm lizardsML <- glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "glm.fit") # Get estimates, standard errors, and confidence intervals of odds # ratios with various methods expo(lizardsML, type = "ML") expo(lizardsML, type = "correction*") expo(lizardsML, type = "Lylesetal2012") expo(lizardsML, type = "correction+") expo(lizardsML, type = "AS_median") ## Example from ?glm ## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) expo(glm.D93, type = "correction*")
Data from a randomized double-blind trial to assess whether
withholding donor blood positive for the non-A, non-B ("NANB"
)
surrogate markers would reduce the frequency of post-transfusion
hepatitis. The dataset contains 4588
subjects enrolled from 1988
to 1992 into two study groups that received allogenic blood from
which units positive for NANB surrogate markers were withheld (n =
2311
) or not withheld (n = 2277
). Subjects were followed up
for 6 months and assessed for the presence of post-transfusion
hepatitis.
hepatitis
hepatitis
A data frame with 28 rows and the following 6 columns:
city
: Subjects were recruited from 3 Canadian Red Cross Society
Blood Centres and 13 university-affiliated hospitals in 3 cities:
Toronto, Hamilton and Winnipeg.
group
: Eligible subjects were assigned to one of two allogenic
blood recipient groups. One group received products that had only
routine Canadian transfusion-transmissible disease marker screening
(no-withhold). The other group received only products that were
not positive for NANB surrogate markers (withhold).
time
: Hepatitis C (HCV) screening was introduced in Canada in
May, 1990. Subjects were recruited into the study before (pre) and
after (post) the introduction of anti-HCV testing.
HCV
: Post-transfusion HCV hepatitis present (1) or absent (0).
nonABC
: Post-transfusion non-A, non-B, non-C hepatitis present (1) or absent (0)
counts
: Number of subjects
Data is from Blajchman et al. (1995), also analyzed in Bull et al. (2002), and is also provided by the pmlr R package.
Bull S B, Mak C, Greenwood C M T (2002). A modified score function estimator for multinomial logistic regression in small samples. Computational Statistics & Data Analysis, 39, 57-74. doi:10.1016/S0167-9473(01)00048-2
Blajchman M A, Bull S B and Feinman S V (1995). Post-transfusion hepatitis: impact of non-A, non-B hepatitis surrogate tests. The Lancet, 345, 21–25. doi:10.1016/S0140-6736(95)91153-7
Habitat preferences of lizards
lizards
lizards
A data frame with 23 rows and 6 columns:
grahami
. count of grahami lizards
opalinus
. count of opalinus lizards
height
. a factor with levels <5ft
, >=5ft
diameter
. a factor with levels <=2in
, >2in
light
. a factor with levels sunny
, shady
time
. a factor with levels early
, midday
, late
The variables grahami
and opalinus
are counts of two lizard
species at two different perch heights, two different perch
diameters, in sun and in shade, at three times of day.
McCullagh P, Nelder J A (1989) Generalized Linear Models (2nd Edition). London: Chapman and Hall.
Originally from
Schoener T W (1970) Nonsynchronous spatial overlap of lizards in patchy habitats. _Ecology_ *51*, 408-418.
"link-glm"
object for misclassified responses in binomial regression modelsmis()
is a "link-glm"
object that specifies the link function in Neuhaus (1999, expression (8)) for handling misclassified responses in binomial regression models using maximum likelihood. A prior specification of the sensitivity and specificity is required.
mis(link = "logit", sensitivity = 1, specificity = 1)
mis(link = "logit", sensitivity = 1, specificity = 1)
link |
the baseline link to be used. |
sensitivity |
the probability of observing a success given that a success actually took place given any covariate values. |
specificity |
the probability of observing a failure given that a failure actually took place given any covariate values. |
sensitivity + specificity
should be greater or equal to 1,
otherwise it is implied that the procedure producing the responses
performs worse than chance in terms of misclassification.
Neuhaus J M (1999). Bias and efficiency loss due to misclassified responses in binary regression. Biometrika, 86, 843-855. https://www.jstor.org/stable/2673589.
## Define a few links with some misclassification logit_mis <- mis(link = "logit", sensitivity = 0.9, specificity = 0.9) lizards_f <- cbind(grahami, opalinus) ~ height + diameter + light + time lizardsML <- glm(lizards_f, family = binomial(logit), data = lizards) lizardsML_mis <- update(lizardsML, family = binomial(logit_mis), start = coef(lizardsML)) ## A notable change is coefficients is noted here compared to when ## specificity and sensitity are 1 coef(lizardsML) coef(lizardsML_mis) ## Bias reduction is also possible update(lizardsML_mis, method = "brglmFit", type = "AS_mean", start = coef(lizardsML)) update(lizardsML_mis, method = "brglmFit", type = "AS_median", start = coef(lizardsML))
## Define a few links with some misclassification logit_mis <- mis(link = "logit", sensitivity = 0.9, specificity = 0.9) lizards_f <- cbind(grahami, opalinus) ~ height + diameter + light + time lizardsML <- glm(lizards_f, family = binomial(logit), data = lizards) lizardsML_mis <- update(lizardsML, family = binomial(logit_mis), start = coef(lizardsML)) ## A notable change is coefficients is noted here compared to when ## specificity and sensitity are 1 coef(lizardsML) coef(lizardsML_mis) ## Bias reduction is also possible update(lizardsML_mis, method = "brglmFit", type = "AS_mean", start = coef(lizardsML)) update(lizardsML_mis, method = "brglmFit", type = "AS_median", start = coef(lizardsML))
ordinal_superiority()
is a method for the estimation and
inference about model-based ordinal superiority scores introduced
in Agresti and Kateri (2017, Section 5) from fitted objects. The
mean bias of the estimates of the ordinal superiority scores can be
corrected.
## S3 method for class 'bracl' ordinal_superiority( object, formula, data, measure = c("gamma", "Delta"), level = 0.95, bc = FALSE )
## S3 method for class 'bracl' ordinal_superiority( object, formula, data, measure = c("gamma", "Delta"), level = 0.95, bc = FALSE )
object |
a fitted object from an ordinal regression
model. Currently only models from class |
formula |
a RHS formula indicating the group variable to use. |
data |
an optional data frame in which to look for variables
with which to compute ordinal superiority measures. If
omitted, an attempt is made to use the data that produced
|
measure |
either |
level |
the confidence level required when computing confidence intervals for the ordinal superiority measures. |
bc |
logical. If |
Agresti, A., Kateri, M. (2017). Ordinal probability effect measures for group comparisons in multinomial cumulative link models. Biometrics, 73 214-219. doi:10.1111/biom.12565.
data("stemcell", package = "brglm2") # Adjacent category logit (proportional odds) stem <- within(stemcell, {nreligion = as.numeric(religion)}) fit_bracl_p <- bracl(research ~ nreligion + gender, weights = frequency, data = stem, type = "ML", parallel = TRUE) # Estimates and 95% confidence intervals for the probabilities that the response # category for gender "female" is higher than the response category for gender "male", # while adjusting for religion. ordinal_superiority(fit_bracl_p, ~ gender) ## Not run: # And their (very-similar in value here) bias corrected versions # with 99% CIs ordinal_superiority(fit_bracl_p, ~ gender, bc = TRUE, level = 0.99) # Note that the object is refitted with type = "AS_mean" ## End(Not run)
data("stemcell", package = "brglm2") # Adjacent category logit (proportional odds) stem <- within(stemcell, {nreligion = as.numeric(religion)}) fit_bracl_p <- bracl(research ~ nreligion + gender, weights = frequency, data = stem, type = "ML", parallel = TRUE) # Estimates and 95% confidence intervals for the probabilities that the response # category for gender "female" is higher than the response category for gender "male", # while adjusting for religion. ordinal_superiority(fit_bracl_p, ~ gender) ## Not run: # And their (very-similar in value here) bias corrected versions # with 99% CIs ordinal_superiority(fit_bracl_p, ~ gender, bc = TRUE, level = 0.99) # Note that the object is refitted with type = "AS_mean" ## End(Not run)
Obtain class and probability predictions from a fitted adjacent category logits model.
## S3 method for class 'bracl' predict(object, newdata, type = c("class", "probs"), ...)
## S3 method for class 'bracl' predict(object, newdata, type = c("class", "probs"), ...)
object |
a fitted object of class inheriting from |
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used. |
type |
the type of prediction required. The default is
|
... |
further arguments passed to or from other methods. |
If newdata
is omitted the predictions are based on the data
used for the fit.
If type = "class"
a vector with the predicted response
categories; if type = "probs"
a matrix of probabilities for all
response categories at newdata
.
data("stemcell", package = "brglm2") # Adjacent category logit (non-proportional odds) fit_bracl <- bracl(research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML") # Adjacent category logit (proportional odds) fit_bracl_p <- bracl(research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML", parallel = TRUE) # New data newdata <- expand.grid(gender = c("male", "female"), religion = c("liberal", "moderate", "fundamentalist")) # Predictions sapply(c("class", "probs"), function(what) predict(fit_bracl, newdata, what)) sapply(c("class", "probs"), function(what) predict(fit_bracl_p, newdata, what))
data("stemcell", package = "brglm2") # Adjacent category logit (non-proportional odds) fit_bracl <- bracl(research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML") # Adjacent category logit (proportional odds) fit_bracl_p <- bracl(research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML", parallel = TRUE) # New data newdata <- expand.grid(gender = c("male", "female"), religion = c("liberal", "moderate", "fundamentalist")) # Predictions sapply(c("class", "probs"), function(what) predict(fit_bracl, newdata, what)) sapply(c("class", "probs"), function(what) predict(fit_bracl_p, newdata, what))
Obtain class and probability predictions from a fitted baseline category logits model.
## S3 method for class 'brmultinom' predict(object, newdata, type = c("class", "probs"), ...)
## S3 method for class 'brmultinom' predict(object, newdata, type = c("class", "probs"), ...)
object |
a fitted object of class inheriting from
|
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used. |
type |
the type of prediction required. The default is
|
... |
further arguments passed to or from other methods. |
If newdata
is omitted the predictions are based on the data used
for the fit.
If type = "class"
a vector with the predicted response
categories; if type = "probs"
a matrix of probabilities for all
response categories at newdata
.
data("housing", package = "MASS") # Maximum likelihood using brmultinom with baseline category 'Low' houseML1 <- brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 1) # New data newdata <- expand.grid(Infl = c("Low", "Medium"), Type = c("Tower", "Atrium", "Terrace"), Cont = c("Low", NA, "High")) ## Predictions sapply(c("class", "probs"), function(what) predict(houseML1, newdata, what))
data("housing", package = "MASS") # Maximum likelihood using brmultinom with baseline category 'Low' houseML1 <- brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 1) # New data newdata <- expand.grid(Infl = c("Low", "Medium"), Type = c("Tower", "Atrium", "Terrace"), Cont = c("Low", NA, "High")) ## Predictions sapply(c("class", "probs"), function(what) predict(houseML1, newdata, what))
Residuals for multinomial logistic regression and adjacent category logit models
## S3 method for class 'brmultinom' residuals(object, type = c("pearson", "response", "deviance", "working"), ...) ## S3 method for class 'bracl' residuals(object, type = c("pearson", "response", "deviance", "working"), ...)
## S3 method for class 'brmultinom' residuals(object, type = c("pearson", "response", "deviance", "working"), ...) ## S3 method for class 'bracl' residuals(object, type = c("pearson", "response", "deviance", "working"), ...)
object |
the object coming out of |
type |
the type of residuals which should be returned. The
options are: |
... |
Currently not used. |
The residuals computed are the residuals from the equivalent
Poisson log-linear model fit, organized in a form that matches the
output of fitted(object, type = "probs")
. As a result, the output
is residuals defined in terms of the object and expected
multinomial counts.
brmultinom bracl
"brmultinom"
and "bracl"
objectsMethod for simulating a data set from "brmultinom"
and "bracl"
objects
## S3 method for class 'brmultinom' simulate(object, ...)
## S3 method for class 'brmultinom' simulate(object, ...)
object |
an object of class |
... |
currently not used. |
A "data.frame"
with object$ncat
times the rows that
model.frame(object)
have and the same variables. If weights
has
been specified in the call that generated object
, then the
simulate frequencies will populate the weights variable. Otherwise,
the resulting data.frame will have a ".weights"
variable with
the simulated multinomial counts.
## Multinomial logistic regression data("housing", package = "MASS") houseML1 <- brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 1) simulate(houseML1) ## Adjacent-category logits data("stemcell", package = "brglm2") stemML1 <- bracl(research ~ religion + gender, weights = frequency, data = stemcell, type = "ML") simulate(stemML1)
## Multinomial logistic regression data("housing", package = "MASS") houseML1 <- brmultinom(Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 1) simulate(houseML1) ## Adjacent-category logits data("stemcell", package = "brglm2") stemML1 <- bracl(research ~ religion + gender, weights = frequency, data = stemcell, type = "ML") simulate(stemML1)
Simulate one or more responses from the distribution corresponding
to a fitted model "brnb"
object.
## S3 method for class 'brnb' simulate(object, nsim = 1, seed = NULL, ...)
## S3 method for class 'brnb' simulate(object, nsim = 1, seed = NULL, ...)
object |
an object representing a fitted model. |
nsim |
number of response vectors to simulate. Defaults to 1. |
seed |
an object specifying if and how the random number
generator should be initialized; see |
... |
extra arguments to be passed to methods. Not currently used. |
# Example in Saha, K., & Paul, S. (2005). Bias-corrected maximum # likelihood estimator of the negative binomial dispersion # parameter. Biometrics, 61, 179--185. # # Frequency distribution of red mites on apple leaves. nomites <- 0:8 noleaves <- c(70, 38, 17, 10, 9, 3, 2, 1, 0) fit_glmnb <- MASS::glm.nb(nomites~1,link="identity",weights = noleaves) fit_brnb <- brnb(nomites ~ 1, link = "identity", transformation = "inverse", type = "ML",weights = noleaves) ## Let us simulate 10 response vectors sim_glmnb <- simulate(fit_glmnb, nsim = 10, seed = 123) sim_brnb <- simulate(fit_brnb, nsim = 10, seed = 123) # The results from glm.nb and brnb with type = "ML" are # exactly the same all.equal(sim_glmnb, sim_brnb, check.attributes = FALSE)
# Example in Saha, K., & Paul, S. (2005). Bias-corrected maximum # likelihood estimator of the negative binomial dispersion # parameter. Biometrics, 61, 179--185. # # Frequency distribution of red mites on apple leaves. nomites <- 0:8 noleaves <- c(70, 38, 17, 10, 9, 3, 2, 1, 0) fit_glmnb <- MASS::glm.nb(nomites~1,link="identity",weights = noleaves) fit_brnb <- brnb(nomites ~ 1, link = "identity", transformation = "inverse", type = "ML",weights = noleaves) ## Let us simulate 10 response vectors sim_glmnb <- simulate(fit_glmnb, nsim = 10, seed = 123) sim_brnb <- simulate(fit_brnb, nsim = 10, seed = 123) # The results from glm.nb and brnb with type = "ML" are # exactly the same all.equal(sim_glmnb, sim_brnb, check.attributes = FALSE)
A data set from the 2006 General Social Survey that shows the relationship in the United States between opinion about funding stem cell research and the fundamentalism/liberalism of one’s religious beliefs, stratified by gender.
stemcell
stemcell
A data frame with 24 rows and 4 variables:
research
: opinion about funding stem cell research with levels definitely
, probably
, probably not
, definitely not
gender
: the gender of the respondent with levels female
and male
religion
: the fundamentalism/liberalism of one’s religious beliefs with levels fundamentalist
, moderate
,
liberal
frequency
: the number of times a respondent fell in each of the combinations of levels for research
, religion
and gender
The stemcell
data set is analyzed in Agresti (2010, Subsection 4.1.5).
Agresti A (2010). Analysis of Ordinal Categorical Data (2nd edition). Wiley Series in Probability and Statistics. Wiley.
summary()
method for brglmFit objectssummary()
method for brglmFit objects
## S3 method for class 'brglmFit' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, ... ) ## S3 method for class 'summary.brglmFit' print( x, digits = max(3L, getOption("digits") - 3L), symbolic.cor = x$symbolic.cor, signif.stars = getOption("show.signif.stars"), ... )
## S3 method for class 'brglmFit' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, ... ) ## S3 method for class 'summary.brglmFit' print( x, digits = max(3L, getOption("digits") - 3L), symbolic.cor = x$symbolic.cor, signif.stars = getOption("show.signif.stars"), ... )
object |
an object of class |
dispersion |
the dispersion parameter for the family used.
Either a single numerical value or |
correlation |
logical; if |
symbolic.cor |
logical. If |
... |
further arguments passed to or from other methods. |
x |
an object of class |
digits |
the number of significant digits to use when printing. |
signif.stars |
logical. If |
The interface of the summary method for "brglmFit"
objects is identical to that of "glm"
objects. The summary
method for "brglmFit"
objects computes the p-values of the
individual Wald statistics based on the standard normal
distribution, unless the family is Gaussian, in which case a t
distribution with appropriate degrees of freedom is used.
summary.glm()
and glm()
## For examples see `examples(brglmFit)`
## For examples see `examples(brglmFit)`
summary()
method for "brnb"
objectssummary()
method for "brnb"
objects
## S3 method for class 'brnb' summary(object, ...) ## S3 method for class 'summary.brnb' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'brnb' summary(object, ...) ## S3 method for class 'summary.brnb' print(x, digits = max(3, getOption("digits") - 3), ...)
object |
an object of class |
... |
further arguments passed to or from other methods. |
x |
an object of class |
digits |
the number of significant digits to use when printing. |
The interface of the summary method for "brnb"
objects is similar to that of "brglmFit"
objects
with additional information.
p-values of the individual Wald statistics are based on the standard normal distribution.
# For examples see examples(brnb)
# For examples see examples(brnb)
brglmFit()
objectReturn the variance-covariance matrix for the regression parameters
in a brglmFit()
object
## S3 method for class 'brglmFit' vcov(object, model = c("mean", "full", "dispersion"), complete = TRUE, ...)
## S3 method for class 'brglmFit' vcov(object, model = c("mean", "full", "dispersion"), complete = TRUE, ...)
object |
a fitted model object, typically. Sometimes also a
|
model |
character specifying for which component of the model coefficients should be extracted. |
complete |
for the |
... |
additional arguments for method functions. For the
|
The options for model
are "mean"
for mean regression parameters
only (default), "dispersion"
for the dispersion parameter (or the
transformed dispersion; see brglm_control()
), and "full"
for
both the mean regression and the (transformed) dispersion
parameters.
"brnb"
objectsExtract model variance-covariance matrix from "brnb"
objects
## S3 method for class 'brnb' vcov(object, model = c("mean", "full", "dispersion"), complete = TRUE, ...)
## S3 method for class 'brnb' vcov(object, model = c("mean", "full", "dispersion"), complete = TRUE, ...)
object |
an object of class |
model |
character specifying for which component of the model variance-covariance matrix should be extracted. |
complete |
for the |
... |
additional arguments for method functions. For the
|
The options for model
are "mean"
for mean regression only
(default), "dispersion"
for the dispersion parameter (in a
chosen transformation; see brglmControl()
, and "full"
for both
the mean regression and the (transformed) dispersion parameters.
See vcov()
for more details.