Title: | Bias Reduction in Binomial-Response Generalized Linear Models |
---|---|
Description: | Fit generalized linear models with binomial responses using either an adjusted-score approach to bias reduction or maximum penalized likelihood where penalization is by Jeffreys invariant prior. These procedures return estimates with improved frequentist properties (bias, mean squared error) that are always finite even in cases where the maximum likelihood estimates are infinite (data separation). Fitting takes place by fitting generalized linear models on iteratively updated pseudo-data. The interface is essentially the same as 'glm'. More flexibility is provided by the fact that custom pseudo-data representations can be specified and used for model fitting. Functions are provided for the construction of confidence intervals for the reduced-bias estimates. |
Authors: | Ioannis Kosmidis [aut, cre] |
Maintainer: | Ioannis Kosmidis <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.7.2 |
Built: | 2024-11-10 05:14:52 UTC |
Source: | https://github.com/ikosmidis/brglm |
Fits binomial-response GLMs using the bias-reduction method developed in
Firth (1993) for the removal of the leading () term from the asymptotic expansion of the bias
of the maximum likelihood estimator. Fitting is performed using
pseudo-data representations, as described in Kosmidis (2007, Chapter 5). For
estimation in binomial-response GLMs, the bias-reduction method is an
improvement over traditional maximum likelihood because:
the bias-reduced estimator is second-order unbiased and has smaller variance than the maximum likelihood estimator and
the resulting estimates and their corresponding standard errors are always finite while the maximum likelihood estimates can be infinite (in situations where complete or quasi separation occurs); see Kosmidis & Firth (2021) for the proof of finiteness in logistic regression models.
brglm(formula, family = binomial, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control.glm = glm.control1(...), model = TRUE, method = "brglm.fit", pl = FALSE, x = FALSE, y = TRUE, contrasts = NULL, control.brglm = brglm.control(...), ...) brglm.fit(x, y, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = binomial(), control = glm.control(), control.brglm = brglm.control(), intercept = TRUE, pl = FALSE)
brglm(formula, family = binomial, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control.glm = glm.control1(...), model = TRUE, method = "brglm.fit", pl = FALSE, x = FALSE, y = TRUE, contrasts = NULL, control.brglm = brglm.control(...), ...) brglm.fit(x, y, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = binomial(), control = glm.control(), control.brglm = brglm.control(), intercept = TRUE, pl = FALSE)
formula |
as in |
family |
as in |
data |
as in |
weights |
as in |
subset |
as in |
na.action |
as in |
start |
as in |
etastart |
as in |
mustart |
as in |
offset |
as in |
control.glm |
|
control |
same as in |
intercept |
as in |
model |
as in |
method |
the method to be used for fitting the model. The default
method is |
pl |
a logical value indicating whether the
model should be fitted using maximum penalized likelihood, where the
penalization is done using Jeffreys invariant prior, or using the
bias-reducing modified scores. It is only used when
|
x |
as in |
y |
as in |
contrasts |
as in |
control.brglm |
a list of parameters for controlling the fitting
process when |
... |
further arguments passed to or from other methods. |
brglm.fit
is the workhorse function for fitting the model using
either the bias-reduction method or maximum penalized likelihood. If
method = "glm.fit"
, usual maximum likelihood is used via
glm.fit
.
The main iteration of brglm.fit
consists of the following
steps:
Calculate the diagonal components of the hat matrix (see
gethats
and hatvalues
).
Obtain the pseudo-data representation at the current value of the
parameters (see modifications
for more information).
Fit a local GLM, using glm.fit
on the pseudo data.
Adjust the quadratic weights to agree with the original binomial totals.
Iteration is repeated until either the iteration limit has been reached
or the sum of the absolute values of the modified scores is less than
some specified positive constant (see the br.maxit
and
br.epsilon
arguments in brglm.control
).
The default value (FALSE
) of pl
, when method = "brglm.fit"
,
results in estimates that are free of any terms in the asymptotic expansion of their bias. When
pl = TRUE
bias-reduction is again achieved but generally not at
such order of magnitude. In the case of logistic regression the value of
pl
is irrelevant since maximum penalized likelihood and the
modified-scores approach coincide for natural exponential families (see
Firth, 1993).
For other language related details see the details section in
glm
.
brglm
returns an object of class "brglm"
. A
"brglm"
object inherits first from "glm"
and then from
"lm"
and is a list containing the following components:
coefficients |
as in |
residuals |
as in |
fitted.values |
as in |
effects |
as in |
R |
as in |
rank |
as in |
qr |
as in |
family |
as in |
linear.predictors |
as in |
deviance |
as in |
aic |
as in |
null.deviance |
as in |
iter |
as in |
weights |
as in |
prior.weights |
as in |
df.residual |
as in |
df.null |
as in |
y |
as in |
converged |
as in |
boundary |
as in |
ModifiedScores |
the vector of the modified scores for the
parameters at the final iteration. If |
FisherInfo |
the Fisher information matrix evaluated at the
resulting estimates. Only available when |
hats |
the diagonal elements of the hat matrix. Only available
when |
nIter |
the number of iterations that were required until
convergence. Only available when |
cur.model |
a list with components |
model |
as in |
call |
as in |
formula |
as in |
terms |
as in |
data |
as in |
offset |
as in |
control.glm |
as |
control.brglm |
the |
method |
the method used for fitting the model. |
contrasts |
as in |
xlevels |
as in |
pl |
logical having the same value with the |
1. It is not advised to use methods associated with model comparison
(add1
, drop1
,
anova
, etc.) on objects of class
"brglm"
. Model comparison when estimation is performed using
the modified scores or the penalized likelihood is an on-going
research topic and will be implemented as soon as it is concluded.
2. The use of Akaike's information criterion (AIC) for model selection
when method = "brglm.fit"
is asymptotically valid, because
the log-likelihood derivatives dominate the modification (in terms
of asymptotic order).
1. Supported methods for objects of class "brglm"
are:
print
through print.brglm
.
summary
through summary.brglm
.
coefficients
inherited from the
"glm"
class.
vcov
inherited from the"glm"
class.
predict
inherited from the"glm"
class.
residuals
inherited from the"glm"
class.
and other methods that apply to objects of class
"glm"
2. A similar implementation of the bias-reduction method could be done for every GLM, following Kosmidis (2007) (see also Kosmidis and Firth, 2009). The full set of families and links will be available in a future version. However, bias-reduction is not generally beneficial as it is in the binomial family and it could cause inflation of the variance (see Firth, 1993).
3. Basically, the differences between maximum likelihood, maximum
penalized likelihood and the modified scores approach are more
apparent in small sample sizes, in sparse data sets and in cases
where complete or quasi-complete separation occurs. Asymptotically
(as goes to infinity), the three different approaches are
equivalent to first order.
4. When an offset is not present in the model, the modified-scores based estimates are usually smaller in magnitude than the corresponding maximum likelihood estimates, shrinking towards the origin of the scale imposed by the link function. Thus, the corresponding estimated asymptotic standard errors are also smaller.
The same is true for the maximum penalized likelihood estimates when for example, the logit (where the maximum penalized likelihood and modified-scores approaches coincide) or the probit links are used. However, generally the maximum penalized likelihood estimates do not shrink towards the origin. In terms of mean-value parameterization, in the case of maximum penalized likelihood the fitted probabilities would shrink towards the point where the Jeffreys prior is maximized or equivalently where the quadratic weights are simultaneously maximized (see Kosmidis, 2007).
5. Implementations of the bias-reduction method for logistic
regressions can also be found in the logistf package. In
addition to the obvious advantage of brglm
in the range of
link functions that can be used ("logit"
, "probit"
,
"cloglog"
and "cauchit"
), brglm
is also more
efficient computationally. Furthermore, for any user-specified link
function (see the Example section of family
), the
user can specify the corresponding pseudo-data representation to be
used within brglm
(see modifications
for
details).
Ioannis Kosmidis, [email protected]
Kosmidis I. and Firth D. (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71–82.
Bull, S. B., Lewinger, J. B. and Lee, S. S. F. (2007). Confidence intervals for multinomial logistic regression in sparse data. Statistics in Medicine 26, 903–918.
Firth, D. (1992) Bias reduction, the Jeffreys prior and GLIM. In Advances in GLIM and statistical modelling: Proceedings of the GLIM 92 conference, Munich, Eds. L.~Fahrmeir, B.~Francis, R.~Gilchrist and G.Tutz, pp. 91–100. New York: Springer.
Firth, D. (1992) Generalized linear models and Jeffreys priors: An iterative generalized least-squares approach. In Computational Statistics I, Eds. Y. Dodge and J. Whittaker. Heidelberg: Physica-Verlag.
Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika 80, 27–38.
Heinze, G. and Schemper, M. (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine 21, 2409–2419.
Kosmidis, I. (2007). Bias reduction in exponential family nonlinear models. PhD Thesis, Department of Statistics, University of Warwick.
Kosmidis, I. and Firth, D. (2009). Bias reduction in exponential family nonlinear models. Biometrika 96, 793–804.
## Begin Example data(lizards) # Fit the GLM using maximum likelihood lizards.glm <- brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data=lizards, method = "glm.fit") # Now the bias-reduced fit: lizards.brglm <- brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data=lizards, method = "brglm.fit") lizards.glm lizards.brglm # Other links update(lizards.brglm, family = binomial(probit)) update(lizards.brglm, family = binomial(cloglog)) update(lizards.brglm, family = binomial(cauchit)) # Using penalized maximum likelihood update(lizards.brglm, family = binomial(probit), pl = TRUE) update(lizards.brglm, family = binomial(cloglog), pl = TRUE) update(lizards.brglm, family = binomial(cauchit), pl = TRUE)
## Begin Example data(lizards) # Fit the GLM using maximum likelihood lizards.glm <- brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data=lizards, method = "glm.fit") # Now the bias-reduced fit: lizards.brglm <- brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data=lizards, method = "brglm.fit") lizards.glm lizards.brglm # Other links update(lizards.brglm, family = binomial(probit)) update(lizards.brglm, family = binomial(cloglog)) update(lizards.brglm, family = binomial(cauchit)) # Using penalized maximum likelihood update(lizards.brglm, family = binomial(probit), pl = TRUE) update(lizards.brglm, family = binomial(cloglog), pl = TRUE) update(lizards.brglm, family = binomial(cauchit), pl = TRUE)
Auxiliary function as user interface for brglm
fitting. Typically only used when calling brglm
or brglm.fit
.
brglm.control(br.epsilon = 1e-08, br.maxit = 100, br.trace=FALSE, br.consts = NULL, ...)
brglm.control(br.epsilon = 1e-08, br.maxit = 100, br.trace=FALSE, br.consts = NULL, ...)
br.epsilon |
positive convergence tolerance for the iteration
described in |
br.maxit |
integer giving the maximum number of iterations for
the iteration in |
br.trace |
logical indicating if output should be prooduced for each iteration. |
br.consts |
a (small) positive constant or a vector of such. |
... |
further arguments passed to or from other methods. |
If br.trace=TRUE
then for each iteration the iteration number
and the current value of the modified scores is
cat
'ed. If br.consts
is specified then br.consts
is added to the original binomial counts and 2*br.consts
. Then
the model is fitted to the adjusted data to provide starting values
for the iteration in brglm.fit
. If br.consts = NULL
(default) then brglm.fit
adjusts the responses and totals by
"number of parameters"/"number of observations" and twice that, respectively.
A list with the arguments as components.
Ioannis Kosmidis, [email protected]
Kosmidis I. and Firth D. (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71–82.
Kosmidis, I. (2007). Bias reduction in exponential family nonlinear models. PhD Thesis, Department of Statistics, University of Warwick.
brglm.fit
, the fitting procedure used by
brglm
.
Computes confidence intervals for one or more parameters when
estimation is performed using brglm
. The resulting confidence
intervals are based on manipulation of the profiles of the deviance,
the penalized deviance and the modified score statistic (see
profileObjectives
).
## S3 method for class 'brglm' confint(object, parm = 1:length(coef(object)), level = 0.95, verbose = TRUE, endpoint.tolerance = 0.001, max.zoom = 100, zero.bound = 1e-08, stepsize = 0.5, stdn = 5, gridsize = 10, scale = FALSE, method = "smooth", ci.method = "union", n.interpolations = 100, ...) ## S3 method for class 'profile.brglm' confint(object, parm, level = 0.95, method = "smooth", ci.method = "union", endpoint.tolerance = 0.001, max.zoom = 100, n.interpolations = 100, verbose = TRUE, ...)
## S3 method for class 'brglm' confint(object, parm = 1:length(coef(object)), level = 0.95, verbose = TRUE, endpoint.tolerance = 0.001, max.zoom = 100, zero.bound = 1e-08, stepsize = 0.5, stdn = 5, gridsize = 10, scale = FALSE, method = "smooth", ci.method = "union", n.interpolations = 100, ...) ## S3 method for class 'profile.brglm' confint(object, parm, level = 0.95, method = "smooth", ci.method = "union", endpoint.tolerance = 0.001, max.zoom = 100, n.interpolations = 100, verbose = TRUE, ...)
object |
an object of class |
parm |
either a numeric vector of indices or a character vector
of names, specifying the parameters for which confidence intervals
are to be estimated. The default is all parameters in the fitted
model. When |
level |
the confidence level required. The default is 0.95. When
|
verbose |
logical. If |
endpoint.tolerance |
as in |
max.zoom |
as in |
zero.bound |
as in |
stepsize |
as in |
stdn |
as in |
gridsize |
as in |
scale |
as in |
method |
as in |
ci.method |
The method to be used for the construction of
confidence intervals. It can take values |
n.interpolations |
as in |
... |
further arguments to or from other methods. |
In the case of logistic regression Heinze & Schemper (2002) and Bull et. al. (2007) suggest the use of confidence intervals based on the profiles of the penalized likelihood, when estimation is performed using maximum penalized likelihood.
Kosmidis (2007) illustrated that because of the shape of the penalized likelihood, confidence intervals based on the penalized likelihood could exhibit low or even zero coverage for hypothesis testing on large parameter values and also misbehave illustrating severe oscillation (see Brown et. al., 2001); see, also Kosmidis & Firth (2021) for discussion on the schrinkage implied by bias reduction and what that entails for inference. Kosmidis (2007) suggested an alternative confidence interval that is based on the union of the confidence intervals resulted by profiling the ordinary deviance for the maximum likelihood fit and by profiling the penalized deviance for the maximum penalized fit. Such confidence intervals, despite of being slightly conservative, illustrate less oscillation and avoid the loss of coverage. Another possibility is to use the mean of the corresponding endpoints instead of “union”. Yet unpublished simulation studies suggest that such confidence intervals are not as conservative as the “union” based intervals but illustrate more oscillation, which however is not as severe as in the case of the penalized likelihood based ones.
The properties of the “union” and “mean” confidence
intervals extend to all the links that are supported by
brglm
, when estimation is performed using maximum
penalized likelihood.
In the case of estimation using modified scores and for models other
than logistic, where there is not an objective that is maximized, the
profiles of the penalized likelihood for the construction of the
“union” and “mean” confidence intervals can be replaced
by the profiles of modified score statistic (see
profileObjectives
).
The confint
method for brglm
and profile.brglm
objects implements the “union” and “mean” confidence
intervals. The method is chosen through the ci.method
argument.
A matrix with columns the endpoints of the confidence intervals for the specified (or profiled) parameters.
Ioannis Kosmidis, [email protected]
Kosmidis I. and Firth D. (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71–82.
Brown, L. D., Cai, T. T. and DasGupta, A. (2001). Interval estimation for a binomial proportion (with discussion). Statistical Science 16, 101–117.
Bull, S. B., Lewinger, J. B. and Lee, S. S. F. (2007). Confidence intervals for multinomial logistic regression in sparse data. Statistics in Medicine 26, 903–918.
Heinze, G. and Schemper, M. (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine 21, 2409–2419.
Kosmidis, I. (2007). Bias reduction in exponential family nonlinear models. PhD Thesis, Department of Statistics, University of Warwick.
confintModel
, profileModel
,
profile.brglm
.
## Begin Example 1 ## Not run: library(MASS) data(bacteria) contrasts(bacteria$trt) <- structure(contr.sdif(3), dimnames = list(NULL, c("drug", "encourage"))) # fixed effects analyses m.glm.logit <- brglm(y ~ trt * week, family = binomial, data = bacteria, method = "glm.fit") m.brglm.logit <- brglm(y ~ trt * week, family = binomial, data = bacteria, method = "brglm.fit") p.glm.logit <- profile(m.glm.logit) p.brglm.logit <- profile(m.brglm.logit) # plot(p.glm.logit) plot(p.brglm.logit) # confidence intervals for the glm fit based on the profiles of the # ordinary deviance confint(p.glm.logit) # confidence intervals for the brglm fit confint(p.brglm.logit, ci.method = "union") confint(p.brglm.logit, ci.method = "mean") # A cloglog link m.brglm.cloglog <- update(m.brglm.logit, family = binomial(cloglog)) p.brglm.cloglog <- profile(m.brglm.cloglog) plot(p.brglm.cloglog) confint(m.brglm.cloglog, ci.method = "union") confint(m.brglm.cloglog, ci.method = "mean") ## End example ## End(Not run) ## Not run: ## Begin Example 2 y <- c(1, 1, 0, 0) totals <- c(2, 2, 2, 2) x1 <- c(1, 0, 1, 0) x2 <- c(1, 1, 0, 0) m1 <- brglm(y/totals ~ x1 + x2, weights = totals, family = binomial(cloglog)) p.m1 <- profile(m1) confint(p.m1, method="zoom") ## End(Not run)
## Begin Example 1 ## Not run: library(MASS) data(bacteria) contrasts(bacteria$trt) <- structure(contr.sdif(3), dimnames = list(NULL, c("drug", "encourage"))) # fixed effects analyses m.glm.logit <- brglm(y ~ trt * week, family = binomial, data = bacteria, method = "glm.fit") m.brglm.logit <- brglm(y ~ trt * week, family = binomial, data = bacteria, method = "brglm.fit") p.glm.logit <- profile(m.glm.logit) p.brglm.logit <- profile(m.brglm.logit) # plot(p.glm.logit) plot(p.brglm.logit) # confidence intervals for the glm fit based on the profiles of the # ordinary deviance confint(p.glm.logit) # confidence intervals for the brglm fit confint(p.brglm.logit, ci.method = "union") confint(p.brglm.logit, ci.method = "mean") # A cloglog link m.brglm.cloglog <- update(m.brglm.logit, family = binomial(cloglog)) p.brglm.cloglog <- profile(m.brglm.cloglog) plot(p.brglm.cloglog) confint(m.brglm.cloglog, ci.method = "union") confint(m.brglm.cloglog, ci.method = "mean") ## End example ## End(Not run) ## Not run: ## Begin Example 2 y <- c(1, 1, 0, 0) totals <- c(2, 2, 2, 2) x1 <- c(1, 0, 1, 0) x2 <- c(1, 1, 0, 0) m1 <- brglm(y/totals ~ x1 + x2, weights = totals, family = binomial(cloglog)) p.m1 <- profile(m1) confint(p.m1, method="zoom") ## End(Not run)
Calculates the leverages of a GLM through a C routine. It is intended to
be used only within brglm.fit
.
gethats(nobs, nvars, x.t, XWXinv, ww)
gethats(nobs, nvars, x.t, XWXinv, ww)
nobs |
The number of observations, i.e. |
nvars |
The number of parameters, i.e. |
x.t |
|
XWXinv |
The inverse of the Fisher information. |
ww |
The ‘working’ weights. |
A vector containing the diagonal elements of the hat matrix.
Ioannis Kosmidis, [email protected]
Auxiliary function as user interface for brglm
fitting. Typically only used when calling brglm
or
brglm.fit
.
glm.control1(epsilon = 1e-08, maxit = 25, trace = FALSE, ...)
glm.control1(epsilon = 1e-08, maxit = 25, trace = FALSE, ...)
epsilon |
as in |
maxit |
as in |
trace |
as in |
... |
further arguments passed to or from other methods. |
The only difference with glm.control
is that
glm.control1
supports further arguments to be passed from other
methods. However, this additional arguments have no effect on the
resulting list.
Ioannis Kosmidis, [email protected]
The lizards
data frame has 23 rows and 6 columns.
Variables grahami
and opalinus
are counts
of two lizard species at two different perch heights, two
different perch diameters, in sun and in shade, at three
times of day.
data(lizards)
data(lizards)
This data frame contains the following columns:
grahami
count of grahami lizards
opalinus
count of opalinus lizards
height
a factor with levels "<5ft"
, ">=5ft"
diameter
a factor with levels "<=2in"
, ">2in"
light
a factor with levels "sunny"
, "shady"
time
a factor with levels "early"
, "midday"
,
"late"
McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models (2nd Edition). London: Chapman and Hall.
Originally from
Schoener, T. W. (1970) Nonsynchronous spatial overlap of lizards in patchy habitats. Ecology 51, 408–418.
data(lizards) glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial, data=lizards) brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial, data=lizards)
data(lizards) glm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial, data=lizards) brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial, data=lizards)
Get, test and set the functions that calculate the additive modifications to the responses and totals in binomial-response GLMs, for the application of bias-reduction either via modified scores or via maximum penalized likelihood (where penalization is by Jeffreys invariant prior).
modifications(family, pl = FALSE)
modifications(family, pl = FALSE)
family |
a family object of the form |
pl |
logical determining whether the function returned corresponds
to modifications for the penalized maximum likelihood approach or for
the modified-scores approach to bias-reduction. Default value is
|
The function returned from modifications
accepts the argument p
which are the binomial probabilities and returns a list with
components ar
and at
, which are the link-dependent parts
of the additive modifications to the actual responses and totals,
respectively.
Since the resulting function is used in brglm.fit
, for
efficiency reasons no check is made for p >= 0 | p <= 1
, for
length(at) == length(p)
and for length(ap) == length(p)
.
If
are the pseudo-responses (pseudo-counts) and
are the pseudo-totals then we call the pair
a pseudo-data representation. Both the modified-scores
approach and the maximum penalized likelihood have a common property:
there exists such that if we replace the actual data
with
in the expression for the
ordinary scores (first derivatives of the likelihood) of a
binomial-response GLM, then we end-up either with the modified-scores
or with the derivatives of the penalized likelihood (see Kosmidis,
2007, Chapter 5).
Let be the mean of the binomial response
(i.e.
, where
is the binomial probability
corresponding to the count
). Also, let
and
denote the first and the second derivatives, respectively, of
with respect to the linear predictor
of the
model. All the above are viewed as functions of
. The
pseudo-data representations have the generic form
pseudo-response : | |
pseudo-totals : | , |
where is the leverage corresponding to
. The general
expressions for
("r" for "response") and
("t" for "totals") are:
modified-scores approach
|
, |
maximum penalized likelihood approach
|
. |
For supplying in
glm.fit
(as is
done by brglm.fit
), an essential requirement for the
pseudo-data representation is that it should mimic the behaviour of the
original responses and totals, i.e. . Since
, the requirement translates to
for every
. However,
the above definitions of
and
do not
necessarily respect this requirement.
On the other hand, the pair is not unique in
the sense that for a given link function and once the link-specific
structure of the pair has been extrapolated, there is a class of
equivalent pairs that can be resulted following only the following two
rules:
add and subtract the same quantity from either
or
.
if a quantity is to be moved from to
it
first has to be divided by
.
For example, in the case of penalized maximum likelihood, the pairs
and
are
equivalent, in the sense that if the corresponding pseudo-data
representations are substituted in the ordinary scores both return the
same expression.
So, in order to construct a pseudo-data representation that
corresponds to a user-specified link function and has the property
for every
, one merely
has to pursue a simple algebraic calculation on the initial pair
using only the two aforementioned rules until
an appropriate pair is resulted. There is always a pair!
Once the pair has been found the following steps should be followed.
For a user-specified link function the user has to write a
modification function with name "br.custom.family" or
"pml.custom.family" for pl=FALSE
or pl=TRUE
,
respectively. The function should take as argument the
probabilities p
and return a list of two vectors with
same length as p
and with names
c("ar", "at")
. The result corresponds to the pair
.
Check if the custom-made modifications function is
appropriate. This can be done via the function
checkModifications
which has arguments
fun
(the function to be tested) and Length
with
default value Length=100
. Length
is to be used
when the user-specified link function takes as argument a
vector of values (e.g. the logexp
link in
?family
). Then the value of Length
should be the
length of that vector.
Put the function in the search patch so that
modifications
can find it.
brglm
can now be used with the custom family as
glm
would be used.
The user could also deviate from modified-scores and maximum penalized
likelihood and experiment with implemented (or not) links, e.g. probit
,
constructing his own pseudo-data representations of the aforementioned
general form. This could be done by changing the link name, e.g. by
probitt <- make.link(probit) ;
probitt$name <- "probitt"
and then setting a custom br.custom.family
that does
not necessarily depend on the probit
link. Then, brglm
could be used with pl=FALSE
.
A further generalization would be to completely remove the hat value
in the generic expression of the pseudo-data representation
and have general additive modifications that depend on
. To do
this divide both
ar
and at
by
pmax(get("hats",parent.frame()),.Machine\$double.eps)
within the
custom modification function (see also Examples).
Ioannis Kosmidis, [email protected]
Kosmidis I. and Firth D. (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71–82.
Kosmidis, I. (2007). Bias reduction in exponential family nonlinear models. PhD Thesis, Department of Statistics, University of Warwick.
## Begin Example 1 ## logistic exposure model, following the Example in ?family. See, ## Shaffer, T. 2004. Auk 121(2): 526-540. # Definition of the link function logexp <- function(days = 1) { linkfun <- function(mu) qlogis(mu^(1/days)) linkinv <- function(eta) plogis(eta)^days mu.eta <- function(eta) days * plogis(eta)^(days-1) * binomial()$mu.eta(eta) valideta <- function(eta) TRUE link <- paste("logexp(", days, ")", sep="") structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, valideta = valideta, name = link), class = "link-glm") } # Here d(p) = days * p * ( 1 - p^(1/days) ) # d'(p) = (days - (days+1) * p^(1/days)) * d(p) # w(p) = days^2 * p * (1-p^(1/days))^2 / (1-p) # Initial modifications, as given from the general expressions above: br.custom.family <- function(p) { etas <- binomial(logexp(.days))$linkfun(p) # the link function argument `.days' will be detected by lexical # scoping. So, make sure that the link-function inputted arguments # have unusual names, like `.days' and that # the link function enters `brglm' as # `family=binomial(logexp(.days))'. list(ar = 0.5*(1-p)-0.5*(1-p)*exp(etas)/.days, at = 0*p/p) # so that to fix the length of at } .days <-3 # `.days' could be a vector as well but then it should have the same # length as the number of observations (`length(.days)' should be # equal to `length(p)'). In this case, `checkModifications' should # have argument `Length=length(.days)'. # # Check: ## Not run: checkModifications(br.custom.family) # OOOPS error message... the condition is not satisfied # # After some trivial algebra using the two allowed operations, we # get new modifications: br.custom.family <- function(p) { etas <- binomial(logexp(.days))$linkfun(p) list(ar=0.5*p/p, # so that to fix the length of ar at=0.5+exp(etas)*(1-p)/(2*p*.days)) } # Check: checkModifications(br.custom.family) # It is OK. # Now, modifications(binomial(logexp(.days))) # works. # Notice that for `.days <- 1', `logexp(.days)' is the `logit' link # model and `a_r=0.5', `a_t=1'. # In action: library(MASS) example(birthwt) m.glm <- glm(formula = low ~ ., family = binomial, data = bwt) .days <- bwt$age m.glm.logexp <- update(m.glm,family=binomial(logexp(.days))) m.brglm.logexp <- brglm(formula = low ~ ., family = binomial(logexp(.days)), data = bwt) # The fit for the `logexp' link via maximum likelihood m.glm.logexp # and the fit for the `logexp' link via modified scores m.brglm.logexp ## End Example ## Begin Example 2 ## Another possible use of brglm.fit: ## Deviating from bias reducing modified-scores: ## Add 1/2 to the response of a probit model. y <- c(1,2,3,4) totals <- c(5,5,5,5) x1 <- c(1,0,1,0) x2 <- c(1,1,0,0) my.probit <- make.link("probit") my.probit$name <- "my.probit" br.custom.family <- function(p) { h <- pmax(get("hats",parent.frame()),.Machine$double.eps) list(ar=0.5/h,at=1/h) } m1 <- brglm(y/totals~x1+x2,weights=totals,family=binomial(my.probit)) m2 <- glm((y+0.5)/(totals+1)~x1+x2,weights=totals+1,family=binomial(probit)) # m1 and m2 should be the same. # End example # Begin example 3: Maximum penalized likelihood for logistic regression, # with the penalty being a powerof the Jeffreys prior (`.const` below) # Setup a custom logit link mylogit <- make.link("logit") mylogit$name <- "mylogit" ## Set-up the custom family br.custom.family <- function(p) { list(ar = .const * p/p, at = 2 * .const * p/p) } data("lizards") ## The reduced-bias fit is .const <- 1/2 brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(mylogit), data=lizards) ## which is the same as what brglm does by default for logistic regression brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data=lizards) ## Stronger penalization (e.g. 5/2) can be achieved by .const <- 5/2 brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(mylogit), data=lizards) # End example
## Begin Example 1 ## logistic exposure model, following the Example in ?family. See, ## Shaffer, T. 2004. Auk 121(2): 526-540. # Definition of the link function logexp <- function(days = 1) { linkfun <- function(mu) qlogis(mu^(1/days)) linkinv <- function(eta) plogis(eta)^days mu.eta <- function(eta) days * plogis(eta)^(days-1) * binomial()$mu.eta(eta) valideta <- function(eta) TRUE link <- paste("logexp(", days, ")", sep="") structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, valideta = valideta, name = link), class = "link-glm") } # Here d(p) = days * p * ( 1 - p^(1/days) ) # d'(p) = (days - (days+1) * p^(1/days)) * d(p) # w(p) = days^2 * p * (1-p^(1/days))^2 / (1-p) # Initial modifications, as given from the general expressions above: br.custom.family <- function(p) { etas <- binomial(logexp(.days))$linkfun(p) # the link function argument `.days' will be detected by lexical # scoping. So, make sure that the link-function inputted arguments # have unusual names, like `.days' and that # the link function enters `brglm' as # `family=binomial(logexp(.days))'. list(ar = 0.5*(1-p)-0.5*(1-p)*exp(etas)/.days, at = 0*p/p) # so that to fix the length of at } .days <-3 # `.days' could be a vector as well but then it should have the same # length as the number of observations (`length(.days)' should be # equal to `length(p)'). In this case, `checkModifications' should # have argument `Length=length(.days)'. # # Check: ## Not run: checkModifications(br.custom.family) # OOOPS error message... the condition is not satisfied # # After some trivial algebra using the two allowed operations, we # get new modifications: br.custom.family <- function(p) { etas <- binomial(logexp(.days))$linkfun(p) list(ar=0.5*p/p, # so that to fix the length of ar at=0.5+exp(etas)*(1-p)/(2*p*.days)) } # Check: checkModifications(br.custom.family) # It is OK. # Now, modifications(binomial(logexp(.days))) # works. # Notice that for `.days <- 1', `logexp(.days)' is the `logit' link # model and `a_r=0.5', `a_t=1'. # In action: library(MASS) example(birthwt) m.glm <- glm(formula = low ~ ., family = binomial, data = bwt) .days <- bwt$age m.glm.logexp <- update(m.glm,family=binomial(logexp(.days))) m.brglm.logexp <- brglm(formula = low ~ ., family = binomial(logexp(.days)), data = bwt) # The fit for the `logexp' link via maximum likelihood m.glm.logexp # and the fit for the `logexp' link via modified scores m.brglm.logexp ## End Example ## Begin Example 2 ## Another possible use of brglm.fit: ## Deviating from bias reducing modified-scores: ## Add 1/2 to the response of a probit model. y <- c(1,2,3,4) totals <- c(5,5,5,5) x1 <- c(1,0,1,0) x2 <- c(1,1,0,0) my.probit <- make.link("probit") my.probit$name <- "my.probit" br.custom.family <- function(p) { h <- pmax(get("hats",parent.frame()),.Machine$double.eps) list(ar=0.5/h,at=1/h) } m1 <- brglm(y/totals~x1+x2,weights=totals,family=binomial(my.probit)) m2 <- glm((y+0.5)/(totals+1)~x1+x2,weights=totals+1,family=binomial(probit)) # m1 and m2 should be the same. # End example # Begin example 3: Maximum penalized likelihood for logistic regression, # with the penalty being a powerof the Jeffreys prior (`.const` below) # Setup a custom logit link mylogit <- make.link("logit") mylogit$name <- "mylogit" ## Set-up the custom family br.custom.family <- function(p) { list(ar = .const * p/p, at = 2 * .const * p/p) } data("lizards") ## The reduced-bias fit is .const <- 1/2 brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(mylogit), data=lizards) ## which is the same as what brglm does by default for logistic regression brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data=lizards) ## Stronger penalization (e.g. 5/2) can be achieved by .const <- 5/2 brglm(cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(mylogit), data=lizards) # End example
plot.profile.brglm
plots the objects of class
"profileModel"
that are contained in an object of class
"profile.brglm"
. pairs.profile.brglm
is a diagnostic tool
that plots pairwise profile traces.
## S3 method for class 'profile.brglm' plot(x, signed = FALSE, interpolate = TRUE, n.interpolations = 100, print.grid.points = FALSE, ...) ## S3 method for class 'profile.brglm' pairs(x, colours = 2:3, ...)
## S3 method for class 'profile.brglm' plot(x, signed = FALSE, interpolate = TRUE, n.interpolations = 100, print.grid.points = FALSE, ...) ## S3 method for class 'profile.brglm' pairs(x, colours = 2:3, ...)
x |
a |
signed |
as in |
interpolate |
as in |
n.interpolations |
as in |
print.grid.points |
as in |
colours |
as in |
... |
further arguments passed to or from other methods. |
See Details in plot.profileModel
.
Ioannis Kosmidis, [email protected]
plot.profileModel
, profile.brglm
.
# see example in 'confint.brglm'.
# see example in 'confint.brglm'.
Creates "profile.brglm"
objects to be used for the calculation of
confidence intervals and for plotting.
## S3 method for class 'brglm' profile(fitted, gridsize = 10, stdn = 5, stepsize = 0.5, level = 0.95, which = 1:length(coef(fitted)), verbose = TRUE, zero.bound = 1e-08, scale = FALSE, ...)
## S3 method for class 'brglm' profile(fitted, gridsize = 10, stdn = 5, stepsize = 0.5, level = 0.95, which = 1:length(coef(fitted)), verbose = TRUE, zero.bound = 1e-08, scale = FALSE, ...)
fitted |
an object of class |
gridsize |
as in |
stdn |
as in |
stepsize |
as in |
level |
|
which |
as in |
verbose |
as in |
zero.bound |
as in |
scale |
as in |
... |
further arguments passed to or from other methods. |
profile.brglm
calculates the profiles of the appropriate
objectives to be used for the construction of confidence intervals for
the bias-reduced estimates (see confint.brglm
for the
objectives that are profiled).
An object of class "profile.glm"
with attribute “level”
corresponding to the argument level
. The object supports the
methods print
, plot
, pairs
and confint
and it is a list of the components:
profilesML |
a |
profilesBR |
|
Objects of class "profile.brglm"
support the methods:
print
which prints the "level"
attribute of the
object, as well as the supported methods.
confint
see confint.brglm
.
plot
see plot.profile.brglm
.
pairs
see plot.profile.brglm
.
Ioannis Kosmidis, [email protected]
# see example in 'confint.brglm'.
# see example in 'confint.brglm'.
Objectives that are used in profile.brglm
penalizedDeviance(fm, X, dispersion = 1) modifiedScoreStatistic(fm, X, dispersion = 1)
penalizedDeviance(fm, X, dispersion = 1) modifiedScoreStatistic(fm, X, dispersion = 1)
fm |
the restricted fit. |
X |
the model matrix of the fit on all parameters. |
dispersion |
the dispersion parameter. |
These objectives follow the specifications for objectives in the
profileModel package and are used from profile.brglm
.
penalizedDeviance
returns a deviance-like value corresponding
to a likelihood function penalized by Jeffreys invariant prior. It
has been used by Heinze & Schemper (2002) and by Bull et. al. (2002)
for the construction of confidence intervals for the bias-reduced
estimates in logistic regression. The X
argument is the
model matrix of the full (not the restricted) fit.
modifiedScoreStatistic
mimics RaoScoreStatistic
in profileModel, but with the ordinary scores replaced with the
modified scores used for bias reduction. The argument X
has
the same interpretation as for penalizedDeviance
.
A scalar.
Ioannis Kosmidis, [email protected]
Kosmidis I. and Firth D. (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71–82.
Bull, S. B., Lewinger, J. B. and Lee, S. S. F. (2007). Confidence intervals for multinomial logistic regression in sparse data. Statistics in Medicine 26, 903–918.
Heinze, G. and Schemper, M. (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine 21, 2409–2419.
Provides a tool for identifying whether or not separation has occurred.
separation.detection(fit, nsteps = 30)
separation.detection(fit, nsteps = 30)
fit |
the result of a |
nsteps |
Starting from |
Identifies separated cases for binomial-response GLMs, by refitting
the model. At each iteration the maximum number of allowed IWLS
iterations is fixed starting from 1 to nsteps
(by setting
control = glm.control(maxit = j)
, where j
takes values 1,
..., nsteps in glm
). For each value of maxit
,
the estimated asymptotic standard errors are divided to the
corresponding ones resulted for
control = glm.control(maxit = 1)
. Based on the results in Lesaffre
& Albert (1989), if the sequence of ratios in any column of the
resulting matrix diverges, then separation occurs and the maximum
likelihood estimate for the corresponding parameter has value minus or
plus infinity.
A matrix of dimension nsteps
by length(coef(fit))
, that
contains the ratios of the estimated asymptotic standard errors.
Ioannis Kosmidis, [email protected]
Kosmidis I. and Firth D. (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71–82.
Lesaffre, E. and Albert, A. (1989). Partial separation in logistic discrimination. J. R. Statist. Soc. B, 51, 109–116.
## Begin Example y <- c(1,1,0,0) totals <- c(2,2,2,2) x1 <- c(1,0,1,0) x2 <- c(1,1,0,0) m1 <- glm(y/totals ~ x1 + x2, weights = totals, family = binomial()) # No warning from glm... m1 # However estimates for (Intercept) and x2 are unusually large in # absolute value... Investigate further: # separation.detection(m1,nsteps=30) # Note that the values in the column for (Intercept) and x2 diverge, # while for x1 converged. Hence, separation has occurred and the # maximum lieklihood estimate for (Intercept) is minus infinity and # for x2 is plus infinity. The signs for infinity are taken from the # signs of (Intercept) and x1 in coef(m1). ## End Example
## Begin Example y <- c(1,1,0,0) totals <- c(2,2,2,2) x1 <- c(1,0,1,0) x2 <- c(1,1,0,0) m1 <- glm(y/totals ~ x1 + x2, weights = totals, family = binomial()) # No warning from glm... m1 # However estimates for (Intercept) and x2 are unusually large in # absolute value... Investigate further: # separation.detection(m1,nsteps=30) # Note that the values in the column for (Intercept) and x2 diverge, # while for x1 converged. Hence, separation has occurred and the # maximum lieklihood estimate for (Intercept) is minus infinity and # for x2 is plus infinity. The signs for infinity are taken from the # signs of (Intercept) and x1 in coef(m1). ## End Example