Canonical Logistic Regression for Ordinal Categorical Response Using %CANLOG

Canonical Logistic Regression for Ordinal Categorical Response Using %CANLOG

Jochem König

Institut für Medizinsche Biometrie, Epidemiologie und Medizinische Informatik,
Universität des Saarlandes, 66421 Homburg.

jk@med-imbei.uni-saarland.de

1- Notation and models

All models are submodels of multinomial response logistic model:

(1)

The repsonse variable Y is assumed to have m ordered response categories coded 1,…,m and denotes the conditional probability for category j given a realisation x of the vector X of explanatory variables. The conditional distribution of the response Y given the explanatory variables is sometimes referred to as posterior distribution.

2-An excursion on sampling schemes priors, marginal and class conditional distributions

If there is an underlying common distribution of Y and X, it can be decomposed in two ways:

(2)

where f is termed marginal density, f_y are class conditional densities and p_y are priors. Three types of sampling may be used for fitting: mixture sampling denotes a sample from the common distribution of (Y,X); controlled sampling denotes a set of independent drawings from the collection of posterior distributions, i.e. information on X may have been used to select subjects for the study; and finally, separate sampling addresses independent samples from each of the class conditional distributions.

Under mixture sampling and controlled sampling, posteriors may be computed from an ML-fit of (1) via

(3)

prior estimates may be obtained as sample proportions under mixture sampling and are unknown under controlled sampling.

Under seperate sampling, (1) a formal ML-fit of

(4)

to the data yields ML estimates of density ratios via

, (5)

where n_j denote sample sizes. Assuming equal priors, (5) may be interpretated as an equation for posterior ratios like (1) (but note that an offset was introduced during fitting and than removed). Adding any set of logits of priors known form some other source on both sides of (5) makes (5) to an equation of type (1) that may be used to compute posteriors via (3). Similarly results from mixture sampling may be transferred to a situtaion with the same class conditional distribution, but different priors (and consequently different marginal and posterior distribution) by adding constants to equation (1).

3- Models continued

Parametrization (1) is referred to as last category coding. After reversing signs of regression coefficients it is identical to that used in proc catmod.

In parallel, adjacent coding, an equivalent parametrization is used in the macro:

(6)

All specific models may be presented as a meta model for the set of vectors of regression coefficients. Where conveniant, is defined.

For instance, Andersons stereo type model uses last category coding:

(7)

An ordinal relationship is stated once holds. More precisely, the constraints establish a stochastically monotone relationship between a single linear predictor and the response variable. (Anderson 1984). Ties between successive f parameters indicate indistinguishable categories

Equivalently, in adjacent coding

(8)

here order constraints are formulated as . And indistinguishability between categories j and j+1 may be stated as .

Now, a class of models called partially ordinal logistic model is introduced, that contains models more restrictive than (1/6) and more general than (7/8) (see Feldmann,König). These models put collinearity constraints on some but not all of the regression vectors (8):

(9)

for some given subset of indices . Any such set of starting categories of collinear sequences fully specifies a partially ordinal logistic model. Note, that (6) and (8) are included as special cases r=m-1 and r=1, respectively.

Example: m=7, r=3, {j₁,..,j_r+1}={1, 4, 5, 7}; then 1-2-3-4, 4-5 and 5-6-7 are joint by collinear regression vectors that are multiples of , repsectively. The entries of have to be interpreted as average unit log odd ratio between adjacent categories among the set {1,2,3,4} of categories, where as the entries of are still specific unit odds ratios between categories j and j+1 as in model (6).

Each of the models (8) and (9) may be further restricted by fixing the tau parameters as known constants. For the fixed scales models of %CANLOG, all tau are set to one:

(10)

Anderson also considers models where all regression vectors of (1) (or equivalently of (6) are assumed to lie in an unknown d-dimensional linear subspace, e.g. in adjacent coding:

(11)

Note that any model (9) spans a subspace of dimension at most r and is nested in the model (11) with d=r.

The data generated choice between models of type (11) was termed canonical logistic regression by McCullagh (in the discussion of Anderson 1984). Here any data generated selection between models (9,10,11) is termed canonical logistic regression.

5-Specifications

The macro-parameters of %CANLOG are as follows

%macro canlog(y,xliste,

numvar=1,

data=_last_,

crit=0.0001,

id=,

partial=,

reverse=1,

priors=equal,

out=outpred,outb=outpredb,

outest=outest,

outestb=outestb,

);

macro parameters of %CANLOG

Parameter	Explanation
y	response variable coded 1,2,...,m
xliste	list of explanatory variables
id	list of numerical identification variables (optional)
data	input data set (training sample)
partial	sequence of indices specifying a partially collinear model: e.g.let m=12. then 1 2 6 7 8 11 specifies a model where coefficient vectors 2-3-4-5 and 8-9-10 are collinear (adjacent coding)
priors	set of q priors if different from sample sizes (q= number of categories minus 1). key word equal produces constant priors
reverse=1	reversal of sign for all variables, leads to positive coefficients if higher values of explanatory variable are associated with a trend towards higher response values. option does not affect contents of outestb which is organized to conform with proc catmod.
outb	name of output dataset containing one row for each row of data
out	name of output dataset containing q rows for each rows of data
outest	name of output dataset containing parameter estimates
outestb	name of output dataset containing parameter estimates format compatible to that of proc catmod

5-Models Fitted

%CANLOG fits by default the full multinomial model and the onedimensional stereotype model with equidistant fixed scales and with estimated scale parameters. Parameters are estimated in the adjacent coding scheme and in the last category coding. Partially ordinal models are fitted if specified via macro parameter partial: one with estimated scale parameters and one with fixed constant scale parameters.

models fitted by %CANLOG

model idenfying string	Explanation
glogitm	multinomial model
glogit1e	stereotype model 1-dimensional, estimated scale parameters
glogit1f	stereotype model 1-dimensional, fixed constant scale parameters
pa(&partial)	partially ordinal model, estimated scale parameters
paf(&partial)	partially ordinal model, fixed constant scale parameters

6-Output Datasets

Output dataset &outb

Output dataset &outb contains one row for each row of data.

Output dataset &outb Description of vaiables

Variable name	name of output dataset containing one row for each row of data and the following variables
y	response variable
ybayes	predicted category from bayes rule,
y_ml	predicted category from maximum likelihood rule i.e. assuming equal priors,
y_lc	rule minimzing the Bayesrisk assuming linearly increasing costs of misclassification
y_lce	same as y_lc, but assuming equal priors.
y_ex	expected value of y as computed from posteriors rounded to the nearest integer
post1,post2,..	variables for posteriors (priors as proportions in data set) and

Output dataset &out

Output dataset &out contains q:=m-1 rows for each rows of data

Output dataset &out: Description of vaiables

Variable name	name of output dataset containing one row for each row of data and the following variables
&id	idvariable (optional)
y	response variable as row indicator
xbeta	linear predictor
post	posterior for category given by y

6-Printed Output

For each model fitted parameters estimated and standard errors are given.