Title:  Multidimensional Item Response Theory 

Description:  Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bifactor and twotier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zeroinflated response models are supported. 
Authors:  Phil Chalmers [aut, cre] , Joshua Pritikin [ctb], Alexander Robitzsch [ctb], Mateusz Zoltak [ctb], KwonHyun Kim [ctb], Carl F. Falk [ctb], Adam Meade [ctb], Lennart Schneider [ctb], David King [ctb], ChenWei Liu [ctb], Ogreden Oguzhan [ctb] 
Maintainer:  Phil Chalmers <[email protected]> 
License:  GPL (>= 3) 
Version:  1.41.8 
Built:  20240524 19:24:13 UTC 
Source:  https://github.com/philchalmers/mirt 
Full information maximum likelihood estimation of multidimensional IRT models
Analysis of dichotomous and polytomous response data using unidimensional and multidimensional latent trait models under the Item Response Theory paradigm. Exploratory and confirmatory models can be estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bifactor and twotier analyses are available for modeling item testlets. Multiple group analysis and mixed effects designs also are available for detecting differential item and test functioning as well as modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, and several other discrete variable models are supported.
Users interested in the most recent version of this package can visit https://github.com/philchalmers/mirt and follow the instructions for installing the package from source. Questions regarding the package can be sent to the mirtpackage Google Group, located at https://groups.google.com/forum/#!forum/mirtpackage. User contributed files, workshop files, and evaluated help files are also available on the package wiki (https://github.com/philchalmers/mirt/wiki).
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Compare nested models using likelihood ratio test (X2), Akaike Information Criterion (AIC),
Bayesian Information Criterion (BIC),
SampleSize Adjusted BIC (SABIC), and HannanQuinn (HQ) Criterion.
When given a sequence of objects, anova
tests the models against one another
in the order specified. Note that the object
inputs should be ordered in terms
of most constrained model to least constrained.
## S4 method for signature 'SingleGroupClass'
anova(
object,
object2,
...,
bounded = FALSE,
mix = 0.5,
frame = 1,
verbose = FALSE
)
object 
an object of class 
object2 
a second model estimated from any of the mirt package estimation methods 
... 
additional less constrained model objects to be compared sequentially to the previous model 
bounded 
logical; are the two models comparing a bounded parameter (e.g., comparing a single
2PL and 3PL model with 1 df)? If 
mix 
proportion of chisquared mixtures. Default is 0.5 
frame 
(internal parameter not for standard use) 
verbose 
(deprecated argument) 
a data.frame
/mirt_df
object
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
x < mirt(Science, 1)
x2 < mirt(Science, 2)
anova(x, x2)
# compare three models sequentially (X2 not always meaningful)
x3 < mirt(Science, 1, 'gpcm')
x4 < mirt(Science, 1, 'nominal')
anova(x, x2, x3, x4)
# in isolation
anova(x)
# with priors on first model
model < "Theta = 14
PRIOR = (14, a1, lnorm, 0, 10)"
xp < mirt(Science, model)
anova(xp, x2)
anova(xp)
# bounded parameter
dat < expand.table(LSAT7)
mod < mirt(dat, 1)
mod2 < mirt(dat, 1, itemtype = c(rep('2PL', 4), '3PL'))
anova(mod, mod2) #unbounded test
anova(mod, mod2, bounded = TRUE) #bounded
# priors
model < 'F = 15
PRIOR = (5, g, norm, 1, 1)'
mod1b < mirt(dat, model, itemtype = c(rep('2PL', 4), '3PL'))
anova(mod1b)
model2 < 'F = 15
PRIOR = (15, g, norm, 1, 1)'
mod2b < mirt(dat, model2, itemtype = '3PL')
anova(mod1b, mod2b)
## End(Not run)
Compute the area of a test or item information function over a definite integral range.
areainfo(
x,
theta_lim,
which.items = 1:extract.mirt(x, "nitems"),
group = NULL,
...
)
x 
an object of class 'SingleGroupClass', or an object of class 'MultipleGroupClass' if a suitable

theta_lim 
range of integration to be computed 
which.items 
an integer vector indicating which items to include in the expected information function. Default uses all possible items 
group 
group argument to pass to 
... 
additional arguments passed to 
a data.frame
with the lower and upper integration range, the information area
within the range (Info), the information area over the range 10 to 10 (Total.Info), proportion
of total information given the integration range (Info.Proportion), and the number of items included (nitems)
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
dat < expand.table(LSAT7)
mod < mirt(dat, 1)
areainfo(mod, c(2,0), which.items = 1) #item 1
## Not run:
areainfo(mod, c(2,0), which.items = 1:3) #items 1 to 3
areainfo(mod, c(2,0)) # all items (total test information)
# plot the area
area < areainfo(mod, c(2,0))
Theta < matrix(seq(3,3, length.out=1000))
info < testinfo(mod, Theta)
plot(info ~ Theta, type = 'l')
pick < Theta >= 2 & Theta <=0
polygon(c(2, Theta[pick], 0), c(0, info[pick], 0), col='lightblue')
text(x = 2, y = 0.5, labels = paste("Total Information:", round(area$TotalInfo, 3),
"\n\nInformation in (2, 0):", round(area$Info, 3),
paste("(", round(100 * area$Proportion, 2), "%)", sep = "")), cex = 1.2)
## End(Not run)
This function computes updated parameter and standard error estimates using multiple imputation methodology. Given a set of parameter estimates and their associated standard errors the function returns the weighted average of the overall between and within variability due to the multiple imputations according to Rubin's (1987) methodology.
averageMI(par, SEpar, as.data.frame = TRUE)
par 
a list containing parameter estimates which were computed the imputed datasets 
SEpar 
a list containing standard errors associated with 
as.data.frame 
logical; return a data.frame instead of a list? Default is TRUE 
returns a list or data.frame containing the updated averaged parameter estimates, standard errors, and tvalues with the associated degrees of freedom and two tailed pvalues
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Rubin, D.B. (1987) Multiple Imputation for Nonresponse in Surveys. Wiley & Sons, New York.
## Not run:
# simulate data
set.seed(1234)
N < 1000
# covariates
X1 < rnorm(N); X2 < rnorm(N)
covdata < data.frame(X1, X2)
Theta < matrix(0.5 * X1 + 1 * X2 + rnorm(N, sd = 0.5))
# items and response data
a < matrix(1, 20); d < matrix(rnorm(20))
dat < simdata(a, d, 1000, itemtype = '2PL', Theta=Theta)
mod1 < mirt(dat, 1, 'Rasch', covdata=covdata, formula = ~ X1 + X2)
coef(mod1, simplify=TRUE)
# draw plausible values for secondary analyses
pv < fscores(mod1, plausible.draws = 10)
pvmods < lapply(pv, function(x, covdata) lm(x ~ covdata$X1 + covdata$X2),
covdata=covdata)
# compute Rubin's multiple imputation average
so < lapply(pvmods, summary)
par < lapply(so, function(x) x$coefficients[, 'Estimate'])
SEpar < lapply(so, function(x) x$coefficients[, 'Std. Error'])
averageMI(par, SEpar)
## End(Not run)
bfactor
fits a confirmatory maximum likelihood twotier/bifactor/testlet model to
dichotomous and polytomous data under the item response theory paradigm.
The IRT models are fit using a dimensional reduction EM algorithm so that regardless
of the number of specific factors estimated the model only uses the number of
factors in the secondtier structure plus 1. For the bifactor model the maximum
number of dimensions is only 2 since the secondtier only consists of a
ubiquitous unidimensional factor. See mirt
for appropriate methods to be used
on the objects returned from the estimation.
bfactor(
data,
model,
model2 = paste0("G = 1", ncol(data)),
group = NULL,
quadpts = NULL,
invariance = "",
...
)
data 
a 
model 
a numeric vector specifying which factor loads on which
item. For example, if for a 4 item test with two specific factors, the first
specific factor loads on the first two items and the second specific factor
on the last two, then the vector is 
model2 
a twotier model specification object defined by 
group 
a factor variable indicating group membership used for multiple group analyses 
quadpts 
number of quadrature nodes to use after accounting for the reduced number of dimensions.
Scheme is the same as the one used in 
invariance 
see 
... 
additional arguments to be passed to the estimation engine. See 
bfactor
follows the item factor analysis strategy explicated by
Gibbons and Hedeker (1992), Gibbons et al. (2007), and Cai (2010).
Nested models may be compared via an approximate
chisquared difference test or by a reduction in AIC or BIC (accessible via
anova
). See mirt
for more details regarding the
IRT estimation approach used in this package.
The twotier model has a specific block diagonal covariance structure between the primary and secondary latent traits. Namely, the secondary latent traits are assumed to be orthogonal to all traits and have a fixed variance of 1, while the primary traits can be organized to vary and covary with other primary traits in the model.
$\Sigma_{twotier} = \left(\begin{array}{cc} G & 0 \\ 0 & diag(S) \end{array} \right)$
The bifactor model is a special case of the twotier model when $G$
above is a 1x1 matrix,
and therefore only 1 primary factor is being modeled. Evaluation of the numerical integrals
for the twotier model requires only $ncol(G) + 1$
dimensions for integration since the
$S$
second order (or 'specific') factors require only 1 integration grid due to the
dimension reduction technique.
Note: for multiple group twotier analyses only the secondtier means and variances should be freed since the specific factors are not treated independently due to the dimension reduction technique.
function returns an object of class SingleGroupClass
(SingleGroupClassclass) or MultipleGroupClass
(MultipleGroupClassclass).
Phil Chalmers [email protected]
Cai, L. (2010). A twotier fullinformation item factor analysis model with applications. Psychometrika, 75, 581612.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Bradlow, E.T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153168.
Gibbons, R. D., & Hedeker, D. R. (1992). Fullinformation Item BiFactor Analysis. Psychometrika, 57, 423436.
Gibbons, R. D., Darrell, R. B., Hedeker, D., Weiss, D. J., Segawa, E., Bhaumik, D. K., Kupfer, D. J., Frank, E., Grochocinski, V. J., & Stover, A. (2007). FullInformation item bifactor analysis of graded response data. Applied Psychological Measurement, 31, 419.
Wainer, H., Bradlow, E.T., & Wang, X. (2007). Testlet response theory and its applications. New York, NY: Cambridge University Press.
## Not run:
### load SAT12 and compute bifactor model with 3 specific factors
data(SAT12)
data < key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
specific < c(2,3,2,3,3,2,1,2,1,1,1,3,1,3,1,2,1,1,3,3,1,1,3,1,3,3,1,3,2,3,1,2)
mod1 < bfactor(data, specific)
summary(mod1)
itemplot(mod1, 18, drop.zeros = TRUE) #drop the zero slopes to allow plotting
### Try with fixed guessing parameters added
guess < rep(.1,32)
mod2 < bfactor(data, specific, guess = guess)
coef(mod2)
anova(mod1, mod2)
## don't estimate specific factor for item 32
specific[32] < NA
mod3 < bfactor(data, specific)
anova(mod3, mod1)
# same, but declared manually (not run)
#sv < mod2values(mod1)
#sv$value[220] < 0 #parameter 220 is the 32 items specific slope
#sv$est[220] < FALSE
#mod3 < bfactor(data, specific, pars = sv) #with excellent starting values
#########
# mixed itemtype example
# simulate data
a < matrix(c(
1,0.5,NA,
1,0.5,NA,
1,0.5,NA,
1,0.5,NA,
1,0.5,NA,
1,0.5,NA,
1,0.5,NA,
1,NA,0.5,
1,NA,0.5,
1,NA,0.5,
1,NA,0.5,
1,NA,0.5,
1,NA,0.5,
1,NA,0.5),ncol=3,byrow=TRUE)
d < matrix(c(
1.0,NA,NA,
1.5,NA,NA,
1.5,NA,NA,
0.0,NA,NA,
2.5,1.0,1,
3.0,2.0,0.5,
3.0,2.0,0.5,
3.0,2.0,0.5,
2.5,1.0,1,
2.0,0.0,NA,
1.0,NA,NA,
1.5,NA,NA,
1.5,NA,NA,
0.0,NA,NA),ncol=3,byrow=TRUE)
items < rep('2PL', 14)
items[5:10] < 'graded'
sigma < diag(3)
dataset < simdata(a,d,2000,itemtype=items,sigma=sigma)
itemstats(dataset)
specific < c(rep(1,7),rep(2,7))
simmod < bfactor(dataset, specific)
coef(simmod)
#########
# General testlet response model (Wainer, 2007)
# simulate data
set.seed(1234)
a < matrix(0, 12, 4)
a[,1] < rlnorm(12, .2, .3)
ind < 1
for(i in 1:3){
a[ind:(ind+3),i+1] < a[ind:(ind+3),1]
ind < ind+4
}
print(a)
d < rnorm(12, 0, .5)
sigma < diag(c(1, .5, 1, .5))
dataset < simdata(a,d,2000,itemtype=rep('2PL', 12),sigma=sigma)
itemstats(dataset)
# estimate by applying constraints and freeing the latent variances
specific < c(rep(1,4),rep(2,4), rep(3,4))
model < "G = 112
CONSTRAIN = (1, a1, a2), (2, a1, a2), (3, a1, a2), (4, a1, a2),
(5, a1, a3), (6, a1, a3), (7, a1, a3), (8, a1, a3),
(9, a1, a4), (10, a1, a4), (11, a1, a4), (12, a1, a4)
COV = S1*S1, S2*S2, S3*S3"
simmod < bfactor(dataset, specific, model)
coef(simmod, simplify=TRUE)
# Constrained testlet model (Bradlow, 1999)
model2 < "G = 112
CONSTRAIN = (1, a1, a2), (2, a1, a2), (3, a1, a2), (4, a1, a2),
(5, a1, a3), (6, a1, a3), (7, a1, a3), (8, a1, a3),
(9, a1, a4), (10, a1, a4), (11, a1, a4), (12, a1, a4),
(GROUP, COV_22, COV_33, COV_44)
COV = S1*S1, S2*S2, S3*S3"
simmod2 < bfactor(dataset, specific, model2)
coef(simmod2, simplify=TRUE)
anova(simmod2, simmod)
#########
# Twotier model
# simulate data
set.seed(1234)
a < matrix(c(
0,1,0.5,NA,NA,
0,1,0.5,NA,NA,
0,1,0.5,NA,NA,
0,1,0.5,NA,NA,
0,1,0.5,NA,NA,
0,1,NA,0.5,NA,
0,1,NA,0.5,NA,
0,1,NA,0.5,NA,
1,0,NA,0.5,NA,
1,0,NA,0.5,NA,
1,0,NA,0.5,NA,
1,0,NA,NA,0.5,
1,0,NA,NA,0.5,
1,0,NA,NA,0.5,
1,0,NA,NA,0.5,
1,0,NA,NA,0.5),ncol=5,byrow=TRUE)
d < matrix(rnorm(16))
items < rep('2PL', 16)
sigma < diag(5)
sigma[1,2] < sigma[2,1] < .4
dataset < simdata(a,d,2000,itemtype=items,sigma=sigma)
itemstats(dataset)
specific < c(rep(1,5),rep(2,6),rep(3,5))
model < '
G1 = 18
G2 = 916
COV = G1*G2'
# quadpts dropped for faster estimation, but not as precise
simmod < bfactor(dataset, specific, model, quadpts = 9, TOL = 1e3)
coef(simmod, simplify=TRUE)
summary(simmod)
itemfit(simmod, QMC=TRUE)
M2(simmod, QMC=TRUE)
residuals(simmod, QMC=TRUE)
## End(Not run)
A 3item tabulated data set extracted from Table 3 in Chapter Two.
Phil Chalmers [email protected]
Bock, R. D. (1997). The Nominal Categories Model. In van der Linden, W. J. & Hambleton, R. K. Handbook of modern item response theory. New York: Springer.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
dat < expand.table(Bock1997)
head(dat)
itemstats(dat, use_ts=FALSE)
mod < mirt(dat, 1, 'nominal')
# reproduce table 3 in Bock (1997)
fs < round(fscores(mod, verbose = FALSE, full.scores = FALSE)[,c('F1','SE_F1')],2)
fttd < residuals(mod, type = 'exp')
table < data.frame(fttd[,ncol(fttd)], fs)
table
mod < mirt(dat, 1, 'nominal')
coef(mod)
## End(Not run)
Given two fitted models, compute a parametric bootstrap test to determine whether
the less restrictive models fits significantly better than the more restricted model.
Note that this hypothesis test also works when prior parameter distributions are included for
either model. Function can be run in parallel after using a suitable mirtCluster
definition.
boot.LR(mod, mod2, R = 1000, verbose = TRUE)
mod 
an estimated model object, more constrained than 
mod2 
an estimated model object 
R 
number of parametric bootstraps to use. 
verbose 
logical; include additional information in the console? 
a pvalue evaluating whether the more restrictive model fits significantly worse than the less restrictive model
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
# standard
dat < expand.table(LSAT7)
mod1 < mirt(dat, 1)
mod2 < mirt(dat, 1, '3PL')
# standard LR test
anova(mod1, mod2)
# bootstrap LR test (run in parallel to save time)
if(interactive()) mirtCluster()
boot.LR(mod1, mod2, R=200)
## End(Not run)
Given an internal mirt object estimate the bootstrapped standard errors. It may
be beneficial to run the computations using multicore architecture (e.g., the parallel
package). Parameters are organized from the freely estimated values in mod2values(x)
(equality constraints will also be returned in the bootstrapped estimates).
boot.mirt(x, R = 100, boot.fun = NULL, technical = NULL, ...)
x 
an estimated model object 
R 
number of draws to use (passed to the 
boot.fun 
a userdefined function used to extract the information from the bootstrap
fitted models. Must be of the form 
technical 
technical arguments passed to estimation engine. See 
... 
additional arguments to be passed on to 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
# standard
mod < mirt(Science, 1)
booted < boot.mirt(mod, R=20)
plot(booted)
booted
#run in parallel using snow backend using all available cores
mod < mirt(Science, 1)
booted < boot.mirt(mod, parallel = 'snow', ncpus = parallel::detectCores())
booted
####
# bootstrapped CIs for standardized factor loadings
boot.fun < function(mod){
so < summary(mod, verbose=FALSE)
as.vector(so$rotF)
}
# test to see if it works before running
boot.fun(mod)
# run
booted.loads < boot.mirt(mod, boot.fun=boot.fun)
booted.loads
## End(Not run)
Return a list (or data.frame) of raw item and group level coefficients. Note that while
the output to the console is rounded to three digits, the returned list of objects is not.
Hence, elements from cfs < coef(mod); cfs[[1]]
will contain the nonrounded results (useful
for simulations).
## S4 method for signature 'SingleGroupClass'
coef(
object,
CI = 0.95,
printSE = FALSE,
rotate = "none",
Target = NULL,
IRTpars = FALSE,
rawug = FALSE,
as.data.frame = FALSE,
simplify = FALSE,
unique = FALSE,
verbose = TRUE,
...
)
object 
an object of class 
CI 
the amount of converged used to compute confidence intervals; default is 95 percent confidence intervals 
printSE 
logical; print the standard errors instead of the confidence intervals? When

rotate 
see 
Target 
a dummy variable matrix indicting a target rotation pattern 
IRTpars 
logical; convert slope intercept parameters into traditional IRT parameters?
Only applicable to unidimensional models or models with simple structure (i.e., only one nonzero slope).
If a suitable ACOV estimate was computed in the fitted
model, and 
rawug 
logical; return the untransformed internal g and u parameters?
If 
as.data.frame 
logical; convert list output to a data.frame instead? 
simplify 
logical; if all items have the same parameter names (indicating they are of the same class) then they are collapsed to a matrix, and a list of length 2 is returned containing a matrix of item parameters and grouplevel estimates 
unique 
return the vector of uniquely estimated parameters 
verbose 
logical; allow information to be printed to the console? 
... 
additional arguments to be passed 
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
dat < expand.table(LSAT7)
x < mirt(dat, 1)
coef(x)
coef(x, IRTpars = TRUE)
coef(x, simplify = TRUE)
#with computed information matrix
x < mirt(dat, 1, SE = TRUE)
coef(x)
coef(x, printSE = TRUE)
coef(x, as.data.frame = TRUE)
#two factors
x2 < mirt(Science, 2)
coef(x2)
coef(x2, rotate = 'varimax')
## End(Not run)
Initializes the proper S4 class and methods necessary for mirt functions to use in estimation for defining
customized grouplevel functions. To use the defined objects pass to the
mirt(..., customGroup = OBJECT)
command, and ensure that the class parameters are properly labelled.
createGroup(
par,
est,
den,
nfact,
standardize = FALSE,
gr = NULL,
hss = NULL,
gen = NULL,
lbound = NULL,
ubound = NULL,
derivType = "Richardson"
)
par 
a named vector of the starting values for the parameters 
est 
a logical vector indicating which parameters should be freely estimated by default 
den 
the probability density function given the Theta/ability values.
First input contains a vector of all the defined parameters and the second input
must be a matrix called 
nfact 
number of factors required for the model. E.g., for unidimensional models with only one
dimension of integration 
standardize 
logical; use standardization of the quadrature table method proposed by
Woods and Thissen (2006)? If TRUE, the logical elements named 
gr 
gradient function (vector of first derivatives) of the loglikelihood used in
estimation. The function must be of the form 
hss 
Hessian function (matrix of second derivatives) of the loglikelihood used in
estimation. If not specified a numeric approximation will be used.
The input is identical to the 
gen 
a function used when 
lbound 
optional vector indicating the lower bounds of the parameters. If not specified then the bounds will be set to Inf 
ubound 
optional vector indicating the lower bounds of the parameters. If not specified then the bounds will be set to Inf 
derivType 
if the 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
# normal density example, N(mu, sigma^2)
den < function(obj, Theta) dnorm(Theta, obj@par[1], sqrt(obj@par[2]))
par < c(mu = 0, sigma2 = .5)
est < c(FALSE, TRUE)
lbound < c(Inf, 0)
grp < createGroup(par, est, den, nfact = 1, lbound=lbound)
dat < expand.table(LSAT6)
mod < mirt(dat, 1, 'Rasch')
modcustom < mirt(dat, 1, 'Rasch', customGroup=grp)
coef(mod)
coef(modcustom)
Initializes the proper S4 class and methods necessary for mirt
functions to use in estimation. To use the defined objects pass to the
mirt(..., customItems = list())
command, and
ensure that the classes are properly labelled and unique in the list.
Additionally, the input mirt(..., customItemsData = list())
can
also be included to specify additional itemlevel information to better
recycle customitem definitions (e.g., for supplying varying
Qmatrices), where the list
input must have the same length as the
number of items. For further examples regarding how this function can be
used for fitting unfoldingtype models see Liu and Chalmers (2018).
createItem(
name,
par,
est,
P,
gr = NULL,
hss = NULL,
gen = NULL,
lbound = NULL,
ubound = NULL,
derivType = "Richardson",
derivType.hss = "Richardson",
bytecompile = TRUE
)
name 
a character indicating the item class name to be defined 
par 
a named vector of the starting values for the parameters 
est 
a logical vector indicating which parameters should be freely estimated by default 
P 
the probability trace function for all categories (first column is category 1, second
category two, etc). First input contains a vector of all the item parameters, the second input
must be a matrix called
or
to be valid; however, the names of the arguements is not relavent. Finally, this function must return a 
gr 
gradient function (vector of first derivatives) of the loglikelihood used in
estimation. The function must be of the form 
hss 
Hessian function (matrix of second derivatives) of the loglikelihood used in
estimation. If not specified a numeric approximation will be used (required for the MHRM
algorithm only). The input is identical to the 
gen 
a function used when 
lbound 
optional vector indicating the lower bounds of the parameters. If not specified then the bounds will be set to Inf 
ubound 
optional vector indicating the lower bounds of the parameters. If not specified then the bounds will be set to Inf 
derivType 
if the 
derivType.hss 
if the 
bytecompile 
logical; where applicable, byte compile the functions provided? Default is

The summary()
function will not return proper standardized loadings
since the function is not sure how to handle them (no slopes could be
defined at all!). Instead loadings of .001 are filled in as placeholders.
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Liu, C.W. and Chalmers, R. P. (2018). Fitting item response unfolding models to Likertscale data using mirt in R. PLoS ONE, 13, 5. doi:10.1371/journal.pone.0196292
## Not run:
name < 'old2PL'
par < c(a = .5, b = 2)
est < c(TRUE, TRUE)
P.old2PL < function(par,Theta, ncat){
a < par[1]
b < par[2]
P1 < 1 / (1 + exp(1*a*(Theta  b)))
cbind(1P1, P1)
}
x < createItem(name, par=par, est=est, P=P.old2PL)
# So, let's estimate it!
dat < expand.table(LSAT7)
sv < mirt(dat, 1, c(rep('2PL',4), 'old2PL'), customItems=list(old2PL=x), pars = 'values')
tail(sv) #looks good
mod < mirt(dat, 1, c(rep('2PL',4), 'old2PL'), customItems=list(old2PL=x))
coef(mod)
mod2 < mirt(dat, 1, c(rep('2PL',4), 'old2PL'), customItems=list(old2PL=x), method = 'MHRM')
coef(mod2)
# same definition as above, but using symbolic derivative computations
# (can be more accurate/stable)
xs < createItem(name, par=par, est=est, P=P.old2PL, derivType = 'symbolic')
mod < mirt(dat, 1, c(rep('2PL',4), 'old2PL'), customItems=list(old2PL=xs))
coef(mod, simplify=TRUE)
# several secondary functions supported
M2(mod, calcNull=FALSE)
itemfit(mod)
fscores(mod, full.scores=FALSE)
plot(mod)
# fit the same model, but specify gradient function explicitly (use of a browser() may be helpful)
gr < function(x, Theta){
# browser()
a < x@par[1]
b < x@par[2]
P < probtrace(x, Theta)
PQ < apply(P, 1, prod)
r_P < x@dat / P
grad < numeric(2)
grad[2] < sum(a * PQ * (r_P[,2]  r_P[,1]))
grad[1] < sum((Theta  b) * PQ * (r_P[,2]  r_P[,1]))
## check with internal numerical form to be safe
# numerical_deriv(x@par[x@est], mirt:::EML, obj=x, Theta=Theta)
grad
}
x < createItem(name, par=par, est=est, P=P.old2PL, gr=gr)
mod < mirt(dat, 1, c(rep('2PL',4), 'old2PL'), customItems=list(old2PL=x))
coef(mod, simplify=TRUE)
### nonlinear
name < 'nonlin'
par < c(a1 = .5, a2 = .1, d = 0)
est < c(TRUE, TRUE, TRUE)
P.nonlin < function(par,Theta, ncat=2){
a1 < par[1]
a2 < par[2]
d < par[3]
P1 < 1 / (1 + exp(1*(a1*Theta + a2*Theta^2 + d)))
cbind(1P1, P1)
}
x2 < createItem(name, par=par, est=est, P=P.nonlin)
mod < mirt(dat, 1, c(rep('2PL',4), 'nonlin'), customItems=list(nonlin=x2))
coef(mod)
### nominal response model (Bock 1972 version)
Tnom.dev < function(ncat) {
T < matrix(1/ncat, ncat, ncat  1)
diag(T[1, ]) < diag(T[1, ])  1
return(T)
}
name < 'nom'
par < c(alp=c(3,0,3),gam=rep(.4,3))
est < rep(TRUE, length(par))
P.nom < function(par, Theta, ncat){
alp < par[1:(ncat1)]
gam < par[ncat:length(par)]
a < Tnom.dev(ncat) %*% alp
c < Tnom.dev(ncat) %*% gam
z < matrix(0, nrow(Theta), ncat)
for(i in 1:ncat)
z[,i] < a[i] * Theta + c[i]
P < exp(z) / rowSums(exp(z))
P
}
nom1 < createItem(name, par=par, est=est, P=P.nom)
nommod < mirt(Science, 1, 'nom1', customItems=list(nom1=nom1))
coef(nommod)
Tnom.dev(4) %*% coef(nommod)[[1]][1:3] #a
Tnom.dev(4) %*% coef(nommod)[[1]][4:6] #d
## End(Not run)
Mathematics data from de Ayala (2009; pg. 14); 5 item dataset in table format.
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
de Ayala, R. J. (2009). The theory and practice of item response theory. Guilford Press.
## Not run:
dat < expand.table(deAyala)
head(dat)
itemstats(dat)
## End(Not run)
This function runs the Wald and likelihoodratio approaches for testing differential
item functioning (DIF) with two or more groups. This is primarily a convenience wrapper to the
multipleGroup
function for performing standard DIF procedures. Independent
models can be estimated in parallel by defining a parallel object with mirtCluster
,
which will help to decrease the run time. For best results, the baseline model should contain
a set of 'anchor' items and have freely estimated hyperparameters in the focal groups.
DIF(
MGmodel,
which.par,
scheme = "add",
items2test = 1:extract.mirt(MGmodel, "nitems"),
groups2test = "all",
seq_stat = "SABIC",
Wald = FALSE,
p.adjust = "none",
pairwise = FALSE,
return_models = FALSE,
return_seq_model = FALSE,
max_run = Inf,
plotdif = FALSE,
type = "trace",
simplify = TRUE,
verbose = TRUE,
...
)
MGmodel 
an object returned from 
which.par 
a character vector containing the parameter names which will be inspected for DIF 
scheme 
type of DIF analysis to perform, either by adding or dropping constraints across groups. These can be:

items2test 
a numeric vector, or character vector containing the item names, indicating
which items will be tested for DIF. In models where anchor items are known, omit them from
this vector. For example, if items 1 and 2 are anchors in a 10 item test, then

groups2test 
a character vector indicating which groups to use in the DIF testing
investigations. Default is 
seq_stat 
select a statistic to test for in the sequential schemes. Potential values are
(in descending order of power) 
Wald 
logical; perform Wald tests for DIF instead of likelihood ratio test? 
p.adjust 
string to be passed to the 
pairwise 
logical; perform pairwise tests between groups when the number of groups is greater than 2? Useful as quickly specified posthoc tests 
return_models 
logical; return estimated model objects for further analysis? Default is FALSE 
return_seq_model 
logical; on the last iteration of the sequential schemes, return
the fitted multiplegroup model containing the freely estimated parameters indicative of
DIF? This is generally only useful when 
max_run 
a number indicating the maximum number of cycles to perform in sequential searches. The default is to perform search until no further DIF is found 
plotdif 
logical; create item plots for items that are displaying DIF according to the

type 
the 
simplify 
logical; simplify the output by returning a data.frame object with the differences between AIC, BIC, etc, as well as the chisquared test (X2) and associated df and pvalues 
verbose 
logical print extra information to the console? 
... 
additional arguments to be passed to 
Generally, the precomputed baseline model should have been configured with two estimation properties: 1) a set of 'anchor' items, where the anchor items have various parameters that have been constrained to be equal across the groups, and 2) contain freely estimated latent mean and variance terms in all but one group (the socalled 'reference' group). These two properties help to fix the metric of the groups so that item parameter estimates do not contain latent distribution characteristics.
a mirt_df
object with the informationbased criteria for DIF, though this may be changed
to a list output when return_models
or simplify
are modified. As well, a silent
'DIF_coefficeints'
attribute is included to view the item parameter differences
between the groups
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Chalmers, R. P., Counsell, A., and Flora, D. B. (2016). It might not make a big DIF: Improved Differential Test Functioning statistics that account for sampling variability. Educational and Psychological Measurement, 76, 114140. doi:10.1177/0013164415584576
## Not run:
# simulate data where group 2 has a smaller slopes and more extreme intercepts
set.seed(12345)
a1 < a2 < matrix(abs(rnorm(15,1,.3)), ncol=1)
d1 < d2 < matrix(rnorm(15,0,.7),ncol=1)
a2[1:2, ] < a1[1:2, ]/3
d1[c(1,3), ] < d2[c(1,3), ]/4
head(data.frame(a.group1 = a1, a.group2 = a2, d.group1 = d1, d.group2 = d2))
itemtype < rep('2PL', nrow(a1))
N < 1000
dataset1 < simdata(a1, d1, N, itemtype)
dataset2 < simdata(a2, d2, N, itemtype, mu = .1, sigma = matrix(1.5))
dat < rbind(dataset1, dataset2)
group < c(rep('D1', N), rep('D2', N))
#### no anchors, all items tested for DIF by adding item constrains one item at a time.
# define a parallel cluster (optional) to help speed up internal functions
if(interactive()) mirtCluster()
# Information matrix with Oakes' identity (not controlling for latent group differences)
# NOTE: Without properly equating the groups the following example code is not testing for DIF,
# but instead reflects a combination of DIF + latenttrait distribution effects
model < multipleGroup(dat, 1, group, SE = TRUE)
# Likelihoodratio test for DIF (as well as model information)
dif < DIF(model, c('a1', 'd'))
dif
# function silently includes "DIF_coefficients" attribute to view
# the IRT parameters postcompletion
extract.mirt(dif, "DIF_coefficients")
# same as above, but using Wald tests with Benjamini & Hochberg adjustment
DIF(model, c('a1', 'd'), Wald = TRUE, p.adjust = 'fdr')
# equate the groups by assuming the last 5 items have no DIF
itemnames < colnames(dat)
model < multipleGroup(dat, 1, group, SE = TRUE,
invariance = c(itemnames[11:ncol(dat)], 'free_means', 'free_var'))
# test whether adding slopes and intercepts constraints results in DIF. Plot items showing DIF
resulta1d < DIF(model, c('a1', 'd'), plotdif = TRUE, items2test=1:10)
resulta1d
# test whether adding only slope constraints results in DIF for all items
DIF(model, 'a1', items2test=1:10)
# Determine whether it's a1 or d parameter causing DIF (could be joint, however)
(a1s < DIF(model, 'a1', items2test = 1:3))
(ds < DIF(model, 'd', items2test = 1:3))
### drop down approach (freely estimating parameters across groups) when
### specifying a highly constrained model with estimated latent parameters
model_constrained < multipleGroup(dat, 1, group,
invariance = c(colnames(dat), 'free_means', 'free_var'))
dropdown < DIF(model_constrained, c('a1', 'd'), scheme = 'drop')
dropdown
# View silent "DIF_coefficients" attribute
extract.mirt(dropdown, "DIF_coefficients")
### sequential schemes (add constraints)
### sequential searches using SABIC as the selection criteria
# starting from completely different models
stepup < DIF(model, c('a1', 'd'), scheme = 'add_sequential',
items2test=1:10)
stepup
# step down procedure for highly constrained model
stepdown < DIF(model_constrained, c('a1', 'd'), scheme = 'drop_sequential')
stepdown
# view final MG model (only useful when scheme is 'add_sequential')
updated_mod < DIF(model, c('a1', 'd'), scheme = 'add_sequential',
return_seq_model=TRUE)
plot(updated_mod, type='trace')
###################################
# Multigroup example
a1 < a2 < a3 < matrix(abs(rnorm(15,1,.3)), ncol=1)
d1 < d2 < d3 < matrix(rnorm(15,0,.7),ncol=1)
a2[1:2, ] < a1[1:2, ]/3
d3[c(1,3), ] < d2[c(1,3), ]/4
head(data.frame(a.group1 = a1, a.group2 = a2, a.group3 = a3,
d.group1 = d1, d.group2 = d2, d.group3 = d3))
itemtype < rep('2PL', nrow(a1))
N < 1000
dataset1 < simdata(a1, d1, N, itemtype)
dataset2 < simdata(a2, d2, N, itemtype, mu = .1, sigma = matrix(1.5))
dataset3 < simdata(a3, d3, N, itemtype, mu = .2)
dat < rbind(dataset1, dataset2, dataset3)
group < gl(3, N, labels = c('g1', 'g2', 'g3'))
# equate the groups by assuming the last 5 items have no DIF
itemnames < colnames(dat)
model < multipleGroup(dat, group=group, SE=TRUE,
invariance = c(itemnames[11:ncol(dat)], 'free_means', 'free_var'))
coef(model, simplify=TRUE)
# omnibus tests
dif < DIF(model, which.par = c('a1', 'd'), items2test=1:9)
dif
# pairwise posthoc tests for items flagged via omnibus tests
dif.posthoc < DIF(model, which.par = c('a1', 'd'), items2test=1:2,
pairwise = TRUE)
dif.posthoc
# further probing for df = 1 tests, this time with Wald tests
DIF(model, which.par = c('a1'), items2test=1:2, pairwise = TRUE,
Wald=TRUE)
DIF(model, which.par = c('d'), items2test=1:2, pairwise = TRUE,
Wald=TRUE)
## End(Not run)
Defines the object returned from mdirt
.
Call
:function call
Data
:list of data, sometimes in different forms
Options
:list of estimation options
Fit
:a list of fit information
Model
:a list of modelbased information
ParObjects
:a list of the S4 objects used during estimation
OptimInfo
:a list of arguments from the optimization process
Internals
:a list of internal arguments for secondary computations (inspecting this object is generally not required)
vcov
:a matrix represented the asymptotic covariance matrix of the parameter estimates
time
:a data.frame indicating the breakdown of computation times in seconds
signature(x = "DiscreteClass")
signature(object = "DiscreteClass")
signature(object = "DiscreteClass")
signature(x = "DiscreteClass")
signature(object = "DiscreteClass")
signature(object = "DiscreteClass")
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Draws plausible parameters from a model using parametric sampling (if the information matrix
was computed) or via bootstrap sampling. Primarily for use with the DRF
function.
draw_parameters(
mod,
draws,
method = c("parametric", "boostrap"),
redraws = 20,
verbose = FALSE,
...
)
mod 
estimated single or multiplegroup model 
draws 
number of draws to obtain 
method 
type of plausible values to obtain. Can be 'parametric', for the parametric sampling
scheme which uses the estimated information matrix, or 'boostrap' to obtain values from the 
redraws 
number of redraws to perform when the given parameteric sample does not satisfy the upper and lower parameter bounds. If a valid set cannot be found within this number of draws then an error will be thrown 
verbose 
logical; include additional information in the console? 
... 
additional arguments to be passed 
returns a draws x p matrix of plausible parameters, where each row correspeonds to a single set
## Not run:
set.seed(1234)
n < 40
N < 500
# only first 5 items as anchors
model < 'F = 140
CONSTRAINB = (15, a1), (15, d)'
a < matrix(1, n)
d < matrix(rnorm(n), n)
group < c(rep('Group_1', N), rep('Group_2', N))
## 
# groups completely equal
dat1 < simdata(a, d, N, itemtype = 'dich')
dat2 < simdata(a, d, N, itemtype = 'dich')
dat < rbind(dat1, dat2)
mod < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
param_set < draw_parameters(mod, 100)
head(param_set)
## End(Not run)
Function performs various omnibus differential item (DIF), bundle (DBF), and test (DTF)
functioning procedures on an object
estimated with multipleGroup()
. The compensatory and noncompensatory statistics provided
are described in Chalmers (2018), which generally can be interpreted as IRT generalizations
of the SIBTEST and CSIBTEST statistics. For hypothesis tests, these measures
require the ACOV matrix to be computed in the
fitted multiplegroup model (otherwise, sets of plausible draws from the posterior are explicitly
required).
DRF(
mod,
draws = NULL,
focal_items = 1L:extract.mirt(mod, "nitems"),
param_set = NULL,
den.type = "marginal",
best_fitting = FALSE,
CI = 0.95,
npts = 1000,
quadpts = NULL,
theta_lim = c(6, 6),
Theta_nodes = NULL,
plot = FALSE,
DIF = FALSE,
DIF.cats = FALSE,
groups2test = "all",
pairwise = FALSE,
simplify = TRUE,
p.adjust = "none",
par.strip.text = list(cex = 0.7),
par.settings = list(strip.background = list(col = "#9ECAE1"), strip.border = list(col =
"black")),
auto.key = list(space = "right", points = FALSE, lines = TRUE),
verbose = TRUE,
...
)
mod 
a multipleGroup object which estimated only 2 groups 
draws 
a number indicating how many draws to take to form a suitable multiple imputation
or bootstrap estimate of the expected test scores (100 or more). If 
focal_items 
a character/numeric vector indicating which items to include in the DRF tests. The default uses all of the items (note that including anchors in the focal items has no effect because they are exactly equal across groups). Selecting fewer items will result in tests of 'differential bundle functioning' 
param_set 
an N x p matrix of parameter values drawn from the posterior (e.g., using the
parametric sampling approach, bootstrap, of MCMC). If supplied, then these will be used to compute
the DRF measures. Can be much more efficient to precompute these values if DIF, DBF, or DTF are
being evaluated within the same model (especially when using the bootstrap method).
See 
den.type 
character specifying how the density of the latent traits is computed.
Default is 
best_fitting 
logical; use the best fitting parametric distribution (Gaussian by default) that was used at the time of model estimation? This will result in much fast computations, however the results are more dependent upon the underlying modelling assumptions. Default is FALSE, which uses the empirical histogram approach 
CI 
range of confidence interval when using draws input 
npts 
number of points to use for plotting. Default is 1000 
quadpts 
number of quadrature nodes to use when constructing DRF statistics. Default is extracted from the input model object 
theta_lim 
lower and upper limits of the latent trait (theta) to be evaluated, and is
used in conjunction with 
Theta_nodes 
an optional matrix of Theta values to be evaluated in the draws for the sDRF statistics. However, these values are not averaged across, and instead give the bootstrap confidence intervals at the respective Theta nodes. Useful when following up a large sDRF or uDRF statistic, for example, to determine where the difference between the test curves are large (while still accounting for sampling variability). Returns a matrix with observed variability 
plot 
logical; plot the 'sDRF' functions for the evaluated sDBF or sDTF values across the
integration grid or, if 
DIF 
logical; return a list of itemlevel imputation properties using the DRF statistics? These can generally be used as a DIF detection method and as a graphical display for understanding DIF within each item 
DIF.cats 
logical; same as 
groups2test 
when more than 2 groups are being investigated which two groups should be used in the effect size comparisons? 
pairwise 
logical; perform pairwise computations when the applying to multigroup settings 
simplify 
logical; attempt to simplify the output rather than returning larger lists? 
p.adjust 
string to be passed to the 
par.strip.text 
plotting argument passed to 
par.settings 
plotting argument passed to 
auto.key 
plotting argument passed to 
verbose 
logical; include additional information in the console? 
... 
additional arguments to be passed to 
The effect sizes estimates by the DRF function are
$sDRF = \int [S(C\bm{\Psi}^{(R)},\theta) S(C\bm{\Psi}^{(F)},\theta)] f(\theta)d\theta,$
$uDRF = \int S(C\bm{\Psi}^{(R)},\theta) S(C\bm{\Psi}^{(F)},\theta) f(\theta)d\theta,$
and
$dDRF = \sqrt{\int [S(C\bm{\Psi}^{(R)},\theta) S(C\bm{\Psi}^{(F)},\theta)]^2 f(\theta)d\theta}$
where $S(.)$
are the scoring equations used to evaluate the modelimplied
difference between the focal and reference group. The $f(\theta)$
terms can either be estimated from the posterior via an empirical
histogram approach (default), or can use the best
fitting prior distribution that is obtain postconvergence (default is a Guassian
distribution). Note that, in comparison to Chalmers (2018), the focal group is
the leftmost scoring function while the reference group is the rightmost
scoring function. This is largely to keep consistent with similar effect
size statistics, such as SIBTEST, DFIT, Wainer's measures of impact, etc,
which in general can be seen as specialcase estimators of this family.
Phil Chalmers [email protected]
Chalmers, R. P. (2018). ModelBased Measures for Detecting and Quantifying Response Bias. Psychometrika, 83(3), 696732. doi:10.1007/s1133601896269
## Not run:
set.seed(1234)
n < 30
N < 500
# only first 5 items as anchors
model < 'F = 130
CONSTRAINB = (15, a1), (15, d)'
a < matrix(1, n)
d < matrix(rnorm(n), n)
group < c(rep('Group_1', N), rep('Group_2', N))
## 
# groups completely equal
dat1 < simdata(a, d, N, itemtype = 'dich')
dat2 < simdata(a, d, N, itemtype = 'dich')
dat < rbind(dat1, dat2)
mod < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
plot(mod)
plot(mod, which.items = 6:10) #DBF
plot(mod, type = 'itemscore')
plot(mod, type = 'itemscore', which.items = 10:15)
# empirical histogram approach
DRF(mod)
DRF(mod, focal_items = 6:10) #DBF
DRF(mod, DIF=TRUE)
DRF(mod, DIF=TRUE, focal_items = 10:15)
# Bestfitting Gaussian distributions
DRF(mod, best_fitting=TRUE)
DRF(mod, focal_items = 6:10, best_fitting=TRUE) #DBF
DRF(mod, DIF=TRUE, best_fitting=TRUE)
DRF(mod, DIF=TRUE, focal_items = 10:15, best_fitting=TRUE)
DRF(mod, plot = TRUE)
DRF(mod, focal_items = 6:10, plot = TRUE) #DBF
DRF(mod, DIF=TRUE, plot = TRUE)
DRF(mod, DIF=TRUE, focal_items = 10:15, plot = TRUE)
if(interactive()) mirtCluster()
DRF(mod, draws = 500)
DRF(mod, draws = 500, best_fitting=TRUE)
DRF(mod, draws = 500, plot=TRUE)
# predraw parameter set to save computations
# (more useful when using nonparametric bootstrap)
param_set < draw_parameters(mod, draws = 500)
DRF(mod, focal_items = 6, param_set=param_set) #DIF test
DRF(mod, DIF=TRUE, param_set=param_set) #DIF test
DRF(mod, focal_items = 6:10, param_set=param_set) #DBF test
DRF(mod, param_set=param_set) #DTF test
DRF(mod, focal_items = 6:10, draws=500) #DBF test
DRF(mod, focal_items = 10:15, draws=500) #DBF test
DIFs < DRF(mod, draws = 500, DIF=TRUE)
print(DIFs)
DRF(mod, draws = 500, DIF=TRUE, plot=TRUE)
DIFs < DRF(mod, draws = 500, DIF=TRUE, focal_items = 6:10)
print(DIFs)
DRF(mod, draws = 500, DIF=TRUE, focal_items = 6:10, plot = TRUE)
DRF(mod, DIF=TRUE, focal_items = 6)
DRF(mod, draws=500, DIF=TRUE, focal_items = 6)
# evaluate specific values for sDRF
Theta_nodes < matrix(seq(6,6,length.out = 100))
sDTF < DRF(mod, Theta_nodes=Theta_nodes)
head(sDTF)
sDTF < DRF(mod, Theta_nodes=Theta_nodes, draws=200)
head(sDTF)
# sDIF (isolate single item)
sDIF < DRF(mod, Theta_nodes=Theta_nodes, focal_items=6)
head(sDIF)
sDIF < DRF(mod, Theta_nodes=Theta_nodes, focal_items = 6, draws=200)
head(sDIF)
## 
## random slopes and intercepts for 15 items, and latent mean difference
## (no systematic DTF should exist, but DIF will be present)
set.seed(1234)
dat1 < simdata(a, d, N, itemtype = 'dich', mu=.50, sigma=matrix(1.5))
dat2 < simdata(a + c(numeric(15), rnorm(n15, 0, .25)),
d + c(numeric(15), rnorm(n15, 0, .5)), N, itemtype = 'dich')
dat < rbind(dat1, dat2)
mod1 < multipleGroup(dat, 1, group=group)
plot(mod1)
DRF(mod1) #does not account for group differences! Need anchors
mod2 < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
plot(mod2)
# significant DIF in multiple items....
# DIF(mod2, which.par=c('a1', 'd'), items2test=16:30)
DRF(mod2)
DRF(mod2, draws=500) #nonsig DTF due to item cancellation
## 
## systematic differing slopes and intercepts (clear DTF)
set.seed(1234)
dat1 < simdata(a, d, N, itemtype = 'dich', mu=.50, sigma=matrix(1.5))
dat2 < simdata(a + c(numeric(15), rnorm(n15, 1, .25)),
d + c(numeric(15), rnorm(n15, 1, .5)),
N, itemtype = 'dich')
dat < rbind(dat1, dat2)
mod3 < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
plot(mod3) #visable DTF happening
# DIF(mod3, c('a1', 'd'), items2test=16:30)
DRF(mod3) #unsigned bias. Signed bias (group 2 scores higher on average)
DRF(mod3, draws=500)
DRF(mod3, draws=500, plot=TRUE) #multiple DRF areas along Theta
# plot the DIF
DRF(mod3, draws=500, DIF=TRUE, plot=TRUE)
# evaluate specific values for sDRF
Theta_nodes < matrix(seq(6,6,length.out = 100))
sDTF < DRF(mod3, Theta_nodes=Theta_nodes, draws=200)
head(sDTF)
# DIF
sDIF < DRF(mod3, Theta_nodes=Theta_nodes, focal_items = 30, draws=200)
car::some(sDIF)
## 
# polytomous example
# simulate data where group 2 has a different slopes/intercepts
set.seed(4321)
a1 < a2 < matrix(rlnorm(20,.2,.3))
a2[c(16:17, 19:20),] < a1[c(16:17, 19:20),] + c(.5, .25, .25, .5)
# for the graded model, ensure that there is enough space between the intercepts,
# otherwise closer categories will not be selected often
diffs < t(apply(matrix(runif(20*4, .3, 1), 20), 1, cumsum))
diffs < (diffs  rowMeans(diffs))
d1 < d2 < diffs + rnorm(20)
rownames(d1) < rownames(d2) < paste0('Item.', 1:20)
d2[16:20,] < d1[16:20,] + matrix(c(.5, .5, .5, .5,
1, 0, 0, 1,
.5, .5, .5, .5,
1, .5, 0, 1,
.5, .5, .5, .5), byrow=TRUE, nrow=5)
tail(data.frame(a.group1 = a1, a.group2 = a2), 6)
list(d.group1 = d1[15:20,], d.group2 = d2[15:20,])
itemtype < rep('graded', nrow(a1))
N < 600
dataset1 < simdata(a1, d1, N, itemtype)
dataset2 < simdata(a2, d2, N, itemtype, mu = .25, sigma = matrix(1.25))
dat < rbind(dataset1, dataset2)
group < c(rep('D1', N), rep('D2', N))
# item 110 as anchors
mod < multipleGroup(dat, group=group, SE=TRUE,
invariance=c(colnames(dat)[1:10], 'free_means', 'free_var'))
coef(mod, simplify=TRUE)
plot(mod)
plot(mod, type='itemscore')
# DIF tests vis Wald method
DIF(mod, items2test=11:20,
which.par=c('a1', paste0('d', 1:4)),
Wald=TRUE, p.adjust='holm')
DRF(mod)
DRF(mod, DIF=TRUE, focal_items=11:20)
DRF(mod, DIF.cats=TRUE, focal_items=11:20)
## 
### multidimensional DTF
set.seed(1234)
n < 50
N < 1000
# only first 5 items as anchors within each dimension
model < 'F1 = 125
F2 = 2650
COV = F1*F2
CONSTRAINB = (15, a1), (15, 2630, d), (2630, a2)'
a < matrix(c(rep(1, 25), numeric(50), rep(1, 25)), n)
d < matrix(rnorm(n), n)
group < c(rep('Group_1', N), rep('Group_2', N))
Cov < matrix(c(1, .5, .5, 1.5), 2)
Mean < c(0, 0.5)
# groups completely equal
dat1 < simdata(a, d, N, itemtype = 'dich', sigma = cov2cor(Cov))
dat2 < simdata(a, d, N, itemtype = 'dich', sigma = Cov, mu = Mean)
dat < rbind(dat1, dat2)
mod < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
coef(mod, simplify=TRUE)
plot(mod, degrees = c(45,45))
DRF(mod)
# some intercepts slightly higher in Group 2
d2 < d
d2[c(10:15, 31:35)] < d2[c(10:15, 31:35)] + 1
dat1 < simdata(a, d, N, itemtype = 'dich', sigma = cov2cor(Cov))
dat2 < simdata(a, d2, N, itemtype = 'dich', sigma = Cov, mu = Mean)
dat < rbind(dat1, dat2)
mod < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
coef(mod, simplify=TRUE)
plot(mod, degrees = c(45,45))
DRF(mod)
DRF(mod, draws = 500)
## End(Not run)
Function performs various omnibus differential test functioning procedures on an object
estimated with multipleGroup()
. If the latent means/covariances are suspected to differ
then the input object should contain a set of 'anchor' items to ensure that only differential
test features are being detected rather than group differences. Returns signed (average area
above and below) and unsigned (total area) statistics, with descriptives such as the percent
average bias between group total scores for each statistic. If a grid of Theta values is passed,
these can be evaluated as well to determine specific DTF location effects. For best results,
the baseline model should contain a set of 'anchor' items and have freely estimated
hyperparameters in the focal groups. See DIF
for details.
DTF(
mod,
draws = NULL,
CI = 0.95,
npts = 1000,
theta_lim = c(6, 6),
Theta_nodes = NULL,
plot = "none",
auto.key = list(space = "right", points = FALSE, lines = TRUE),
...
)
mod 
a multipleGroup object which estimated only 2 groups 
draws 
a number indicating how many draws to take to form a suitable multiple imputation estimate of the expected test scores (usually 100 or more). Returns a list containing the imputation distribution and null hypothesis test for the sDTF statistic 
CI 
range of confidence interval when using draws input 
npts 
number of points to use in the integration. Default is 1000 
theta_lim 
lower and upper limits of the latent trait (theta) to be evaluated, and is
used in conjunction with 
Theta_nodes 
an optional matrix of Theta values to be evaluated in the draws for the sDTF statistic. However, these values are not averaged across, and instead give the bootstrap confidence intervals at the respective Theta nodes. Useful when following up a large uDTF/sDTF statistic to determine where the difference between the test curves are large (while still accounting for sampling variability). Returns a matrix with observed variability 
plot 
a character vector indicating which plot to draw. Possible values are 'none', 'func' for the test score functions, and 'sDTF' for the evaluated sDTF values across the integration grid. Each plot is drawn with imputed confidence envelopes 
auto.key 
logical; automatically generate key in lattice plot? 
... 
additional arguments to be passed to 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Chalmers, R. P., Counsell, A., and Flora, D. B. (2016). It might not make a big DIF: Improved Differential Test Functioning statistics that account for sampling variability. Educational and Psychological Measurement, 76, 114140. doi:10.1177/0013164415584576
## Not run:
set.seed(1234)
n < 30
N < 500
# only first 5 items as anchors
model < 'F = 130
CONSTRAINB = (15, a1), (15, d)'
a < matrix(1, n)
d < matrix(rnorm(n), n)
group < c(rep('Group_1', N), rep('Group_2', N))
## 
# groups completely equal
dat1 < simdata(a, d, N, itemtype = '2PL')
dat2 < simdata(a, d, N, itemtype = '2PL')
dat < rbind(dat1, dat2)
mod < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
plot(mod)
DTF(mod)
if(interactive()) mirtCluster()
DTF(mod, draws = 1000) #95% C.I. for sDTF containing 0. uDTF is very small
DTF(mod, draws = 1000, plot='sDTF') #sDTF 95% C.I.'s across Theta always include 0
## 
## random slopes and intercepts for 15 items, and latent mean difference
## (no systematic DTF should exist, but DIF will be present)
set.seed(1234)
dat1 < simdata(a, d, N, itemtype = '2PL', mu=.50, sigma=matrix(1.5))
dat2 < simdata(a + c(numeric(15), runif(n15, .2, .2)),
d + c(numeric(15), runif(n15, .5, .5)), N, itemtype = '2PL')
dat < rbind(dat1, dat2)
mod1 < multipleGroup(dat, 1, group=group)
plot(mod1) #does not account for group differences! Need anchors
mod2 < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
plot(mod2)
# significant DIF in multiple items....
# DIF(mod2, which.par=c('a1', 'd'), items2test=16:30)
DTF(mod2)
DTF(mod2, draws=1000) #nonsig DTF due to item cancellation
## 
## systematic differing slopes and intercepts (clear DTF)
dat1 < simdata(a, d, N, itemtype = '2PL', mu=.50, sigma=matrix(1.5))
dat2 < simdata(a + c(numeric(15), rnorm(n15, 1, .25)), d + c(numeric(15), rnorm(n15, 1, .5)),
N, itemtype = '2PL')
dat < rbind(dat1, dat2)
mod3 < multipleGroup(dat, model, group=group, SE=TRUE,
invariance=c('free_means', 'free_var'))
plot(mod3) #visable DTF happening
# DIF(mod3, c('a1', 'd'), items2test=16:30)
DTF(mod3) #unsigned bias. Signed bias indicates group 2 scores generally higher on average
DTF(mod3, draws=1000)
DTF(mod3, draws=1000, plot='func')
DTF(mod3, draws=1000, plot='sDTF') #multiple DTF areas along Theta
# evaluate specific values for sDTF
Theta_nodes < matrix(seq(6,6,length.out = 100))
sDTF < DTF(mod3, Theta_nodes=Theta_nodes)
head(sDTF)
sDTF < DTF(mod3, Theta_nodes=Theta_nodes, draws=100)
head(sDTF)
## End(Not run)
Computes effect size measures of differential item functioning and differential test/bundle functioning based on expected scores from Meade (2010). Item parameters from both reference and focal group are used in conjunction with focal group empirical theta estimates (and an assumed normally distributed theta) to compute expected scores.
empirical_ES(
mod,
Theta.focal = NULL,
focal_items = 1L:extract.mirt(mod, "nitems"),
DIF = TRUE,
npts = 61,
theta_lim = c(6, 6),
plot = FALSE,
type = "b",
par.strip.text = list(cex = 0.7),
par.settings = list(strip.background = list(col = "#9ECAE1"), strip.border = list(col =
"black")),
...
)
mod 
a multipleGroup object which estimated only 2 groups. The first group in this object
is assumed to be the reference group by default (i.e., 
Theta.focal 
an optional matrix of Theta values from the focal group to be evaluated. If not supplied
the default values to 
focal_items 
a numeric vector indicating which items to include the tests. The
default uses all of the items. Selecting fewer items will result in tests of
'differential bundle functioning' when 
DIF 
logical; return a data.frame of itemlevel imputation properties? If 
npts 
number of points to use in the integration. Default is 61 
theta_lim 
lower and upper limits of the latent trait (theta) to be evaluated, and is
used in conjunction with 
plot 
logical; plot expected scores of items/test where expected scores are computed using focal group thetas and both focal and reference group item parameters 
type 
type of objects to draw in 
par.strip.text 
plotting argument passed to 
par.settings 
plotting argument passed to 
... 
The default DIF = TRUE
produces several effect sizes indices at the item level.
Signed indices allow DIF favoring the focal group at one point on the theta
distribution to cancel DIF favoring the reference group at another point on the theta
distribution. Unsigned indices take the absolute value before summing or averaging,
thus not allowing cancellation of DIF across theta.
Signed Item Difference in the Sample. The average difference in expected scores across the focal sample using both focal and reference group item parameters.
Unsigned Item Difference in the Sample. Same as SIDS except absolute value of expected scores is taken prior to averaging across the sample.
The maximum difference in expected scores in the sample.
Expected Score Standardized Difference. Cohen's D for difference in expected scores.
Signed Item Difference in a Normal distribution. Identical to SIDS but averaged across a normal distribution rather than the sample.
Unsigned Item Difference in a Normal distribution. Identical to UIDS but averaged across a normal distribution rather than the sample.
DIF = FALSE
produces a series of test/bundlelevel indices that are based on itemlevel
indices.
Signed Test Differences in the Sample. The sum of the SIDS across items.
Unsigned Test Differences in the Sample. The sum of the UIDS across items.
Stark's version of STDS using a normal distribution rather than sample estimated thetas.
Unsigned Expected Test Scores Differences in the Sample. The difference in observed summed scale scores expected, on average, across a hypothetical focal group with a normally distributed theta, had DF been uniform in nature for all items
Unsigned Expected Test Score Differences in the Sample. The hypothetical difference expected scale scores that would have been present if scalelevel DF had been uniform across respondents (i.e., always favoring the focal group).
Identical to UETSDS but computed using a normal distribution.
Maximum expected test score differences in the sample.
Expected Test Score Standardized Difference. Cohen's D for expected test scores.
Adam Meade, with contributions by Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Meade, A. W. (2010). A taxonomy of effect size measures for the differential functioning of items and scales. Journal of Applied Psychology, 95, 728743.
## Not run:
# no DIF
set.seed(12345)
a < matrix(abs(rnorm(15,1,.3)), ncol=1)
d < matrix(rnorm(15,0,.7),ncol=1)
itemtype < rep('2PL', nrow(a))
N < 1000
dataset1 < simdata(a, d, N, itemtype)
dataset2 < simdata(a, d, N, itemtype, mu = .1, sigma = matrix(1.5))
dat < rbind(dataset1, dataset2)
# ensure 'Ref' is the first group (and therefore reference group during estimation)
group < factor(c(rep('Ref', N), rep('Focal', N)), levels = c('Ref', 'Focal'))
mod < multipleGroup(dat, 1, group = group,
invariance = c(colnames(dat)[1:5], 'free_means', 'free_var'))
coef(mod, simplify=TRUE)
empirical_ES(mod)
empirical_ES(mod, DIF=FALSE)
empirical_ES(mod, DIF=FALSE, focal_items = 10:15)
empirical_ES(mod, plot=TRUE)
empirical_ES(mod, plot=TRUE, DIF=FALSE)
###
# DIF
set.seed(12345)
a1 < a2 < matrix(abs(rnorm(15,1,.3)), ncol=1)
d1 < d2 < matrix(rnorm(15,0,.7),ncol=1)
a2[10:15,] < a2[10:15,] + rnorm(6, 0, .3)
d2[10:15,] < d2[10:15,] + rnorm(6, 0, .3)
itemtype < rep('dich', nrow(a1))
N < 1000
dataset1 < simdata(a1, d1, N, itemtype)
dataset2 < simdata(a2, d2, N, itemtype, mu = .1, sigma = matrix(1.5))
dat < rbind(dataset1, dataset2)
group < factor(c(rep('Ref', N), rep('Focal', N)), levels = c('Ref', 'Focal'))
mod < multipleGroup(dat, 1, group = group,
invariance = c(colnames(dat)[1:5], 'free_means', 'free_var'))
coef(mod, simplify=TRUE)
empirical_ES(mod)
empirical_ES(mod, DIF = FALSE)
empirical_ES(mod, plot=TRUE)
empirical_ES(mod, plot=TRUE, DIF=FALSE)
## End(Not run)
Given a dataset containing item responses this function will construct empirical graphics using the observed responses to each item conditioned on the total score. When individual item plots are requested then the total score will be formed without the item of interest (i.e., the total score without that item).
empirical_plot(
data,
which.items = NULL,
type = "prop",
smooth = FALSE,
formula = resp ~ s(TS, k = 5),
main = NULL,
par.strip.text = list(cex = 0.7),
par.settings = list(strip.background = list(col = "#9ECAE1"), strip.border = list(col =
"black")),
auto.key = list(space = "right", points = FALSE, lines = TRUE),
...
)
data 
a 
which.items 
a numeric vector indicating which items to plot in a faceted image plot. If NULL then empirical test plots will be constructed instead 
type 
character vector specifying type of plot to draw. When 
smooth 
logical; include a GAM smoother instead of the raw proportions? Default is FALSE 
formula 
formula used for the GAM smoother 
main 
the main title for the plot. If NULL an internal default will be used 
par.strip.text 
plotting argument passed to 
par.settings 
plotting argument passed to 
auto.key 
plotting argument passed to 
... 
additional arguments to be passed to 
Note that these types of plots should only be used for unidimensional tests with monotonically increasing item response functions. If monotonicity is not true for all items, however, then these plots may serve as a visual diagnostic tool so long as the majority of items are indeed monotonic.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
SAT12[SAT12 == 8] < NA
data < key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
# test plot
empirical_plot(data)
empirical_plot(data, type = 'hist')
empirical_plot(data, type = 'hist', breaks=20)
# items 1, 2 and 5
empirical_plot(data, c(1, 2, 5))
empirical_plot(data, c(1, 2, 5), smooth = TRUE)
empirical_plot(data, c(1, 2, 5), type = 'boxplot')
# replace weird looking items with unscored versions for diagnostics
empirical_plot(data, 32)
data[,32] < SAT12[,32]
empirical_plot(data, 32)
empirical_plot(data, 32, smooth = TRUE)
## End(Not run)
Given secondary latent trait estimates and their associated standard errors
returned from fscores
, compute the empirical reliability.
empirical_rxx(Theta_SE, T_as_X = FALSE)
Theta_SE 
a matrix of latent trait estimates returned from 
T_as_X 
logical; should the observed variance be equal to

Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
dat < expand.table(deAyala)
itemstats(dat)
mod < mirt(dat)
theta_se < fscores(mod, full.scores.SE = TRUE)
empirical_rxx(theta_se)
theta_se < fscores(mod, full.scores.SE = TRUE, method = 'ML')
empirical_rxx(theta_se)
empirical_rxx(theta_se, T_as_X = TRUE)
## End(Not run)
A function for extracting the empirical estimating functions of a fitted
mirt
, multipleGroup
or bfactor
model. This is the derivative of the loglikelihood with respect to the
parameter vector, evaluated at the observed (casewise) data. In other
words, this function returns the casewise scores, evaluated at the fitted
model parameters. Currently, models fitted via the EM
or BL
method are supported. For the computations, the internal Theta
grid of
the model is being used which was already used during the estimation of
the model itself along with its matching normalized density.
estfun.AllModelClass(
x,
weights = extract.mirt(x, "survey.weights"),
centering = FALSE
)
x 
a fitted model object of class 
weights 
by default, the 
centering 
a boolean variable that allows the centering of the casewise scores (i.e., setting their expected values to 0). If the casewise scores were obtained from maximum likelihood estimates, this setting does not affect the result. 
An n x k matrix corresponding to n observations and k parameters
Lennart Schneider [email protected]; centering argument contributed by Rudolf Debelak ([email protected])
## Not run:
# fit a 2PL on the LSAT7 data and get the scores
mod1 < mirt(expand.table(LSAT7), 1, SE = TRUE, SE.type = "crossprod")
sc1 < estfun.AllModelClass(mod1)
# get the gradient
colSums(sc1)
# calculate the OPG estimate of the variancecovariance matrix "by hand"
vc1 < vcov(mod1)
all.equal(crossprod(sc1), chol2inv(chol(vc1)), check.attributes = FALSE)
# fit a multiple group 2PL and do the same as above
group < rep(c("G1", "G2"), 500)
mod2 < multipleGroup(expand.table(LSAT7), 1, group, SE = TRUE,
SE.type = "crossprod")
sc2 < estfun.AllModelClass(mod2)
colSums(sc2)
vc2 < vcov(mod2)
all.equal(crossprod(sc2), chol2inv(chol(vc2)), check.attributes = FALSE)
# fit a bifactor model with 2 specific factors and do the same as above
mod3 < bfactor(expand.table(LSAT7), c(2, 2, 1, 1, 2), SE = TRUE,
SE.type = "crossprod")
sc3 < estfun.AllModelClass(mod3)
colSums(sc3)
vc3 < vcov(mod3)
all.equal(crossprod(sc3), chol2inv(chol(vc3)), check.attributes = FALSE)
# fit a 2PL not weighting all cases equally
survey.weights < c(rep(2, sum(LSAT7$freq) / 2), rep(1, sum(LSAT7$freq) / 2))
survey.weights < survey.weights / sum(survey.weights) * sum(LSAT7$freq)
mod4 < mirt(expand.table(LSAT7), 1, SE = TRUE, SE.type = "crossprod",
survey.weights = survey.weights)
sc4 < estfun.AllModelClass(mod4,
weights = extract.mirt(mod4, "survey.weights"))
# get the gradient
colSums(sc4)
# to calculate the OPG estimate of the variancecovariance matrix "by hand",
# the weights must be adjusted by taking their square root
sc4_crp < estfun.AllModelClass(mod4,
weights = sqrt(extract.mirt(mod4, "survey.weights")))
vc4 < vcov(mod4)
all.equal(crossprod(sc4_crp), chol2inv(chol(vc4)), check.attributes = FALSE)
## End(Not run)
The expand.table
function expands a summary table of unique response
patterns to a full sized dataset. By default the response frequencies are
assumed to be on rightmost column of the input data, though this can be modified.
expand.table(tabdata, freq = colnames(tabdata)[ncol(tabdata)], sample = FALSE)
tabdata 
An object of class 
freq 
either a character vector specifying the column in 
sample 
logical; randomly switch the rows in the expanded table? This does not change the expanded data, only the row locations 
Returns a numeric matrix with all the response patterns.
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
data(LSAT7)
head(LSAT7) # frequency in rightmost column
LSAT7full < expand.table(LSAT7)
head(LSAT7full)
dim(LSAT7full)
# randomly switch rows in the expanded response table
LSAT7samp < expand.table(LSAT7, sample = TRUE)
head(LSAT7samp)
colMeans(LSAT7full)
colMeans(LSAT7samp) #equal
#
## Not run:
# Generate data from separate response pattern matrix and freq vector
# The following uses Table 2.1 from de Ayala (2009)
f < c(691,2280,242,235,158,184,1685,1053,134,462,92,65,571,79,87,41,1682,702,
370,63,626,412,166,52,28,15,2095,1219,500,187,40,3385)
pat < matrix(c(
0, 0, 0, 0, 0,
1, 0, 0, 0, 0,
0, 1, 0, 0, 0,
0, 0, 1, 0, 0,
0, 0, 0, 1, 0,
0, 0, 0, 0, 1,
1, 1, 0, 0, 0,
1, 0, 1, 0, 0,
0, 1, 1, 0, 0,
1, 0, 0, 1, 0,
0, 1, 0, 1, 0,
0, 0, 1, 1, 0,
1, 0, 0, 0, 1,
0, 1, 0, 0, 1,
0, 0, 1, 0, 1,
0, 0, 0, 1, 1,
1, 1, 1, 0, 0,
1, 1, 0, 1, 0,
1, 0, 1, 1, 0,
0, 1, 1, 1, 0,
1, 1, 0, 0, 1,
1, 0, 1, 0, 1,
1, 0, 0, 1, 1,
0, 1, 1, 0, 1,
0, 1, 0, 1, 1,
0, 0, 1, 1, 1,
1, 1, 1, 1, 0,
1, 1, 1, 0, 1,
1, 1, 0, 1, 1,
1, 0, 1, 1, 1,
0, 1, 1, 1, 1,
1, 1, 1, 1, 1), ncol=5, byrow=TRUE)
colnames(pat) < paste0('Item.', 1:5)
head(pat)
table2.1 < expand.table(pat, freq = f)
dim(table2.1)
## End(Not run)
Given an internal mirt object extracted from an estimated model compute the expected value for an item given the ability parameter(s).
expected.item(x, Theta, min = 0)
x 
an extracted internal mirt object containing item information (see 
Theta 
a vector (unidimensional) or matrix (multidimensional) of latent trait values 
min 
a constant value added to the expected values indicating the lowest theoretical category. Default is 0 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
mod < mirt(Science, 1)
extr.2 < extract.item(mod, 2)
Theta < matrix(seq(6,6, length.out=200))
expected < expected.item(extr.2, Theta, min(Science[,1])) #min() of first item
head(data.frame(expected, Theta=Theta))
Given an estimated model compute the expected test score. Returns the expected values in the same form as the data used to estimate the model.
expected.test(
x,
Theta,
group = NULL,
mins = TRUE,
individual = FALSE,
which.items = NULL,
probs.only = FALSE
)
x 
an estimated mirt object 
Theta 
a matrix of latent trait values (if a vector is supplied, will be coerced to a matrix with one column) 
group 
a number or character signifying which group the item should be extracted from (applies to 'MultipleGroupClass' objects only) 
mins 
logical; include the minimum value constants in the dataset. If FALSE, the expected values for each item are determined from the scoring 0:(ncat1) 
individual 
logical; return tracelines for individual items? 
which.items 
an integer vector indicating which items to include in the expected test score. Default uses all possible items 
probs.only 
logical; return the probability for each category instead of
traceline score functions? Only useful when 
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
dat < expand.table(deAyala)
model < 'F = 15
CONSTRAIN = (15, a1)'
mod < mirt(dat, model)
Theta < matrix(seq(6,6,.01))
tscore < expected.test(mod, Theta)
tail(cbind(Theta, tscore))
# use only first two items (i.e., a bundle)
bscore < expected.test(mod, Theta, which.items = 1:2)
tail(cbind(Theta, bscore))
# more lowlevel output (score and probabilty elements)
expected.test(mod, Theta, individual=TRUE)
expected.test(mod, Theta, individual=TRUE, probs.only=TRUE)
## End(Not run)
Extract a single group from an object defined by multipleGroup
.
extract.group(x, group)
x 
mirt model of class 'MultipleGroupClass' 
group 
the name of the group to extract 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
set.seed(12345)
a < matrix(abs(rnorm(15,1,.3)), ncol=1)
d < matrix(rnorm(15,0,.7),ncol=1)
itemtype < rep('2PL', nrow(a))
N < 1000
dataset1 < simdata(a, d, N, itemtype)
dataset2 < simdata(a, d, N, itemtype, mu = .1, sigma = matrix(1.5))
dat < rbind(dataset1, dataset2)
group < c(rep('D1', N), rep('D2', N))
models < 'F1 = 115'
mod_configural < multipleGroup(dat, models, group = group)
group.1 < extract.group(mod_configural, 'D1') #extract first group
summary(group.1)
plot(group.1)
## End(Not run)
Extract the internal mirt objects from any estimated model.
extract.item(x, item, group = NULL, drop.zeros = FALSE)
x 
mirt model of class 'SingleGroupClass' or 'MultipleGroupClass' 
item 
a number or character signifying which item to extract 
group 
a number signifying which group the item should be extracted from (applies to 'MultipleGroupClass' only) 
drop.zeros 
logical; drop slope values that are numerically close to zero to reduce
dimensionality? Useful in objects returned from 
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
mod < mirt(Science, 1)
extr.1 < extract.item(mod, 1)
## End(Not run)
A generic function to extract the internal objects from estimated models.
extract.mirt(x, what)
x 
mirt model of class 'SingleGroupClass', 'MultipleGroupClass', 'MixedClass' or 'DiscreteGroupClass' 
what 
a string indicating what to extract 
Objects which can be extracted from mirt objects include:
observed loglikelihood
log term contributed by prior parameter distributions
goodness of fit statistic
degrees of freedom
pvalue for G2 statistic
root meansquare error of approximation based on G2
CFI fit statistic
TLI fit statistic
AIC
BIC
sample size adjusted BIC
HQ
unrotated standardized loadings matrix
factor communality estimates
EM loglikelihood history
a tabular version of the raw response data input. Frequencies are stored
in freq
frequencies associated with tabdata
an integer vector indicating the number of unique elements for each item
an integer vector indicating the lowest category found in the input data
input model syntax
estimation method used
a vector of item types for each respective item (e.g., 'graded', '2PL', etc)
a vector of item names from the input data
a vector of factor names from the model definition
an integer vector indicating all valid row numbers used in the model estimation
(when all cases are used this will be 1:nrow(data)
)
raw input data of item responses
raw input data of data used as covariates
similar to tabdata
, however the responses have been transformed into
dummy coded variables
analogous to tabdatafull
, but for the raw input data instead of the
tabulated frequencies
if saved, extract the EM iteration history
expected probability of the unique response patterns
if supplied, the vector of survey weights used during estimation (NULL if missing)
a logical value indicating whether the model terminated within the convergence criteria
number of iterations it took to reach the convergence criteria
number of freely estimated parameters
vector containing uniquely estimated parameters
parameter covariance matrix (associated with parvec)
the condition number of the Hessian (if computed). Otherwise NA
a list of item parameter constraints to indicate which item parameters were equal during estimation
prior density distribution for the latent traits
posterior distribution for latent traits when using EM algorithm
if supplied, the data scoring key
number of latent traits/factors
number of items
number of groups
character vector of unique group names
a character vector indicating the group membership
a character vector indicating invariance
input from multipleGroup
a logical indicating whether the model passed the secondorder test based on the Hessian matrix. Indicates whether model is a potential local maximum solution
logical; check whether the supplemented EM information matrix converged. Will be NA
if not applicable
estimation time, broken into different sections
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
extract.group
, extract.item
, mod2values
## Not run:
mod < mirt(Science, 1)
extract.mirt(mod, 'logLik')
extract.mirt(mod, 'F')
#multiple group model
grp < rep(c('G1', 'G2'), each = nrow(Science)/2)
mod2 < multipleGroup(Science, 1, grp)
grp1 < extract.group(mod2, 1) #extract single group model
extract.mirt(mod2, 'parvec')
extract.mirt(grp1, 'parvec')
## End(Not run)
Implements the set of fixeditem calibration methods described by Kim (2006). The initial
calibrated model must be fitted via mirt
, is currently limited to
unidimensional models only, and should only be utilized when the new set of responses
are obtained from a population with similar distributional characteristics in the latent traits.
For more flexible calibration of items, including a fixeditem calibration variant involving
anchor items for equating, see multipleGroup
.
fixedCalib(
data,
model = 1,
old_mod,
PAU = "MWU",
NEMC = "MEM",
technical = list(),
...
)
data 
new data to be used for calibration. Note that to be consistent
with the 
model 
type of model to fit for the complete dataset (not that for the fixed items
in 
old_mod 
a model of class SingleGroupClass fitted using 
PAU 
prior ability update (PAU) approach. Supports none ( 
NEMC 
number of EM cycles (NEMC) to use for the tobeestimated parameters.
Supports one ( 
technical 
list of technical estimation arguments
(see 
... 
additional arguments to pass to 
Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 4(43), 355381.
## Not run:
# single factor
set.seed(12345)
J < 50
a < matrix(abs(rnorm(J,1,.3)), ncol=1)
d < matrix(rnorm(J,0,.7),ncol=1)
itemtype < rep('2PL', nrow(a))
# calibration data theta ~ N(0,1)
N < 3000
dataset1 < simdata(a, d, N = N, itemtype=itemtype)
# new data (again, theta ~ N(0,1))
dataset2 < simdata(a, d, N = 1000, itemtype=itemtype)
# last 40% of experimental items not given to calibration group
# (unobserved; hence removed)
dataset1 < dataset1[,c(J:(J*.6))]
head(dataset1)
#
# calibrated model from dataset1 only
mod < mirt(dataset1, model = 1)
coef(mod, simplify=TRUE)
# No Prior Weights Updating and One EM Cycle (NWUOEM)
NWU_OEM < fixedCalib(dataset2, model=1, old_mod=mod, PAU='NWU', NEMC='OEM')
coef(NWU_OEM, simplify=TRUE)
data.frame(coef(NWU_OEM, simplify=TRUE)$items[,c('a1','d')],
pop_a1=a, pop_d=d)
plot(NWU_OEM, type = 'empiricalhist')
# No Prior Weights Updating and Multiple EM Cycles (NWUMEM)
NWU_MEM < fixedCalib(dataset2, model = 1, old_mod = mod, PAU = 'NWU')
coef(NWU_MEM, simplify=TRUE)
data.frame(coef(NWU_MEM, simplify=TRUE)$items[,c('a1','d')],
pop_a1=a, pop_d=d)
plot(NWU_MEM, type = 'empiricalhist')
# One Prior Weights Updating and One EM Cycle (OWUOEM)
OWU_OEM < fixedCalib(dataset2, model=1, old_mod=mod, PAU='OWU', NEMC="OEM")
coef(OWU_OEM, simplify=TRUE)
data.frame(coef(OWU_OEM, simplify=TRUE)$items[,c('a1','d')], pop_a1=a, pop_d=d)
plot(OWU_OEM, type = 'empiricalhist')
# One Prior Weights Updating and Multiple EM Cycles (OWUMEM)
OWU_MEM < fixedCalib(dataset2, model = 1, old_mod = mod, PAU = 'OWU')
coef(OWU_MEM, simplify=TRUE)
data.frame(coef(OWU_MEM, simplify=TRUE)$items[,c('a1','d')],
pop_a1=a, pop_d=d)
plot(OWU_MEM, type = 'empiricalhist')
# Multiple Prior Weights Updating and Multiple EM Cycles (MWUMEM)
MWU_MEM < fixedCalib(dataset2, model = 1, old_mod = mod)
coef(MWU_MEM, simplify=TRUE)
data.frame(coef(MWU_MEM, simplify=TRUE)$items[,c('a1','d')],
pop_a1=a, pop_d=d)
plot(MWU_MEM, type = 'empiricalhist')
# factor scores distribution check
fs < fscores(MWU_MEM)
hist(fs)
c(mean_calib=mean(fs[1:N, ]), sd_calib=sd(fs[1:N, ]))
c(mean_exper=mean(fs[c(1:N), ]), sd_exper=sd(fs[c(1:N), ]))
############################
## Item length constraint example for each participant in the experimental
## items group. In this example, all participants were forced to have a test
## length of J=30, though the item pool had J=50 total items.
# new experimental data (relatively extreme, theta ~ N(.5,1.5))
dataset2 < simdata(a, d, N = 1000, itemtype=itemtype,
mu=.5, sigma=matrix(1.5))
# Add missing values to each participant in new dataset where individuals
# were randomly administered 10 experimental items, subject to the constraint
# that each participant received a test with J=30 items.
dataset2 < t(apply(dataset2, 1, function(x){
NA_precalib < sample(1:30, 10)
NA_experimental < sample(31:50, 10)
x[c(NA_precalib, NA_experimental)] < NA
x
}))
head(dataset2)
# check that all individuals had 30 items
all(rowSums(!is.na(dataset2)) == 30)
# Multiple Prior Weights Updating and Multiple EM Cycles (MWUMEM)
MWU_MEM < fixedCalib(dataset2, model = 1, old_mod = mod)
coef(MWU_MEM, simplify=TRUE)
data.frame(coef(MWU_MEM, simplify=TRUE)$items[,c('a1','d')],
pop_a1=a, pop_d=d)
plot(MWU_MEM, type = 'empiricalhist')
## factor scores check
fs < fscores(MWU_MEM)
hist(fs)
c(mean_calib=mean(fs[1:N, ]), sd_calib=sd(fs[1:N, ]))
## shrinkage, but generally different from calibrated sample
c(mean_exper=mean(fs[c(1:N), ]), sd_exper=sd(fs[c(1:N), ]))
## End(Not run)
Create expected values for fixed effects parameters in latent regression models.
fixef(x)
x 
an estimated model object from the 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Chalmers, R. P. (2015). Extended MixedEffects Item Response Models with the MHRM Algorithm. Journal of Educational Measurement, 52, 200222. doi:10.1111/jedm.12072
## Not run:
#simulate data
set.seed(1234)
N < 1000
# covariates
X1 < rnorm(N); X2 < rnorm(N)
covdata < data.frame(X1, X2)
Theta < matrix(0.5 * X1 + 1 * X2 + rnorm(N, sd = 0.5))
#items and response data
a < matrix(1, 20); d < matrix(rnorm(20))
dat < simdata(a, d, 1000, itemtype = '2PL', Theta=Theta)
#conditional model using X1 and X2 as predictors of Theta
mod1 < mirt(dat, 1, 'Rasch', covdata=covdata, formula = ~ X1 + X2)
#latent regression fixed effects (i.e., expected values)
fe < fixef(mod1)
head(fe)
# with mixedmirt()
mod1b < mixedmirt(dat, covdata, 1, lr.fixed = ~ X1 + X2, fixed = ~ 0 + items)
fe2 < fixef(mod1b)
head(fe2)
## End(Not run)
Computes MAP, EAP, ML (Embretson & Reise, 2000), EAP for sumscores (Thissen et al., 1995),
or WLE (Warm, 1989) factor scores with a multivariate normal
prior distribution using equally spaced quadrature. EAP scores for models with more than
three factors are generally not recommended since the integration grid becomes very large,
resulting in slower estimation and less precision if the quadpts
are too low.
Therefore, MAP scores should be used instead of EAP scores for higher dimensional models.
Multiple imputation variants are possible for each estimator if a parameter
information matrix was computed, which are useful if the sample size/number of items were small.
As well, if the model contained latent regression predictors this information will
be used in computing MAP and EAP estimates (for these models, full.scores=TRUE
will always be used). Finally, plausible value imputation is also available, and will also account
for latent regression predictor effects.
fscores(
object,
method = "EAP",
full.scores = TRUE,
rotate = "oblimin",
Target = NULL,
response.pattern = NULL,
append_response.pattern = FALSE,
na.rm = FALSE,
plausible.draws = 0,
plausible.type = "normal",
quadpts = NULL,
item_weights = rep(1, extract.mirt(object, "nitems")),
returnER = FALSE,
T_as_X = FALSE,
return.acov = FALSE,
mean = NULL,
cov = NULL,
covdata = NULL,
verbose = TRUE,
full.scores.SE = FALSE,
theta_lim = c(6, 6),
MI = 0,
use_dentype_estimate = FALSE,
QMC = FALSE,
custom_den = NULL,
custom_theta = NULL,
min_expected = 1,
max_theta = 20,
start = NULL,
...
)
object 
a computed model object of class 
method 
type of factor score estimation method. Can be:

full.scores 
if 
rotate 
prior rotation to be used when estimating the factor scores. See

Target 
target rotation; see 
response.pattern 
an optional argument used to calculate the factor scores and standard errors for a given response vector or matrix/data.frame 
append_response.pattern 
logical; should the inputs from 
na.rm 
logical; remove rows with any missing values? This is generally not required due to the nature of computing factors scores, however for the "EAPsum" method this may be necessary to ensure that the sumscores correspond to the same composite score 
plausible.draws 
number of plausible values to draw for future researchers
to perform secondary analyses of the latent trait scores. Typically used in conjunction
with latent regression predictors (see 
plausible.type 
type of plausible values to obtain. Can be either 
quadpts 
number of quadrature to use per dimension. If not specified, a suitable
one will be created which decreases as the number of dimensions increases
(and therefore for estimates such as EAP, will be less accurate). This is determined from
the switch statement

item_weights 
a userdefined weight vector used in the likelihood expressions to add more/less weight for a given observed response. Default is a vector of 1's, indicating that all the items receive the same weight 
returnER 
logical; return empirical reliability (also known as marginal reliability) estimates as a numeric values? 
T_as_X 
logical; should the observed variance be equal to

return.acov 
logical; return a list containing covariance matrices instead of factors
scores? 
mean 
a vector for custom latent variable means. If NULL, the default for 'group' values from the computed mirt object will be used 
cov 
a custom matrix of the latent variable covariance matrix. If NULL, the default for 'group' values from the computed mirt object will be used 
covdata 
when latent regression model has been fitted, and the 
verbose 
logical; print verbose output messages? 
full.scores.SE 
logical; when 
theta_lim 
lower and upper range to evaluate latent trait integral for each dimension. If omitted, a range will be generated automatically based on the number of dimensions 
MI 
a number indicating how many multiple imputation draws to perform. Default is 0, indicating that no MI draws will be performed 
use_dentype_estimate 
logical; if the density of the latent trait was estimated in the model (e.g., via Davidian curves or empirical histograms), should this information be used to compute the latent trait estimates? Only applicable for EAPbased estimates (EAP, EAPsum, and plausible) 
QMC 
logical; use quasiMonte Carlo integration? If 
custom_den 
a function used to define the integration density (if required). The NULL default assumes that the multivariate normal distribution with the 'GroupPars' hyperparameters are used. At the minimum must be of the form:
where Theta is a matrix of latent trait values (will be a grid of values
if 
custom_theta 
a matrix of custom integration nodes to use instead of the default, where each column corresponds to the respective dimension in the model 
min_expected 
when computing goodness of fit tests when 
max_theta 
the maximum/minimum value any given factor score estimate will achieve using any modal estimator method (e.g., MAP, WLE, ML) 
start 
a matrix of starting values to use for iterative estimation methods. Default
will start at a vector of 0's for each response pattern, or will start at the EAP
estimates (unidimensional models only). Must be in the form that matches

... 
additional arguments to be passed to 
The function will return either a table with the computed scores and standard errors,
the original data matrix with scores appended to the rightmost column, or the scores only. By
default the latent means and covariances are determined from the estimated object,
though these can be overwritten. Iterative estimation methods can be estimated
in parallel to decrease estimation times if a mirtCluster
object is available.
If the input object is a discrete latent class object estimated from mdirt
then the returned results will be with respect to the posterior classification for each
individual. The method inputs for 'DiscreteClass'
objects may only be 'EAP'
,
for posterior classification of each response pattern, or 'EAPsum'
for posterior
classification based on the raw sumscore. For more information on these algorithms refer to
the mirtCAT
package and the associated JSS paper (Chalmers, 2016).
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Chalmers, R. P. (2016). Generating Adaptive and NonAdaptive Test Interfaces for Multidimensional Item Response Theory Applications. Journal of Statistical Software, 71(5), 139. doi:10.18637/jss.v071.i05
Embretson, S. E. & Reise, S. P. (2000). Item Response Theory for Psychologists. Erlbaum.
Thissen, D., Pommerich, M., Billeaud, K., & Williams, V. S. L. (1995). Item Response Theory for Scores on Tests Including Polytomous Items with Ordered Responses. Applied Psychological Measurement, 19, 3949.
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427450.
mod < mirt(Science, 1)
tabscores < fscores(mod, full.scores = FALSE)
head(tabscores)
# convert scores into expected total score information with 95% CIs
E.total < expected.test(mod, Theta=tabscores[,'F1'])
E.total_2.5 < expected.test(mod, Theta=tabscores[,'F1'] +
tabscores[,'SE_F1'] * qnorm(.05/2))
E.total_97.5 < expected.test(mod, Theta=tabscores[,'F1'] +
tabscores[,'SE_F1'] * qnorm(1.05/2))
data.frame(Total_score=rowSums(tabscores[,1:4]),
E.total, E.total_2.5, E.total_97.5) > head()
## Not run:
fullscores < fscores(mod)
fullscores_with_SE < fscores(mod, full.scores.SE=TRUE)
head(fullscores)
head(fullscores_with_SE)
# convert scores into expected total score information with 95% CIs
E.total < expected.test(mod, Theta=fullscores[,'F1'])
E.total_2.5 < expected.test(mod, Theta=fullscores_with_SE[,'F1'] +
fullscores_with_SE[,'SE_F1'] * qnorm(.05/2))
E.total_97.5 < expected.test(mod, Theta=fullscores_with_SE[,'F1'] +
fullscores_with_SE[,'SE_F1'] * qnorm(1.05/2))
data.frame(Total_score=rowSums(Science),
E.total, E.total_2.5, E.total_97.5) > head()
# change method argument to use MAP estimates
fullscores < fscores(mod, method='MAP')
head(fullscores)
# calculate MAP for a given response vector
fscores(mod, method='MAP', response.pattern = c(1,2,3,4))
# or matrix
fscores(mod, method='MAP', response.pattern = rbind(c(1,2,3,4), c(2,2,1,3)))
# return only the scores and their SEs
fscores(mod, method='MAP', response.pattern = c(1,2,3,4))
# use custom latent variable properties (diffuse prior for MAP is very close to ML)
fscores(mod, method='MAP', cov = matrix(1000), full.scores = FALSE)
fscores(mod, method='ML', full.scores = FALSE)
# EAPsum table of values based on total scores
(fs < fscores(mod, method = 'EAPsum', full.scores = FALSE))
# convert expected counts back into marginal probability distribution
within(fs,
`P(y)` < expected / sum(observed))
# list of error VCOV matrices for EAPsum (works for other estimators as well)
acovs < fscores(mod, method = 'EAPsum', full.scores = FALSE, return.acov = TRUE)
acovs
# WLE estimation, run in parallel using available cores
if(interactive()) mirtCluster()
head(fscores(mod, method='WLE', full.scores = FALSE))
# multiple imputation using 30 draws for EAP scores. Requires information matrix
mod < mirt(Science, 1, SE=TRUE)
fs < fscores(mod, MI = 30)
head(fs)
# plausible values for future work
pv < fscores(mod, plausible.draws = 5)
lapply(pv, function(x) c(mean=mean(x), var=var(x), min=min(x), max=max(x)))
## define a custom_den function. EAP with a uniform prior between 3 and 3
fun < function(Theta, ...) as.numeric(dunif(Theta, min = 3, max = 3))
head(fscores(mod, custom_den = fun))
# custom MAP prior: standard truncated normal between 5 and 2
library(msm)
# need the :: scope for parallel to see the function (not require if no mirtCluster() defined)
fun < function(Theta, ...) msm::dtnorm(Theta, mean = 0, sd = 1, lower = 2, upper = 5)
head(fscores(mod, custom_den = fun, method = 'MAP', full.scores = FALSE))
####################
# scoring via response.pattern input (with latent regression structure)
# simulate data
set.seed(1234)
N < 1000
# covariates
X1 < rnorm(N); X2 < rnorm(N)
covdata < data.frame(X1, X2)
Theta < matrix(0.5 * X1 + 1 * X2 + rnorm(N, sd = 0.5))
# items and response data
a < matrix(1, 20); d < matrix(rnorm(20))
dat < simdata(a, d, 1000, itemtype = '2PL', Theta=Theta)
# conditional model using X1 and X2 as predictors of Theta
mod < mirt(dat, 1, 'Rasch', covdata=covdata, formula = ~ X1 + X2)
coef(mod, simplify=TRUE)
# all EAP estimates that include latent regression information
fs < fscores(mod, full.scores.SE=TRUE)
head(fs)
# score only two response patterns
rp < dat[1:2, ]
cd < covdata[1:2, ]
fscores(mod, response.pattern=rp, covdata=cd)
fscores(mod, response.pattern=rp[2,], covdata=cd[2,]) # just one pattern
## End(Not run)
Function provides the four generalized item difficulty representations for polytomous response models described by Ali, Chang, and Anderson (2015). These estimates are used to gauge how difficult a polytomous item may be.
gen.difficulty(mod, type = "IRF", interval = c(30, 30), ...)
mod 
a single factor model estimated by 
type 
type of generalized difficulty parameter to report.
Can be 
interval 
interval range to search for 
... 
additional arguments to pass to 
Phil Chalmers [email protected]
Ali, U. S., Chang, H.H., & Anderson, C. J. (2015). Location indices for ordinal polytomous items based on item response theory (Research Report No. RR1520). Princeton, NJ: Educational Testing Service. http://dx.doi.org/10.1002/ets2.12065
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
mod < mirt(Science, 1)
coef(mod, simplify=TRUE, IRTpars = TRUE)$items
gen.difficulty(mod)
gen.difficulty(mod, type = 'mean')
# also works for dichotomous items (though this is unnecessary)
dat < expand.table(LSAT7)
mod < mirt(dat, 1)
coef(mod, simplify=TRUE, IRTpars = TRUE)$items
gen.difficulty(mod)
gen.difficulty(mod, type = 'mean')
## End(Not run)
Given an estimated model from any of mirt's model fitting functions and an estimate of the
latent trait, impute plausible missing data values. Returns the original data in a
data.frame
without any NA values. If a list of Theta
values is supplied then a
list of complete datasets is returned instead.
imputeMissing(x, Theta, warn = TRUE, ...)
x 
an estimated model x from the mirt package 
Theta 
a matrix containing the estimates of the latent trait scores
(e.g., via 
warn 
logical; print warning messages? 
... 
additional arguments to pass 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
dat < expand.table(LSAT7)
(original < mirt(dat, 1))
NAperson < sample(1:nrow(dat), 20, replace = TRUE)
NAitem < sample(1:ncol(dat), 20, replace = TRUE)
for(i in 1:20)
dat[NAperson[i], NAitem[i]] < NA
(mod < mirt(dat, 1))
scores < fscores(mod, method = 'MAP')
# reestimate imputed dataset (good to do this multiple times and average over)
fulldata < imputeMissing(mod, scores)
(fullmod < mirt(fulldata, 1))
# with multipleGroup
set.seed(1)
group < sample(c('group1', 'group2'), 1000, TRUE)
mod2 < multipleGroup(dat, 1, group, TOL=1e2)
fs < fscores(mod2)
fulldata2 < imputeMissing(mod2, fs)
## End(Not run)
Computes itemfit statistics for a variety of unidimensional and multidimensional models.
Poorly fitting items should be inspected with the empirical plots/tables
for unidimensional models, otherwise itemGAM
can be used to diagnose
where the functional form of the IRT model was misspecified, or models can be refit using
more flexible semiparametric response models (e.g., itemtype = 'spline'
).
If the latent trait density was approximated (e.g., Davidian curves, Empirical histograms, etc)
then passing use_dentype_estimate = TRUE
will use the internally saved quadrature and
density components (where applicable). Currently, only SX2 statistic supported for
mixture IRT models. Finally, where applicable the root meansquare error of approximation (RMSEA)
is reported to help gauge the magnitude of item misfit.
itemfit(
x,
fit_stats = "S_X2",
which.items = 1:extract.mirt(x, "nitems"),
na.rm = FALSE,
p.adjust = "none",
group.bins = 10,
group.size = NA,
group.fun = mean,
mincell = 1,
mincell.X2 = 2,
return.tables = FALSE,
pv_draws = 30,
boot = 1000,
boot_dfapprox = 200,
S_X2.plot = NULL,
S_X2.plot_raw.score = TRUE,
ETrange = c(2, 2),
ETpoints = 11,
empirical.plot = NULL,
empirical.CI = 0.95,
empirical.poly.collapse = FALSE,
method = "EAP",
Theta = NULL,
par.strip.text = list(cex = 0.7),
par.settings = list(strip.background = list(col = "#9ECAE1"), strip.border = list(col =
"black")),
auto.key = list(space = "right", points = FALSE, lines = TRUE),
...
)
x 
a computed model object of class 
fit_stats 
a character vector indicating which fit statistics should be computed. Supported inputs are:
Note that 'S_X2' and 'Zh' cannot be computed when there are missing response data (i.e., will require multipleimputation/rowremoval techniques). 
which.items 
an integer vector indicating which items to test for fit. Default tests all possible items 
na.rm 
logical; remove rows with any missing values? This is required for methods such
as SX2 because they require the "EAPsum" method from 
p.adjust 
method to use for adjusting all pvalues for each respective item fit
statistic (see 
group.bins 
the number of bins to use for X2 and G2. For example,
setting 
group.size 
approximate size of each group to be used in calculating the 
group.fun 
function used when 
mincell 
the minimum expected cell size to be used in the SX2 computations. Tables will be collapsed across items first if polytomous, and then across scores if necessary 
mincell.X2 
the minimum expected cell size to be used in the X2 computations. Tables will be collapsed if polytomous, however if this condition can not be met then the group block will be omitted in the computations 
return.tables 
logical; return tables when investigating 
pv_draws 
number of plausiblevalue draws to obtain for PV_Q1 and PV_Q1* 
boot 
number of parametric bootstrap samples to create for PV_Q1* and X2* 
boot_dfapprox 
number of parametric bootstrap samples to create for the X2*_df statistic to approximate the scaling factor for X2* as well as the scaled degrees of freedom estimates 
S_X2.plot 
argument input is the same as 
S_X2.plot_raw.score 
logical; use the rawscore information in the plot in stead of the latent
trait scale score? Default is 
ETrange 
rangone of integration nodes for Stone's X2* statistic 
ETpoints 
number of integration nodes to use for Stone's X2* statistic 
empirical.plot 
a single numeric value or character of the item name indicating which
item to plot (via 
empirical.CI 
a numeric value indicating the width of the empirical confidence interval
ranging between 0 and 1 (default of 0 plots not interval). For example, a 95
interval would be plotted when 
empirical.poly.collapse 
logical; collapse polytomous item categories to for expected scoring
functions for empirical plots? Default is 
method 
type of factor score estimation method. See 
Theta 
a matrix of factor scores for each person used for statistics that require
empirical estimates. If supplied, arguments typically passed to 
par.strip.text 
plotting argument passed to 
par.settings 
plotting argument passed to 
auto.key 
plotting argument passed to 
... 
additional arguments to be passed to 
Phil Chalmers [email protected]
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 2951.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Chalmers, R. P. & Ng, V. (2017). PlausibleValue Imputation Statistics for Detecting Item Misfit. Applied Psychological Measurement, 41, 372387. doi:10.1177/0146621617692079
Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 6786.
Kang, T. & Chen, Troy, T. (2007). An investigation of the performance of the generalized SX2 itemfit index for polytomous IRT models. ACT
McKinley, R., & Mills, C. (1985). A comparison of several goodnessoffit statistics. Applied Psychological Measurement, 9, 4957.
Orlando, M. & Thissen, D. (2000). Likelihoodbased item fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 5064.
Reise, S. P. (1990). A comparison of item and personfit methods of assessing modeldata fit in IRT. Applied Psychological Measurement, 14, 127137.
Stone, C. A. (2000). Monte Carlo Based Null Distribution for an Alternative GoodnessofFit Test Statistics in IRT Models. Journal of Educational Measurement, 37, 5875.
Wright B. D. & Masters, G. N. (1982). Rating scale analysis. MESA Press.
Yen, W. M. (1981). Using simulation results to choose a latent trait model. Applied Psychological Measurement, 5, 245262.
## Not run:
P < function(Theta){exp(Theta^2 * 1.2  1) / (1 + exp(Theta^2 * 1.2  1))}
#make some data
set.seed(1234)
a < matrix(rlnorm(20, meanlog=0, sdlog = .1),ncol=1)
d < matrix(rnorm(20),ncol=1)
Theta < matrix(rnorm(2000))
items < rep('2PL', 20)
ps < P(Theta)
baditem < numeric(2000)
for(i in 1:2000)
baditem[i] < sample(c(0,1), 1, prob = c(1ps[i], ps[i]))
data < cbind(simdata(a,d, 2000, items, Theta=Theta), baditem=baditem)
x < mirt(data, 1)
raschfit < mirt(data, 1, itemtype='Rasch')
fit < itemfit(x)
fit
# pvalue adjustment
itemfit(x, p.adjust='fdr')
# two different fit stats (with/without pvalue adjustment)
itemfit(x, c('S_X2' ,'X2'), p.adjust='fdr')
itemfit(x, c('S_X2' ,'X2'))
# Conditional sumscore plot from SX2 information
itemfit(x, S_X2.plot = 1) # good fit
itemfit(x, S_X2.plot = 2) # good fit
itemfit(x, S_X2.plot = 21) # bad fit
itemfit(x, 'X2') # just X2
itemfit(x, 'X2', method = 'ML') # X2 with maximumlikelihood estimates for traits
itemfit(x, group.bins=15, empirical.plot = 1, method = 'ML') #empirical item plot with 15 points
itemfit(x, group.bins=15, empirical.plot = 21, method = 'ML')
# PV and X2* statistics (parametric bootstrap stats not run to save time)
itemfit(x, 'PV_Q1')
if(interactive()) mirtCluster() # improve speed of bootstrap samples by running in parallel
# itemfit(x, 'PV_Q1*')
# itemfit(x, 'X2*') # Stone's 1993 statistic
# itemfit(x, 'X2*_df') # Stone's 2000 scaled statistic with df estimate
# empirical tables for X2 statistic
tabs < itemfit(x, 'X2', return.tables=TRUE, which.items = 1)
tabs
#infit/outfit statistics. method='ML' agrees better with eRm package
itemfit(raschfit, 'infit', method = 'ML') #infit and outfit stats
#same as above, but inputting ML estimates instead (saves time for reuse)
Theta < fscores(raschfit, method = 'ML')
itemfit(raschfit, 'infit', Theta=Theta)
itemfit(raschfit, empirical.plot=1, Theta=Theta)
itemfit(raschfit, 'X2', return.tables=TRUE, Theta=Theta, which.items=1)
# fit a new more flexible model for the misfitting item
itemtype < c(rep('2PL', 20), 'spline')
x2 < mirt(data, 1, itemtype=itemtype)
itemfit(x2)
itemplot(x2, 21)
anova(x, x2)
#
#similar example to Kang and Chen 2007
a < matrix(c(.8,.4,.7, .8, .4, .7, 1, 1, 1, 1))
d < matrix(rep(c(2.0,0.0,1,1.5),10), ncol=4, byrow=TRUE)
dat < simdata(a,d,2000, itemtype = rep('graded', 10))
head(dat)
mod < mirt(dat, 1)
itemfit(mod)
itemfit(mod, 'X2') # less useful given inflated Type I error rates
itemfit(mod, empirical.plot = 1)
itemfit(mod, empirical.plot = 1, empirical.poly.collapse=TRUE)
# collapsed tables (see mincell.X2) for X2 and G2
itemfit(mod, 'X2', return.tables = TRUE, which.items = 1)
mod2 < mirt(dat, 1, 'Rasch')
itemfit(mod2, 'infit', method = 'ML')
# massive list of tables for SX2
tables < itemfit(mod, return.tables = TRUE)
#observed and expected total score patterns for item 1 (post collapsing)
tables$O[[1]]
tables$E[[1]]
# can also select specific items
# itemfit(mod, return.tables = TRUE, which.items=1)
# fit stats with missing data (run in parallel using all cores)
dat[sample(1:prod(dim(dat)), 100)] < NA
raschfit < mirt(dat, 1, itemtype='Rasch')
# use only valid data by removing rows with missing terms
itemfit(raschfit, c('S_X2', 'infit'), na.rm = TRUE)
# note that X2, G2, PVQ1, and X2* do not require complete datasets
thetas < fscores(raschfit, method = 'ML') # save for faster computations
itemfit(raschfit, c('X2', 'G2'), Theta=thetas)
itemfit(raschfit, empirical.plot=1, Theta=thetas)
itemfit(raschfit, 'X2', return.tables=TRUE, which.items=1, Theta=thetas)
## End(Not run)
This function uses a generalized additive model (GAM) to estimate response curves for items that do not seem to fit well in a given model. Using a stable axillary model, traceline functions for poorly fitting dichotomous or polytomous items can be inspected using point estimates (or plausible values) of the latent trait. Plots of the tracelines and their associated standard errors are available to help interpret the misfit. This function may also be useful when adding new items to an existing, well established set of items, especially when the parametric form of the items under investigation are unknown.
itemGAM(
item,
Theta,
formula = resp ~ s(Theta, k = 10),
CI = 0.95,
theta_lim = c(3, 3),
return.models = FALSE,
...
)
## S3 method for class 'itemGAM'
plot(
x,
y = NULL,
par.strip.text = list(cex = 0.7),
par.settings = list(strip.background = list(col = "#9ECAE1"), strip.border = list(col =
"black")),
auto.key = list(space = "right", points = FALSE, lines = TRUE),
...
)
item 
a single poorly fitting item to be investigated. Can be a vector or matrix 
Theta 
a list or matrix of latent trait estimates typically returned from 
formula 
an R formula to be passed to the 
CI 
a number ranging from 0 to 1 indicating the confidence interval range. Default provides the 95 percent interval 
theta_lim 
range of latent trait scores to be evaluated 
return.models 
logical; return a list of GAM models for each category? Useful when the GAMs should be inspected directly, but also when fitting multidimensional models (this is set to TRUE automatically for multidimensional models) 
... 
additional arguments to be passed to 
x 
an object of class 'itemGAM' 
y 
a 
par.strip.text 
plotting argument passed to 
par.settings 
plotting argument passed to 
auto.key 
plotting argument passed to 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
set.seed(10)
N < 1000
J < 30
a < matrix(1, J)
d < matrix(rnorm(J))
Theta < matrix(rnorm(N, 0, 1.5))
dat < simdata(a, d, N, itemtype = '2PL', Theta=Theta)
# make a bad item
ps < exp(Theta^2 + Theta) / (1 + exp(Theta^2 + Theta))
item1 < sapply(ps, function(x) sample(c(0,1), size = 1, prob = c(1x, x)))
ps2 < exp(2 * Theta^2 + Theta + .5 * Theta^3) / (1 + exp(2 * Theta^2 + Theta + .5 * Theta^3))
item2 < sapply(ps2, function(x) sample(c(0,1), size = 1, prob = c(1x, x)))
# how the actual item looks in the population
plot(Theta, ps, ylim = c(0,1))
plot(Theta, ps2, ylim = c(0,1))
baditems < cbind(item1, item2)
newdat < cbind(dat, baditems)
badmod < mirt(newdat, 1)
itemfit(badmod) #clearly a bad fit for the last two items
mod < mirt(dat, 1) #fit a model that does not contain the bad items
itemfit(mod)
#### Pure nonparametric way of investigating the items
library(KernSmoothIRT)
ks < ksIRT(newdat, rep(1, ncol(newdat)), 1)
plot(ks, item=c(1,31,32))
par(ask=FALSE)
# Using point estimates from the model
Theta < fscores(mod)
IG0 < itemGAM(dat[,1], Theta) #good item
IG1 < itemGAM(baditems[,1], Theta)
IG2 < itemGAM(baditems[,2], Theta)
plot(IG0)
plot(IG1)
plot(IG2)
# same as above, but with plausible values to obtain the standard errors
set.seed(4321)
ThetaPV < fscores(mod, plausible.draws=10)
IG0 < itemGAM(dat[,1], ThetaPV) #good item
IG1 < itemGAM(baditems[,1], ThetaPV)
IG2 < itemGAM(baditems[,2], ThetaPV)
plot(IG0)
plot(IG1)
plot(IG2)
## for polytomous test items
SAT12[SAT12 == 8] < NA
dat < key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
dat < dat[,32]
mod < mirt(dat, 1)
# Kernal smoothing is very sensitive to which category is selected as 'correct'
# 5th category as correct
ks < ksIRT(cbind(dat, SAT12[,32]), c(rep(1, 31), 5), 1)
plot(ks, items = c(1,2,32))
# 3rd category as correct
ks < ksIRT(cbind(dat, SAT12[,32]), c(rep(1, 31), 3), 1)
plot(ks, items = c(1,2,32))
# splines approach
Theta < fscores(mod)
IG < itemGAM(SAT12[,32], Theta)
plot(IG)
set.seed(1423)
ThetaPV < fscores(mod, plausible.draws=10)
IG2 < itemGAM(SAT12[,32], ThetaPV)
plot(IG2)
# assuming a simple increasing parametric form (like in a standard IRT model)
IG3 < itemGAM(SAT12[,32], Theta, formula = resp ~ Theta)
plot(IG3)
IG3 < itemGAM(SAT12[,32], ThetaPV, formula = resp ~ Theta)
plot(IG3)
### multidimensional example by returning the GAM objects
mod2 < mirt(dat, 2)
Theta < fscores(mod2)
IG4 < itemGAM(SAT12[,32], Theta, formula = resp ~ s(Theta1, k=10) + s(Theta2, k=10),
return.models=TRUE)
names(IG4)
plot(IG4[[1L]], main = 'Category 1')
plot(IG4[[2L]], main = 'Category 2')
plot(IG4[[3L]], main = 'Category 3')
## End(Not run)
Given an internal mirt item object extracted by using extract.item
,
compute the item information.
iteminfo(x, Theta, degrees = NULL, total.info = TRUE, multidim_matrix = FALSE)
x 
an extracted internal mirt object containing item information (see 
Theta 
a vector (unidimensional) or matrix (multidimensional) of latent trait values 
degrees 
a vector of angles in degrees that are between 0 and 90. Only applicable when the input object is multidimensional 
total.info 
logical; return the total information curve for the item? If 
multidim_matrix 
logical; compute the information matrix for each row in 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
mod < mirt(Science, 1)
extr.2 < extract.item(mod, 2)
Theta < matrix(seq(4,4, by = .1))
info.2 < iteminfo(extr.2, Theta)
#do something with the info?
plot(Theta, info.2, type = 'l', main = 'Item information')
## Not run:
#category information curves
cat.info < iteminfo(extr.2, Theta, total.info = FALSE)
plot(Theta, cat.info[,1], type = 'l', ylim = c(0, max(cat.info)),
ylab = 'info', main = 'Category information')
for(i in 2:ncol(cat.info))
lines(Theta, cat.info[,i], col = i)
## Customized test information plot
T1 < T2 < 0
dat < expand.table(LSAT7)
mod1 < mirt(dat, 1)
mod2 < mirt(dat, 1, 'Rasch')
for(i in 1:5){
T1 < T1 + iteminfo(extract.item(mod1, i), Theta)
T2 < T2 + iteminfo(extract.item(mod2, i), Theta)
}
plot(Theta, T2/T1, type = 'l', ylab = 'Relative Test Information', las = 1)
lines(Theta, T1/T1, col = 'red')
# multidimensional
mod < mirt(dat, 2, TOL=1e2)
ii < extract.item(mod, 1)
Theta < as.matrix(expand.grid(4:4, 4:4))
iteminfo(ii, Theta, degrees=c(45,45)) # equal angle
iteminfo(ii, Theta, degrees=c(90,0)) # first dimension only
# information matrices
iteminfo(ii, Theta, multidim_matrix = TRUE)
iteminfo(ii, Theta[1, , drop=FALSE], multidim_matrix = TRUE)
## End(Not run)
itemplot
displays various item based IRT plots, with special options for plotting items
that contain several 0 slope parameters. Supports up to three dimensional models.
itemplot(
object,
item,
type = "trace",
degrees = 45,
CE = FALSE,
CEalpha = 0.05,
CEdraws = 1000,
drop.zeros = FALSE,
theta_lim = c(6, 6),
shiny = FALSE,
rot = list(xaxis = 70, yaxis = 30, zaxis = 10),
par.strip.text = list(cex = 0.7),
npts = 200,
par.settings = list(strip.background = list(col = "#9ECAE1"), strip.border = list(col =
"black")),
auto.key = list(space = "right", points = FALSE, lines = TRUE),
...
)
object 
a computed model object of class 
item 
a single numeric value, or the item name, indicating which item to plot 
type 
plot type to use, information ( 
degrees 
the degrees argument to be used if there are two or three factors.
See 
CE 
logical; plot confidence envelope? 
CEalpha 
area remaining in the tail for confidence envelope. Default gives 95% confidence region 
CEdraws 
draws number of draws to use for confidence envelope 
drop.zeros 
logical; drop slope values that are numerically close to zero to reduce
dimensionality? Useful in objects returned from 
theta_lim 
lower and upper limits of the latent trait (theta) to be evaluated, and is
used in conjunction with 
shiny 
logical; run interactive display for item plots using the 
rot 
a list of rotation coordinates to be used for 3 dimensional plots 
par.strip.text 
plotting argument passed to 
npts 
number of quadrature points to be used for plotting features. Larger values make plots look smoother 
par.settings 
plotting argument passed to 
auto.key 
plotting argument passed to 
... 
additional arguments to be passed to 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
data(LSAT7)
fulldata < expand.table(LSAT7)
mod1 < mirt(fulldata,1,SE=TRUE)
mod2 < mirt(fulldata,1, itemtype = 'Rasch')
mod3 < mirt(fulldata,2)
itemplot(mod1, 2)
itemplot(mod1, 2, CE = TRUE)
itemplot(mod1, 2, type = 'info')
itemplot(mod1, 2, type = 'info', CE = TRUE)
mods < list(twoPL = mod1, onePL = mod2)
itemplot(mods, 1, type = 'RE')
# multidimensional
itemplot(mod3, 4, type = 'info')
itemplot(mod3, 4, type = 'info',
col.regions = colorRampPalette(c("white", "red"))(100))
itemplot(mod3, 4, type = 'infocontour')
itemplot(mod3, 4, type = 'tracecontour')
# polytomous items
pmod < mirt(Science, 1, SE=TRUE)
itemplot(pmod, 3)
itemplot(pmod, 3, type = 'threshold')
itemplot(pmod, 3, CE = TRUE)
itemplot(pmod, 3, type = 'score')
itemplot(pmod, 3, type = 'score', CE = TRUE)
itemplot(pmod, 3, type = 'infotrace')
itemplot(pmod, 3, type = 'infocat')
# use the directlabels package to put labels on tracelines
library(directlabels)
plt < itemplot(pmod, 3)
direct.label(plt, 'top.points')
# change colour theme of plots
bwtheme < standard.theme("pdf", color=FALSE)
plot(pmod, type='trace', par.settings=bwtheme)
itemplot(pmod, 1, type = 'trace', par.settings=bwtheme)
# additional modifications can be made via update().
# See ?update.trellis for further documentation
(plt < itemplot(pmod, 1))
update(plt, ylab = expression(Prob(theta))) # ylab changed
# infoSE plot
itemplot(pmod, 1, type = 'infoSE')
# uncomment to run interactive shiny applet
# itemplot(shiny = TRUE)
## End(Not run)
Function to compute generic item summary statistics that do not require prior fitting of IRT models. Contains information about coefficient alpha (and alpha if an item is deleted), mean/SD and frequency of total scores, reduced itemtotal correlations, average/sd of the correlation between items, response frequencies, and conditional mean/sd information given the unweighted sum scores.
itemstats(
data,
group = NULL,
use_ts = TRUE,
proportions = TRUE,
ts.tables = FALSE
)
data 
An object of class 
group 
optional grouping variable to condition on when computing summary information 
use_ts 
logical; include information that is conditional on a meaningful total score? 
proportions 
logical; include response proportion information for each item? 
ts.tables 
logical; include mean/sd summary information pertaining to the unweighted total score? 
Returns a list containing the summary statistics
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
# dichotomous data example
LSAT7full < expand.table(LSAT7)
head(LSAT7full)
itemstats(LSAT7full)
# behaviour with missing data
LSAT7full[1:5,1] < NA
itemstats(LSAT7full)
# data with no meaningful total score
head(SAT12)
itemstats(SAT12, use_ts=FALSE)
# extra total scores tables
dat < key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,
5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
itemstats(dat, ts.tables=TRUE)
# grouping information
group < gl(2, 300, labels=c('G1', 'G2'))
itemstats(dat, group=group)
#####
# polytomous data example
itemstats(Science)
# polytomous data with missing
newScience < Science
newScience[1:5,1] < NA
itemstats(newScience)
# unequal categories
newScience[,1] < ifelse(Science[,1] == 1, NA, Science[,1])
itemstats(newScience)
merged < data.frame(LSAT7full[1:392,], Science)
itemstats(merged)
The key2binary
function will convert response pattern data to a
dichotomous format, given a response key.
key2binary(fulldata, key, score_missing = FALSE)
fulldata 
an object of class 
key 
a vector or matrix consisting of the 'correct' response to the items. Each
value/row corresponds to each column in 
score_missing 
logical; should missing data elements be returned as incorrect (i.e., 0)?
If 
Returns a numeric matrix with all the response patterns in dichotomous format
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
data(SAT12)
head(SAT12)
key < c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5)
dicho.SAT12 < key2binary(SAT12, key)
head(dicho.SAT12)
# multiple scoring keys
key2 < cbind(c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5),
c(2,3,NA,1,rep(NA, 28)))
dicho.SAT12 < key2binary(SAT12, key2)
# keys from raw character responses
resp < as.data.frame(matrix(c(
"B","B","D","D","E",
"B","A","D","D","E",
"B","A","D","C","E",
"D","D","D","C","E",
"B","C","A","D","A"), ncol=5, byrow=TRUE))
key < c("B", "D", "D", "C", "E")
d01 < key2binary(resp, key)
head(d01)
# score/don't score missing values
resp[1,1] < NA
d01NA < key2binary(resp, key) # without scoring
d01NA
d01 < key2binary(resp, key, score_missing = TRUE) # with scoring
d01
Lagrange (i.e., score) test to test whether parameters should be freed from a more constrained baseline model.
lagrange(mod, parnum, SE.type = "Oakes", type = "Richardson", ...)
mod 
an estimated model 
parnum 
a vector, or list of vectors, containing one or more parameter
locations/sets of locations to be tested.
See objects returned from 
SE.type 
type of information matrix estimator to use. See 
type 
type of numerical algorithm passed to 
... 
additional arguments to pass to 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
dat < expand.table(LSAT7)
mod < mirt(dat, 1, 'Rasch')
(values < mod2values(mod))
# test all fixed slopes individually
parnum < values$parnum[values$name == 'a1']
lagrange(mod, parnum)
# compare to LR test for first two slopes
mod2 < mirt(dat, 'F = 15
FREE = (1, a1)', 'Rasch')
coef(mod2, simplify=TRUE)$items
anova(mod, mod2)
mod2 < mirt(dat, 'F = 15
FREE = (2, a1)', 'Rasch')
coef(mod2, simplify=TRUE)$items
anova(mod, mod2)
mod2 < mirt(dat, 'F = 15
FREE = (3, a1)', 'Rasch')
coef(mod2, simplify=TRUE)$items
anova(mod, mod2)
# test slopes first two slopes and last three slopes jointly
lagrange(mod, list(parnum[1:2], parnum[3:5]))
# test all 5 slopes and first + last jointly
lagrange(mod, list(parnum[1:5], parnum[c(1, 5)]))
## End(Not run)
Given a matrix or data.frame object consisting of Likert responses return an object of the same dimensions with integer values.
likert2int(x, levels = NULL)
x 
a matrix of character values or data.frame of character/factor vectors 
levels 
a named character vector indicating which integer values
should be assigned to which elements. If omitted, the order of the elements
will be determined after converting each column in 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
# simulate data
dat1 < matrix(sample(c('Disagree', 'Strongly Disagree', 'Agree',
'Neutral', 'Strongly Agree'), 1000*5, replace=TRUE),
nrow=1000, ncol=5)
dat1[2,2] < dat1[3,3] < dat1[1,3] < NA # NAs added for flavour
dat2 < matrix(sample(c('D', 'SD', 'A', 'N', 'SA'), 1000*5, replace=TRUE),
nrow=1000, ncol=5)
dat < cbind(dat1, dat2)
# separately
intdat1 < likert2int(dat1)
head(dat1)
head(intdat1)
# more useful with explicit levels
lvl1 < c('Strongly Disagree'=1, 'Disagree'=2, 'Neutral'=3, 'Agree'=4,
'Strongly Agree'=5)
intdat1 < likert2int(dat1, levels = lvl1)
head(dat1)
head(intdat1)
# second data
lvl2 < c('SD'=1, 'D'=2, 'N'=3, 'A'=4, 'SA'=5)
intdat2 < likert2int(dat2, levels = lvl2)
head(dat2)
head(intdat2)
# full dataset (using both mapping schemes)
intdat < likert2int(dat, levels = c(lvl1, lvl2))
head(dat)
head(intdat)
#####
# data.frame as input with ordered factors
dat1 < data.frame(dat1)
dat2 < data.frame(dat2)
dat.old < cbind(dat1, dat2)
colnames(dat.old) < paste0('Item_', 1:10)
str(dat.old) # factors are leveled alphabetically by default
# create explicit ordering in factor variables
for(i in 1:ncol(dat1))
levels(dat1[[i]]) < c('Strongly Disagree', 'Disagree', 'Neutral', 'Agree',
'Strongly Agree')
for(i in 1:ncol(dat2))
levels(dat2[[i]]) < c('SD', 'D', 'N', 'A', 'SA')
dat < cbind(dat1, dat2)
colnames(dat) < colnames(dat.old)
str(dat) # note ordering
intdat < likert2int(dat)
head(dat)
head(intdat)
## End(Not run)
Extract the observeddata loglikelihood.
## S4 method for signature 'SingleGroupClass'
logLik(object)
object 
an object of class 
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
x < mirt(Science, 1)
logLik(x)
## End(Not run)
Data from Thissen (1982); contains 5 dichotomously scored items obtained from the Law School Admissions Test, section 6.
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Thissen, D. (1982). Marginal maximum likelihood estimation for the oneparameter logistic model. Psychometrika, 47, 175186.
## Not run:
dat < expand.table(LSAT6)
head(dat)
itemstats(dat)
model < 'F = 15
CONSTRAIN = (15, a1)'
(mod < mirt(dat, model))
M2(mod)
itemfit(mod)
coef(mod, simplify=TRUE)
# equivalentely, but with a different parameterization
mod2 < mirt(dat, 1, itemtype = 'Rasch')
anova(mod, mod2) #equal
M2(mod2)
itemfit(mod2)
coef(mod2, simplify=TRUE)
sqrt(coef(mod2)$GroupPars[2]) #latent SD equal to the slope in mod
## End(Not run)
Data from Bock & Lieberman (1970); contains 5 dichotomously scored items obtained from the Law School Admissions Test, section 7.
Phil Chalmers [email protected]
Bock, R. D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35(2), 179197.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
## Not run:
dat < expand.table(LSAT7)
head(dat)
itemstats(dat)
(mod < mirt(dat, 1))
coef(mod)
## End(Not run)
Computes the M2 (MaydeuOlivares & Joe, 2006) statistic when all data are dichotomous,
the collapsed M2* statistic (collapsing over univariate and bivariate response categories;
see Cai and Hansen, 2013), and the hybrid C2 statistic which only collapses only the bivariate
moments (Cai and Monro, 2014). The C2 variant is mainly useful when polytomous response models
do not have sufficient degrees of freedom to compute M2*. This function
also computes associated fit indices that are based on
fitting the null model. Supports single and multiplegroup models.
If the latent trait density was approximated (e.g., Davidian curves, Empirical histograms, etc)
then passing use_dentype_estimate = TRUE
will use the internally saved quadrature and
density components (where applicable).
M2(
obj,
type = "M2*",
calcNull = TRUE,
na.rm = FALSE,
quadpts = NULL,
theta_lim = c(6, 6),
CI = 0.9,
residmat = FALSE,
QMC = FALSE,
suppress = 1,
...
)
obj 
an estimated model object from the mirt package 
type 
type of fit statistic to compute. Options are "M2", "M2*" for the univariate and bivariate collapsed version of the M2 statistic ("M2" currently limited to dichotomous response data only), and "C2" for a hybrid between M2 and M2* where only the bivariate moments are collapsed 
calcNull 
logical; calculate statistics for the null model as well?
Allows for statistics such as the limited information TLI and CFI. Only valid when items all
have a suitable null model (e.g., those created via 
na.rm 
logical; remove rows with any missing values? The M2 family of statistics requires a complete dataset in order to be well defined 
quadpts 
number of quadrature points to use during estimation. If 
theta_lim 
lower and upper range to evaluate latent trait integral for each dimension 
CI 
numeric value from 0 to 1 indicating the range of the confidence interval for RMSEA. Default returns the 90% interval 
residmat 
logical; return the residual matrix used to compute the SRMSR statistic? Only the lower triangle of the residual correlation matrix will be returned (the upper triangle is filled with NA's) 
QMC 
logical; use quasiMonte Carlo integration? Useful for higher dimensional models.
If 
suppress 
a numeric value indicating which parameter residual dependency combinations
to flag as being too high. Absolute values for the standardized residuals greater than
this value will be returned, while all values less than this value will be set to NA.
Must be used in conjunction with the argument 
... 
additional arguments to pass 
Returns a data.frame object with the M2type statistic, along with the degrees of freedom,
pvalue, RMSEA (with 90% confidence interval), SRMSR for each group,
and optionally the TLI and CFI model fit statistics if calcNull = TRUE
.
Phil Chalmers [email protected]
Cai, L. & Hansen, M. (2013). Limitedinformation goodnessoffit testing of hierarchical item factor models. British Journal of Mathematical and Statistical Psychology, 66, 245276.
Cai, L. & Monro, S. (2014). A new statistic for evaluating item response theory models for ordinal data. National Center for Research on Evaluation, Standards, & Student Testing. Technical Report.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
MaydeuOlivares, A. & Joe, H. (2006). Limited information goodnessoffit testing in multidimensional contingency tables. Psychometrika, 71, 713732.
## Not run:
dat < as.matrix(expand.table(LSAT7))
(mod1 < mirt(dat, 1))
M2(mod1)
resids < M2(mod1, residmat=TRUE) #lower triangle of residual correlation matrix
resids
summary(resids[lower.tri(resids)])
# M2 with missing data present
dat[sample(1:prod(dim(dat)), 250)] < NA
mod2 < mirt(dat, 1)
# Compute stats by removing missing data rowwise
M2(mod2, na.rm = TRUE)
# C2 statistic (useful when polytomous IRT models have too few df)
pmod < mirt(Science, 1)
# This fails with too few df:
# M2(pmod)
# This, however, works:
M2(pmod, type = 'C2')
## End(Not run)
Given an estimated model and a prior density function, compute the marginal reliability (Thissen and Wainer, 2001). This is only available for unidimensional tests.
marginal_rxx(mod, density = dnorm, var_theta = 1, ...)
mod 
an object of class 
density 
a density function to use for integration. Default assumes the latent traits are from a normal (Gaussian) distribution 
var_theta 
variance of the Theta distribution (typically 1 for many fitted IRT models) 
... 
additional arguments passed to the density function 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Thissen, D. and Wainer, H. (2001). Test Scoring. Lawrence Erlbaum Associates.
empirical_rxx
, extract.group
, testinfo
dat < expand.table(deAyala)
mod < mirt(dat, 1)
# marginal estimate
marginal_rxx(mod)
## Not run:
# empirical estimate (assuming the same prior)
fscores(mod, returnER = TRUE)
# empirical rxx the alternative way, given theta scores and SEs
fs < fscores(mod, full.scores.SE=TRUE)
head(fs)
empirical_rxx(fs)
## End(Not run)
Returns a matrix containing the MDIFF values (Reckase, 2009). Only supported for items of class 'dich' and 'graded'.
MDIFF(x, which.items = NULL, group = NULL)
x 
an object of class 'SingleGroupClass', or an object of class 'MultipleGroupClass' if a suitable

which.items 
a vector indicating which items to select. If NULL is used (the default) then MDISC will be computed for all items 
group 
group argument to pass to 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Reckase, M. D. (2009). Multidimensional Item Response Theory. Springer.
## Not run:
mod < mirt(Science, 2)
MDIFF(mod)
mod < mirt(expand.table(LSAT7), 2)
MDIFF(mod)
## End(Not run)
mdirt
fits a variety of item response models with discrete latent variables.
These include, but are not limited to, latent class analysis, multidimensional latent
class models, multidimensional discrete latent class models, DINA/DINO models,
grade of measurement models, CRUM, and so on. If response models are not defined explicitly
then customized models can defined using the createItem
function.
mdirt(
data,
model,
customTheta = NULL,
structure = NULL,
item.Q = NULL,
nruns = 1,
method = "EM",
covdata = NULL,
formula = NULL,
itemtype = "lca",
optimizer = "nlminb",
return_max = TRUE,
group = NULL,
GenRandomPars = FALSE,
verbose = TRUE,
pars = NULL,
technical = list(),
...
)
data 
a 
model 
number of mutually exclusive classes to fit, or alternatively a more specific

customTheta 
input passed to 
structure 
an R formula allowing the profile probability patterns (i.e., the structural component of
the model) to be fitted according to a loglinear model. When 
item.Q 
a list of itemlevel Qmatrices indicating how the respective categories should be
modeled by the underlying attributes. Each matrix must represent a 
nruns 
a numeric value indicating how many times the model should be fit to the data
when using random starting values. If greater than 1, 
method 
estimation method. Can be 'EM' or 'BL' (see 
covdata 
a data.frame of data used for latent regression models 
formula 
an R formula (or list of formulas) indicating how the latent traits
can be regressed using external covariates in 
itemtype 
a vector indicating the itemtype associated with each item.
For discrete models this is limited to only 'lca' or items defined using a

optimizer 
optimizer used for the Mstep, set to 
return_max 
logical; when 
group 
a factor variable indicating group membership used for multiple group analyses 
GenRandomPars 
logical; use random starting values 
verbose 
logical; turn on messages to the R console 
pars 
used for modifying starting values; see 
technical 
list of lowerlevel inputs. See 
... 
additional arguments to be passed to the estimation engine. See 
Posterior classification accuracy for each response pattern may be obtained
via the fscores
function. The summary()
function will display
the category probability values given the class membership, which can also
be displayed graphically with plot()
, while coef()
displays the raw coefficient values (and their standard errors, if estimated). Finally,
anova()
is used to compare nested models, while
M2
and itemfit
may be used for model fitting purposes.
The latent class IRT model with two latent classes has the form
$P(x = k\theta_1, \theta_2, a1, a2) = \frac{exp(a1 \theta_1 + a2 \theta_2)}{
\sum_j^K exp(a1 \theta_1 + a2 \theta_2)}$
where the $\theta$
values generally take on discrete points (such as 0 or 1).
For proper identification, the first category slope parameters
($a1$
and $a2$
) are never freely estimated. Alternatively, supplying a different
grid of $\theta$
values will allow the estimation of similar models (multidimensional
discrete models, grade of membership, etc.). See the examples below.
When the item.Q
for is utilized, the above equation can be understood as
$P(x = k\theta_1, \theta_2, a1, a2) = \frac{exp(a1 \theta_1 Q_{j1} + a2 \theta_2 Q_{j2})}{
\sum_j^K exp(a1 \theta_1 Q_{j1} + a2 \theta_2 Q_{j2})}$
where by construction Q
is a $K_i \times A$
matrix indicating whether the category should
be modeled according to the latent class structure. For the standard latent class model, the Qmatrix
has as many rows as categories, as many columns as the number of classes/attributes modeled,
and consist of 0's in the first row and 1's elsewhere. This of course can be overwritten by passing
an alternative item.Q
definition for each respective item.
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129.
Proctor, C. H. (1970). A probabilistic formulation and statistical analysis for Guttman scaling. Psychometrika, 35, 7378. doi:10.18637/jss.v048.i06
thetaComb
, fscores
, mirt.model
, M2
,
itemfit
, boot.mirt
, mirtCluster
,
wald
, coefmethod
, summarymethod
,
anovamethod
, residualsmethod
# LSAT6 dataset
dat < expand.table(LSAT6)
# fit with 23 latent classes
(mod2 < mdirt(dat, 2))
## Not run:
(mod3 < mdirt(dat, 3))
summary(mod2)
residuals(mod2)
residuals(mod2, type = 'exp')
anova(mod2, mod3)
M2(mod2)
itemfit(mod2)
# generate classification plots
plot(mod2)
plot(mod2, facet_items = FALSE)
plot(mod2, profile = TRUE)
# available for polytomous data
mod < mdirt(Science, 2)
summary(mod)
plot(mod)
plot(mod, profile=TRUE)
# classification based on response patterns
fscores(mod2, full.scores = FALSE)
# classify individuals either with the largest posterior probability.....
fs < fscores(mod2)
head(fs)
classes < 1:2
class_max < classes[apply(apply(fs, 1, max) == fs, 1, which)]
table(class_max)
# ... or by probability sampling (i.e., plausible value draws)
class_prob < apply(fs, 1, function(x) sample(1:2, 1, prob=x))
table(class_prob)
# plausible value imputations for stochastic classification in both classes
pvs < fscores(mod2, plausible.draws=10)
tabs < lapply(pvs, function(x) apply(x, 2, table))
tabs[[1]]
# fit with random starting points (run in parallel to save time)
if(interactive()) mirtCluster()
mod < mdirt(dat, 2, nruns=10)
#
# Grade of measurement model
# define a custom Theta grid for including a 'fuzzy' class membership
(Theta < matrix(c(1, 0, .5, .5, 0, 1), nrow=3 , ncol=2, byrow=TRUE))
(mod_gom < mdirt(dat, 2, customTheta = Theta))
summary(mod_gom)
#
# Multidimensional discrete latent class model
dat < key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
# define Theta grid for three latent classes
(Theta < thetaComb(0:1, 3))
(mod_discrete < mdirt(dat, 3, customTheta = Theta))
summary(mod_discrete)
# Located latent class model
model < mirt.model('C1 = 132
C2 = 132
C3 = 132
CONSTRAIN = (132, a1), (132, a2), (132, a3)')
(mod_located < mdirt(dat, model, customTheta = diag(3)))
summary(mod_located)
#
### DINA model example
# generate some suitable data for a two dimensional DINA application
# (first columns are intercepts)
set.seed(1)
Theta < expand.table(matrix(c(1,0,0,0,
1,1,0,0,
1,0,1,0,
1,1,1,1), 4, 4, byrow=TRUE),
freq = c(200,200,100,500))
a < matrix(c(rnorm(15, 1.5, .5), rlnorm(5, .2, .3), numeric(15), rlnorm(5, .2, .3),
numeric(15), rlnorm(5, .2, .3)), 15, 4)
guess < plogis(a[11:15,1]) # population guess
slip < 1  plogis(rowSums(a[11:15,])) # population slip
dat < simdata(a, Theta=Theta, itemtype = 'lca')
# first column is the intercept, 2nd and 3rd are attributes
theta < cbind(1, thetaComb(0:1, 2))
theta < cbind(theta, theta[,2] * theta[,3]) #DINA interaction of main attributes
model < mirt.model('Intercept = 115
A1 = 15
A2 = 610
A1A2 = 1115')
# last 5 items are DINA (first 10 are unidimensional CRUMs)
DINA < mdirt(dat, model, customTheta = theta)
coef(DINA, simplify=TRUE)
summary(DINA)
M2(DINA) # fits well (as it should)
cfs < coef(DINA, simplify=TRUE)$items[11:15,]
cbind(guess, estguess = plogis(cfs[,1]))
cbind(slip, estslip = 1  plogis(rowSums(cfs)))
### DINO model example
theta < cbind(1, thetaComb(0:1, 2))
# define theta matrix with negative interaction term
(theta < cbind(theta, theta[,2] * theta[,3]))
model < mirt.model('Intercept = 115
A1 = 15, 1115
A2 = 615
Yoshi = 1115
CONSTRAIN = (11,a2,a3,a4), (12,a2,a3,a4), (13,a2,a3,a4),
(14,a2,a3,a4), (15,a2,a3,a4)')
# last five items are DINOs (first 10 are unidimensional CRUMs)
DINO < mdirt(dat, model, customTheta = theta)
coef(DINO, simplify=TRUE)
summary(DINO)
M2(DINO) #doesn't fit as well, because not the generating model
## CRUM (analogous to MIRT model)
theta < cbind(1, thetaComb(0:1, 2))
model < mirt.model('Intercept = 115
A1 = 15, 1115
A2 = 615')
CRUM < mdirt(dat, model, customTheta = theta)
coef(CRUM, simplify=TRUE)
summary(CRUM)
# good fit, but oversaturated (main effects for items 1115 can be set to 0)
M2(CRUM)
#
# multidimensional latent class model
dat < key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
# 5 latent classes within 2 different sets of items
model < mirt.model('C1 = 116
C2 = 116
C3 = 116
C4 = 116
C5 = 116
C6 = 1732
C7 = 1732
C8 = 1732
C9 = 1732
C10 = 1732
CONSTRAIN = (116, a1), (116, a2), (116, a3), (116, a4), (116, a5),
(1732, a6), (1732, a7), (1732, a8), (1732, a9), (1732, a10)')
theta < diag(10) # defined explicitly. Otherwise, this profile is assumed
mod < mdirt(dat, model, customTheta = theta)
coef(mod, simplify=TRUE)
summary(mod)
#
# multiple group with constrained group probabilities
dat < key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
group < rep(c('G1', 'G2'), each = nrow(SAT12)/2)
Theta < diag(2)
# the latent class parameters are technically located in the (nitems + 1) location
model < mirt.model('A1 = 132
A2 = 132
CONSTRAINB = (33, c1)')
mod < mdirt(dat, model, group = group, customTheta = Theta)
coef(mod, simplify=TRUE)
summary(mod)
#
# Probabilistic Guttman Model (Proctor, 1970)
# example analysis can also be found in the sirt package (see ?prob.guttman)
data(data.read, package = 'sirt')
head(data.read)
Theta < matrix(c(1,0,0,0,
1,1,0,0,
1,1,1,0,
1,1,1,1), 4, byrow=TRUE)
model < mirt.model("INTERCEPT = 112
C1 = 1,7,9,11
C2 = 2,5,8,10,12
C3 = 3,4,6")
mod < mdirt(data.read, model, customTheta=Theta)
summary(mod)
M2(mod)
itemfit(mod)
## End(Not run)
Returns a vector containing the MDISC values for each item in the model input object (Reckase, 2009).
MDISC(x, group = NULL)
x 
an object of class 'SingleGroupClass', or an object of class 'MultipleGroupClass' if a suitable

group 
group argument to pass to 
Phil Chalmers [email protected]
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Reckase, M. D. (2009). Multidimensional Item Response Theory. Springer.
## Not run:
mod < mirt(Science, 2)
MDISC(mod)
## End(Not run)
mirt
fits a maximum likelihood (or maximum a posteriori) factor analysis model
to any mixture of dichotomous and polytomous data under the item response theory paradigm
using either Cai's (2010) MetropolisHastings RobbinsMonro (MHRM) algorithm, with
an EM algorithm approach outlined by Bock and Aitkin (1981) using rectangular or
quasiMonte Carlo integration grids, or with the stochastic EM (i.e., the first two stages
of the MHRM algorithm). Models containing 'explanatory' person or item level predictors
can only be included by using the mixedmirt
function, though latent
regression models can be fit using the formula
input in this function.
Tests that form a twotier or bifactor structure should be estimated with the
bfactor
function, which uses a dimension reduction EM algorithm for
modeling item parcels. Multiple group analyses (useful for DIF and DTF testing) are
also available using the multipleGroup
function.
mirt(
data,
model = 1,
itemtype = NULL,
guess = 0,
upper = 1,
SE = FALSE,
covdata = NULL,
formula = NULL,
SE.type = "Oakes",
method = "EM",
optimizer = NULL,
dentype = "Gaussian",
pars = NULL,
constrain = NULL,
calcNull = FALSE,
draws = 5000,
survey.weights = NULL,
quadpts = NULL,
TOL = NULL,
gpcm_mats = list(),
grsm.block = NULL,
rsm.block = NULL,
monopoly.k = 1L,
key = NULL,
large = FALSE,
GenRandomPars = FALSE,
accelerate = "Ramsay",
verbose = TRUE,
solnp_args = list(),
nloptr_args = list(),
spline_args = list(),
control = list(),
technical = list(),
...
)
data 
a 
model 
a string to be passed (or an object returned from) 
itemtype 
type of items to be modeled, declared as a vector for each item or a single value
which will be recycled for each item. The
Additionally, user defined item classes can also be defined using the 
guess 
fixed pseudoguessing parameters. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item 
upper 
fixed upper bound parameters for 4PL model. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item 
SE 
logical; estimate the standard errors by computing the parameter information matrix?
See 
covdata 
a data.frame of data used for latent regression models 
formula 
an R formula (or list of formulas) indicating how the latent traits
can be regressed using external covariates in 
SE.type 
type of estimation method to use for calculating the parameter information matrix
for computing standard errors and
Note that both the 
method 
a character object specifying the estimation algorithm to be used. The default is
The 
optimizer 
a character indicating which numerical optimizer to use. By default, the EM
algorithm will use the Other options include the NewtonRaphson ( Additionally, estimation subroutines from the 
dentype 
type of density form to use for the latent trait parameters. Current options include
Note that when 
pars 
a data.frame with the structure of how the starting values, parameter numbers,
estimation logical values, etc, are defined. The user may observe how the model defines the
values by using 
constrain 
a list of user declared equality constraints. To see how to define the
parameters correctly use 
calcNull 
logical; calculate the Null model for additional fit statistics (e.g., TLI)? Only applicable if the data contains no NA's and the data is not overly sparse 
draws 
the number of Monte Carlo draws to estimate the loglikelihood for the MHRM algorithm. Default is 5000 
survey.weights 
a optional numeric vector of survey weights to apply for each case in the
data (EM estimation only). If not specified, all cases are weighted equally (the standard IRT
approach). The sum of the 
quadpts 
number of quadrature points per dimension (must be larger than 2).
By default the number of quadrature uses the following scheme:

TOL 
convergence threshold for EM or MHRM; defaults are .0001 and .001. If

gpcm_mats 
a list of matrices specifying how the scoring coefficients in the (generalized)
partial credit model should be constructed. If omitted, the standard gpcm format will be used
(i.e., 
grsm.block 
an optional numeric vector indicating where the blocking should occur when
using the grsm, NA represents items that do not belong to the grsm block (other items that may
be estimated in the test data). For example, to specify two blocks of 3 with a 2PL item for
the last item: 
rsm.block 
same as 
monopoly.k 
a vector of values (or a single value to repeated for each item) which indicate
the degree of the monotone polynomial fitted, where the monotone polynomial
corresponds to 
key 
a numeric vector of the response scoring key. Required when using nested logit item
types, and must be the same length as the number of items used. Items that are not nested logit
will ignore this vector, so use 
large 
a Alternatively, if the collapse table of frequencies is desired for the purpose of saving computations
(i.e., only computing the collapsed frequencies for the data ontetime) then a character vector can
be passed with the arguement

GenRandomPars 
logical; generate random starting values prior to optimization instead of using the fixed internal starting values? 
accelerate 
a character vector indicating the type of acceleration to use. Default
is 
verbose 
logical; print observed (EM) or completedata (MHRM) loglikelihood after each iteration cycle? Default is TRUE 
solnp_args 
a list of arguments to be passed to the 
nloptr_args 
a list of arguments to be passed to the 
spline_args 
a named list of lists containing information to be passed to the
This code input changes the 
control 
a list passed to the respective optimizers (i.e., 
technical 
a list containing lower level technical parameters for estimation. May be:

... 
additional arguments to be passed 
function returns an object of class SingleGroupClass
(SingleGroupClassclass)
Specification of the confirmatory item factor analysis model follows many of
the rules in the structural equation modeling framework for confirmatory factor analysis. The
variances of the latent factors are automatically fixed to 1 to help
facilitate model identification. All parameters may be fixed to constant
values or set equal to other parameters using the appropriate declarations.
Confirmatory models may also contain 'explanatory' person or item level predictors, though
including predictors is currently limited to the mixedmirt
function.
When specifying a single number greater than 1 as the model
input to mirt
an exploratory IRT model will be estimated. Rotation and target matrix options are available
if they are passed to generic functions such as summarymethod
and
fscores
. Factor means and variances are fixed to ensure proper identification.
If the model is an exploratory item factor analysis estimation will begin
by computing a matrix of quasipolychoric correlations. A
factor analysis with nfact
is then extracted and item parameters are
estimated by $a_{ij} = f_{ij}/u_j$
, where $f_{ij}$
is the factor
loading for the jth item on the ith factor, and $u_j$
is
the square root of the factor uniqueness, $\sqrt{1  h_j^2}$
. The
initial intercept parameters are determined by calculating the inverse
normal of the item facility (i.e., item easiness), $q_j$
, to obtain
$d_j = q_j / u_j$
. A similar implementation is also used for obtaining
initial values for polytomous items.
Internally the $g$
and $u$
parameters are transformed using a logit
transformation ($log(x/(1x))$
), and can be reversed by using $1 / (1 + exp(x))$
following convergence. This also applies when computing confidence intervals for these
parameters, and is done so automatically if coef(mod, rawug = FALSE)
.
As such, when applying prior distributions to these parameters it is recommended to use a prior
that ranges from negative infinity to positive infinity, such as the normally distributed
prior via the 'norm'
input (see mirt.model
).
Unrestricted fullinformation factor analysis is known to have problems with convergence, and some items may need to be constrained or removed entirely to allow for an acceptable solution. As a general rule dichotomous items with means greater than .95, or items that are only .05 greater than the guessing parameter, should be considered for removal from the analysis or treated with prior parameter distributions. The same type of reasoning is applicable when including upper bound parameters as well. For polytomous items, if categories are rarely endorsed then this will cause similar issues. Also, increasing the number of quadrature points per dimension, or using the quasiMonte Carlo integration method, may help to stabilize the estimation process in higher dimensions. Finally, solutions that are not well defined also will have difficulty converging, and can indicate that the model has been misspecified (e.g., extracting too many dimensions).
For the MHRM algorithm, when the number of iterations grows very high (e.g., greater than 1500)
or when Max Change = .2500
values are repeatedly printed
to the console too often (indicating that the parameters were being constrained since they are
naturally moving in steps greater than 0.25) then the model may either be ill defined or have a
very flat likelihood surface, and genuine maximumlikelihood parameter estimates may be difficult
to find. Standard errors are computed following the model convergence by passing
SE = TRUE
, to perform an addition MHRM stage but treating the maximumlikelihood
estimates as fixed points.
Additional functions are available in the package which can be useful pre and postestimation. These are:
mirt.model
Define the IRT model specification use special syntax. Useful for defining between and within group parameter constraints, prior parameter distributions, and specifying the slope coefficients for each factor
coefmethod
Extract raw coefficients from the model, along with their standard errors and confidence intervals
summarymethod
Extract standardized loadings from model. Accepts a rotate
argument for exploratory
item response model
anovamethod
Compare nested models using likelihood ratio statistics as well as information criteria such as the AIC and BIC
residualsmethod
Compute pairwise residuals between each item using methods such as the LD statistic (Chen & Thissen, 1997), as well as response pattern residuals
plotmethod
Plot various types of test level plots including the test score and information functions and more
itemplot
Plot various types of item level plots, including the score, standard error, and information functions, and more
createItem
Create a customized itemtype
that does not currently exist in the package
imputeMissing
Impute missing data given some computed Theta matrix
fscores
Find predicted scores for the latent traits using estimation methods such as EAP, MAP, ML, WLE, and EAPsum
wald
Compute Wald statistics follow the convergence of a model with a suitable information matrix
M2
Limited information goodness of fit test statistic based to determine how well the model fits the data
itemfit
and personfit
Goodness of fit statistics at the item and person levels, such as the SX2, infit, outfit, and more
boot.mirt
Compute estimated parameter confidence intervals via the bootstrap methods
mirtCluster
Define a cluster for the package functions to use for capitalizing on multicore architecture to utilize available CPUs when possible. Will help to decrease estimation times for tasks that can be run in parallel
The parameter labels use the follow convention, here using two factors and $K$
as the total
number of categories (using $k$
for specific category instances).
Only one intercept estimated, and the latent variance of $\theta$
is freely estimated. If
the data have more than two categories then a partial credit model is used instead (see
'gpcm' below).
$P(x = 1\theta, d) = \frac{1}{1 + exp((\theta + d))}$
Depending on the model $u$
may be equal to 1 and $g$
may be equal to 0.
$P(x = 1\theta, \psi) = g + \frac{(u  g)}{
1 + exp((a_1 * \theta_1 + a_2 * \theta_2 + d))}$
Currently restricted to unidimensional models
$P(x = 1\theta, \psi) = g + \frac{(u  g)}{
1 + exp((a_1 * \theta_1 + d))^S}$
where $S$
allows for asymmetry in the response function and
is transformation constrained to be greater than 0 (i.e., log(S)
is estimated rather than S
)
Complementary loglog model (see Shim, Bonifay, and Wiedermann, 2022)
$P(x = 1\theta, b) = 1  exp(exp(\theta  b))$
Currently restricted to unidimensional dichotomous data.
The graded model consists of sequential 2PL models,
$P(x = k  \theta, \psi) = P(x \ge k  \theta, \phi)  P(x \ge k + 1  \theta, \phi)$
Note that $P(x \ge 1  \theta, \phi) = 1$
while $P(x \ge K + 1  \theta, \phi) = 0$
The unipolar loglogistic model (ULL; Lucke, 2015) is defined the same as the graded response model, however
$P(x \le k  \theta, \psi) = \frac{\lambda_k\theta^\eta}{1 + \lambda_k\theta^\eta}$
.
Internally the $\lambda$
parameters are exponentiated to keep them positive, and should
therefore the reported estimates should be interpreted in log units
A more constrained version of the graded model where graded spacing is equal across item
blocks and only adjusted by a single 'difficulty' parameter (c) while the latent variance
of $\theta$
is freely estimated (see Muraki, 1990 for this exact form).
This is restricted to unidimensional models only.
For the gpcm the $d$
values are treated as fixed and ordered values
from $0:(K1)$
(in the nominal model $d_0$
is also set to 0). Additionally, for
identification in the nominal model $ak_0 = 0$
, $ak_{(K1)} = (K  1)$
.
$P(x = k  \theta, \psi) =
\frac{exp(ak_{k1} * (a_1 * \theta_1 + a_2 * \theta_2) + d_{k1})}
{\sum_{k=1}^K exp(ak_{k1} * (a_1 * \theta_1 + a_2 * \theta_2) + d_{k1})}$
For the partial credit model (when itemtype = 'Rasch'
; unidimensional only) the above
model is further constrained so that $ak = (0,1,\ldots, K1)$
, $a_1 = 1$
, and the
latent variance of $\theta_1$
is freely estimated. Alternatively, the partial credit model
can be obtained by containing all the slope parameters in the gpcms to be equal.
More specific scoring function may be included by passing a suitable list or matrices
to the gpcm_mats
input argument.
In the nominal model this parametrization helps to identify the empirical ordering of the
categories by inspecting the $ak$
values. Larger values indicate that the item category
is more positively related to the latent trait(s) being measured. For instance, if an item
was truly ordinal (such as a Likert scale), and had 4 response categories, we would expect
to see $ak_0 < ak_1 < ak_2 < ak_3$
following estimation. If on the other hand
$ak_0 > ak_1$
then it would appear that the second category is less related to to the
trait than the first, and therefore the second category should be understood as the
'lowest score'.
NOTE: The nominal model can become numerical unstable if poor choices for the high and low
values are chosen, resulting in ak
values greater than abs(10)
or more. It is
recommended to choose high and low anchors that cause the estimated parameters to fall
between 0 and $K  1$
either by theoretical means or by reestimating
the model with better values following convergence.
The gpcmIRT model is the classical generalized partial credit model for unidimensional response
data. It will obtain the same fit as the gpcm
presented above, however the parameterization
allows for the Rasch/generalized rating scale model as a special case.
E.g., for a K = 4 category response model,
$P(x = 0  \theta, \psi) = exp(0) / G$
$P(x = 1  \theta, \psi) = exp(a(\theta  b1) + c) / G$
$P(x = 2  \theta, \psi) = exp(a(2\theta  b1  b2) + 2c) / G$
$P(x = 3  \theta, \psi) = exp(a(3\theta  b1  b2  b3) + 3c) / G$
where
$G = exp(0) + exp(a(\theta  b1) + c) + exp(a(2\theta  b1  b2) + 2c) +
exp(a(3\theta  b1  b2  b3) + 3c)$
Here $a$
is the slope parameter, the $b$
parameters are the threshold
values for each adjacent category, and $c$
is the socalled difficulty parameter when
a rating scale model is fitted (otherwise, $c = 0$
and it drops out of the computations).
The gpcmIRT can be constrained to the partial credit IRT model by either constraining all the slopes to be equal, or setting the slopes to 1 and freeing the latent variance parameter.
Finally, the rsm is a more constrained version of the (generalized) partial credit model where the spacing is equal across item blocks and only adjusted by a single 'difficulty' parameter (c). Note that this is analogous to the relationship between the graded model and the grsm (with an additional constraint regarding the fixed discrimination parameters).
The multidimensional sequential response model has the form
$P(x = k  \theta, \psi) = \prod (1  F(a_1 \theta_1 + a_2 \theta_2 + d_{sk}))
F(a_1 \theta_1 + a_2 \theta_2 + d_{jk})$
where $F(\cdot)$
is the cumulative logistic function.
The Tutz variant of this model (Tutz, 1990) (via itemtype = 'Tutz'
)
assumes that the slope terms are all equal to 1 and the latent
variance terms are estimated (i.e., is a Rasch variant).
The ideal point model has the form, with the upper bound constraint on $d$
set to 0:
$P(x = 1  \theta, \psi) = exp(0.5 * (a_1 * \theta_1 + a_2 * \theta_2 + d)^2)$
Partially compensatory models consist of the product of 2PL probability curves.
$P(x = 1  \theta, \psi) = g + (1  g) (\frac{1}{1 + exp((a_1 * \theta_1 + d_1))} *
\frac{1}{1 + exp((a_2 * \theta_2 + d_2))})$
Note that constraining the slopes to be equal across items will reduce the model to Embretson's (a.k.a. Whitely's) multicomponent model (1980).
Nested logistic curves for modeling distractor items. Requires a scoring key. The model is broken into two components for the probability of endorsement. For successful endorsement the probability trace is the 14PL model, while for unsuccessful endorsement:
$P(x = 0  \theta, \psi) =
(1  P_{14PL}(x = 1  \theta, \psi)) * P_{nominal}(x = k  \theta, \psi)$
which is the product of the complement of the dichotomous trace line with the nominal
response model. In the nominal model, the slope parameters defined above are constrained
to be 1's, while the last value of the $ak$
is freely estimated.
The (multidimensional) generalized graded unfolding model is a class of ideal point models useful for ordinal response data. The form is
$P(z=k\theta,\psi)=\frac{exp\left[\left(z\sqrt{\sum_{d=1}^{D}
a_{id}^{2}(\theta_{jd}b_{id})^{2}}\right)+\sum_{k=0}^{z}\psi_{ik}\right]+
exp\left[\left((Mz)\sqrt{\sum_{d=1}^{D}a_{id}^{2}(\theta_{jd}b_{id})^{2}}\right)+
\sum_{k=0}^{z}\psi_{ik}\right]}{\sum_{w=0}^{C}\left(exp\left[\left(w
\sqrt{\sum_{d=1}^{D}a_{id}^{2}(\theta_{jd}b_{id})^{2}}\right)+
\sum_{k=0}^{z}\psi_{ik}\right]+exp\left[\left((Mw)
\sqrt{\sum_{d=1}^{D}a_{id}^{2}(\theta_{jd}b_{id})^{2}}\right)+
\sum_{k=0}^{z}\psi_{ik}\right]\right)}$
where $\theta_{jd}$
is the location of the $j$
th individual on the $d$
th dimension,
$b_{id}$
is the difficulty location of the $i$
th item on the $d$
th dimension,
$a_{id}$
is the discrimination of the $j$
th individual on the $d$
th dimension
(where the discrimination values are constrained to be positive),
$\psi_{ik}$
is the $k$
th subjective response category threshold for the $i$
th item,
assumed to be symmetric about the item and constant across dimensions, where
$\psi_{ik} = \sum_{d=1}^D a_{id} t_{ik}$
$z = 1,2,\ldots, C$
(where $C$
is the number of categories minus 1),
and $M = 2C + 1$
.
Spline response models attempt to model the response curves uses nonlinear and potentially nonmonotonic patterns. The form is
$P(x = 1\theta, \eta) = \frac{1}{1 + exp((\eta_1 * X_1 + \eta_2 * X_2 + \cdots + \eta_n * X_n))}$
where the $X_n$
are from the spline design matrix $X$
organized from the grid of $\theta$
values. Bsplines with a natural or polynomial basis are supported, and the intercept
input is
set to TRUE
by default.
Monotone polynomial model for polytomous response data of the form
$P(x = k  \theta, \psi) =
\frac{exp(\sum_1^k (m^*(\psi) + \xi_{c1})}
{\sum_1^C exp(\sum_1^K (m^*(\psi) + \xi_{c1}))}$
where $m^*(\psi)$
is the monotone polynomial function without the intercept.
To access examples, vignettes, and exercise files that have been generated with knitr please visit https://github.com/philchalmers/mirt/wiki.
Phil Chalmers [email protected]
Andrich, D. (1978). A rating scale formulation for ordered response categories. Psychometrika, 43, 561573.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443459.
Bock, R. D., Gibbons, R., & Muraki, E. (1988). FullInformation Item Factor Analysis. Applied Psychological Measurement, 12(3), 261280.
Bock, R. D. & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179197.
Cai, L. (2010a). HighDimensional exploratory item factor analysis by a MetropolisHastings RobbinsMonro algorithm. Psychometrika, 75, 3357.
Cai, L. (2010b). MetropolisHastings RobbinsMonro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307335.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 129. doi:10.18637/jss.v048.i06
Chalmers, R. P. (2015). Extended MixedEffects Item Response Models with the MHRM Algorithm. Journal of Educational Measurement, 52, 200222. doi:10.1111/jedm.12072
Chalmers, R. P. (2018). Numerical Approximation of the Observed Information Matrix with Oakes' Identity. British Journal of Mathematical and Statistical Psychology DOI: 10.1111/bmsp.12127
Chalmers, R., P. & Flora, D. (2014). Maximumlikelihood Estimation of Noncompensatory IRT Models with the MHRM Algorithm. Applied Psychological Measurement, 38, 339358. doi:10.1177/0146621614520958
Chen, W. H. & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265289.
Falk, C. F. & Cai, L. (2016). Maximum Marginal Likelihood Estimation of a Monotonic Polynomial Generalized Partial Credit Model with Applications to Multiple Group Analysis. Psychometrika, 81, 434460.
Lord, F. M. & Novick, M. R. (1968). Statistical theory of mental test scores. AddisonWesley.
Lucke, J. F. (2015). Unipolar item response models. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 272284). New York, NY: Routledge/Taylor & Francis Group.
Ramsay, J. O. (1975). Solving implicit equations in psychometric data analysis. Psychometrika, 40, 337360.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research.
Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000). A General Item Response Theory Model for Unfolding Unidimensional Polytomous Responses. Applied Psychological Measurement, 24, 332.
Shim, H., Bonifay, W., & Wiedermann, W. (2022). Parsimonious asymmetric item response theory modeling with the complementary loglog link. Behavior Research Methods, 55, 200219.
MaydeuOlivares, A., Hernandez, A. & McDonald, R. P. (2006). A Multidimensional Ideal Point Item Response Theory Model for Binary Data. Multivariate Behavioral Research, 41, 445471.
Muraki, E. (1990). Fitting a polytomous item response model to Likerttype data. Applied Psychological Measurement, 14, 5971.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159176.
Muraki, E. & Carlson, E. B. (1995). Fullinformation factor analysis for polytomous item responses. Applied Psychological Measurement, 19, 7390.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monographs, 34.
Suh, Y. & Bolt, D. (2010). Nested logit models for multiplechoice item response data. Psychometrika, 75, 454473.
Sympson, J. B. (1977). A model for testing with multidimensional items. Proceedings of the 1977 Computerized Adaptive Testing Conference.
Thissen, D. (1982). Marginal maximum likelihood estimation for the oneparameter logistic model. Psychometrika, 47, 175186.
Tutz, G. (1990). Sequential item response models with ordered response. British Journal of Mathematical and Statistical Psychology, 43, 3955.
Varadhan, R. & Roland, C. (2008). Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm. Scandinavian Journal of Statistics, 35, 335353.
Whitely, S. E. (1980). Multicomponent latent trait models for ability tests. Psychometrika, 45(4), 470494.
Wood, R., Wilson, D. T., Gibbons, R. D., Schilling, S. G., Muraki, E., & Bock, R. D. (2003). TESTFACT 4 for Windows: Test Scoring, Item Statistics, and Fullinformation Item Factor Analysis [Computer software]. Lincolnwood, IL: Scientific Software International.
Woods, C. M., and Lin, N. (2009). Item Response Theory With Estimation of the Latent Density Using Davidian Curves. Applied Psychological Measurement,33(2), 102117.
bfactor
, multipleGroup
, mixedmirt
,
expand.table
, key2binary
, mod2values
,
extract.item
, iteminfo
, testinfo
,
probtrace
, simdata
, averageMI
,
fixef
, extract.mirt
, itemstats
# load LSAT section 7 data and compute 1 and 2 factor models
data < expand.table(LSAT7)
itemstats(data)
(mod1 < mirt(data, 1))
coef(mod1)
summary(mod1)
plot(mod1)
plot(mod1, type = 'trace')
## Not run:
(mod2 < mirt(data, 1, SE = TRUE)) #standard errors via the Oakes method
(mod2 < mirt(data, 1, SE = TRUE, SE.type = 'SEM')) #standard errors with SEM method
coef(mod2)
(mod3 < mirt(data, 1, SE = TRUE, SE.type = 'Richardson')) #with numerical Richardson method
residuals(mod1)
plot(mod1) #test score function
plot(mod1, type = 'trace') #trace lines
plot(mod2, type = 'info') #test information
plot(mod2, MI=200) #expected total score with 95% confidence intervals
# estimated 3PL model for item 5 only
(mod1.3PL < mirt(data, 1, itemtype = c('2PL', '2PL', '2PL', '2PL', '3PL')))
coef(mod1.3PL)
# internally g and u pars are stored as logits, so usually a good idea to include normal prior
# to help stabilize the parameters. For a value around .182 use a mean
# of 1.5 (since 1 / (1 + exp((1.5))) == .182)
model < 'F = 15
PRIOR = (5, g, norm, 1.5, 3)'
mod1.3PL.norm < mirt(data, model, itemtype = c('2PL', '2PL', '2PL', '2PL', '3PL'))
coef(mod1.3PL.norm)
#limited information fit statistics
M2(mod1.3PL.norm)
# unidimensional ideal point model
idealpt < mirt(data, 1, itemtype = 'ideal')
plot(idealpt, type = 'trace', facet_items = TRUE)
plot(idealpt, type = 'trace', facet_items = FALSE)
# two factors (exploratory)
mod2 < mirt(data, 2)
coef(mod2)
summary(mod2, rotate = 'oblimin') #oblimin rotation
residuals(mod2)
plot(mod2)
plot(mod2, rotate = 'oblimin')
anova(mod1, mod2) #compare the two models
scoresfull < fscores(mod2) #factor scores for each response pattern
head(scoresfull)
scorestable < fscores(mod2, full.scores = FALSE) #save factor score table
head(scorestable)
# confirmatory (as an example, model is not identified since you need 3 items per factor)
# Two ways to define a confirmatory model: with mirt.model, or with a string
# these model definitions are equivalent
cmodel < mirt.model('
F1 = 1,4,5
F2 = 2,3')
cmodel2 < 'F1 = 1,4,5
F2 = 2,3'
cmod < mirt(data, cmodel)
# cmod < mirt(data, cmodel2) # same as above
coef(cmod)
anova(cmod, mod2)
# check if identified by computing information matrix
(cmod < mirt(data, cmodel, SE = TRUE))
###########
# data from the 'ltm' package in numeric format
itemstats(Science)
pmod1 < mirt(Science, 1)
plot(pmod1)
plot(pmod1, type = 'trace')
plot(pmod1, type = 'itemscore')
summary(pmod1)
# Constrain all slopes to be equal with the constrain = list() input or mirt.model() syntax
# first obtain parameter index
values < mirt(Science,1, pars = 'values')
values #note that slopes are numbered 1,5,9,13, or index with values$parnum[values$name == 'a1']
(pmod1_equalslopes < mirt(Science, 1, constrain = list(c(1,5,9,13))))
coef(pmod1_equalslopes)
# using mirt.model syntax, constrain all item slopes to be equal
model < 'F = 14
CONSTRAIN = (14, a1)'
(pmod1_equalslopes < mirt(Science, model))
coef(pmod1_equalslopes)
coef(pmod1_equalslopes)
anova(pmod1_equalslopes, pmod1) #significantly worse fit with almost all criteria
pmod2 < mirt(Science, 2)
summary(pmod2)
plot(pmod2, rotate = 'oblimin')
itemplot(pmod2, 1, rotate = 'oblimin')
anova(pmod1, pmod2)
# unidimensional fit with a generalized partial credit and nominal model
(gpcmod < mirt(Science, 1, 'gpcm'))
coef(gpcmod)
# for the nominal model the lowest and highest categories are assumed to be the
# theoretically lowest and highest categories that related to the latent trait(s)
(nomod < mirt(Science, 1, 'nominal'))
coef(nomod) #ordering of ak values suggest that the items are indeed ordinal
anova(gpcmod, nomod)
itemplot(nomod, 3)
# generalized graded unfolding model
(ggum < mirt(Science, 1, 'ggum'))
coef(ggum, simplify=TRUE)
plot(ggum)
plot(ggum, type = 'trace')
plot(ggum, type = 'itemscore')
# monotonic polyomial models
(monopoly < mirt(Science, 1, 'monopoly'))
coef(monopoly, simplify=TRUE)
plot(monopoly)
plot(monopoly, type = 'trace')
plot(<