Title: | Tools for Creating Conditional Probability Tables |
---|---|
Description: | Provides support parameterized tables for Bayesian networks, particularly the IRT-like DiBello tables. Also, provides some tools for visualing the networks. |
Authors: | Russell Almond |
Maintainer: | Russell Almond <[email protected]> |
License: | Artistic-2.0 |
Version: | 0.8-4 |
Built: | 2024-12-24 20:49:35 UTC |
Source: | https://github.com/ralmond/CPTtools |
Provides support parameterized tables for Bayesian networks, particularly the IRT-like DiBello tables. Also, provides some tools for visualing the networks.
The DESCRIPTION file:
Package: | CPTtools |
Title: | Tools for Creating Conditional Probability Tables |
Version: | 0.8-4 |
Date: | 2024/12/23 |
Author: | Russell Almond |
Maintainer: | Russell Almond <[email protected]> |
Authors@R: | person(given = "Russell", family = "Almond", role = c("aut", "cre"), email = "[email protected]", comment = c(ORCID = "0000-0002-8876-9337")) |
Description: | Provides support parameterized tables for Bayesian networks, particularly the IRT-like DiBello tables. Also, provides some tools for visualing the networks. |
License: | Artistic-2.0 |
Encoding: | UTF-8 |
LazyData: | true |
Roxygen: | list(markdown = TRUE) |
RoxygenNote: | 7.1.1 |
URL: | http://pluto.coe.fsu.edu/RNetica |
Depends: | methods, R (>= 3.5.0) |
Imports: | grDevices, graphics, stats, utils, einsum, tidyr, dplyr |
Suggests: | ggplot2, lattice, knitr, rmarkdown, tidyverse, googlesheets4, cowplot, vdiffr, withr, kableExtra, testthat (>= 3.0.0) |
VignetteBuilder: | knitr |
Support: | c( 'Bill & Melinda Gates Foundation grant "Games as Learning/Assessment: Stealth Assessment" (no. 0PP1035331, Val Shute, PI)', 'National Science Foundation grant "DIP: Game-based Assessment and Support of STEM-related Competencies" (no. 1628937, Val Shute, PI)', 'National Science Foundation grant "Mathematical Learning via Architectural Design and Modeling Using E-Rebuild." (no. 1720533, Fengfeng Ke, PI)', 'Institute of Educational Statistics Grant: "Exploring adaptive cognitive and affective learning support for next-generation STEM learning games." (no. R305A170376-20, Val Shute and Russell Almond, PIs)' ) |
Config/testthat/edition: | 3 |
Config/pak/sysreqs: | libicu-dev |
Repository: | https://ralmond.r-universe.dev |
RemoteUrl: | https://github.com/ralmond/CPTtools |
RemoteRef: | HEAD |
RemoteSha: | c3b8e8383adcf67c5c03e7d567dcf639a59af342 |
CPTtools is a collection of various bits of R code useful for processing Bayes net output. Some were designed to work with ETS's proprietary StatShop code, and some with RNetica. The code collected in this package is all free from explicit dependencies on the specific Bayes net package and will hopefully be useful with other systems as well.
The majority of the code are related to building conditional probability
tables (CPTs) for Bayesian networks. The package has two output
representations for a CPT. The first is a data.frame
object
where the first several columns are factor variables corresponding the
the parent variables, and the remaining columns are numeric variables
corresponding to the state of the child variables. The rows represent
possible configurations of the parent variables. An example is shown below.
S1 S2 Full Partial None 1 High High 0.81940043 0.15821522 0.02238436 2 Medium High 0.46696668 0.46696668 0.06606664 3 Low High 0.14468106 0.74930671 0.10601223 4 High Medium 0.76603829 0.14791170 0.08605000 5 Medium Medium 0.38733177 0.38733177 0.22533647 6 Low Medium 0.10879020 0.56342707 0.32778273 7 High Low 0.65574465 0.12661548 0.21763987 8 Medium Low 0.26889642 0.26889642 0.46220715 9 Low Low 0.06630741 0.34340770 0.59028489 10 High LowerYet 0.39095414 0.07548799 0.53355787 11 Medium LowerYet 0.11027649 0.11027649 0.77944702 12 Low LowerYet 0.02337270 0.12104775 0.85557955
The second representation is a table (matrix
) with just the
numeric part. Two approaches to building these tables from parameters
are described below. The more flexible discrete partial credit model is
used for the basis of the parameterized networks in the
Peanut
package.
In addition to the code for building partial credit networks, this package contains some code for building Bayesian network structures from (inverse) correlation matrixes, and graphical displays for Bayes net output. The latter includes some diagnostic plots and additional diagnostic tests.
The original parameterization for creating conditional probability tables based on Almond et al (2001) proved to be insufficiently flexible. Almond (2015) describes a newer parameterization based on three steps:
Translate the parent variables onto a numeric effective theta
scale (effectiveThetas
).
Combine the parent effective thetas into a single effective
theta using a combination rule (Compensatory
,
OffsetConjunctive
).
Convert the effective theta for each row of the table into
conditional probabilities using a link function
(gradedResponse
, partialCredit
,
normalLink
).
The partialCredit
link function is particularly flexible
as it allows different parameterizations and different combination rules
for each state of the child variable. This functionality is best
captured by the two high level functions:
calcDPCTable
Creates the probability table for the discrete partial credit model given the parameters.
mapDPC
Finds an MAP estimate for the parameters given an observed table of counts.
This parameterization serves as basis for the model used in the
Peanut
package.
The first two steps of the discrete partial credit framework outlined above are due to a suggestion by Lou DiBello (Almond et al, 2001). This lead to an older framework, in which the link function was hard coded into the conditional probability table formation. The models were called DiBello-XX, where XX is the name of the link function. Almond et al. (2015) describes several additional examples.
calcDDTable
Calculates DiBello-Dirichlet model probability and parameter tables.
calcDNTable
Creates the probability table for
DiBello-Normal distribution. This is equivalent to using the
normalLink
in the DPC framework. This also uses a link
scale parameter.
calcDSTable
Creates the probability table for
DiBello-Samejima distribution. This is equivalent to using the
gradedResponse
in the DPC framework.
calcDSllike
Calculates the log-likelihood for data from a DiBello-Samejima (Normal) distribution.
Diez (1993) and Srinivas (1993) describe an older parametric framework for Bayes nets based on the noisy-or or noisy-max function. These are also available.
calcNoisyAndTable
Calculate the conditional probability table for a Noisy-And or Noisy-Min distribution.
calcNoisyOrTable
Calculate the conditional probability table for a Noisy-Or distribution.
Almond (2010) noted that in many cases the best information about the relationship among variables came from a procedure that produces a correlation matrix (e.g., a factor analysis). Applying a trick from Whittaker (1990), connecting pairs of nodes corresponding to nonzero entries in an inverse correlation matrix produces an undirected graphical model. Ordering in the nodes in a perfect ordering allows the undirected model to be converted into a directed model (Bayesian network). The conditional probability tables can then be created through a series of regressions.
The following functions implement this protocol:
structMatrix
Finds graphical structure from a covariance matrix.
mcSearch
Orders variables using Maximum Cardinality search.
buildParentList
Builds a list of parents of nodes in a graph.
buildRegressions
Creates a series of regressions from a covariance matrix.
buildRegressionTables
Builds conditional probability tables from regressions.
These functions are a grab bag of lower level utilities useful for building CPTs:
areaProbs
Translates between normal and categorical probabilities.
numericPart
Splits a mixed data frame into a numeric matrix and a factor part..
dataTable
Constructs a table of counts from a setof discrete observations..
eThetaFrame
Constructs a data frame showing the effective thetas for each parent combination..
effectiveThetas
Assigns effective theta levels for categorical variable.
getTableStates
Gets meta data about a conditional probability table..
rescaleTable
Rescales the numeric part of the table.
scaleMatrix
Scales a matrix to have a unit diagonal.
scaleTable
Scales a table according to the Sum and Scale column.
Almond et al. (2009) suggested using hanging barplots for displaying
Bayes net output and gives several examples. The function
stackedBars
produces the simple version of this plot and
the function compareBars
compares two distributions
(e.g., prior and posterior). The function
buildFactorTab
is useful for building the data and the
function colorspread
is useful for building color
gradients.
Madigan, Mosurski and Almond (1997) describe a graphical weight of
evidence balance sheet (see also Almond et al, 2015, Chapter 7; Almond
et al, 2013). The function woeHist
calculates the weights of
evidence for a series of observations and the function
woeBal
produces a graphical display.
Sinharay and Almond (2006) propose a graphical fit test for
conditional probability tables (see also, Almond et al, 2015, Chapter
10). The function OCP
implements this test, and the
function betaci
creates the beta credibility intervals
around which the function is built.
The key to Bayesian network models are the assumptions of conditional
independence which underlie the model. The function
localDepTest
tests these assumptions based on observed
(or imputed) data tables.
The function mutualInformation
calculates the mutual
information of a two-way table, a measure of the strength of
association. This is similar to the measure used in many Bayes net
packages (e.g., MutualInfo
).
Two data sets are provided with this package:
ACED
Data from ACED field trial (Shute, Hansen, and Almond, 2008). This example is based on a field trial of a Bayesian network based Assessment for Learning system, and contains both item-level response and high-level network summaries. A complete description of the Bayes net can be found at http://ecd.ralmond.net/ecdwiki/ACED/ACED.
MathGrades
Grades on 5 mathematics tests from Mardia, Kent and Bibby (from Whittaker, 1990).
Complete index of all functions.
Index of help topics:
ACED.scores Data from ACED field trial CPA Representation of a conditional probability table as an array. CPF Representation of a conditional probability table as a data frame. CPTtools-package Tools for Creating Conditional Probability Tables Compensatory DiBello-Samejima combination function EAPBal Produces a graphical balance sheet for EAP or other univarate statistics. Language_modal Accuracy and Expected Accuracy Matrixes for Langauge Test. MathGrades Grades on 5 mathematics tests from Mardia, Kent and Bibby OCP Observable Characteristic Plot OCP.CPF Plots a conditional probability distribuiton with data overlayed OffsetConjunctive Conjunctive combination function with one difficulty per parent. areaProbs Translates between normal and categorical probabilities barchart.CPF This function produces a plot of a conditional probability frame. betaci Credibility intervals for a proportion based on beta distribution buildFactorTab Builds probability tables from Scored Bayes net output. buildParentList Builds a list of parents of nodes in a graph buildRegressionTables Builds conditional probability tables from regressions buildRegressions Creates a series of regressions from a covariance matrix calcDDTable Calculates DiBello-Dirichlet model probability and parameter tables calcDNTable Creates the probability table for DiBello-Normal distribution calcDPCTable Creates the probability table for the discrete partial credit model calcDSTable Creates the probability table for DiBello-Samejima distribution calcDSllike Calculates the log-likelihood for data from a DiBello-Samejima (Normal) distribution calcNoisyAndTable Calculate the conditional probability table for a Noisy-And or Noisy-Min distribution calcNoisyOrTable Calculate the conditional probability table for a Noisy-Or distribution colorspread Produces an ordered palate of colours with the same hue. compareBars Produces comparison stacked bar charts for two sets of groups cptChi2 Compare and observed to an expected conditional probability table. dataTable Constructs a table of counts from a set of discrete observations. defaultAlphas Generates default values for Alpha and Beta parameters eThetaFrame Constructs a data frame showing the effective thetas for each parent combination. effectiveThetas Assigns effective theta levels for categorical variable ewoe.CPF Calculates the expected weight of evidence from a conditional probability frame. expTable Builds expected contingency tables. fcKappa Functions for measuring rater agreement. getTableStates Gets meta data about a conditional probability table. gkGamma Calculates the Goodman and Kruskal gamma measure of association. gradedResponse A link function based on Samejima's graded response isMonotonic Tests to see if a sequence is ascending or descending isOffsetRule Distinguishes Offset from ordinary rules. localDepTest Tests for conditional independence between two variables given a third mapDPC Finds an MAP estimate for a discrete partial credit CPT mcSearch Orders variables using Maximum Cardinality search mutualInformation Calculates Mutual Information for a two-way table. normalLink Link function using a normal regression. normalize Normalizes a conditional probability table. numericPart Splits a mixed data frame into a numeric matrix and a factor part. parseProbVec Parses Probability Vector Strings partialCredit A link function based on the generalized partial credit model plotsimplex Produces a simplex plot of probability vectors. proflevelci Produce cumulative sum credibility intervals readHistory Reads a file of histories of marginal distributions. rescaleTable Rescales the numeric part of the table scaleMatrix Scales a matrix to have a unit diagonal scaleTable Scales a table according to the Sum and Scale column. stackedBarplot Produces a hanging barplot stackedBars Produces a stacked, staggered barplot structMatrix Finds graphical structure from a covariance matrix woeBal Weight of Evidence Balance Sheet woeHist Creates weights of evidence from a history matrix.
We are grateful to support from the following projects for supporting the work in the development and maintenance of this package.
Bill \& Melinda Gates Foundation grant "Games as Learning/Assessment: Stealth Assessment" (\#0PP1035331, Val Shute, PI)
National Science Foundation grant "DIP: Game-based Assessment and Support of STEM-related Competencies" (\#1628937, Val Shute, PI).
National Scient Foundation grant "Mathematical Learning via Architectual Design and Modeling Using E-Rebuild." (\#1720533, Fengfeng Ke, PI).
Russell Almond
Maintainer: Russell Almond <[email protected]>
Almond, R.G. (2015). An IRT-based Parameterization for Conditional Probability Tables. Paper submitted to the 2015 Bayesian Application Workshop at the Uncertainty in Artificial Intelligence conference.
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer.
Almond, R. G. (2010). ‘I can name that Bayesian network in two matrixes.’ International Journal of Approximate Reasoning. 51, 167-178.
Almond, R. G., Shute, V. J., Underwood, J. S., and Zapata-Rivera, J.-D (2009). Bayesian Networks: A Teacher's View. International Journal of Approximate Reasoning. 50, 450-460.
Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.
Diez, F. J. (1993) Parameter adjustment in Bayes networks. The generalized noisy OR-gate. In Heckerman and Mamdani (eds) Uncertainty in Artificial Intelligence 93. Morgan Kaufmann. 99–105.
Muraki, E. (1992). A Generalized Partial Credit Model: Application of an EM Algorithm. Applied Psychological Measurement, 16, 159-176. DOI: 10.1177/014662169201600206
Samejima, F. (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph No. 17, 34, (No. 4, Part 2).
Shute, V. J., Hansen, E. G., & Almond, R. G. (2008). You can't fatten a hog by weighing it—Or can you? Evaluating an assessment for learning system called ACED. International Journal of Artificial Intelligence and Education, 18(4), 289-316.
Sinharay, S. and Almond, R.G. (2006). Assessing Fit of Cognitively Diagnostic Models: A case study. Educational and Psychological Measurement. 67(2), 239–257.
Srinivas, S. (1993) A generalization of the Noisy-Or model, the generalized noisy OR-gate. In Heckerman and Mamdani (eds) Uncertainty in Artificial Intelligence 93. Morgan Kaufmann. 208–215.
Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley.
Madigan, D., Mosurski, K. and Almond, R. (1997) Graphical explanation in belief networks. Journal of Computational Graphics and Statistics, 6, 160-181.
Almond, R. G., Kim, Y. J., Shute, V. J. and Ventura, M. (2013). Debugging the Evidence Chain. In Almond, R. G. and Mengshoel, O. (Eds.) Proceedings of the 2013 UAI Application Workshops: Big Data meet Complex Models and Models for Spatial, Temporal and Network Data (UAI2013AW), 1-10. http://ceur-ws.org/Vol-1024/paper-01.pdf
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## New Discrete Partial Credit framework: ## Complex model, different rules for different levels cptPC2 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, list(full=log(1),partial=log(c(S1=1,S2=.75))), betas=list(full=c(0,999),partial=1.0), rule=list("OffsetDisjunctive","Compensatory")) ## Graded Response using the older DiBello-Samejima framework. cptGraded <- calcDSTable(list(S1=skill1l),gradeL, 0.0, 0.0, dinc=c(.3,.4,.3)) ## Building a Bayes net from a correlation matrix. data(MathGrades) pl <- buildParentList(structMatrix(MathGrades$var),"Algebra") rt <- buildRegressions(MathGrades$var,MathGrades$means,pl) tabs <- buildRegressionTables(rt, MathGrades$pvecs, MathGrades$means, sqrt(diag(MathGrades$var))) ## Stacked Barplots: margins.prior <- data.frame ( Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09), NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14), Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04) ) margins.post <- data.frame( Trouble=c(Novice=.03,Semester1=.15,Semester2=.39,Semseter3=.32,Semester4=.11), NDK=c(Novice=.00,Semester1=.03,Semester2=.28,Semseter3=.52,Semester4=.17), Model=c(Novice=.10,Semester1=.25,Semester2=.37,Semseter3=.23,Semester4=.05)) stackedBars(margins.post,3, main="Marginal Distributions for NetPASS skills", sub="Baseline at 3rd Semester level.", cex.names=.75, col=hsv(223/360,.2,0.10*(5:1)+.5)) compareBars(margins.prior,margins.post,3,c("Prior","Post"), main="Margins before/after Medium Trouble Shooting Task", sub="Observables: cfgCor=Medium, logCor=High, logEff=Medium", legend.loc = "topright", cex.names=.75, col1=hsv(h=.1,s=.2*1:5-.1,alpha=1), col2=hsv(h=.6,s=.2*1:5-.1,alpha=1)) ## Weight of evidence balance sheets sampleSequence <- read.csv(system.file("testFiles","SampleStudent.csv", package="CPTtools"), header=TRUE,row.names=1) woeBal(sampleSequence[,c("H","M","L")],c("H"),c("M","L"),lcex=1.25) ### Observable Characteristic Plot pi <- c("+"=.15,"-"=.85) nnn <- c("(0,0,0)"=20,"(0,0,1)"=10, "(0,1,0)"=10,"(0,1,0)"=5, "(1,0,0)"=10,"(1,0,1)"=10, "(1,1,1)"=10,"(1,1,1)"=25) xx1 <- c("(0,0,0)"=2,"(0,0,1)"=5, "(0,1,0)"=1,"(0,1,1)"=3, "(1,0,0)"=0,"(1,0,1)"=2, "(1,1,0)"=5,"(1,1,1)"=24) grouplabs <- c(rep("-",3),"+") grouplabs1 <- rep(grouplabs,each=2) OCP2 (xx1,nnn,grouplabs1,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), setlabs=c("Low Skill3","High Skill3"),setat=-.8, main="Data for which Skill 3 is relevant")
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## New Discrete Partial Credit framework: ## Complex model, different rules for different levels cptPC2 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, list(full=log(1),partial=log(c(S1=1,S2=.75))), betas=list(full=c(0,999),partial=1.0), rule=list("OffsetDisjunctive","Compensatory")) ## Graded Response using the older DiBello-Samejima framework. cptGraded <- calcDSTable(list(S1=skill1l),gradeL, 0.0, 0.0, dinc=c(.3,.4,.3)) ## Building a Bayes net from a correlation matrix. data(MathGrades) pl <- buildParentList(structMatrix(MathGrades$var),"Algebra") rt <- buildRegressions(MathGrades$var,MathGrades$means,pl) tabs <- buildRegressionTables(rt, MathGrades$pvecs, MathGrades$means, sqrt(diag(MathGrades$var))) ## Stacked Barplots: margins.prior <- data.frame ( Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09), NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14), Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04) ) margins.post <- data.frame( Trouble=c(Novice=.03,Semester1=.15,Semester2=.39,Semseter3=.32,Semester4=.11), NDK=c(Novice=.00,Semester1=.03,Semester2=.28,Semseter3=.52,Semester4=.17), Model=c(Novice=.10,Semester1=.25,Semester2=.37,Semseter3=.23,Semester4=.05)) stackedBars(margins.post,3, main="Marginal Distributions for NetPASS skills", sub="Baseline at 3rd Semester level.", cex.names=.75, col=hsv(223/360,.2,0.10*(5:1)+.5)) compareBars(margins.prior,margins.post,3,c("Prior","Post"), main="Margins before/after Medium Trouble Shooting Task", sub="Observables: cfgCor=Medium, logCor=High, logEff=Medium", legend.loc = "topright", cex.names=.75, col1=hsv(h=.1,s=.2*1:5-.1,alpha=1), col2=hsv(h=.6,s=.2*1:5-.1,alpha=1)) ## Weight of evidence balance sheets sampleSequence <- read.csv(system.file("testFiles","SampleStudent.csv", package="CPTtools"), header=TRUE,row.names=1) woeBal(sampleSequence[,c("H","M","L")],c("H"),c("M","L"),lcex=1.25) ### Observable Characteristic Plot pi <- c("+"=.15,"-"=.85) nnn <- c("(0,0,0)"=20,"(0,0,1)"=10, "(0,1,0)"=10,"(0,1,0)"=5, "(1,0,0)"=10,"(1,0,1)"=10, "(1,1,1)"=10,"(1,1,1)"=25) xx1 <- c("(0,0,0)"=2,"(0,0,1)"=5, "(0,1,0)"=1,"(0,1,1)"=3, "(1,0,0)"=0,"(1,0,1)"=2, "(1,1,0)"=5,"(1,1,1)"=24) grouplabs <- c(rep("-",3),"+") grouplabs1 <- rep(grouplabs,each=2) OCP2 (xx1,nnn,grouplabs1,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), setlabs=c("Low Skill3","High Skill3"),setat=-.8, main="Data for which Skill 3 is relevant")
ACED (Adaptive Content with Evidence-Based Diagnosis; Shute, Hansen and Almond, 2008) is a Bayes net based assessment system which featured: (a) adaptive item selection and (b) extended feedback for incorrect items. This data contains both item level and pretest/posttest data from a field trial of the ACED system.
data("ACED")
data("ACED")
ACED contains 3 primary data.frame
objects and some supplementary data.
All of the data tables have two variables which can serve as keys. SubjID
and AltID
.
Either can be used as a primary key in joins. Note that the first two digits of the AltID
gives the session (i.e., class) that the student was in. Note also that students in
the control group have only pretest and posttest data; hence they do not appear in
ACED.items
, ACED.scores
or ACED.splitHalves
.
ACED.scores
is data frame with 230 observations on 74
variables. These are mostly high-level scores from the Bayesian
network.
Cond_code
a factor giving the experimental condition for this student, the levels are “adaptive_acc”, “adaptive_full”, “linear_full”, and “control”. Note that there are no control students in this data set.
Sequencing
a factor describing whether the sequence of items
was Linear
or Adaptive
Feedback
a factor describing whether the feedback for
incorrect items was Extended
or AccuracyOnly
All_Items
a numeric vector giving the number of items in ACED
Correct
a numeric vector giving the number of items the student got correct
Incorr
a numeric vector giving the number of items the student got incorrect
Remain
a numeric vector giving the number of items not reached or skipped
ElapTime
a difftime vector giving the total time spent on ACED (in seconds)
The next group of columns give “scores” for each of the nodes in
the Bayesian network. Each node has four scores, and the columns are
names p
nodeScoreType where node is
replaced by one of the short codes in ACED.allSkills
.
p
nodeH
a numeric vector giving the probability node is in the high state
p
nodeM
a numeric vector giving the probability node is in the medium state
p
nodeL
a numeric vector giving the probability node is in the low state
EAP
node
the expected a posteriori value of node
assuming an equal interval scale, where L=0
, M=1
and
H=2
MAP
node
a factor vector giving maximum a
posteriori value of node, i.e.,
which.max(p
nodeH, p
nodeM, p
nodeL)
.
ACED.skillNames
a list with two components, long
and short
giving
the long (spelled out in CamelCase) and short names for the skills is a character vector giving the abbreviations
used for the node/skill/attributes names.
ACED.items
is data frame with 230 observations on 73
variables. These are mostly item-level scores from the field trial.
The first two columns are SubjID
and AltID
. The remaining columns
correspond to ACED internal tasks, and are coded 1 for correct, 0 for incorrect, and
NA
for not reached.
ACED.taskNames
is essentially the row names from ACED.items
.
The naming scheme for the tasks reflect the skills measured by the task.
The response variable names
all start with t
(for task) followed by the name of one or more
skills tapped by the task (if there is more than one, then the first one is
“primary”.) This is followed by a numeric code, 1, 2 or 3,
giving the difficulty (easy, medium or hard) and a letter (a, b or
c) used to indicate alternate tasks following the same task model.
ACED.prePost
is data frame with 290 observations on 32
variables giving the results of the pretest and posttest.
SubjID
ID assigned by study team, “S” followed by 3 digits. Primary key.
AltID
ID assigned by the ACED software vendor. Pattern is “sXX-YY”, where XX is the session and YY is a student with the session.
Classroom
A factor correpsonding to the student's class.
Gender
A factor giving the student's gender (I'm not sure if this is self-report or administrative records.)
Race
A factor giving the student's self-reported race. The codebook is lost.
Level_Code
a factor variable describing the academic
track of the student with levels Honors
, Academic
,
Regular
, Part 1
, Part 2
and ELL
. The
codes Part 1
and Part 2
refer to special education
students in Part 1 (mainstream classroom) or Part 2 (sequestered).
pre_scaled
scale score (after equating) on pretest
post_scaled
scale score (after equating) on posttest
gain_scaled
post_scaled - pre_scaled
Form_Order
a factor variables describing whether
(AB
) Form A was the pretest and Form B was the posttest or
(BA
) vise versa.
PreACorr
number of correct items on Form A for students who took Form A as a pretest
PostBCorr
number of correct items on Form B for students who took Form B as a posttest
PreBCorr
number of correct items on Form B for students who took Form B as a pretest
PostACorr
number of correct items on Form A for students who took Form A as a posttest
PreScore
a numeric vector with either the non-missing
value from PreACorr
and PreBCorr
PostScore
a numeric vector with either the non-missing
value from PostACorr
and PostBCorr
Gain
PostScore - PreScore
preacorr_adj
PreACorr
adjusted to put forms A
and B on the same scale
postbcorr_adj
PostBCorr
adjusted to put forms A
and B on the same scale
prebcorr_adj
PreBCorr
adjusted to put forms A
and B on the same scale
postacorr_adj
PostACorr
adjusted to put forms A
and B on the same scale
Zpreacorr_adj
standardized version of preacorr_adj
Zpostbcorr_adj
standardized version of postbcorr_adj
Zprebcorr_adj
standardized version of prebcorr_adj
Zpostacorr_adj
standardized version of postacorr_adj
scale_prea
score on Form A for students who took Form A as a pretest scaled to range 0-100
scale_preb
score on Form B for students who took Form B as a pretest scaled to range 0-100
pre_scaled
scale score on pretest (whichever form)
scale_posta
score on Form A for students who took Form A as a posttest scaled to range 0-100
scale_postb
score on Form B for students who took Form B as a posttest scaled to range 0-100
Flagged
a logical variable (codebook lost)
Cond
a factor describing the experimental condition
with levels Adaptive/Accuracy
, Adaptive/Extended
,
Linear/Extended
and Control
. Note that controls are included in this data set.
Sequencing
a factor describing whether the sequence of items
was Linear
or Adaptive
Feedback
a factor describing whether the feedback for
incorrect items was Extended
or AccuracyOnly
ACED.Qmatrix
is a logical matrix whose rows correspond to skills (long names)
and whose columns correspond to tasks which is true if the skill is required for solving
the task (according to the expert).
ACED.QEM
is a reduced Q-matrix containing the 15 evidence models
(unique rows in the $Q$-matrix). The entries are character values with "0" indicating skill not needed, "+"
indicating skill is needed and "++" indicating skill is primary. The Tasks
column lists
the tasks corresponding to this evidence model (1, 2 and 3 again represent difficulty level, and the
letters indicating variants). The Anchor
column is used to identify subscales for scale identification.
ACED.splithalves
is a list of two datasets labeled “A” and “B”. Both
have the same structure as ACED.scores
(less the datas giving study condition). These were created
by splitting the 62 items into 2 subforms with 31 items each. For the most part, each item was paired with
an variant which differed only by the last letter. The scores are the results of Bayes net scoring with
half of the items.
ACED.pretest
and ACED.posttest
are raw data from the external pretest and posttest given
before and after the study. Each is a list with four components:
Araw
Unscored responses for students who took that form as pre(post)test. The first row is the key.
Ascored
The scored responses for the Araw
students; correct is 1, incorrect is 0.
Braw
Unscored responses for students who took that form as pre(post)test. The first row is the key.
Bscored
The scored responses for the Araw
students; correct is 1, incorrect is 0.
Because of the counterbalancing each student should appear in either Form A or Form B in the pretest
and in the other group in the posttest. Note that the A and B forms here have no relationship with the A and B
forms in ACED.splithalves
.
ACED is a Bayesian network based Assessment for Learning learning system, thus it served as both a assessment and a tutoring system. It had two novel features which could be turned on and off, elaborated feedback (turned off, it provided accuracy only feedback) and adaptive sequencing of items (turned off, it scheduled items in a fixed linear sequence).
It was originally built to cover all algebraic sequences (arithmetic, geometric and other recursive), but only the branch of the system using geometric sequences was tested. Shute, Hansen and Almond (2008) describe the field trial. Students from a local middle school (who studied arithmetic, but not geometric sequences as part of their algebra curriculum) were recruited for the study. The students were randomized into one of four groups:
Adaptive/Accuracy
Adaptive sequencing was used, but students only received correct/incorrect feedback.
Adaptive/Extended
Adaptive sequencing was used, but students received extended feedback for incorrect items.
Linear/Extended
The fixed linear sequencing was used, but students received extended feedback for incorrect items.
Control
The students did independent study and did not use ACED.
Because students in the control group were not exposed to the ACED
task, neither the Bayes net level scores nor the item level scores are
available for those groups, and those students are excluded from
ACED.scores
and ACED.items
. The students are in the
same order in all of the data sets, with the 60 control students
tacked onto the end of the ACED.prePost
data set.
All of the students (including the control students) were given a
25-item pretest and a 25-item posttest with items similar to the ones
used in ACED. The design was counterbalanced, with half of the
students receiving Form A as the pretest and Form B as the posttest
and the other half the other way around, to allow the two forms to be
equated using the pretest data. The details are buried in
ACED.prePost
.
Note that some irregularities were observed with the English Language
Learner (ACED.prePost$Level_code=="ELL"
) students. Their
teachers were allowed to translated words for the students, but in
many cases actually wound up giving instruction as part of the
translation.
Shute, V. J., Hansen, E. G., & Almond, R. G. (2008). You can't fatten a hog by weighing it—Or can you? Evaluating an assessment for learning system called ACED. International Journal of Artificial Intelligence and Education, 18(4), 289-316.
Thanks to Val Shute for permission to use the data.
ACED development and data collection was sponsored by National Science Foundation Grant No. 0313202.
A more detailed description, including a Q-matrix can be found at the ECD Wiki: http://ecd.ralmond.net/ecdwiki/ACED/ACED.
data(ACED)
data(ACED)
Maps between a continuous (normal) variable and a discrete variable by
establishing a set of bins to maintain a particular probability
vector. The pvecToCutpoints
function returns the cut points
separating the bins, the pvecToMidpoints
returns a central
point from each bin, and the areaProbs
calculates the fraction
of a normal curve falling into a particular bin.
pvecToCutpoints(pvec, mean = 0, std = 1) pvecToMidpoints(pvec, mean = 0, std = 1) areaProbs(pvec, condmean, condstd, mean = 0, std = 1)
pvecToCutpoints(pvec, mean = 0, std = 1) pvecToMidpoints(pvec, mean = 0, std = 1) areaProbs(pvec, condmean, condstd, mean = 0, std = 1)
pvec |
A vector of marginal probabilities for the categories of the discrete variable. Elements should be ordered from smallest to largest. |
mean |
The mean of the continuous variable. |
std |
The standard deviation of the continuous variable. |
condmean |
The conditional mean of the continuous variable. |
condstd |
The conditional standard deviation of the continuous variable. |
Let be a discrete variable whose states
are given by
names(pvec)[k]
and for which the
marginal probability is given by
pvec[k]
.
Let be a continuous normal variable with mean
mean
and
standard deviation std
. These function map between and
.
The function pvecToCutpoints
produces a series of cutpoints,
, such that setting
to
when
produces the marginal probability
specified by
pvec
. Note that is always
-Inf
and is always
Inf
(where is
length(pvec)
).
The function pvecToMidpoints
produces the midpoints (with
respect to the normal density) of the intervals defined by
pvecToCutpoints
. In particular, if ,
then the values returned are
.
The function areaProbs
inverts these calculations. If
condmean
is and
condstd
is
, then this function calculates
by calculating the area under the normal curve.
For pvecToCutpoints
, a vector of length one greater than
pvec
giving the endpoints of the bins. Note that the first and
last values are always infinite.
For pvecToCutpoints
, a vector of length the same length as
pvec
giving the midpoint of the bins.
For areaProbs
a vector of probabilities of the same length as
pvec
.
Variables are given from lowest to highest state, for example ‘Low’, ‘Medium’, ‘High’. StatShop expects variables in the opposite order.
The function effectiveThetas
does something similar, but
assumes all probability values are equally weighted.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Almond, R.G. ‘I Can Name that Bayesian Network in Two Matrixes.’ International Journal of Approximate Reasoning, 51, 167–178.
probs <- c(Low=.05,Med=.9,High=.05) cuts <- pvecToCutpoints(probs) mids <- pvecToMidpoints(probs) areaProbs(probs,1,.5)
probs <- c(Low=.05,Med=.9,High=.05) cuts <- pvecToCutpoints(probs) mids <- pvecToMidpoints(probs) areaProbs(probs,1,.5)
This is an extension of barchart.array
for
displaying conditional probability tables. In particular, it will
display the output of calcDPCFrame
.
## S3 method for class 'CPF' barchart(x, data = NULL, ..., baseCol = "firebrick", auto.key=TRUE, par.settings)
## S3 method for class 'CPF' barchart(x, data = NULL, ..., baseCol = "firebrick", auto.key=TRUE, par.settings)
x |
A conditional probaiblity frame (see |
data |
Ignore this value, used for compatability with
|
... |
Other arguments passed on to
|
baseCol |
This should be a specification of a color. The color
is designed as a gradient starting at the base color and getting
progressively lighter. If its value is |
auto.key |
This is the |
par.settings |
This is the |
The function barchart.array
and the function
as.CPA
to convert the conditional probability frame to an array
do 90 percent of the work.
A few minor touches:
The function takes special care of one row [un]conditional probability frames.
The function overrides the default colors using
colorspread
to produce varying intensities of the
same color.
The function adds the dimension names, so that the labels indicate which variables they belong to.
The function sets the default value of auto.key to TRUE
so that a legend for the colors is produced.
Note that the color change is brought about internally by modifying
par.settings
. To suppress this behavior, set baseCol
to
null, and the user value for par.settings
will be passed
through unchanged.
An object of class lattice that when printed will produce the graph.
Russell Almond
as.CPA
, colorspread
,
barchart.array
, calcDPCFrame
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) barchart.CPF(cpfTheta) cptComp <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Compensatory") barchart.CPF(cptComp,layout=c(3,1)) cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)), rule="OffsetDisjunctive") barchart.CPF(cptPC1,baseCol="slateblue")
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) barchart.CPF(cpfTheta) cptComp <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Compensatory") barchart.CPF(cptComp,layout=c(3,1)) cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)), rule="OffsetDisjunctive") barchart.CPF(cptPC1,baseCol="slateblue")
This generates upper and lower bounds for a highest posterior density credibility interval for a beta distribution by looking at appropriate quantiles of the beta distribution. This is designed to work with sums of classification probabilities.
betaci(sumData, totals = NULL, limits = c(lower = 0.025, upper = 0.975), a = 0.5, b = 0.5)
betaci(sumData, totals = NULL, limits = c(lower = 0.025, upper = 0.975), a = 0.5, b = 0.5)
sumData |
A vector or matrix of counts or sums proportions. Note these do not need to be integers, sums of classification probabilities work here. |
totals |
Total number of individuals as reference for
|
limits |
The upper and lower credibility limits. |
a |
Value for the |
b |
Value for the |
This function computes the upper and lower bounds of a credibility
interval for a beta distribution based on sumData
successes out
of totals
trials. Note that as a beta distribution is used for
the basic calculations, neither sumData
nor totals
need
be integers.
To avoid problems with zero cells (or cells with values equal to
totals
), a small prior is added to the beta calculations. By
default a Jeffrey's prior is added to the data. Thus the
final returned value is:
where prob
varies over the values in limits
. Note that
a
and b
can be scalars or an array conformable with
totals
.
If totals
is not supplied, and sumData
is a vector, then the
sum is used as the total. If the totals
is not supplied and sumData
is a
matrix, then the row sums are used. For higher dimensional arrays, it is necessary to explicitly
specify a total.
A list of the same length as limits
with the same names. Each
component is a quantile of the posterior distribution which has the
same shape as sumData
.
Note that limits
is not limited to length 2, although this is
the most obvious application.
A previous verison used colSums(data)
rather than rowSums()
. Using the row sums
makes more sense as this corresponds to the CPF format.
Russell Almond
See OCP
for an application.
x <- matrix(c(7,4,2,31),2,2) ## Use row sums as totals betaci(x) ## fixed totals x <- c(7,2,31) nn <- c(30,15,35) betaci(x,nn,c(Q1=.25,Q2=.5,Q3=.75)) ## Prior varies according to cell. pi0 <- c(.2,.2,.8) betaci(x,nn,a=pi0,b=1-pi0)
x <- matrix(c(7,4,2,31),2,2) ## Use row sums as totals betaci(x) ## fixed totals x <- c(7,2,31) nn <- c(30,15,35) betaci(x,nn,c(Q1=.25,Q2=.5,Q3=.75)) ## Prior varies according to cell. pi0 <- c(.2,.2,.8) betaci(x,nn,a=pi0,b=1-pi0)
Looks for margin statistics in scored Bayes net output, and puts them into tables with rows representing variables and columns representing variable states.
The marginTab
function does this for a single individual. The
buildMarginTab
uses the grand mean across all individuals and
the buildFactorTab
breaks down groups according to a factor
variable. The function build2FactorTab
builds a three-way table.
buildFactorTab(data, fact, stateNames, skillName, reverse = TRUE, stem="P", sep=".") build2FactorTab(data, fact1, fact2, stateNames, skillName, reverse = TRUE, stem="P",sep=".") buildMarginTab(data, stateNames, skillNames, reverse = TRUE, stem="P",sep=".") marginTab(datarow, stateNames, skillNames, reverse = TRUE, stem="P",sep=".")
buildFactorTab(data, fact, stateNames, skillName, reverse = TRUE, stem="P", sep=".") build2FactorTab(data, fact1, fact2, stateNames, skillName, reverse = TRUE, stem="P",sep=".") buildMarginTab(data, stateNames, skillNames, reverse = TRUE, stem="P",sep=".") marginTab(datarow, stateNames, skillNames, reverse = TRUE, stem="P",sep=".")
data |
A data sets of StatShop statistics for many individuals. |
datarow |
One row of such a data set. |
fact |
A factor variable according to which to split the
data. Length should be the same as the length of |
fact1 |
A factor variable according to which to split the data. |
fact2 |
A factor variable according to which to split the data. |
stateNames |
Names of the variable states. |
skillName , skillNames
|
Name(s) of the proficiency variable(s) to be used. |
reverse |
Reverse the order of the states for display (i.e., convert from StatShop order of highest first to more natural order of lowest first. |
stem |
A character string giving a prefix used to indicate variable names. |
sep |
A character string giving a separator used to separate prefix from variable names. |
This looks for columns marked “<stem><sep><skillName>” in the
data frame, and builds them into a matrix. It is assumed that all
variables have the same number of states and that they are in the same
order, and the order is the same as given in stateNames
.
The functions buildFactorTab
and build2FactorTab
really
expect their skillNames
argument to be a single variable name.
However, they should work with multiple variables if suitable values
are chosen for the state names.
For marginTab
a matrix with columns corresponding to
skillNames
and rows corresponding to stateNames
giving
the probabilities for a single individual.
For buildMarginTab
a matrix with columns corresponding to
skillNames
and rows corresponding to stateNames
giving
the average probabilities for the entire data set.
For buildFactorTab
a matrix with columns corresponding to
the unique values of fact
and rows corresponding to
stateNames
entries give the average probabilities across the
groups.
For build2FactorTab
a 3 dimensional array with the first
dimension corresponding to the unique values of fact1
, the
second dimension corresponding to the unique values of fact2
and the last dimension corresponding to stateNames
entries give
the average probabilities across the groups.
Russell Almond
data(ACED) marginTab(ACED.scores[1,], c("H","M","L"), ACED.skillNames$short, reverse = TRUE, stem="P",sep=".") buildMarginTab(ACED.scores, c("H","M","L"), ACED.skillNames$short[1:4], reverse = TRUE, stem="P",sep=".") buildFactorTab(ACED.scores, ACED.scores$Cond_code, c("H","M","L"), "sgp", reverse = TRUE, stem="P", sep=".") build2FactorTab(ACED.scores, ACED.scores$Sequencing, ACED.scores$Feedback, c("H","M","L"), "sgp", reverse = TRUE, stem="P",sep=".")
data(ACED) marginTab(ACED.scores[1,], c("H","M","L"), ACED.skillNames$short, reverse = TRUE, stem="P",sep=".") buildMarginTab(ACED.scores, c("H","M","L"), ACED.skillNames$short[1:4], reverse = TRUE, stem="P",sep=".") buildFactorTab(ACED.scores, ACED.scores$Cond_code, c("H","M","L"), "sgp", reverse = TRUE, stem="P", sep=".") build2FactorTab(ACED.scores, ACED.scores$Sequencing, ACED.scores$Feedback, c("H","M","L"), "sgp", reverse = TRUE, stem="P",sep=".")
Takes an incidence matrix describing a graph, and an order of the
nodes, and builds a list of parents for each node. If the ord
argument is not supplied, the ordering is constructed through maximum
cardinality search (see mcSearch
).
buildParentList(sm, start = colnames(sm)[1], ord = mcSearch(sm, start))
buildParentList(sm, start = colnames(sm)[1], ord = mcSearch(sm, start))
sm |
A logical matrix whose rows and columns correspond to nodes (variables) and a true value indicates an edge between the variables. |
start |
The name of the first element. |
ord |
A vector of size equal to the number of columns of
|
The sm
argument should be an incidence matrix for a graph, with
row and column names set to the names of the nodes/variables.
A node is a parent of node
if
They are neighbors, that is sm[i,j] == TRUE
.
The node appears before node
in
ord
,
that is ord[i] < ord[j]
.
The argument start
is used only if ord
is not supplied,
in which case it is passed to mcSearch
.
A list with as many elements as there are columns in sm
, and
whose elements appear in the order specified by ord
. Each
element of that list is a character vector giving the names of the
parents.
Russell Almond
Almond, R. G. (2010). ‘I can name that Bayesian network in two matrixes.’ International Journal of Approximate Reasoning. 51, 167-178.
data(MathGrades) MG.struct <- structMatrix(MathGrades$var) parents <- buildParentList(MG.struct) # Arbitrary start parentsa <- buildParentList(MG.struct, "Algebra") # Put algebra first.
data(MathGrades) MG.struct <- structMatrix(MathGrades$var) parents <- buildParentList(MG.struct) # Arbitrary start parentsa <- buildParentList(MG.struct, "Algebra") # Put algebra first.
This function takes a covariance matrix and list of variables and their parents and returns a collection of regression model parameters for each variable regressed on its parents. This is a compact representation of a normal graphical model.
buildRegressions(Sigma, means = 0, parents = buildParentList(structMatrix(Sigma)))
buildRegressions(Sigma, means = 0, parents = buildParentList(structMatrix(Sigma)))
Sigma |
A covariance matrix among a collection of continuous variables. |
means |
The means of those variables |
parents |
A named list of length equal to |
This function performs one regression for each element of the
parents
list. The name of the dependent variable for each
regression is given by names(parents)
and the independent
variables is given by the values of parents
. (The function
buildParentList()
builds a suitable list of elements.)
If means
is not supplied, then the variables are assumed to be
centered, otherwise the given vector is used as the means.
A list of length equal to parents
whose elements are also a
list having the following structure
b |
A vector of slopes for the regression with names equal to the
names of the parent variables. Note that if there are no parents,
this will be |
a |
The intercept from the regression. |
std |
The residual standard deviation from the regression. |
Russell Almond
Almond, R. G. (2010). ‘I can name that Bayesian network in two matrixes.’ International Journal of Approximate Reasoning. 51, 167-178.
buildParentList
,
buildRegressionTables
data(MathGrades) pl <- buildParentList(structMatrix(MathGrades$var),"Algebra") rt <- buildRegressions(MathGrades$var,MathGrades$mean,pl)
data(MathGrades) pl <- buildParentList(structMatrix(MathGrades$var),"Algebra") rt <- buildRegressions(MathGrades$var,MathGrades$mean,pl)
Takes a description of a normal graphical model as a series of regressions and a description of the corresponding discrete variables and builds a series of conditional probability tables for the corresponding Bayesian network.
buildRegressionTables(regs, pvecs, mean = 0, std = 1)
buildRegressionTables(regs, pvecs, mean = 0, std = 1)
regs |
A list with names corresponding to the variables in the model giving a series of regression coefficients (see Details). |
pvecs |
A list with names corresponding to the variables in the model giving a series of probability vectors in order from highest state to lowest state (see Details). |
mean |
A vector of means for the continuous variables. |
std |
A vector of standard deviations for the continuous variables. |
The regs
argument should be a list whose names are the names of
the variables. Each element should have the following fields:
A vector of slopes for the regression with names equal to the
names of the parent variables. Note that if there are no parents,
this should be numeric(0)
.
The intercept from the regression.
The residual standard deviation from the regression.
The function buildRegressions()
creates an appropriate list.
The pvecs
should be a list whose names are the names of the
variables. Each element should be a named vector of probabilities in
order from the Highest to the Lowest state, e.g.
c(High=.2,Med=.5,Low=.3)
.
The values mean
and std
should either be scalars or
vectors of the same length as the number of elements in regs
and pvecs
. If vectors, they should have names corresponding to
the variable names. Note that the order of the elements does not need
to be the same in all four arguments, but that the names of all four
arguments need to be identical (unless mean
or std
are
given as scalars, in which case they will be appropriately
replicated.)
A list of conditional probability tables whose names correspond to
regs
. Each conditional probability table is expressed as a
data frame whose columns correspond to either parent variables or
states of the child variable and whose rows correspond to
configurations of the parent variable.
Variables are given from highest to lowest state, for
example ‘High’, ‘Medium’, ‘Low’. This is the
order expected by StatShop. Note that pvecToCutpoints
expects probability vectors in the opposite order.
Russell Almond
Almond, R. G. (2010). ‘I can name that Bayesian network in two matrixes.’ International Journal of Approximate Reasoning. 51, 167-178.
Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley.
data(MathGrades) pl <- buildParentList(structMatrix(MathGrades$var),"Algebra") rt <- buildRegressions(MathGrades$var,MathGrades$means,pl) tabs <- buildRegressionTables(rt, MathGrades$pvecs, MathGrades$means, sqrt(diag(MathGrades$var)))
data(MathGrades) pl <- buildParentList(structMatrix(MathGrades$var),"Algebra") rt <- buildRegressions(MathGrades$var,MathGrades$means,pl) tabs <- buildRegressionTables(rt, MathGrades$pvecs, MathGrades$means, sqrt(diag(MathGrades$var)))
The DiBello–Dirichlet model creates a hyper-Dirichlet prior
distribution by interpolating between an masterProfile
and a
noviceProfile
. This function builds the hyper-Dirichlet
parameter table, or with normalization, the conditional probability
table for this distribution type.
calcDDTable(skillLevels, obsLevels, skillWeights, masterProfile, noviceProfile = 0.5, rule = "Compensatory") calcDDFrame(skillLevels, obsLevels, skillWeights, masterProfile, noviceProfile = 0.5, rule = "Compensatory")
calcDDTable(skillLevels, obsLevels, skillWeights, masterProfile, noviceProfile = 0.5, rule = "Compensatory") calcDDFrame(skillLevels, obsLevels, skillWeights, masterProfile, noviceProfile = 0.5, rule = "Compensatory")
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
obsLevels |
A character vector giving names of levels for the output variables from highest to lowest. As a special case, can also be a vector of integers. |
skillWeights |
A numeric vector of the same length as
|
masterProfile |
The Dirichlet prior for “experts” (see
Details). Its length should match |
noviceProfile |
The Dirichlet prior for “novices” (see
Details). Its length should match |
rule |
Function for computing effective theta (see Details). |
Assume for the moment that there are two possible skill profiles:
“expert” and “novice”. This model presumes a
conditional probability table for the outcome given skill profile with
two rows each of which is an independent categorical distribution.
The natural conjugate prior is an independent Dirichlet distribution
for each row. The parameters for this distribution are given in the
masterProfile
and noviceProfile
arguments.
If there is more than one parent variable or if the parent variable has more than one state, the situation becomes muddier. The “expert” state is obviously the one with all the variables at the highest levels and the “novice” is the one with all variables at the lowest levels. If we can assign an integer between 0 and 1 to each of the intermediate states, then we can interpolate between them to produce Dirichlet priors for each row.
This distribution type uses the DiBello effective theta technique to
come up with the interpolation. Each parent variable state is
assigned a ‘theta’ value using the
effectiveThetas
function to assign a numeric value to
each one. These are then combined using the function rule
in
the rule argument. The resulting theta values are then scaled to a
range of 0–1. The prior for that row is a weighted combination of
the masterProfile
and noviceProfile
.
The combination of the individual effective theta values into a joint
value for effective theta is done by the function reference by
rule
. This should be a function of three arguments:
theta
— the vector of effective theta values for each parent,
alphas
— the vector of discrimination parameters, and
beta
— a scalar value giving the difficulty. The initial
distribution supplies three functions appropriate for use with
calcDSTable
: Compensatory
,
Conjunctive
, and Disjunctive
. Note that
the beta
argument is effectively ignored because of the later
scaling of the output.
Normally obslevel
should be a character vector giving state
names. However, in the special case of state names which are integer
values, R will “helpfully” convert these to legal variable
names by prepending a letter. This causes other functions which rely
on the names()
of the result being the state names to break.
As a special case, if the value of obsLevel
is of type numeric,
then calcDSFrame()
will make sure that the correct values are
preserved.
For calcDDTable
, a matrix whose rows correspond configurations
of the parent variable states (skillLevels
) and whose columns
correspond to obsLevels
. Each row of the table is the
parameters of a Dirichlet distribution, so the whole matrix is the
parameters for a hyper-Dirichlet distribution. The order of the
parent rows is the same as is produced by applying expand.grid
to skillLevels
.
For calcDDFrame
a CPF
, a data frame with additional columns
corresponding to the entries in skillLevels
giving the parent
value for each row.
Unlike calcDSTable
, there is not a corresponding
DiBello-Dirichlet distribution support in StatShop. This function is
used to model the parameters of an unconstrained hyper-Dirichlet
distribution.
This was originally designed for use in Situational Judgment Tests where experts might not agree on the “key”.
Note: Zeros in the masterProfile
indicate responses that a
master would never make. They will result in zero probability
of mastery for any response which yields that outcome.
Almond, R.G. and Roberts, R. (Draft) Bayesian Scoring for Situational Judgment Tests. Unpublished white paper.
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.
effectiveThetas
,Compensatory
,
calcDNTable
, calcDSTable
,
expand.grid
skill1l <- c("High","Medium","Low") skill2l <- c("High","Low") option5L <- c("A","B","C","D","E") ## Expert responses eProfile <- c(A=7,B=15,C=3,D=0,E=0) paramT <- calcDDTable(list(S1=skill1l,S2=skill2l), option5L, c(S1=2,S2=1), masterProfile=eProfile+0.5) paramF <- calcDDFrame(list(S1=skill1l,S2=skill2l), option5L, c(S1=2,S2=1), masterProfile=5*eProfile+0.5, noviceProfile=2)
skill1l <- c("High","Medium","Low") skill2l <- c("High","Low") option5L <- c("A","B","C","D","E") ## Expert responses eProfile <- c(A=7,B=15,C=3,D=0,E=0) paramT <- calcDDTable(list(S1=skill1l,S2=skill2l), option5L, c(S1=2,S2=1), masterProfile=eProfile+0.5) paramF <- calcDDFrame(list(S1=skill1l,S2=skill2l), option5L, c(S1=2,S2=1), masterProfile=5*eProfile+0.5, noviceProfile=2)
The calcDNTable
function takes a description of input and
output variables for a Bayesian network distribution and a collection
of IRT-like parameter (discrimination, difficulty) and calculates a
conditional probability table using the DiBello–Normal distribution
(see Details). The calcDNFrame
function
returns the value as a data frame with labels for the parent states.
calcDNTable(skillLevels, obsLevels, lnAlphas, beta, std, rule = "Compensatory") calcDNFrame(skillLevels, obsLevels, lnAlphas, beta, std, rule = "Compensatory")
calcDNTable(skillLevels, obsLevels, lnAlphas, beta, std, rule = "Compensatory") calcDNFrame(skillLevels, obsLevels, lnAlphas, beta, std, rule = "Compensatory")
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
obsLevels |
A character vector giving names of levels for the output variables from highest to lowest. Can also be a vector of integers (see Details). |
lnAlphas |
A vector of log slope parameters. Its length should
be either 1 or the length of |
beta |
A vector of difficulty (-intercept) parameters. Its length
should be either 1 or the length of |
std |
The log of the residual standard deviation (see Details). |
rule |
Function for computing effective theta (see Details). |
The DiBello–Normal distribution (Almond et al, 2015) is a variant of the DiBello–Samejima distribution (Almond et al, 2001) for creating conditional probability tables for Bayesian networks which uses a regression-like (probit) link function in place of Samejima's graded response link function. The basic procedure unfolds in three steps.
Each level of each input variable is assigned an “effective theta” value — a normal value to be used in calculations.
For each possible skill profile (combination of states of the parent variables) the effective thetas are combined using a combination function. This produces an “effective theta” for that skill profile.
Taking the effective theta value as the mean, the probability that the examinee will fall into each category.
The parent (conditioning) variables are described by the
skillLevels
argument which should provide for each parent
variable in order the names of the states ranked from highest to
lowest value. These are calculated through the function
effectiveThetas
which gives equally spaced points on the
probability curve. Note that for the DiBello-Normal distribution, Step 1 and
Step 3 are inverses of each other (except for rounding error).
The combination of the individual effective theta values into a joint
value for effective theta is done by the function reference by
rule
. This should be a function of three arguments:
theta
— the vector of effective theta values for each parent,
alphas
— the vector of discrimination parameters, and
beta
— a scalar value giving the difficulty. The initial
distribution supplies five functions appropriate for use with
calcDSTable
: Compensatory
,
Conjunctive
, and Disjunctive
,
OffsetConjunctive
, and OffsetDisjunctive
.
The last two have a slightly different parameterization: alpha
is assumed to be a scalar and betas
parameter is vector
valued. Note that the discrimination and difficulty parameters are
built into the structure function and not the probit curve.
The effective theta values are converted to probabilities by assuming
that the categories for the consequence variable (obsLevels
)
are generated by taking equal probability divisions of a standard
normal random variable. However, a person with a given pattern of
condition variables is drawn from a population with mean at effective
theta and standard deviation of exp(std)
. The returned numbers
are the probabilities of being in each category.
Normally obslevel
should be a character vector giving state
names. However, in the special case of state names which are integer
values, R will “helpfully” convert these to legal variable
names by prepending a letter. This causes other functions which rely
on the names()
of the result being the state names to break.
As a special case, if the value of obsLevel
is of type numeric,
then calcDNFrame()
will make sure that the correct values are
preserved.
For calcDNTable
, a matrix whose rows correspond configurations
of the parent variable states (skillLevels
) and whose columns
correspond to obsLevels
. Each row of the table is a
probability distribution, so the whole matrix is a conditional
probability table. The order of the parent rows is the same as is
produced by applying expand.grid
to skillLevels
.
For calcDNFrame
a CPF
, a data frame with additional columns
corresponding to the entries in skillLevels
giving the parent
value for each row.
This distribution class was developed primarily for modeling
relationships among proficiency variables. For models for
observables, see calcDSTable
.
This function has largely been superceeded by calls to calcDPCTable
with normalLink
as the link function.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.
effectiveThetas
,Compensatory
,
OffsetConjunctive
,eThetaFrame
,
calcDSTable
, calcDNllike
,
calcDPCTable
,
expand.grid
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") skill3l <- c("Advanced","Proficient","Basic","Developing") cptSkill3 <- calcDNTable(list(S1=skill1l,S2=skill2l),skill3l, log(c(S1=1,S2=.75)),1.0,log(0.5), rule="Compensatory") cpfSkill3 <- calcDNFrame(list(S1=skill1l,S2=skill2l),skill3l, log(c(S1=1,S2=.75)),1.0,log(0.5), rule="Compensatory")
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") skill3l <- c("Advanced","Proficient","Basic","Developing") cptSkill3 <- calcDNTable(list(S1=skill1l,S2=skill2l),skill3l, log(c(S1=1,S2=.75)),1.0,log(0.5), rule="Compensatory") cpfSkill3 <- calcDNFrame(list(S1=skill1l,S2=skill2l),skill3l, log(c(S1=1,S2=.75)),1.0,log(0.5), rule="Compensatory")
The calcDPCTable
function takes a description of input and
output variables for a Bayesian network distribution and a collection
of IRT-like parameter (discrimination, difficulty) and calculates a
conditional probability table using the discrete partial credit distribution
(see Details). The calcDPCFrame
function
returns the value as a data frame with labels for the parent states.
calcDPCTable(skillLevels, obsLevels, lnAlphas, betas, rules = "Compensatory", link="partialCredit", linkScale=NULL, Q=TRUE, tvals=lapply(skillLevels, function (sl) effectiveThetas(length(sl)))) calcDPCFrame(skillLevels, obsLevels, lnAlphas, betas, rules = "Compensatory", link="partialCredit", linkScale=NULL, Q=TRUE, tvals=lapply(skillLevels, function (sl) effectiveThetas(length(sl))))
calcDPCTable(skillLevels, obsLevels, lnAlphas, betas, rules = "Compensatory", link="partialCredit", linkScale=NULL, Q=TRUE, tvals=lapply(skillLevels, function (sl) effectiveThetas(length(sl)))) calcDPCFrame(skillLevels, obsLevels, lnAlphas, betas, rules = "Compensatory", link="partialCredit", linkScale=NULL, Q=TRUE, tvals=lapply(skillLevels, function (sl) effectiveThetas(length(sl))))
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
obsLevels |
A character vector giving names of levels for the output variables from highest to lowest. As a special case, can also be a vector of integers. |
lnAlphas |
A list of vectors of log slope parameters. Its
length should be 1 or |
betas |
A list of vectors of difficulty (-intercept) parameters. Its
length should be 1 or |
rules |
A list of functions for computing effective theta (see
Details). Its length should be |
link |
The function that converts a table of effective thetas to probabilities |
linkScale |
An optional scale parameter for the |
Q |
This should be a Q matrix indicating which parent variables are relevant for which state transitions. It should be a number of states minus one by number of parents logical matrix. As a special case, if all variable are used for all levels, then it can be a scalar value. |
tvals |
A list of the same length as |
The discrete graded response model is a generalization of the
DiBello–Samejima mechanism for creating conditional
probability tables for Bayesian network models using IRT-like
parameters (calcDSTable
). The basic procedure unfolds
in three steps.
Each level of each input variable is assigned an “effective theta” value — a normal value to be used in calculations.
For each possible skill profile (combination of states of
the parent variables) the effective thetas are combined using a
one of the rule
functions. This produces an
“effective theta” for that skill profile.
The effective theta table is input into the link
function to produce a probability
distribution over the states of the outcome variables.
The parent (conditioning) variables are described by the
skillLevels
argument which should provide for each parent
variable in order the names of the states ranked from highest to
lowest value. The default implementation uses the function
effectiveThetas
to calculate equally spaced points on
the normal curve. This can be overridden by supplying a tvals
argument. This should be a list of the same length as
skillLevels
with each element having the same length as the
corresponding element of skillLevels
.
The tvals
(either default or user supplied) are used to create
a table of rows with values ,
corresponding to all possible combinations of the parent variables
(using
expand.grid
).
Let be the child variable of the distribution, and assume that
it can take on
possible states labeled
through
in increasing order. (Note: that
calcDPCTable
assumes variable states are ordered the other direction: from highest
to lowest.) For each state but the lowest state (the last one in the
input order) defines a combination rule
. Applying these
functions to the rows of the table produces a table of effective
thetas for each configuration of the parent variables and each child
state except for the lowest. (The metaphor is this theta represents
the “ability level” required to reach that output state.)
Note that the s do not need to have the same
parameters or even the same functional form. The argument
rules
should contain a list of the names of the combination
functions, the first one corresponding to , and so
forth in descending order. As a special case, if
rules
has only
one element, than it is used for all of the transitions. Similarly,
the lnAlphas
and betas
should also be lists of the
parameters of the combination functions corresponding to the
transitions between the levels. The betas[[m]]
represent
difficulties (negative intercepts) and the exp(lnAlphas[[m]])
represent slopes for the transition to level (following the
highest to lowest order). Again if these lists have length one, the
same value is used for all transitions.
The length of the elements of lnAlphas
and betas
is
determined by the specific choice of combination function. The
functions Compensatory
, Conjunctive
, and
Disjunctive
all assume that there will be one
lnAlpha
for each parent variable, but a single beta
.
The functions OffsetConjunctive
, and
OffsetDisjunctive
both assume that there will be one
beta
for each parent variable, but a single lnAlpha
.
The code link
function is then applied to the table of
effective theta values to produce a conditional probability
distribution. Two link functions are currently supported:
partialCredit
is based on the generalized partial credit
model (Muraki, 1992), gradedResponse
is a modified
version of the graded response model (Samejima, 1969). (The
modification corrects for problems when the curves cross.) A third
planned link function is based on a normal error model, this will
require the extra linkScale
parameter.
The Q
matrix is used in situations where some of the parent
variables are not relevant for one or more parent transitions. If
parent k is relevant for the transition between state m+1 and m
(remember that states are coded from highest to lowest) then
Q[m,k]
should be TRUE
. In particular,
eTheta[,Q[m,]]
is passed to the combination rule, not all of
theta. If there are false entries in Q
the corresponding sets
of alphas and betas need to have the correct length. Generally
speaking, Q
matrixes with FALSE
entries are not
appropriate with the gradedResponse
link. As a special
case if Q=TRUE
, then all parent variables are used for all
state transitions.
Normally obslevel
should be a character vector giving state
names. However, in the special case of state names which are integer
values, R will “helpfully” convert these to legal variable
names by prepending a letter. This causes other functions which rely
on the names()
of the result being the state names to break.
As a special case, if the value of obsLevel
is of type numeric,
then calcDSFrame()
will make sure that the correct values are
preserved.
For calcDPCTable
, a matrix whose rows correspond configurations
of the parent variable states (skillLevels
) and whose columns
correspond to obsLevels
. Each row of the table is a
probability distribution, so the whole matrix is a conditional
probability table. The order of the parent rows is the same as is
produced by applying expand.grid
to skillLevels
.
For calcDPCFrame
a CPF
, a data frame with additional columns
corresponding to the entries in skillLevels
giving the parent
value for each row.
The framework set up by this function is completely expandable. The
link
and the elements of rules
can be any value that is
suitable for the first argument of do.call
.
Elements of rules
are called with the expression
do.call(rules[[kk]],list(thetas,exp(lnAlphas[[kk]]),betas[[kk]]))
where thetas
is the matrix of effective theta values produced
in the first step of the algorithm, and the return function should be
a vector of effective thetas, one for each row of thetas
.
The link
function is called with the expression
do.call(link,list(et,linkScale,obsLevels))
where et
is
the matrix of effective thetas produced in the second step. It should
return a conditional probability table with the same number of rows
and one more column than et
. All of the rows should sum to 1.0.
Russell Almond
Almond, R.G. (2015). An IRT-based Parameterization for Conditional Probability Tables. Paper submitted to the 2015 Bayesian Application Workshop at the Uncertainty in Artificial Intelligence conference.
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Muraki, E. (1992). A Generalized Partial Credit Model: Application of an EM Algorithm. Applied Psychological Measurement, 16, 159-176. DOI: 10.1177/014662169201600206
Samejima, F. (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph No. 17, 34, (No. 4, Part 2).
effectiveThetas
,Compensatory
,
OffsetConjunctive
,eThetaFrame
,
calcDNTable
, calcDSTable
,
expand.grid
, gradedResponse
,
partialCredit
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## Simple binary model, these three should be the same. cptCorrect <- calcDPCTable(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Compensatory", link="partialCredit") cptCorrect2 <- calcDPCTable(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Compensatory", link="gradedResponse") cptCorrect1 <- calcDSTable(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Compensatory") stopifnot (all (abs(cptCorrect2-cptCorrect1) <.001)) stopifnot (all (abs(cptCorrect-cptCorrect1) <.001)) ## Conjunctive uses multiple betas, not multiple alphas. cptConj <- calcDPCTable(list(S1=skill1l,S2=skill2l),correctL, log(1),c(S1=0.5,S2=.7),rule="OffsetConjunctive") ## Test for no parent case cptTheta <- calcDPCTable(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) ## Simple model, Skill 1 needed for step 1, Skill 2 for Step 2. cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)), rule="OffsetDisjunctive") ##Variant using Q-matrix cptPC1a <- calcDPCTable(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0),partial=c(S2=0)), Q=matrix(c(TRUE,FALSE,FALSE,TRUE),2,2), rule="OffsetDisjunctive") stopifnot(all(abs(as.vector(numericPart(cptPC1))-as.vector(cptPC1a))<.0001)) ## Complex model, different rules for different levels cptPC2 <- calcDPCTable(list(S1=skill1l,S2=skill2l),pcreditL, list(full=log(1),partial=log(c(S1=1,S2=.75))), betas=list(full=c(0,999),partial=1.0), rule=list("OffsetDisjunctive","Compensatory")) ## Graded Response Model, typically uses different difficulties cptGraded <- calcDPCTable(list(S1=skill1l),gradeL, log(1),betas=list(A=2,B=1,C=0,D=-1), rule="Compensatory",link="gradedResponse") ## Partial credit link is somewhat different cptPC5 <- calcDPCTable(list(S1=skill1l),gradeL, log(1),betas=list(A=2,B=1,C=0,D=-1), rule="Compensatory",link="partialCredit") cptPC5a <- calcDPCTable(list(S1=skill1l),gradeL, log(1),betas=1, rule="Compensatory",link="partialCredit") ## Need to be careful when using different slopes (or non-increasing ## difficulties) with graded response link as curves may cross. cptCross <- calcDPCTable(list(S1=skill1l),pcreditL, log(1),betas=list(full=-1,partial=1), rule="Compensatory",link="gradedResponse") stopifnot (all(abs(cptCross[,"Partial"])<.001))
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## Simple binary model, these three should be the same. cptCorrect <- calcDPCTable(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Compensatory", link="partialCredit") cptCorrect2 <- calcDPCTable(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Compensatory", link="gradedResponse") cptCorrect1 <- calcDSTable(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Compensatory") stopifnot (all (abs(cptCorrect2-cptCorrect1) <.001)) stopifnot (all (abs(cptCorrect-cptCorrect1) <.001)) ## Conjunctive uses multiple betas, not multiple alphas. cptConj <- calcDPCTable(list(S1=skill1l,S2=skill2l),correctL, log(1),c(S1=0.5,S2=.7),rule="OffsetConjunctive") ## Test for no parent case cptTheta <- calcDPCTable(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) ## Simple model, Skill 1 needed for step 1, Skill 2 for Step 2. cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)), rule="OffsetDisjunctive") ##Variant using Q-matrix cptPC1a <- calcDPCTable(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0),partial=c(S2=0)), Q=matrix(c(TRUE,FALSE,FALSE,TRUE),2,2), rule="OffsetDisjunctive") stopifnot(all(abs(as.vector(numericPart(cptPC1))-as.vector(cptPC1a))<.0001)) ## Complex model, different rules for different levels cptPC2 <- calcDPCTable(list(S1=skill1l,S2=skill2l),pcreditL, list(full=log(1),partial=log(c(S1=1,S2=.75))), betas=list(full=c(0,999),partial=1.0), rule=list("OffsetDisjunctive","Compensatory")) ## Graded Response Model, typically uses different difficulties cptGraded <- calcDPCTable(list(S1=skill1l),gradeL, log(1),betas=list(A=2,B=1,C=0,D=-1), rule="Compensatory",link="gradedResponse") ## Partial credit link is somewhat different cptPC5 <- calcDPCTable(list(S1=skill1l),gradeL, log(1),betas=list(A=2,B=1,C=0,D=-1), rule="Compensatory",link="partialCredit") cptPC5a <- calcDPCTable(list(S1=skill1l),gradeL, log(1),betas=1, rule="Compensatory",link="partialCredit") ## Need to be careful when using different slopes (or non-increasing ## difficulties) with graded response link as curves may cross. cptCross <- calcDPCTable(list(S1=skill1l),pcreditL, log(1),betas=list(full=-1,partial=1), rule="Compensatory",link="gradedResponse") stopifnot (all(abs(cptCross[,"Partial"])<.001))
These functions take data
which represent draws from a
categorical data with the given DiBello–Samejima distribution and
returns the log-likelihood of the given data.
calcDSllike(data, parents, skillLevels, child, obsLevels, lnAlphas, beta, dinc = 0, rule = "Compensatory") calcDNllike(data, parents, skillLevels, child, obsLevels, lnAlphas, beta, std, rule = "Compensatory")
calcDSllike(data, parents, skillLevels, child, obsLevels, lnAlphas, beta, dinc = 0, rule = "Compensatory") calcDNllike(data, parents, skillLevels, child, obsLevels, lnAlphas, beta, std, rule = "Compensatory")
data |
A data frame whose columns contain variables corresponding
to |
parents |
A vector of names for the columns in |
child |
The name of the child variable, should refer to a column
in |
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
obsLevels |
A character vector giving names of levels for the output variables from highest to lowest. |
lnAlphas |
A vector of log slope parameters. Its length should
be either 1 or the length of |
beta |
A vector of difficulty (-intercept) parameters. Its length
should be either 1 or the length of |
dinc |
Vector of difficulty increment parameters (see
|
rule |
Function for computing effective theta (see
|
std |
The log of the residual standard deviation (see Details). |
This function assumes that the observed data are independent draws
from a Bayesian network. This function calculates the log-likelihood
of a single conditional probability table. First, it calculates a
table of counts corresponding states of the parent and child variables
using the function dataTable
. Next it calculates the
conditional probability for each cell using the function
calcDSTable
or calcDNTable
.
It then calculates the log-likelihood as the sum of
where this value is set to zero if
is zero (this allows cells with zero probability as
long as the count is also zero).
A real giving the log-likelihood of the observed data.
This function is primarily about testing the log likelihood calculations used internally in StatShop.
This function is largely superceeded by the likelihood calculation internal
to mapDPC
. In particular, if probs
is the result of the
call to calcDPCTable
, and postTable
is the expected
contingency table (e.g., the output of expTable
). Then the
log likelihood is
-2*sum(as.vector(postTable)*as.vector(log(probs)))
.
Russell Almond
http://comet.research.ets.org/~ralmond/StatShop
dataTable
, calcDSTable
,
Compensatory
,OffsetConjunctive
,
eThetaFrame
, calcDNTable
skill1l <- c("High","Medium","Low") skill3l <- c("High","Better","Medium","Worse","Low") correctL <- c("Correct","Incorrect") x <- read.csv(system.file("testFiles", "randomPinned100.csv", package="CPTtools"), header=TRUE, as.is=TRUE) x[,"Skill1"] <- ordered(x[,"Skill1"],skill1l) x[,"Skill3"] <- ordered(x[,"Skill3"],skill3l) x[,"Comp.Correct"] <- ordered(x[,"Comp.Correct"],correctL) like <- calcDSllike(x,c("Skill1","Skill3"), list(Skill1=skill1l, Skill3=skill3l), "Comp.Correct", correctL, log(c(0.45,-0.4)),-1.9,rule="Compensatory")
skill1l <- c("High","Medium","Low") skill3l <- c("High","Better","Medium","Worse","Low") correctL <- c("Correct","Incorrect") x <- read.csv(system.file("testFiles", "randomPinned100.csv", package="CPTtools"), header=TRUE, as.is=TRUE) x[,"Skill1"] <- ordered(x[,"Skill1"],skill1l) x[,"Skill3"] <- ordered(x[,"Skill3"],skill3l) x[,"Comp.Correct"] <- ordered(x[,"Comp.Correct"],correctL) like <- calcDSllike(x,c("Skill1","Skill3"), list(Skill1=skill1l, Skill3=skill3l), "Comp.Correct", correctL, log(c(0.45,-0.4)),-1.9,rule="Compensatory")
The calcDSTable
function takes a description of input and
output variables for a Bayesian network distribution and a collection
of IRT-like parameter (discrimination, difficulty) and calculates a
conditional probability table using the DiBello-Samejima distribution
(see Details). The calcDSFrame
function
returns the value as a data frame with labels for the parent states.
calcDSTable(skillLevels, obsLevels, lnAlphas, beta, dinc = 0, rule = "Compensatory") calcDSFrame(skillLevels, obsLevels, lnAlphas, beta, dinc = 0, rule = "Compensatory")
calcDSTable(skillLevels, obsLevels, lnAlphas, beta, dinc = 0, rule = "Compensatory") calcDSFrame(skillLevels, obsLevels, lnAlphas, beta, dinc = 0, rule = "Compensatory")
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
obsLevels |
A character vector giving names of levels for the output variables from highest to lowest. As a special case, can also be a vector of integers. |
lnAlphas |
A vector of log slope parameters. Its length should
be either 1 or the length of |
beta |
A vector of difficulty (-intercept) parameters. Its length
should be either 1 or the length of |
dinc |
Vector of difficulty increment parameters (see Details). |
rule |
Function for computing effective theta (see Details). |
The DiBello–Samejima model is a mechanism for creating conditional probability tables for Bayesian network models using IRT-like parameters. The basic procedure unfolds in three steps.
Each level of each input variable is assigned an “effective theta” value — a normal value to be used in calculations.
For each possible skill profile (combination of states of the parent variables) the effective thetas are combined using a combination function. This produces an “effective theta” for that skill profile.
The effective theta is input into Samejima's graded-response model to produce a probability distribution over the states of the outcome variables.
The parent (conditioning) variables are described by the
skillLevels
argument which should provide for each parent
variable in order the names of the states ranked from highest to
lowest value. The original method (Almond et al., 2001)
used equally spaced points on the interval for the
effective thetas of the parent variables. The current implementation
uses the function
effectiveThetas
to calculate equally
spaced points on the normal curve.
The combination of the individual effective theta values into a joint
value for effective theta is done by the function reference by
rule
. This should be a function of three arguments:
theta
— the vector of effective theta values for each parent,
alphas
— the vector of discrimination parameters, and
beta
— a scalar value giving the difficulty. The initial
distribution supplies five functions appropriate for use with
calcDSTable
: Compensatory
,
Conjunctive
, and Disjunctive
,
OffsetConjunctive
, and OffsetDisjunctive
.
The last two have a slightly different parameterization: alpha
is assumed to be a scalar and betas
parameter is vector
valued. Note that the discrimination and difficulty parameters are
built into the structure function and not the IRT curve.
The Samejima graded response link function describes a series of curves:
for , where
(a scale factor to make the logistic
curve match more closely with the probit curve). The probability for
any given category is then the difference between two adjacent
logistic curves. Note that because a difficulty parameter was
included in the structure function, we have the further constraint
that
.
To remove the parameter restriction we work with the difference
between the parameters: . The value of
is set at
-sum(dinc)/2
to center the d values. Thus the
dinc
parameter (which is required only if
length(obsLevels)>2
) should be of length
length(obsLevels)-2
. The first value is the difference between
the d values for the two highest states, and so forth.
Normally obslevel
should be a character vector giving state
names. However, in the special case of state names which are integer
values, R will “helpfully” convert these to legal variable
names by prepending a letter. This causes other functions which rely
on the names()
of the result being the state names to break.
As a special case, if the value of obsLevel
is of type numeric,
then calcDSFrame()
will make sure that the correct values are
preserved.
For calcDSTable
, a matrix whose rows correspond configurations
of the parent variable states (skillLevels
) and whose columns
correspond to obsLevels
. Each row of the table is a
probability distribution, so the whole matrix is a conditional
probability table. The order of the parent rows is the same as is
produced by applying expand.grid
to skillLevels
.
For calcDSFrame
a CPF
, a data frame with additional columns
corresponding to the entries in skillLevels
giving the parent
value for each row.
This distribution class is not suitable for modeling relationship
among proficiency variable, primarily because the normal mapping used
in the effective theta calculation and the Samejima graded response
models are not inverses. For those model, the function
calcDNTable
, which uses a probit link function, is
recommended instead.
This function has largely been superceeded by calls to calcDPCTable
with gradedResponse
as the link function.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.
Samejima, F. (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph No. 17, 34, (No. 4, Part 2).
effectiveThetas
,Compensatory
,
OffsetConjunctive
,eThetaFrame
,
calcDNTable
, calcDSllike
,
calcDPCTable
, expand.grid
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") gradeL <- c("A","B","C","D","E") cptCorrect <- calcDSTable(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Conjunctive") cpfCorrect <- calcDSFrame(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Conjunctive") cptGraded <- calcDSTable(list(S1=skill1l),gradeL, 0.0, 0.0, dinc=c(.3,.4,.3))
## Set up variables skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") gradeL <- c("A","B","C","D","E") cptCorrect <- calcDSTable(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Conjunctive") cpfCorrect <- calcDSFrame(list(S1=skill1l,S2=skill2l),correctL, log(c(S1=1,S2=.75)),1.0,rule="Conjunctive") cptGraded <- calcDSTable(list(S1=skill1l),gradeL, 0.0, 0.0, dinc=c(.3,.4,.3))
Calculates the conditional probability table for a noisy-and distribution. This follows a logical model where all inputs must be true for the output to be true; however, some "noise" is allowed that produces random deviations from the pure logic. The noisy-min is a generalization in which all variables are ordered, and the weakest of the parent variables drives the conditional probabilities of the child variable.
calcNoisyAndTable(skillLevels, obsLevels = c("True", "False"), bypass = rep(0, length(skillLevels)), noSlip=1, thresholds = sapply(skillLevels, function(states) states[1])) calcNoisyAndFrame(skillLevels, obsLevels = c("True", "False"), bypass = rep(0, length(skillLevels)), noSlip=1, thresholds = sapply(skillLevels, function(states) states[1]))
calcNoisyAndTable(skillLevels, obsLevels = c("True", "False"), bypass = rep(0, length(skillLevels)), noSlip=1, thresholds = sapply(skillLevels, function(states) states[1])) calcNoisyAndFrame(skillLevels, obsLevels = c("True", "False"), bypass = rep(0, length(skillLevels)), noSlip=1, thresholds = sapply(skillLevels, function(states) states[1]))
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
obsLevels |
A character vector giving names of levels for the output variables from highest to lowest. As a special case, can also be a vector of integers. Its length should be 2, and the first value is considered to be logically equivalent to "true". |
noSlip |
A scalar value between 0 and 1. This represents the probability that the output will be true when all of the inputs are true (e.g., 1 - the probability that an examinee will make a careless error). |
bypass |
A vector of the same length as |
thresholds |
If the input variables have more than two states, values that are equal to or higher than this threshold are considered true. It is assumed that the states of the variables are ordered from highest to lowest. |
The noisy-and distribution assumes that both the input and output
variables are binary. Basically, the output should be true if all of
the inputs are true. Let if the
th input is
true, and let
be the
bypass
parameter corresponding
to the th input variable. (If the
's represent a
skill, then
represents the probability that an examinee who
lacks that skill will bypass the need for that skill in solving the
problem.) Then the probability of the true state for the output
variable will be:
where (the
noSlip
parameter) is the probability that
the output will be true even when all of the inputs are true.
It is assumed that all variables are ordered from highest to lowest
state, so that the first state corresponds to "true" the others to
false. If the input variable has more than two states, then it can be
reduced to a binary variable by using the threshold
argument.
Any values which are equal to or higher than the threshold
for
that variable are assumed to be true. (In this case, higher means
closer to the the beginning of the list of possible values.)
The noisy-min is a generalization
For calcNoisyAndTable
, a matrix whose rows correspond configurations
of the parent variable states (skillLevels
) and whose columns
correspond to obsLevels
. Each row of the table is a
probability distribution, so the whole matrix is a conditional
probability table. The order of the parent rows is the same as is
produced by applying expand.grid
to skillLevels
.
For calcNoisyAndFrame
a data frame with additional columns
corresponding to the entries in skillLevels
giving the parent
value for each row.
This is related to the DINA and NIDA models, but uses a slightly
different parameterization. In particular, if the noSlip
parameter is omitted, it is a noisy input deterministic and-gate
(NIDA), and if the bypass
parameters are omitted, it is similar
to a deterministic input noisy and-gate (DINA), except is lacks a
guessing parameter.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.
Diez, F. J. (1993) Parameter adjustment in Bayes networks. The generalized noisy OR-gate. In Heckerman and Mamdani (eds) Uncertainty in Artificial Intelligence 93. Morgan Kaufmann. 99–105.
Srinivas, S. (1993) A generalization of the Noisy-Or model, the generalized noisy OR-gate. In Heckerman and Mamdani (eds) Uncertainty in Artificial Intelligence 93. Morgan Kaufmann. 208–215.
calcDSTable
, calcDNTable
,
calcDPCTable
, expand.grid
,
calcNoisyOrTable
## Logical and table and <- calcNoisyAndFrame(list(c("True","False"),c("True","False")), c("Right","Wrong")) ## DINA, logical-and except that is allows for a small chance of slipping. dina <- calcNoisyAndFrame(list(c("True","False"),c("True","False")), noSlip=.9) ##NIDA, logical-and except that inputs can randomly be bypassed nida <- calcNoisyAndFrame(list(c("True","False"),c("True","False")), bypass=c(.3,.4)) ##Full Noisy And distribution noisyAnd <- calcNoisyAndFrame(list(c("True","False"),c("True","False")), noSlip=.9,bypass=c(.3,.4)) thresh <- calcNoisyAndFrame(list(c("H","M","L"),c("H","M","L")), c("Right","Wrong"), threshold=c("M","H"))
## Logical and table and <- calcNoisyAndFrame(list(c("True","False"),c("True","False")), c("Right","Wrong")) ## DINA, logical-and except that is allows for a small chance of slipping. dina <- calcNoisyAndFrame(list(c("True","False"),c("True","False")), noSlip=.9) ##NIDA, logical-and except that inputs can randomly be bypassed nida <- calcNoisyAndFrame(list(c("True","False"),c("True","False")), bypass=c(.3,.4)) ##Full Noisy And distribution noisyAnd <- calcNoisyAndFrame(list(c("True","False"),c("True","False")), noSlip=.9,bypass=c(.3,.4)) thresh <- calcNoisyAndFrame(list(c("H","M","L"),c("H","M","L")), c("Right","Wrong"), threshold=c("M","H"))
Calculates the conditional probability table for a noisy-and distribution. This follows a logical model where at least one inputs must be true for the output to be true; however, some "noise" is allowed that produces random deviations from the pure logic.
calcNoisyOrTable(skillLevels, obsLevels = c("True", "False"), suppression = rep(0, length(skillLevels)), noGuess = 1, thresholds = sapply(skillLevels, function(states) states[1])) calcNoisyOrFrame(skillLevels, obsLevels = c("True", "False"), suppression = rep(0, length(skillLevels)), noGuess = 1, thresholds = sapply(skillLevels, function(states) states[1]))
calcNoisyOrTable(skillLevels, obsLevels = c("True", "False"), suppression = rep(0, length(skillLevels)), noGuess = 1, thresholds = sapply(skillLevels, function(states) states[1])) calcNoisyOrFrame(skillLevels, obsLevels = c("True", "False"), suppression = rep(0, length(skillLevels)), noGuess = 1, thresholds = sapply(skillLevels, function(states) states[1]))
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
obsLevels |
A character vector giving names of levels for the output variables from highest to lowest. As a special case, can also be a vector of integers. Its length should be 2, and the first value is considered to be logically equivalent to "true". |
suppression |
A vector of the same length as |
noGuess |
A scalar value between 0 and 1. This represents the probability that the the output will be false even when all of the inputs are false (e.g., 1-guessing probability). |
thresholds |
If the input variables have more than two states, values that are equal to or higher than this threshold are considered true. It is assumed that the states of the variables are ordered from highest to lowest. |
The noisy-or distribution assumes that both the input and output
variables are binary. Basically, the output should be true if any of
the inputs are true. Let if the
th input is
true, and let
be the
suppression
parameter
corresponding to the th input variable. (If the
's
represent a skill, then
represents the probability that an
examinee who has that skill will fail to correctly apply it.)
Then the probability of the true state for the output
variable will be:
where (the
noGuess
parameter) is the probability that
the output will be false even when all of the inputs are false.
It is assumed that all variables are ordered from highest to lowest
state, so that the first state corresponds to "true" the others to
false. If the input variable has more than two states, then it can be
reduced to a binary variable by using the threshold
argument.
Any values which are equal to or higher than the threshold
for
that variable are assumed to be true. (In this case, higher means
closer to the the beginning of the list of possible values.)
For calcNoisyOrTable
, a matrix whose rows correspond configurations
of the parent variable states (skillLevels
) and whose columns
correspond to obsLevels
. Each row of the table is a
probability distribution, so the whole matrix is a conditional
probability table. The order of the parent rows is the same as is
produced by applying expand.grid
to skillLevels
.
For calcNoisyOrFrame
a data frame with additional columns
corresponding to the entries in skillLevels
giving the parent
value for each row.
This is related to the DINO and NIDO models, but uses a slightly
different parameterization. In particular, if the noSlip
parameter is omitted, it is a noisy input deterministic and-gate
(NIDO), and if the bypass
parameters are omitted, it is similar
to a deterministic input noisy and-gate (DINO), except is lacks a
slip parameter.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.
Diez, F. J. (1993) Parameter adjustment in Bayes networks. The generalized noisy OR-gate. In Heckerman and Mamdani (eds) Uncertainty in Artificial Intelligence 93. Morgan Kaufmann. 99–105.
Srinivas, S. (1993) A generalization of the Noisy-Or model, the generalized noisy OR-gate. In Heckerman and Mamdani (eds) Uncertainty in Artificial Intelligence 93. Morgan Kaufmann. 208–215.
calcDSTable
, calcDNTable
,
calcDPCTable
, expand.grid
,
calcNoisyOrTable
## Logical or table or <- calcNoisyOrFrame(list(c("True","False"),c("True","False")), c("Right","Wrong")) ## DINO, logical-or except that is allows for a small chance of slipping. dino <- calcNoisyOrFrame(list(c("True","False"),c("True","False")), noGuess=.9) ##NIDO, logical-or except that inputs can randomly be bypassed nido <- calcNoisyOrFrame(list(c("True","False"),c("True","False")), suppression=c(.3,.4)) ##Full Noisy Or distribution noisyOr <- calcNoisyOrFrame(list(c("True","False"),c("True","False")), noGuess=.9,suppression=c(.3,.4)) thresh <- calcNoisyOrFrame(list(c("H","M","L"),c("H","M","L")), c("Right","Wrong"), threshold=c("M","H"))
## Logical or table or <- calcNoisyOrFrame(list(c("True","False"),c("True","False")), c("Right","Wrong")) ## DINO, logical-or except that is allows for a small chance of slipping. dino <- calcNoisyOrFrame(list(c("True","False"),c("True","False")), noGuess=.9) ##NIDO, logical-or except that inputs can randomly be bypassed nido <- calcNoisyOrFrame(list(c("True","False"),c("True","False")), suppression=c(.3,.4)) ##Full Noisy Or distribution noisyOr <- calcNoisyOrFrame(list(c("True","False"),c("True","False")), noGuess=.9,suppression=c(.3,.4)) thresh <- calcNoisyOrFrame(list(c("H","M","L"),c("H","M","L")), c("Right","Wrong"), threshold=c("M","H"))
This takes a colour specification, and produces an ordered series of colours by manipulating the saturate (and possibly value) of the color, leaving the hue constant. This produces a colour palate suitable for plotting ordered factors, which looks good on a colour display, but also reproduces well on a grayscale printer (or for persons with limited colour perception).
colorspread(col, steps, maxsat = FALSE, rampval = FALSE)
colorspread(col, steps, maxsat = FALSE, rampval = FALSE)
col |
A color in any format suitable as input to
|
steps |
A integer describing the number of colors to create. |
maxsat |
A logical value. If true, the final color in the series will have saturation 1, instead of whatever is appropriate for the input. |
rampval |
A logical value. If true, the value as well as the saturation of the color is ramped. |
The colour is converted to a RGB value using
col2rgb
and then to an HSV value using
rgb2hsv
. The saturation is then scaled into
steps
equal intervals. If requested, the value
is
scaled as well.
A character vectors of length steps
giving the colour palate
from lowest to highest intensity. This is suitable to passing to the
col
argument of most graphics functions.
Many of the built-in colours come with 4 intensity variants are meant to
work well together. In some cases an expression like
paste("firebrick",1:4,sep="")
may work better than
colorspread. To see the built-in colours, use the
colors
function.
Russell Almond
compareBars
, link{stackedBars}
barplot(rep(1,4),col=colorspread("slategray",4)) barplot(rep(1,4),col=colorspread("slategray",4,maxsat=TRUE)) barplot(rep(1,4),col=colorspread("violetred",4)) barplot(rep(1,4),col=colorspread("violetred",4,rampval=TRUE))
barplot(rep(1,4),col=colorspread("slategray",4)) barplot(rep(1,4),col=colorspread("slategray",4,maxsat=TRUE)) barplot(rep(1,4),col=colorspread("violetred",4)) barplot(rep(1,4),col=colorspread("violetred",4,rampval=TRUE))
This produces set of stacked bar charts grouped for comparison between
two groups. For example, if suppose that there is a set of
probabilities over a collection of proficiency variables measures both
before and after obtaining a certain piece of evidence. The
compareBars
function would produce stacked bar charts which
compare the prior and posterior probabilities for each variable.
compareBars(data1, data2, profindex, groupNames = c(deparse(data1), deparse(data2)), ..., ylim = c(min(offsets) - 0.25, max(1 + offsets)), cex.names = par("cex.axis"), digits = 2, legend.loc = c(0,1), legend.cex = par("cex"), col = par("col"), col1 = NULL, col2 = NULL, main = NULL, sub = NULL, xlab = NULL, ylab = NULL, rotlab = FALSE) compareBars2(data1, data2, profindex, groupNames=c("Prior","Post"), error.bars=2, scale=100, err.col="gray20", ..., ylim = NULL)
compareBars(data1, data2, profindex, groupNames = c(deparse(data1), deparse(data2)), ..., ylim = c(min(offsets) - 0.25, max(1 + offsets)), cex.names = par("cex.axis"), digits = 2, legend.loc = c(0,1), legend.cex = par("cex"), col = par("col"), col1 = NULL, col2 = NULL, main = NULL, sub = NULL, xlab = NULL, ylab = NULL, rotlab = FALSE) compareBars2(data1, data2, profindex, groupNames=c("Prior","Post"), error.bars=2, scale=100, err.col="gray20", ..., ylim = NULL)
data1 |
Data set with first (prior) values |
data2 |
Data set with second (posterior) values |
profindex |
Index of one of the proficiency levels which will be used as the baseline for the stacked bar charts. |
groupNames |
Names of the groups represented by |
... |
Other arguments to |
ylim |
Default limits for Y axis. |
cex.names |
Character magnification for names. |
digits |
Number of digits for overlaid numeric variables. |
legend.loc |
Location for legend, see
|
legend.cex |
Character magnification for legend. |
col |
The normal graphics |
col1 |
Color scale for the first data set. This should be a vector of colors equal to the number of groups. |
col2 |
Color scale for the second data set. This should be a vector of colors equal to the number of groups. |
main |
Character scalar giving main title (see
|
sub |
Character scalar giving sub title (see
|
xlab |
Character scalar giving x-axis label (see
|
ylab |
Character scalar giving x-axis label (see
|
rotlab |
If |
error.bars |
The number of standard errors for error bars. |
err.col |
The color for error bars. |
scale |
Scales data as probabilities ( |
Invisibly returns the x-co-ordinates of the bars.
The function compareBars2
is a somewhat experimental extension
to compareBars
which adds error bars to the posterior. The
result is not entirely satisfactory, and this function may change
with future releases.
Russell Almond
Almond, R. G., Shute, V. J., Underwood, J. S., and Zapata-Rivera, J.-D (2009). Bayesian Networks: A Teacher's View. International Journal of Approximate Reasoning. 50, 450-460.
stackedBars
, colorspread
,
buildFactorTab
, barplot
margins.prior <- data.frame ( Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09), NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14), Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04) ) margins.post <- data.frame( Trouble=c(Novice=.03,Semester1=.15,Semester2=.39,Semseter3=.32,Semester4=.11), NDK=c(Novice=.00,Semester1=.03,Semester2=.28,Semseter3=.52,Semester4=.17), Model=c(Novice=.10,Semester1=.25,Semester2=.37,Semseter3=.23,Semester4=.05)) foo <- compareBars(margins.prior,margins.post,3,c("Prior","Post"), main="Margins before/after Medium Trouble Shooting Task", sub="Observables: cfgCor=Medium, logCor=High, logEff=Medium", legend.loc = "topright", cex.names=.75, col1=hsv(h=.1,s=.2*1:5-.1,alpha=1), col2=hsv(h=.6,s=.2*1:5-.1,alpha=1)) compareBars2(margins.prior,25*margins.post,3,c("Prior","Post"), main="Margins before/after Medium Trouble Shooting Task", sub="Observables: cfgCor=Medium, logCor=High, logEff=Medium", legend.loc = "topright", cex.names=.75, col1=hsv(h=.1,s=.2*1:5-.1,alpha=1), col2=hsv(h=.6,s=.2*1:5-.1,alpha=1))
margins.prior <- data.frame ( Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09), NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14), Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04) ) margins.post <- data.frame( Trouble=c(Novice=.03,Semester1=.15,Semester2=.39,Semseter3=.32,Semester4=.11), NDK=c(Novice=.00,Semester1=.03,Semester2=.28,Semseter3=.52,Semester4=.17), Model=c(Novice=.10,Semester1=.25,Semester2=.37,Semseter3=.23,Semester4=.05)) foo <- compareBars(margins.prior,margins.post,3,c("Prior","Post"), main="Margins before/after Medium Trouble Shooting Task", sub="Observables: cfgCor=Medium, logCor=High, logEff=Medium", legend.loc = "topright", cex.names=.75, col1=hsv(h=.1,s=.2*1:5-.1,alpha=1), col2=hsv(h=.6,s=.2*1:5-.1,alpha=1)) compareBars2(margins.prior,25*margins.post,3,c("Prior","Post"), main="Margins before/after Medium Trouble Shooting Task", sub="Observables: cfgCor=Medium, logCor=High, logEff=Medium", legend.loc = "topright", cex.names=.75, col1=hsv(h=.1,s=.2*1:5-.1,alpha=1), col2=hsv(h=.6,s=.2*1:5-.1,alpha=1))
These functions take a vector of “effective theta” values for a collection of parent variables and calculates the effective theta value for the child variable according to the named rule. Used in calculating DiBello–Samejima and DiBello–Normal probability tables. These all have one slope parameter (alpha) per parent variable.
Compensatory(theta, alphas, beta) Conjunctive(theta, alphas, beta) Disjunctive(theta, alphas, beta)
Compensatory(theta, alphas, beta) Conjunctive(theta, alphas, beta) Disjunctive(theta, alphas, beta)
theta |
A matrix of effective theta values whose columns correspond to parent variables and whose rows correspond to possible skill profiles. |
alphas |
A vector of discrimination parameters in the same order
as the columns of |
beta |
A difficulty (-intercept) parameter. |
For Compensatory
, the combination function for each row is:
where is the number of parents. (The
is a variance stabilization parameter.)
For Conjunctive
, the combination function for each row is:
For Disjunctive
, the combination function for each row is:
A vector of normal deviates corresponding to the effective theta
value. Length is the number of rows of thetas
.
These functions expect the unlogged discrimination parameters, while
calcDSTable
expect the log of the discrimination parameters.
The rationale is that log discrimination is bound away from zero, and
hence a more natural space for MCMC algorithms. However, it is poor
programming design, as it is liable to catch the unwary.
These functions are meant to be used as structure functions in the DiBello–Samejima and DiBello–Normal models. Other structure functions are possible and can be excepted by those functions as long as they have the same signature as these functions.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.
effectiveThetas
,calcDSTable
,
calcDNTable
, calcDPCTable
,
OffsetConjunctive
, eThetaFrame
thetas <- expand.grid(list(S1=seq(1,-1), S2 = seq(1, -1))) Compensatory(thetas, c(S1=1.25,S2=.75), 0.33) Conjunctive(thetas, c(S1=1.25,S2=.75), 0.33) Disjunctive(thetas, c(S1=1.25,S2=.75), 0.33) skill <- c("High","Medium","Low") eThetaFrame(list(S1=skill,S2=skill), c(S1=1.25,S2=.75), 0.33, "Compensatory") eThetaFrame(list(S1=skill,S2=skill), c(S1=1.25,S2=.75), 0.33, "Conjunctive") eThetaFrame(list(S1=skill,S2=skill), c(S1=1.25,S2=.75), 0.33, "Disjunctive")
thetas <- expand.grid(list(S1=seq(1,-1), S2 = seq(1, -1))) Compensatory(thetas, c(S1=1.25,S2=.75), 0.33) Conjunctive(thetas, c(S1=1.25,S2=.75), 0.33) Disjunctive(thetas, c(S1=1.25,S2=.75), 0.33) skill <- c("High","Medium","Low") eThetaFrame(list(S1=skill,S2=skill), c(S1=1.25,S2=.75), 0.33, "Compensatory") eThetaFrame(list(S1=skill,S2=skill), c(S1=1.25,S2=.75), 0.33, "Conjunctive") eThetaFrame(list(S1=skill,S2=skill), c(S1=1.25,S2=.75), 0.33, "Disjunctive")
A conditional probability table for a node can be represented as a
array with the first dimensions representing the parent
variables and the last dimension representing the
states of the node. Given a set of values for the parent variables,
the values in the last dimension contain the conditional probabilities
corresponding conditional probabilities. A
CPA
is a special
array
object which represents a conditional
probability table.
is.CPA(x) as.CPA(x)
is.CPA(x) as.CPA(x)
x |
Object to be tested or coerced into a |
One way to store a conditional probability table is as an array in
which the first dimensions represent the parent variables, and
the
dimension represents the child variable. Here is an
example with two parents variables,
and
, and a single
child variable,
:
, , C=c1
b1 | b2 | b3 | |
a1 | 0.07 | 0.23 | 0.30 |
a2 | 0.12 | 0.25 | 0.31 |
a3 | 0.17 | 0.27 | 0.32 |
a4 | 0.20 | 0.29 | 0.33 |
, , C=c2
b1 | b2 | b3 | |
a1 | 0.93 | 0.77 | 0.70 |
a2 | 0.88 | 0.75 | 0.69 |
a3 | 0.83 | 0.73 | 0.68 |
a4 | 0.80 | 0.71 | 0.67 |
[Because R stores (and prints) arrays in column-major order, the last value (in this case tables) is the one that sums to 1.]
The CPA
class is a subclass of the
array
class (formally, it is class
c("CPA","array")
). The CPA
class interprets the
dimnames
of the array in terms of the conditional probability
table. The first values of
names(dimnames(x))
are the
input names of the edges (see NodeInputNames()
or the
variable names (or the parent variable, see
NodeParents()
, if the input names were not specified),
and the last value is the name of the child variable. Each of the
elements of dimnames(x)
should give the state names (see
NodeStates()
) for the respective value. In particular,
the conversion function as.CPF()
relies on the existence
of this meta-data, and as.CPA()
will raise a warning if an
array without the appropriate dimnames is supplied.
Although the intended interpretation is that of a conditional
probability table, the normalization constraint is not enforced. Thus
a CPA
object could be used to store likelihoods, probability
potentials, contingency table counts, or other similarly shaped
objects. The function normalize
scales the values of a
CPA
so that the normalization constraint is enforced.
The method NodeProbs()
returns a CPA
object.
The function as.CPA()
is designed to convert between
CPF
s (that is, conditional probability tables stored as
data frames) and CPA
s. It assumes that the factors variables
in the data frame represent the parent variables, and the numeric
values represent the states of the child variable. It also assumes
that the names of the numeric columns are of the form
varname.state
, and attempts to derive variable and
state names from that.
If the argument to as.CPA(x)
is an array, then it assumes that
the dimnames(x)
and names(dimnames(x))
are set to the
states of the variables and the names of the variables respectively.
A warning is issued if the names are missing.
The function is.CPA()
returns a logical value indicating
whether or not the is(x,"CPA")
is true.
The function as.CPA
returns an object of class
c("CPA","array")
, which is essentially an array with the
dimnames set to reflect the variable names and states.
The obvious way to print a CPA
would be to always show the
child variable as the rows in the individual tables, with the parents
corresponding to rows and tables. R, however, internally stores
arrays in column-major order, and hence the rows in the printed tables
always correspond to the second dimension. A new print method for
CPA
would be nice.
This is an S3 object, as it just an array with a special interpretation.
Russell Almond
NodeProbs()
, Extract.NeticaNode
,
CPF
, normalize()
# Note: in R 4.0, the factor() call is required. arf <- data.frame(A=factor(rep(c("a1","a2"),each=3)), B=factor(rep(c("b1","b2","b3"),2)), C.c1=1:6, C.c2=7:12, C.c3=13:18, C.c4=19:24) arfa <- as.CPA(arf) arr1 <- array(1:24,c(4,3,2), dimnames=list(A=c("a1","a2","a3","a4"),B=c("b1","b2","b3"), C=c("c1","c2"))) arr1a <- as.CPA(arr1) ## Not run: ## Requires RNetica as.CPA(node[]) ## End(Not run)
# Note: in R 4.0, the factor() call is required. arf <- data.frame(A=factor(rep(c("a1","a2"),each=3)), B=factor(rep(c("b1","b2","b3"),2)), C.c1=1:6, C.c2=7:12, C.c3=13:18, C.c4=19:24) arfa <- as.CPA(arf) arr1 <- array(1:24,c(4,3,2), dimnames=list(A=c("a1","a2","a3","a4"),B=c("b1","b2","b3"), C=c("c1","c2"))) arr1a <- as.CPA(arr1) ## Not run: ## Requires RNetica as.CPA(node[]) ## End(Not run)
A conditional probability table for a node can be represented as a
data frame with a number of factor variables representing the parent
variables and the remaining numeric values representing the
conditional probabilities of the states of the nodes given the parent
configuration. Each row represents one configuration and the
corresponding conditional probabilities. A CPF
is a special
data.frame
object which represents a conditional
probability table.
is.CPF(x) as.CPF(x)
is.CPF(x) as.CPF(x)
x |
Object to be tested or coerced into a |
One way to store a conditional probability table is a table in which the first several columns indicate the states of the parent variables, and the last several columns indicate probabilities for several child variables. Consider the following example:
A | B | C.c1 | C.c2 | C.c3 | C.c4 | |
[1,] | a1 | b1 | 0.03 | 0.17 | 0.33 | 0.47 |
[2,] | a2 | b1 | 0.05 | 0.18 | 0.32 | 0.45 |
[3,] | a1 | b2 | 0.06 | 0.19 | 0.31 | 0.44 |
[4,] | a2 | b2 | 0.08 | 0.19 | 0.31 | 0.42 |
[5,] | a1 | b3 | 0.09 | 0.20 | 0.30 | 0.41 |
[6,] | a2 | b3 | 0.10 | 0.20 | 0.30 | 0.40 |
In this case the first two columns correspond to parent variables
and
. The variable
has two possible states and
the variable
has three. The child variable is
and it
has for possible states. The numbers in each row give the conditional
probabilities for those states give the state of the child variables.
The class CPF
is a subclass of data.frame
(formally, it is class c("CPF","data.frame")
). Although the
intended interpretation is that of a conditional probability table, the
normalization constraint is not enforced. Thus a CPF
object
could be used to store likelihoods, probability potentials,
contingency table counts, or other similarly shaped objects. The
function normalize
scales the numeric values of CPF
so
that each row is normalized.
The [
method for a NeticaNode
returns a CPF
(if the node is not deterministic).
The function as.CPF()
is designed to convert between
CPA
s (that is, conditional probability tables stored as
arrays) and CPF
s. In particular, as.CPF
is designed to
work with the output of NodeProbs()
or a similarly formatted
array. It assumes that names(dimnames(x))
are the names of the
variables, and dimnames(x)
is a list of character vectors
giving the names of the states of the variables. (See
CPA
for details.) This general method should work with
any numeric array for which both dimnames(x)
and
names(dimnames(x))
are specified.
The argument x
of as.CPF()
could also be a data frame,
in which case it is permuted so that the factor variable are first and
the class tag "CDF"
is added to its class.
The function is.CPF()
returns a logical value indicating
whether or not the is(x,"CDF")
is true.
The function as.CPF
returns an object of class
c("CPF","data.frame")
, which is essentially a data frame with
the first couple of columns representing the parent variables, and the
remaining columns representing the states of the child
variable.
The parent variable list is created with a call
expand.grid(dimnames(x)[1:(p-1)])
. This produces
conditional probability tables where the first parent variable varies
fastest. The Netica GUI displays tables in which the last parent
variable varies fastest.
Note, this is an S3 class, as it is basically a data.frame with special structure.
Change in R 4.0. Note that under R 4.0, character vectors are no longer automaticall coerced into factors when added to data frames. This is probably a good thing, as the code can assume that a column that is a factor is an index for a variable, and one that is a character is a comment or some other data.
Russell Almond
NodeProbs()
, Extract.NeticaNode
,
CPA
, normalize()
# Note: in R 4.0, the factor() call is required. arf <- data.frame(A=factor(rep(c("a1","a2"),each=3)), B=factor(rep(c("b1","b2","b3"),2)), C.c1=1:6, C.c2=7:12, C.c3=13:18, C.c4=19:24) arf <- as.CPF(arf) arr <- array(1:24,c(2,3,4), dimnames=list(A=c("a1","a2"),B=c("b1","b2","b3"), C=c("c1","c2","c3","c4"))) arrf <- as.CPF(arr) stopifnot( is.CPF(arrf), all(levels(arrf$A)==c("a1","a2")), all(levels(arrf$B)==c("b1","b2","b3")), nrow(arrf)==6, ncol(arrf)==6 ) ##Warning, this is not the same as arf, rows are permuted. as.CPF(as.CPA(arf)) ## Not run: ## Requires RNetica as.CPF(NodeProbs(node)) ## End(Not run)
# Note: in R 4.0, the factor() call is required. arf <- data.frame(A=factor(rep(c("a1","a2"),each=3)), B=factor(rep(c("b1","b2","b3"),2)), C.c1=1:6, C.c2=7:12, C.c3=13:18, C.c4=19:24) arf <- as.CPF(arf) arr <- array(1:24,c(2,3,4), dimnames=list(A=c("a1","a2"),B=c("b1","b2","b3"), C=c("c1","c2","c3","c4"))) arrf <- as.CPF(arr) stopifnot( is.CPF(arrf), all(levels(arrf$A)==c("a1","a2")), all(levels(arrf$B)==c("b1","b2","b3")), nrow(arrf)==6, ncol(arrf)==6 ) ##Warning, this is not the same as arf, rows are permuted. as.CPF(as.CPA(arf)) ## Not run: ## Requires RNetica as.CPF(NodeProbs(node)) ## End(Not run)
Each row of a conditional probability frame (CPF) is a
probability distribution. Observed data comes in the form of a
contingency table (or expected contingency table) where the rows
represent configurations of the parent variables and the columns
represent the state of the child variable. The cptChi2
calculates the chi-squared data fit statistic for each row of the
table, and sums them together. This provides a test of the
plausibility that the data came from the proposed model.
cptChi2(obs, exp) cptKL(est, exp) cptG2(obs, exp)
cptChi2(obs, exp) cptKL(est, exp) cptG2(obs, exp)
obs |
A |
est |
A |
exp |
A |
The cannonical use for this plot is to test the conditional
probability model used in particular Bayesian network. The exp
argument is the proposed table. If the parents are fully observed,
then the obs
argument would be a contingency table with
the rows representing parent configurations and the columns the data.
If the parents are not fully observed, then obs can be replaced by an
expected contingincy table (see Almond, et al, 2015).
To avoid potential problems with zero cells, the data table is
modified by adding the prior. That is the observed value is taken as
obs+exp
. This should not have a big effect if the sample size
is large.
The function 'cptG2' is an alternate version which uses the deviance statistic to produce a chi-squared value.
The function 'cptKL' gives the Kulbach-Liebler distance between two probability distributions, so requires the first argument to be normalized.
The for 'cptChi2' and 'cptG2', the sum of the chi-squres for all of the tables with the degrees of
freedom. The degrees of freedom is given as an attribute df
.
For 'cptKL', it is the Kulbach-Leibler distance.
This is called a chi-square statistic, and the percentage points of
the chi-square distribution can be used for reference. However, it is
unclear whether the statistic is actually distributed according to the
chi-square distribution under the hypothesis that the data come from
the distribution given in exp
. In particular, if an expected
table is used instead of an actual table, the distribution is likely
to not be exactly chi-squared.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 10.
Sinharay, S. and Almond, R.G. (2006). Assessing Fit of Cognitively Diagnostic Models: A case study. Educational and Psychological Measurement. 67(2), 239–257.
betaci
, OCP
, barchart.CPF
,
OCP.CPF
skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) ## Compatable data datTheta <- cpfTheta ## Copy to get the shape datTheta[1,] <- rmultinom(1,25,cpfTheta[1,]) cptChi2(datTheta,cpfTheta) cptG2(datTheta,cpfTheta) cptKL(normalize(datTheta),cpfTheta) ## Incompatable data datThetaX <- cpfTheta ## Copy to get the shape datThetaX[1,] <- rmultinom(1,100,c(.05,.45,.5)) cptChi2(datThetaX,cpfTheta) cptG2(datThetaX,cpfTheta) cptKL(normalize(datThetaX),cpfTheta) cptComp <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Compensatory") cptConj <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Conjunctive") datComp <- cptComp for (i in 1:nrow(datComp)) ## Each row has a random sample size. datComp[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1)+1,cptComp[i,3:4]) datComp <- as.CPF(datComp) datConj <- cptConj for (i in 1:nrow(datConj)) ## Each row has a random sample size. datConj[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1)+1,cptConj[i,3:4]) datConj <- as.CPF(datConj) ## Compatable cptChi2(datConj,cptConj) cptG2(datConj,cptConj) cptKL(normalize(datConj),cptConj) ## Incompatable cptChi2(datComp,cptConj) cptG2(datComp,cptConj) cptKL(normalize(datComp),cptConj) cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)), rule="OffsetDisjunctive") cptPC2 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=1,S2=0)), rule="OffsetDisjunctive") datPC1 <- cptPC1 for (i in 1:nrow(datPC1)) ## Each row has a random sample size. datPC1[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC1[i,3:5]) datPC1 <- as.CPF(datPC1) datPC2 <- cptPC2 for (i in 1:nrow(datPC2)) ## Each row has a random sample size. datPC2[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC2[i,3:5]) datPC2 <- as.CPF(datPC2) ## Compatable cptChi2(datPC1,cptPC1) cptG2(datPC1,cptPC1) cptKL(normalize(datPC1),cptPC1) ## Inompatable cptChi2(datPC2,cptPC1) cptG2(datPC2,cptPC1) cptKL(normalize(datPC2),cptPC1)
skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) ## Compatable data datTheta <- cpfTheta ## Copy to get the shape datTheta[1,] <- rmultinom(1,25,cpfTheta[1,]) cptChi2(datTheta,cpfTheta) cptG2(datTheta,cpfTheta) cptKL(normalize(datTheta),cpfTheta) ## Incompatable data datThetaX <- cpfTheta ## Copy to get the shape datThetaX[1,] <- rmultinom(1,100,c(.05,.45,.5)) cptChi2(datThetaX,cpfTheta) cptG2(datThetaX,cpfTheta) cptKL(normalize(datThetaX),cpfTheta) cptComp <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Compensatory") cptConj <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Conjunctive") datComp <- cptComp for (i in 1:nrow(datComp)) ## Each row has a random sample size. datComp[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1)+1,cptComp[i,3:4]) datComp <- as.CPF(datComp) datConj <- cptConj for (i in 1:nrow(datConj)) ## Each row has a random sample size. datConj[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1)+1,cptConj[i,3:4]) datConj <- as.CPF(datConj) ## Compatable cptChi2(datConj,cptConj) cptG2(datConj,cptConj) cptKL(normalize(datConj),cptConj) ## Incompatable cptChi2(datComp,cptConj) cptG2(datComp,cptConj) cptKL(normalize(datComp),cptConj) cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)), rule="OffsetDisjunctive") cptPC2 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=1,S2=0)), rule="OffsetDisjunctive") datPC1 <- cptPC1 for (i in 1:nrow(datPC1)) ## Each row has a random sample size. datPC1[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC1[i,3:5]) datPC1 <- as.CPF(datPC1) datPC2 <- cptPC2 for (i in 1:nrow(datPC2)) ## Each row has a random sample size. datPC2[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC2[i,3:5]) datPC2 <- as.CPF(datPC2) ## Compatable cptChi2(datPC1,cptPC1) cptG2(datPC1,cptPC1) cptKL(normalize(datPC1),cptPC1) ## Inompatable cptChi2(datPC2,cptPC1) cptG2(datPC2,cptPC1) cptKL(normalize(datPC2),cptPC1)
This constructs a table of counts in a special format useful for conditional probability tables. The rows correspond to configurations of the parent variables and the columns correspond to possible states of the child variables.
dataTable(data, parents, child, childStates)
dataTable(data, parents, child, childStates)
data |
A data frame whose columns contain variables corresponding
to |
parents |
A vector of names for the columns in |
child |
The name of the child variable, should refer to a column
in |
childStates |
A character vector giving names of levels for the output variables from highest to lowest. |
Apply the function table
to generate a table of
counts for the indicated variables (subsetting the table if
necessary). Then reformats this into a matrix whose columns
correspond to the child variable.
A matrix whose columns correspond to childStates
and whose rows
correspond to the possible combinations of parents
.
Russell Almond
skill1l <- c("High","Medium","Low") skill3l <- c("High","Better","Medium","Worse","Low") correctL <- c("Correct","Incorrect") x <- read.csv(system.file("testFiles", "randomPinned100.csv", package="CPTtools"), header=TRUE, as.is=TRUE) x[,"Skill1"] <- ordered(x[,"Skill1"],skill1l) x[,"Skill3"] <- ordered(x[,"Skill3"],skill3l) x[,"Comp.Correct"] <- ordered(x[,"Comp.Correct"],correctL) tab <- dataTable(x, c("Skill1","Skill3"),"Comp.Correct",correctL) data.frame(expand.grid(list(Skill1=skill1l,Skill3=skill3l)),tab)
skill1l <- c("High","Medium","Low") skill3l <- c("High","Better","Medium","Worse","Low") correctL <- c("Correct","Incorrect") x <- read.csv(system.file("testFiles", "randomPinned100.csv", package="CPTtools"), header=TRUE, as.is=TRUE) x[,"Skill1"] <- ordered(x[,"Skill1"],skill1l) x[,"Skill3"] <- ordered(x[,"Skill3"],skill3l) x[,"Comp.Correct"] <- ordered(x[,"Comp.Correct"],correctL) tab <- dataTable(x, c("Skill1","Skill3"),"Comp.Correct",correctL) data.frame(expand.grid(list(Skill1=skill1l,Skill3=skill3l)),tab)
The expected shape of the Alpha and Beta parameters depends on
the type of rule. If isOffsetRule(rule)
is true,
then the betas should be a vector of the same length as
pnames
and alphas should be a scalar. If it is false,
then alphas is the vector and betas is the scalar.
defaultAlphas(rule, pnames, states=c("Yes","No"), link="partialCredit") defaultBetas(rule, pnames, states=c("Yes","No"), link="partialCredit")
defaultAlphas(rule, pnames, states=c("Yes","No"), link="partialCredit") defaultBetas(rule, pnames, states=c("Yes","No"), link="partialCredit")
rule |
A rule (e.g., |
pnames |
A character vector giving the names of the parent variables. |
states |
A character vector giving the names of the states of the child variable. |
link |
A link function (e.g., |
Rules come in two styles:
Compensatory
These take multiple slopes (alphas) and a single intercept (beta).
OffsetConjunctive
These take multiple intercepts (betas) and a single slope (alpha).
The value of getOffsetRules()
determines which
rules use which style. These functions are useful for getting
the shape right when the parameters have not yet been established.
There may or may not be different values for alpha or beta for
each state transition depending on the link function. If
multiples are allowed then a list of alphas or betas is returned,
corresponding to the _transitions_ between the states (so the
length of the list is one less than the length of states
).
The shape of the result is as follows:
With this link function both alphas and betas will be lists if the number of states is greater than 2.
With this link function, the betas are a list and the alphas are a single vector. Note that the betas should be in increasing order, but this is not enforced.
Both alphas and betas are vectors.
Either a named vector (the names are pnames
) of the
default value (1 for alphas, 0 for betas) or scalar value, or a
named list of such. If a list, then the names follow the names
of the states with the last one omitted.
As more link functions are added, this needs to be expanded.
Russell Almond
calcDPCTable
, getOffsetRules
,
partialCredit
, gradedResponse
,
normalLink
defaultAlphas("Compensatory",c("S1","S2"),c("Yes","No")) defaultBetas("Compensatory",c("S1","S2"),c("Yes","No")) defaultAlphas("Compensatory",c("S1","S2"),c("Yes","Maybe","No")) defaultBetas("Compensatory",c("S1","S2"),c("Yes","Maybe","No")) defaultAlphas("Compensatory",c("S1","S2"),c("Yes","Maybe","No"), "gradedResponse") defaultBetas("Compensatory",c("S1","S2"),c("Yes","Maybe","No"), "gradedResponse") defaultAlphas("Compensatory",c("S1","S2"),c("Yes","Maybe","No"), "normalLink") defaultBetas("Compensatory",c("S1","S2"),c("Yes","Maybe","No"), "normalLink") defaultAlphas("OffsetConjunctive",c("S1","S2"),c("Yes","No")) defaultBetas("OffsetConjunctive",c("S1","S2"),c("Yes","No")) defaultAlphas("OffsetConjunctive",c("S1","S2"), c("Yes","Maybe","No"),"gradedResponse") defaultBetas("OffsetConjunctive",c("S1","S2"), c("Yes","Maybe","No"),"gradedResponse")
defaultAlphas("Compensatory",c("S1","S2"),c("Yes","No")) defaultBetas("Compensatory",c("S1","S2"),c("Yes","No")) defaultAlphas("Compensatory",c("S1","S2"),c("Yes","Maybe","No")) defaultBetas("Compensatory",c("S1","S2"),c("Yes","Maybe","No")) defaultAlphas("Compensatory",c("S1","S2"),c("Yes","Maybe","No"), "gradedResponse") defaultBetas("Compensatory",c("S1","S2"),c("Yes","Maybe","No"), "gradedResponse") defaultAlphas("Compensatory",c("S1","S2"),c("Yes","Maybe","No"), "normalLink") defaultBetas("Compensatory",c("S1","S2"),c("Yes","Maybe","No"), "normalLink") defaultAlphas("OffsetConjunctive",c("S1","S2"),c("Yes","No")) defaultBetas("OffsetConjunctive",c("S1","S2"),c("Yes","No")) defaultAlphas("OffsetConjunctive",c("S1","S2"), c("Yes","Maybe","No"),"gradedResponse") defaultBetas("OffsetConjunctive",c("S1","S2"), c("Yes","Maybe","No"),"gradedResponse")
This provides a graph of the history of any given measure of
performance. The input should be a list of values hist
and the
list of the contexts
or game levels associated with them.
EAPBal(hist, contexts = names(hist), obs = NULL, varname = deparse(substitute(hist)), elim = c(-1, 1), etic = 4, title = paste("EAP Balance Sheet:", varname), col = "slategray", posCol = "cyan", negCol = "red", stripCol = c("white", "lightgray"), lcex = 0.65)
EAPBal(hist, contexts = names(hist), obs = NULL, varname = deparse(substitute(hist)), elim = c(-1, 1), etic = 4, title = paste("EAP Balance Sheet:", varname), col = "slategray", posCol = "cyan", negCol = "red", stripCol = c("white", "lightgray"), lcex = 0.65)
hist |
A vector of doubles giving the value of the statistic at each time point. |
contexts |
A list of names for the events associated with the
values. Defaults to the names of the |
obs |
Observable values associated with each level. If supplied
it should be a vector as long as |
varname |
The name of the variable which is being monitored. |
elim |
A vector of length two giving the minimum and maximum EAP value. |
etic |
How many tic marks to use on the EAP axis. |
title |
The main title for the overall plot. |
col |
Color for the EAP bars. |
posCol |
Color for positive movements. |
negCol |
Color for negative movements. |
stripCol |
The colors to be used for the time step labels. Setting this to a vector of two colors creates alternate color stripes. Set this to "white" to disable that effect. |
lcex |
Character expansion size for labels. |
Madigan, Mosurski and Almond (1997) described a graphical weight of evidence balance sheet. The key to this display is following a particular node (or more properly hypothesis involving a node) across time as evidence is entered into the system. There are two columns, one showing the probability distribution, and one showing the weight of evidence—the change in probabilities.
Often, the nodes in an ordered categorical value are assigned numeric values and an expected a posteriori or EAP value is calculated instead. The EAP balance sheet is similar to the woe balance sheet, except that it now uses a single numeric value.
The input is a sequence of EAP value, labeled by the event which causes the change. The output is a graphical display in three columns.
The midpoints of the bars (see barplot
) are
returned invisibly.
Russell Almond
Madigan, D., Mosurski, K. and Almond, R. (1997) Graphical explanation in belief networks. Journal of Computational Graphics and Statistics, 6, 160-181.
sampleSequence <- read.csv(system.file("testFiles","SampleStudent.csv", package="CPTtools"), header=TRUE,row.names=1) SolveGeometricProblems <- sampleSequence$H-sampleSequence$L names(SolveGeometricProblems) <- rownames(sampleSequence) EAPBal(SolveGeometricProblems, lcex=1.25) EAPBal(SolveGeometricProblems,rownames(sampleSequence), varname="Solve Geometric Problems", obs=sampleSequence[,"Acc"],lcex=1.25)
sampleSequence <- read.csv(system.file("testFiles","SampleStudent.csv", package="CPTtools"), header=TRUE,row.names=1) SolveGeometricProblems <- sampleSequence$H-sampleSequence$L names(SolveGeometricProblems) <- rownames(sampleSequence) EAPBal(SolveGeometricProblems, lcex=1.25) EAPBal(SolveGeometricProblems,rownames(sampleSequence), varname="Solve Geometric Problems", obs=sampleSequence[,"Acc"],lcex=1.25)
Calculates a vector of normal quantiles corresponding to effective theta levels for a categorical variable for use in a DiBello-Samejima distribution.
effectiveThetas(nlevels)
effectiveThetas(nlevels)
nlevels |
Integer giving the number of levels of the categorical variable. |
The DiBello–Samejima models map levels of categorical values into effective “theta” values, or corresponding continuous values. These can then be input into IRT equations to calculate cell probabilities.
The default algorithm works by assuming that the categories are created by cutting the normal distribution into equal probability intervals. The value returned for each interval is the midpoint (wrt the normal measure) of that interval.
A vector of normal quantiles of length nlevels
.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015). Bayesian Networks in Educational Assessment. Springer. Chapter 8.
effectiveThetas(5)
effectiveThetas(5)
This evaluates the combination function but not the link function of
of an effective theta distribution. It produces a table of effective
thetas one for each configuration of the parent values according to the
combination function given in the model
argument.
eThetaFrame(skillLevels, lnAlphas, beta, rule = "Compensatory")
eThetaFrame(skillLevels, lnAlphas, beta, rule = "Compensatory")
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
lnAlphas |
A vector of log slope parameters. Its length should
be either 1 or the length of |
beta |
A vector of difficulty (-intercept) parameters. Its length
should be either 1 or the length of |
rule |
Function for computing effective theta (see Details). |
The DiBello framework for creating for creating conditional probability tables for Bayesian network models using IRT-like parameters unfolds in three steps.
Each level of each input variable is assigned an “effective theta” value — a normal value to be used in calculations.
For each possible skill profile (combination of states of
the parent variables) the effective thetas are combined using a
combination function. This produces an “effective theta”
for that skill profile. The function rule
determines the
rule for combination.
The effective theta is input into a link function (e.g., Samejima's graded-response function) to produce a probability distribution over the states of the outcome variables.
This function applies the first two of those steps and returns a data frame with the original skill levels and the effective thetas.
The parent (conditioning) variables are described by the
skillLevels
argument which should provide for each parent
variable in order the names of the states ranked from highest to
lowest value. The original method (Almond et al., 2001)
used equally spaced points on the interval for the
effective thetas of the parent variables. The current implementation
uses the function
effectiveThetas
to calculate equally
spaced points on the normal curve.
The combination of the individual effective theta values into a joint
value for effective theta is done by the function reference by
rule
. This should be a function of three arguments:
theta
— the vector of effective theta values for each parent,
alphas
— the vector of discrimination parameters, and
beta
— a scalar value giving the difficulty. The initial
distribution supplies five functions appropriate for use with
calcDSTable
: Compensatory
,
Conjunctive
, and Disjunctive
,
OffsetConjunctive
, and OffsetDisjunctive
.
The last two have a slightly different parameterization: alpha
is assumed to be a scalar and betas
parameter is vector
valued. Note that the discrimination and difficulty parameters are
built into the structure function and not the IRT curve.
For a data frame with one column for each parent variable and an
additional column for the effective theta values. The number of rows
is the product of the number of states in each of the components of
the skillLevels
argument.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015). Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.
effectiveThetas
,Compensatory
,
OffsetConjunctive
,calcDNTable
,
calcDSTable
,
calcDPCTable
,
expand.grid
skill <- c("High","Medium","Low") eThetaFrame(list(S1=skill,S2=skill), log(c(S1=1.25,S2=.75)), 0.33, "Compensatory") eThetaFrame(list(S1=skill,S2=skill), log(c(S1=1.25,S2=.75)), 0.33, "Conjunctive") eThetaFrame(list(S1=skill,S2=skill), log(c(S1=1.25,S2=.75)), 0.33, "Disjunctive") eThetaFrame(list(S1=skill,S2=skill), log(1.0), c(S1=0.25,S2=-0.25), "OffsetConjunctive") eThetaFrame(list(S1=skill,S2=skill), log(1.0), c(S1=0.25,S2=-0.25), "OffsetDisjunctive")
skill <- c("High","Medium","Low") eThetaFrame(list(S1=skill,S2=skill), log(c(S1=1.25,S2=.75)), 0.33, "Compensatory") eThetaFrame(list(S1=skill,S2=skill), log(c(S1=1.25,S2=.75)), 0.33, "Conjunctive") eThetaFrame(list(S1=skill,S2=skill), log(c(S1=1.25,S2=.75)), 0.33, "Disjunctive") eThetaFrame(list(S1=skill,S2=skill), log(1.0), c(S1=0.25,S2=-0.25), "OffsetConjunctive") eThetaFrame(list(S1=skill,S2=skill), log(1.0), c(S1=0.25,S2=-0.25), "OffsetDisjunctive")
The expected weight of evidence (EWOE) is a measure of how
much information about a hypothesis can be learned from a
potential observation. The hypothesis corresponds to a
grouping of the rows of the CPF
(and the
negation of the hypothesis to the remaining rows).
ewoe.CPF(cpf, pos = 1L, neg = NULL)
ewoe.CPF(cpf, pos = 1L, neg = NULL)
cpf |
A conditional probability frame ( |
pos |
An expression for selecting rows of the |
neg |
An expression for selecting the rows corresponding to
the complement of the hypothesis. (The default value is
|
Good (1985) defines the weight of evidence for a hypothesis
as
The expected weight of evidence (Good and Card, 1971) looks at potential future observations to find which might have the highest weight of evidence. The expected weight of evidence is
In this calculation, the potential evidence corresponds to the
columns of the (numericPart
) of cpf
. The
hypothesis is found by splitting the rows. The pos
and
neg
arguments can be any way of specifying a set of rows.
This is similar to the mutualInformation
, only EWOE
works for a binary hypothesis, while
mutualInformation
works for any number of states.
A numeric value giving the weight of evidence in _centibans_ (where the logs are taken base 10 and the result is multiplied by 100).
Russell Almond
Good, I.J. (1985). Weight of Evidence: A brief survey. In Bernardo, J., DeGroot, M., Lindley, D. and Smith, A. (eds). Bayesian Statistics 2. North Holland. 249–269.
Good, I. J. and Card, W. (1971). The Diagnostic Process with Special Reference to Errors. Methods of Information in Medicine, 10, 176–188.
CPF
, mutualInformation
,
expTable
ACED <- dplyr::inner_join(ACED.scores,ACED.items,by="SubjID") expcr <- expTable(ACED,"cr","tCommonRatio1a", pvecregex="P\\.<var>\\.\\.<state>") ewoe.CPF(expcr,"H") ewoe.CPF(expcr,c("H","M"))
ACED <- dplyr::inner_join(ACED.scores,ACED.items,by="SubjID") expcr <- expTable(ACED,"cr","tCommonRatio1a", pvecregex="P\\.<var>\\.\\.<state>") ewoe.CPF(expcr,"H") ewoe.CPF(expcr,c("H","M"))
The table
function in base-R builds contingency tables from cross-classifying factors. If the factor is not known, but instead estimated using a Bayesian classifier, then instead of a single column with the factor value, there will be several columns with giving the probability that the individual is in each state. The table built is then an expected contingency table.
expTable(data, pvecVars, facVars, pvecregex = "<var>\\.<state>") pvecTable(data, pvarName, regx = sub("<var>",pvarName,"<var>\\.<state>")) catTable(data, fvarName, cc = contrasts(as.factor(dplyr::pull(data, fvarName)), contrasts = FALSE))
expTable(data, pvecVars, facVars, pvecregex = "<var>\\.<state>") pvecTable(data, pvarName, regx = sub("<var>",pvarName,"<var>\\.<state>")) catTable(data, fvarName, cc = contrasts(as.factor(dplyr::pull(data, fvarName)), contrasts = FALSE))
data |
This is a data frame where columns corresponding to probability vectors have a regular naming pattern. |
pvecVars , pvarName
|
The names (or name for |
facVars , fvarName
|
The names (or name for |
pvecregex , regx
|
A regular expression template for finding the column names. The string “<var>” is replaced with the variable name ( |
cc |
This is a contrasts matrix (see |
For an individual, $i$, let $Y_i$ be a fully observed variable which takes on states ${ y_1, ..., y_m}$. Let $S_i$ be a latent with states ${s_1, ..., s_k}$, and let $P(S_i)$ is an estimate of $S_i$, so it is a vector over the states.
The expected matrix is formed as follows:
Initialize a $k$ by $m$ (rows correspond to states of $S$ and columns to states of $Y$) matrix with 0.
For each individual, add $P(S_i)$ to the column corresponding to $Y_i$
The result is the expected value of the contingency table.
The general case can be handled with an Einstein sum (see
einsum
), with a rule “za,zb,zc,... -> abc...”.
The assumption is that the estimates of the latent variable are
saved in columns with a regular naming convention. For example,
the ACED data set uses the names P.cr..H
,
P.cr..M
and P.cr..L
are the high, medium and
low probabilities for the common ratio variable. The regular
expression which will capture the names of the states is
“P\.cr\.\.(\w+)”, where “\w+” is one or more
word constituent characters. The parentheses around the
last part are used to extract the state names.
Internally, the function substitutes the value of
pvecName
for “<var>”, and the “(\w+)” is
substituted for “<state>”. If this regular expression
doesn't work for grabbing the state names, a better expression
can be substituted, but it should be the first sub-expression
marked with parentheses. Note also that
the period has a special meaning in regular expressions so it
needs to be quoted. Note also, that the backslash needs to be
quoted in R strings.
The functions pvecTable
and catTable
return a
matrix with rows corresponding to the original data, and columns
to the states of the variable.
The function expTable
produces an array whose dimensions
correspond to the states of probability and factor variables.
Russell Almond
Observable Analysis paper (in preparation).
mutualInformation
, gkGamma
,
table
, contrasts
,
einsum
data(ACED) ACED.joined <- dplyr::inner_join(ACED.scores,ACED.items,by="SubjID") head(pvecTable(ACED.joined,"cr",regx="P\\.cr\\.\\.<state>")) head(catTable(ACED.joined,"tCommonRatio1a")) expTable(ACED.joined,"cr","tCommonRatio1a", pvecregex="P\\.<var>\\.\\.<state>")
data(ACED) ACED.joined <- dplyr::inner_join(ACED.scores,ACED.items,by="SubjID") head(pvecTable(ACED.joined,"cr",regx="P\\.cr\\.\\.<state>")) head(catTable(ACED.joined,"tCommonRatio1a")) expTable(ACED.joined,"cr","tCommonRatio1a", pvecregex="P\\.<var>\\.\\.<state>")
The functions take a “confusion matrix”, a square matrix were
the rows and columns represent classifications by two different
raters, and compute measures of rater agreement. Cohen's kappa
(fcKappa
) is corrected for random labeling of ratings. Goodman
and Kruskall's lambda (gkLambda
) is corrected for labeling
every subject at the modal category.
fcKappa(tab, weights = c("None", "Linear", "Quadratic"), W=diag(nrow(tab))) gkLambda(tab, weights = c("None", "Linear", "Quadratic"), W=diag(nrow(tab))) accuracy(tab, weights = c("None", "Linear", "Quadratic"), W=diag(nrow(tab)))
fcKappa(tab, weights = c("None", "Linear", "Quadratic"), W=diag(nrow(tab))) gkLambda(tab, weights = c("None", "Linear", "Quadratic"), W=diag(nrow(tab))) accuracy(tab, weights = c("None", "Linear", "Quadratic"), W=diag(nrow(tab)))
tab |
A square matrix whose rows and columns represent rating categories from two raters or classifiers and whose cells represent observed (or expected) counts. If one classifier is regarded as “truth” it should be represented by columns. |
weights |
A character scalar which should be one of
“None”, “Linear”, or |
W |
A square matrix of the same size as tab giving the weights
(see details). If missing and |
Lets say the goal is to classify a number of subjects into
categories, and that two raters: Rater 1, and Rater 2 do the
classification. (These could be human raters or machine
classification algorithms. In particular a Bayes net modal
predication is a classier.) Let
be the probability that
Rater 1 places a subject into Category
and Rater 2 places the
same subject into Cateogry
. The
matrix,
tab
, is the confusion matrix.
Note that tab
could be a matrix of probabilities or a matrix of
counts, which can be easily turned into a matrix of probabilities by
dividing by the total. In the case of a Bayes net, expected counts
could be used instead. For example, if Rater 1 was a Bayes net and
the predicted probabilities for the three categories was
, and and Rater 2 was the true category which for this
subject was 1, then that subject would contribute .5 to
,
.3 to
, and .2 to
.
In both cases, , is a measure of agreement between
the two raters. If scaled as probabilities, the highest possible
agreement is
and the lowest
. This is the
accuracy
function.
However, raw agreement has a problem as a measure of the quality of a rating, it depends on the distributions of the categories in the population of interest. In particular, if a majority of the subject are of one category, then it is very easy to match just by labeling everything as the most frequent category.
The most well-known correct is the Fliess-Cohen kappa
(fcKappa
). This adjusts the agreement rate for the probability
that the raters will agree by chance. Let be the row
sums, and Let
be the column sums. The probability of a
chance agreement, is then
. So the adjusted
agreement is:
So kappa answers the question how much better do the raters do than chance agreement.
Goodman and Kruskal (1952) offered another way of normalizing. In
this case, let Rater 1 be the true category and Rater 2 be the estimated
category. Now look at a classifier which always classifies somebody in
Category ; that classifier will be right with probability
. The best such classifer will be
. So
the adjusted agreement becomes:
Goodman and Kruskal's lambda (gkLambda
) is appropriate when
there is a different treatment associated with each category. In this
case, lambda describes how much better one could do than treating
every subject as if they were in the modal category.
Weights are used if the misclassification costs are not equal in all
cases. If the misclassification cost is , then the weight
is defined as
. Weighted
agreeement is defined as
.
If the categories are ordered, there are three fairly standard weighting schemes (especially for kappa).
if
, 0 otherwise. (Diagonal
matrix.)
. Penalty increases
with number of categories of difference.
. Penalty increases
with square of number of categories of difference.
Indeed, quadratic weighted kappa is something of a standard in comparing two human raters or a single machine classification algorithm to a human rater.
The argument weights
can be used to select one of these three
weighting schemes. Alternatively, the weight matrix W
can be
specified directly.
A real number between -1 and 1; with higher numbers indicating more agreement.
Russell Almond
Almond, R.G., Mislevy, R.J. Steinberg, L.S., Yan, D. and Willamson, D. M. (2015). Bayesian Networks in Educational Assessment. Springer. Chapter 7.
Fleiss, J. L., Levin, B. and Paik, M. C. (2003). Statistical Methods for Rates and Proportions. Wiley. Chapter 18.
Goodman, Leo A., Kruskal, William H. (1954). Measures of Association for Cross Classifications. Journal of the American Statistical Association. 49 (268), 732–764.
## Example from Almond et al. (2015). read <- matrix(c(0.207,0.029,0,0.04,0.445,0.025,0,0.025,0.229),3,3, dimnames=list(estimated=c("Advanced","Intermediate","Novice"), actual=c("Advanced","Intermediate","Novice"))) stopifnot (abs(fcKappa(read)-.8088) <.001) stopifnot (abs(gkLambda(read)-.762475) <.001) fcKappa(read,"Linear") fcKappa(read,"Quadratic") gkLambda(read,"Linear") gkLambda(read,"Quadratic")
## Example from Almond et al. (2015). read <- matrix(c(0.207,0.029,0,0.04,0.445,0.025,0,0.025,0.229),3,3, dimnames=list(estimated=c("Advanced","Intermediate","Novice"), actual=c("Advanced","Intermediate","Novice"))) stopifnot (abs(fcKappa(read)-.8088) <.001) stopifnot (abs(gkLambda(read)-.762475) <.001) fcKappa(read,"Linear") fcKappa(read,"Quadratic") gkLambda(read,"Linear") gkLambda(read,"Quadratic")
Fetches the names of the parent variables, or the names of the states of the child variable from a conditional probability table.
getTableStates(table) getTableParents(table)
getTableStates(table) getTableParents(table)
table |
A conditional probability table expressed as a data frame. |
These functions assume that table
is a conditional probability
table (or a set of hyper-Dirichlet parameters) which is shaped like a
data frame. Columns in the data frame which are factors are assumed
to hold values for the parent (conditioning) variables. Columns in
the data frame which are numeric are assumed to correspond to possible
states of the child (dependent) variable.
For getTableParents()
, a character vector giving the names of
the parent variables (factor columns).
For getTableStates()
, a character vector giving the names of
the child states (numeric columns).
StatShop usually assumes that the states are ordered from the highest to the lowest possible values, for example ‘High’, ‘Med’, ‘Low’.
Russell Almond
#conditional table # Note in R 4.0, parents must be explicitly made into factors X2.ptf <- data.frame(Theta=factor(c("Expert","Novice")), correct=c(4,2), incorrect=c(2,4)) #Unconditional table Theta.ptf <- data.frame(Expert=3,Novice=3) stopifnot( identical(getTableStates(X2.ptf), c("correct","incorrect")), identical(getTableStates(Theta.ptf), c("Expert","Novice")), identical(getTableParents(X2.ptf), "Theta"), identical(getTableParents(Theta.ptf), character(0)) )
#conditional table # Note in R 4.0, parents must be explicitly made into factors X2.ptf <- data.frame(Theta=factor(c("Expert","Novice")), correct=c(4,2), incorrect=c(2,4)) #Unconditional table Theta.ptf <- data.frame(Expert=3,Novice=3) stopifnot( identical(getTableStates(X2.ptf), c("correct","incorrect")), identical(getTableStates(Theta.ptf), c("Expert","Novice")), identical(getTableParents(X2.ptf), "Theta"), identical(getTableParents(Theta.ptf), character(0)) )
Goodman and Kruskal introduce a second coefficient, gamma
(gkGamma
) as a measure of association for two ordered
categorical variables which do not necessarily have the same number of
states. Thus it is useful as a measure of association between an
observed variable and a latent proficiency variable.
gkGamma(tab)
gkGamma(tab)
tab |
A matrix whose rows and columns represent rating two different classifications and whose cells represent observed (or expected) counts. |
Let and
be ordinal variables (both oriented in the
same direction either high to low or low to high). Then
tab
is
a cross-tabulation of the two variables (e.g., what is produced by
the table
funciton). In the case that one of the
variables is a classification by a Bayesian network, this could also
be an expected tabulation. It also could be normalized by dividing by
the total to give probabilities.
Let and
be two randomly chosen data points.
Classify this pair according to the following rule:
If both relationships point the same way,
that is either both and
or
both
and
, then the relationship is
concordant.
If both relationships point in
opposite ways,
that is either both and
or
both
and
, then the relationship is
discordant.
If both relationships point in
opposite ways,
that is either both and
or
both
and
, then the relationship is
discordant.
If one or both variables are the same,
that is either or
or
both, then the relationship is tied.
Let be the proportion of pairs which are concordant,
be the proportion of pairs which are discordant, and
be the proportion of pairs which are tied. Then gamma is
defined as the proportion of concordant pairs minus the proportion of
discordant pairs normalized by the number of non-tied pairs, that is:
Like a correlation coefficient it runs from -1 (perfect negative
associations) through 0 (no association) to +1 (perfect association).
It is comparable to Kendall's tau (see
cor(,method="kendall")
).
A numeric scalar.
Like a correlation coefficient it runs from -1 (perfect negative
associations) through 0 (no association) to +1 (perfect association).
It is comparable to Kendall's tau (see
cor(,method="kendall")
).
Russell Almond
Goodman, Leo A., Kruskal, William H. (1954). Measures of Association for Cross Classifications. Journal of the American Statistical Association. 49 (268), 732–764.
Other similar statistics (useful for square tables):
fcKappa
, gkLambda
Other measures of correlation: cor
.
Building contingency tables: table
.
## Example from Goodman and Kruskall (1963) nab <- matrix(c(8,0,0,5,8,4,3,1,14,3,0,4),3,4) stopifnot(all.equal(.6122,gkGamma(nab),tolerance=.0001)) ## Using built-in data set US Judges, rounding to produce ordered ## categories. table(round(USJudgeRatings[,1:2])) gkGamma(table(round(USJudgeRatings[,1:2]))) ## Kendall's Tau for comparison cor(round(USJudgeRatings[,1:2]),method="kendall")
## Example from Goodman and Kruskall (1963) nab <- matrix(c(8,0,0,5,8,4,3,1,14,3,0,4),3,4) stopifnot(all.equal(.6122,gkGamma(nab),tolerance=.0001)) ## Using built-in data set US Judges, rounding to produce ordered ## categories. table(round(USJudgeRatings[,1:2])) gkGamma(table(round(USJudgeRatings[,1:2]))) ## Kendall's Tau for comparison cor(round(USJudgeRatings[,1:2]),method="kendall")
This function converts a matrix of effective theta values into a conditional probability table by applying Samejima's graded response model to each row of the table.
gradedResponse(et, linkScale = NULL, obsLevels = NULL)
gradedResponse(et, linkScale = NULL, obsLevels = NULL)
et |
A matrix of effective theta values. There should be one row in this table for each configuration of the parent variables of the conditional probability table and one column for each state of the child variables except for the last. |
linkScale |
Unused. For compatibility with other link functions. |
obsLevels |
A character vector giving the names of the child variable states.
If supplied, it should have length |
This function takes care of the third step in the algorithm of
calcDPCTable
. Its input is a matrix of effective theta
values (comparable to the last column of the output of
eThetaFrame
), one column for each of the child variable
states (obsLevels
) except for the last one. Each row
represents a different configuration of the parent variables. The
output is the conditional probability table.
Let be the child variable of the distribution, and assume that
it can take on
possible states labeled
through
in increasing order. The graded response model defines a
set of functions
for
,
where
The conditional probabilities for each child state given the effective thetas for the parent variables is then given by
The matrix
et
is the values of
. This function then performs the rest of the
generalized partial credit model. This is a generalization of Muraki
(1992), because the functions
are not restricted to
be the same functional form for all
.
If supplied obsLevels
is used for the column names.
A matrix with one more column than et
giving the conditional
probabilities for each configuration of the parent variables (which
correspond to the rows).
The linkScale
parameter is unused. It is for compatibility
with other link function choices.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015). Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Muraki, E. (1992). A Generalized Partial Credit Model: Application of an EM Algorithm. Applied Psychological Measurement, 16, 159-176. DOI: 10.1177/014662169201600206
Samejima, F. (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph No. 17, 34, (No. 4, Part 2).
I also have planned a manuscript that describes these functions in more detail.
Other Link functions:
gradedResponse
,normalLink
Functions which directly use the link function:
eThetaFrame
, calcDPCTable
,
mapDPC
Earlier version of the graded response link:
calcDSTable
## Set up variables skill1l <- c("High","Medium","Low") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## Get some effective theta values. et <- effectiveThetas(3) gradedResponse(matrix(et,ncol=1),NULL,correctL) gradedResponse(outer(et,c(Full=1,Partial=-1)),NULL,pcreditL) gradedResponse(outer(et,c(A=2,B=1,C=0,D=-1)),NULL,gradeL)
## Set up variables skill1l <- c("High","Medium","Low") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## Get some effective theta values. et <- effectiveThetas(3) gradedResponse(matrix(et,ncol=1),NULL,correctL) gradedResponse(outer(et,c(Full=1,Partial=-1)),NULL,pcreditL) gradedResponse(outer(et,c(A=2,B=1,C=0,D=-1)),NULL,gradeL)
These functions take a vector and check to see if it is in increasing
or decreasing order. The function isMonotonic
returns true if
the sequence is either increasing (nondecreasing) or decreasing
(nonincreasing), with the parenthesized values used if strict
is false. For isMonotonic
an attribute of the return value
tell the direction of the series.
isMonotonic(vec, strict = TRUE) isIncreasing(vec) isNondecreasing(vec) isDecreasing(vec) isNonincreasing(vec)
isMonotonic(vec, strict = TRUE) isIncreasing(vec) isNondecreasing(vec) isDecreasing(vec) isNonincreasing(vec)
vec |
A vector of sortable objects (i.e., ones for which the comparison operators are defined. The vector should have length of at least 2. |
strict |
A logical value. If true, then the function will check for strict monotonicity. |
These function use the following definitions:
vec[1] < vec[2] <
... < vec[k]
.
vec[1] <= vec[2] <=
... <= vec[k]
.
vec[1] > vec[2] >
... > vec[k]
.
vec[1] >= vec[2] >=
... >= vec[k]
.
A sequence is monotonic if it is either nondecreasing or
nonincreasing. It is strictly monotonic if it is either
increasing or decreasing. Note that the function isMonotonic
by default checks for strict monotonicity; which is usually more
useful in the context of Bayesian networks.
It is often of interest to check if the direction is increasing or
decreasing. The function isMonotonic
returns an attribute
"direction"
which is positive for increasing series and
negative for decreasing series.
The functions isIncreasing
, isNondecreasing
,
isDecreasing
, and isNonincreasing
a logical according to
whether or not the corresponding condition holds for vec
.
The function isMonotonic
returns a logical value with an
additional attribute "direction"
. The expression
attr(isMonotonic(vec),"direction"
will be positive if the
series is increasing and negative if it is decreasing. If all values
are equal, and strict
is false, the direction will be 0.
If vec
has length less than 2, then NA
will be returned.
Because isMonotonic
returns an extra attribute,
isTRUE(isMonotonic(vec)
will always be false. The expression
isTRUE(c(isMonotonic(vec)))
is likely to be more useful.
Russell Almond
isIncreasing(1:3) isNondecreasing(c(1,1,3)) isDecreasing(3:1) isNonincreasing(c(3,1,1)) isMonotonic(1:3) isMonotonic(c(1,1,3)) #FALSE isMonotonic(c(1,1,3),strict=FALSE) isMonotonic(3:1) isMonotonic(c(3,3,1),strict=FALSE) isMonotonic(c(0,0,0)) #FALSE isMonotonic(c(0,0,0),FALSE) #TRUE isMonotonic(c(3,0,3)) # FALSE
isIncreasing(1:3) isNondecreasing(c(1,1,3)) isDecreasing(3:1) isNonincreasing(c(3,1,1)) isMonotonic(1:3) isMonotonic(c(1,1,3)) #FALSE isMonotonic(c(1,1,3),strict=FALSE) isMonotonic(3:1) isMonotonic(c(3,3,1),strict=FALSE) isMonotonic(c(0,0,0)) #FALSE isMonotonic(c(0,0,0),FALSE) #TRUE isMonotonic(c(3,0,3)) # FALSE
An offset rule is one where there is one intercept (beta) parameter for each parent and there is a single slope parameters. As opposed to a regression-style rule where there is a different slope (alpha) for each parent and single intercept. This function distinguishes between the twho types of rules.
isOffsetRule(rl) getOffsetRules() setOffsetRules(newval)
isOffsetRule(rl) getOffsetRules() setOffsetRules(newval)
rl |
A character vector of rule names to test to see if these are offset rules. |
newval |
A character vector of rule names to be considered as offset rules. |
The Compensatory
rule acts more or less like a
regression, with a slope (or discrimination) parameter for each parent
variable, and a single intercept or difficulty parameter. The
Conjunctive
and Disjunctive
follow the same pattern. In
contrast the OffsetConjunctive
rule, has a different
intercept for each parent and a single slope parameter. The
OffsetDisjunctive
rule follows the same pattern.
The isOffsetRule()
is true if the argument references a
function which follows the offset parameterization and false if it
follow the regression parameterization. Currently it returns true
only for “OffsetConjunctive”, and “OffsetDisjunctive”,
but using this test should continue to work if the number of rules in
CPTtools expands.
The expression getOffsetRules()
returns the list of currently
known offset-style rules. The function setOffsetRules()
allows
this list to be manipulated to add a new rule to the list of
offset-style rules.
The expression isOffsetRule(rl)
returns a logical vector of the
same length as rl
, with each element TRUE
or
FALSE
depending on whether or the corresponding element of
rl
is the name of an offset rule. If rl
is not a
character vector, then the function returns FALSE
.
The expression getOffsetRule()
returns the names of the current
offset rules.
It makes sense in certain situation to use anonymous function objects as rules, however, it is impossible to test whether or not a function is in the offset rule list. Therefore, it is recommended that only rule names be used with this function.
One consequence of this rule is that when given a function argument,
isOffsetRule
returns FALSE
.
Russell G. Almond
Compensatory
, OffsetConjunctive
,
stopifnot( all(isOffsetRule(c("OffsetConjunctive","Conjunctive"))==c(TRUE,FALSE)), isOffsetRule(OffsetDisjunctive)==TRUE, all(getOffsetRules() == c("OffsetConjunctive","OffsetDisjunctive")) ) setOffsetRules(c(getOffsetRules(),"myOffsetRule")) stopifnot (isOffsetRule("myOffsetRule"))
stopifnot( all(isOffsetRule(c("OffsetConjunctive","Conjunctive"))==c(TRUE,FALSE)), isOffsetRule(OffsetDisjunctive)==TRUE, all(getOffsetRules() == c("OffsetConjunctive","OffsetDisjunctive")) ) setOffsetRules(c(getOffsetRules(),"myOffsetRule")) stopifnot (isOffsetRule("myOffsetRule"))
This is an example used in Chapter 7 of Almond, et. al. (2015), based on a conceptual
assessment described in Mislevy (1995). The assessment has four reporting variables:
_Reading_, _Writing_, _Speaking_, and _Listening_, each of which can take on states
'Novice', 'Intermediate', or 'Advanced'.
To estimate the reliability 1000 students were simulated from
the network (see NetworkTester
and accuracy, using Maximum A Prior
estimates, ('Langauge_modal') and expected accuracy ('Language_exp') estimates were calculated.
data("Language_modal") data("Language_exp") data("language16") data("language24")
data("Language_modal") data("Language_exp") data("language16") data("language24")
The format for both 'Language_modal' and 'Language_exp' is: List of 4 $ Reading A 3x3 matrix giving the accuracy of the Reading measure. $ Writing A 3x3 matrix giving the accuracy of the Writing measure. $ Speaking A 3x3 matrix giving the accuracy of the Speaking measure. $ Listening A 3x3 matrix giving the accuracy of the Listening measure.
All cases, the table has been normalized so that all entries sum to 1.
The data sets 'language16' and 'language24' are simulated data (1000 cases each) from both the 16 task test described above, and a 24 task variant which has 6 copies of each task. Both data frames have the following columns (not necessarily in this order).
A case number.
The "true" (simulated) ability.
The simulated responses for the type D (Listening) tasks. 'TaskD5' is only in 'language24'.
The simulated reponses for the type C (Speaking) tasks. 'TaskC3', 'TaskC4', and 'TaskC5' are only in the 'langauge24' data set.
The simulated responses for the type B (Writing) tasks. 'TaskB3', 'TaskB4', and 'TaskB5' are only in the 'language24' data set.
The simulated responses for the type A (Reading) tasks. 'TaskA5' is only available in the 'langauge24' data set.
The estimated probability that this subject is in the 'Novice', 'Intermediate' and 'Advanced' categories respectively of the Listening skill.
The estimated probability that this subject is in the 'Novice', 'Intermediate' and 'Advanced' categories respectively of the Speaking skill.
The estimated probability that this subject is in the 'Novice', 'Intermediate' and 'Advanced' categories respectively of the Writing skill.
The estimated probability that this subject is in the 'Novice', 'Intermediate' and 'Advanced' categories respectively of the Reading skill.
The most likely state of the skill variables given the relevant data.
The simulation code used to build 'language16' and 'language24' is in 'vignette("SimulationStudies",package="RNetica")'.
The sample language assessment contains:
5 Reading tasks which only tap the Reading attribute.
3 Reading/Writing integrated tasks which require a written response to a textual prompt.
3 Reading/Speaking/Listening Integrated tasks which require a spoken response to an aural stimulus, with textual instructions.
5 Listening tasks which have both aural stimulus and instructions.
Because different numbers of tasks (which have different strengths of evidence) are available, different amounts of information are available about each of the four target variables. A number of different measures of reliability can be calculated from the (expected) accuracy matrixes.
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 7.
Mislevy, R. J. (1995) Test theory and language learning in assessment. Language Testing, 12, 341–369.
data(Language_modal) fcKappa(Language_modal$Reading) gkLambda(Language_modal$Reading)
data(Language_modal) fcKappa(Language_modal$Reading) gkLambda(Language_modal$Reading)
The function ciTest
takes a 3-way contingency table and tests
for the conditional independence of the first two variables given the
third. The function localDepTest
is a wrapper which builds the
table from factor variables. In psychometrics, when the third
variable represents a latent proficiency and the first two item
responses, this is sometimes called local independence.
localDepTest(obs1, obs2, prof) ciTest(tab)
localDepTest(obs1, obs2, prof) ciTest(tab)
obs1 |
A factor variable representing the first observable outcome. |
obs2 |
A factor variable representing the second observable outcome. |
prof |
A factor variable representing the proficiency level, or
any variable that is thought to render |
tab |
A three-way table (see |
Let 1 and 2 represent obs1
and obs2
respectively and let
3 represent prof
. In the case of ciTest
, 1, 2 and 3
represent the first second and 3rd dimensions of tab
. These
function then compare the undirected model [13][23] (1 and 2 are
conditionally independent given 3) to the unrestricted model [123].
The result is a chi-square statistic comparing the two models, high
values of the chi-square statistic indicate a better fit of the
unrestricted model compared to the conditional independence model.
Note that the Cochran-Mantel-Haenszel statistic (see
mantelhaen.test
) is similar, but it assumes that
there is no three-way interaction, so it essentially tests [13][23]
versus [12][13][23].
A list with three elements:
G2 |
The chi-square comparison between the two models. |
df |
The degrees of freedom for the test. |
p |
The percentage point for |
Russell Almond
Bishop, Feinberg and Holland (1975). Discrete Multivariate Analysis: Theory and Practice. MIT Press.
buildFactorTab
, mantelhaen.test
,
UCBAdmissions
data(UCBAdmissions) ciTest(UCBAdmissions) data(ACED) ACED.items$Correct <- rowSums(ACED.items[,-(1:3)],na.rm=TRUE) table(ACED.items$tCommonRatio1a,ACED.items$tCommonRatio2a, cut(ACED.items$Correct,3)) localDepTest(ACED.items$tCommonRatio1a,ACED.items$tCommonRatio2a, cut(ACED.items$Correct,3))
data(UCBAdmissions) ciTest(UCBAdmissions) data(ACED) ACED.items$Correct <- rowSums(ACED.items[,-(1:3)],na.rm=TRUE) table(ACED.items$tCommonRatio1a,ACED.items$tCommonRatio2a, cut(ACED.items$Correct,3)) localDepTest(ACED.items$tCommonRatio1a,ACED.items$tCommonRatio2a, cut(ACED.items$Correct,3))
This finds a set of parameters for a given discrete partial credit
model which produces the best fit to the first argument. It is
assumed that the first argument is a table produced by adding observed
counts to a prior conditional probability table. Thus the result is
a maximum a posterior (MAP) estimate of the CPT. The parametric
structure of the table is given by the rules
and link
parameters.
mapDPC(postTable, skillLevels, obsLevels, lnAlphas, betas, rules = "Compensatory", link = "partialCredit", linkScale=NULL, Q=TRUE, tvals=lapply(skillLevels, function (sl) effectiveThetas(length(sl))), gamma=0.001, ...)
mapDPC(postTable, skillLevels, obsLevels, lnAlphas, betas, rules = "Compensatory", link = "partialCredit", linkScale=NULL, Q=TRUE, tvals=lapply(skillLevels, function (sl) effectiveThetas(length(sl))), gamma=0.001, ...)
postTable |
A table of cell counts which should have the same structure as the
output of |
skillLevels |
A list of character vectors giving names of levels for each of the condition variables. |
obsLevels |
A character vector giving names of levels for the output variables from highest to lowest. As a special case, can also be a vector of integers. |
lnAlphas |
A list of vectors of initial values for the log slope
parameters. Its length should be 1 or |
betas |
A list of vectors of initial values for the difficulty
(-intercept) parameters. Its length should be 1 or
|
rules |
A list of functions for computing effective theta (see
Details). Its length should be |
link |
The function that converts a table of effective thetas to probabilities |
linkScale |
Initial values for the optional scale parameter for
the |
Q |
This should be a Q matrix indicating which parent variables are relevant for which state transitions. It should be a number of states minus one by number of parents logical matrix. As a special case, if all variable are used for all levels, then it can be a scalar value. |
tvals |
A list of the same length as |
gamma |
This is a weight given to a penalty to keep the parameters close to zero. |
... |
Additional arguments passed to the optim function. |
The purpose of this function is to try to estimate the values of a
discrete partial credit model. The structure of the model is given by
the rules
and link
arguments: the form of the table
produces is like the output of calcDPCTable(skillLevels,
obsLevels, lnAlphas, betas, rules, link, linkScale)
. It tries to
find the values of lnAlphas
and betas
(and if used
linkScale
) parameters which are most likely to have generated
the data in the postTable
argument. The lnAlphas
,
betas
and linkScale
arguments provide the initial values
for those parameters.
Let be the value in the
th row and the
th
column of the conditional probability table output from
calcDPCTable(skillLevels, obsLevels, lnAlphas, betas,
rules, link, linkScale)
, and let be the corresponding
elements of
postTable
. The mapDPC
function uses
optim
to find the value of the parameters that
minimizes the deviance,
A list with components:
A vector of the same structure as lnAlphas
containing the estimated values.
A veto of the same structure as betas
containing the estimated values.
If the linkScale
was supplied, the estimated
value.
An integer code. 0
indicates successful
completion, positive values are various error codes (see
optim
).
The deviance of the fit DPC model.
The list is the output of the optim
) function,
which has other components in the output. See the documentation of
that function for details.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015). Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Muraki, E. (1992). A Generalized Partial Credit Model: Application of an EM Algorithm. Applied Psychological Measurement, 16, 159-176. DOI: 10.1177/014662169201600206
Samejima, F. (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph No. 17, 34, (No. 4, Part 2).
I also have planned a manuscript that describes these functions in more detail.
optim
, calcDPCTable
,
Compensatory
,
OffsetConjunctive
, gradedResponse
,
partialCredit
pLevels <- list(Skill1=c("High","Med","Low")) obsLevels <- c("Full","Partial","None") trueLnAlphas <- list(log(1),log(.25)) trueBetas <- list(2, -.5) priorLnAlphas <- list(log(.5),log(.5)) priorBetas <- list(1, -1) truedist <- calcDPCTable(pLevels,obsLevels,trueLnAlphas,trueBetas, rules="Compensatory",link="partialCredit") prior <- calcDPCTable(pLevels,obsLevels,priorLnAlphas,priorBetas, rules="Compensatory",link="partialCredit") post1 <- prior + round(truedist*1000) map1 <- mapDPC(post1,pLevels,obsLevels,priorLnAlphas,priorBetas, rules="Compensatory",link="partialCredit") if (map1$convergence != 0) { warning("Optimization did not converge:", map1$message) } postLnAlphas <- map1$lnAlphas postBetas <- map1$betas fitdist <- calcDPCTable(pLevels,obsLevels,map1$lnAlphas,map1$betas, rules="Compensatory",link="partialCredit") ## Tolerance for recovery test. tol <- .01 maxdistdif <- max(abs(fitdist-truedist)) if (maxdistdif > tol) { stop("Posterior and True CPT differ, maximum difference ",maxdistdif) } if (any(abs(unlist(postLnAlphas)-unlist(trueLnAlphas))>tol)) { stop("Log(alphas) differ by more than tolerance") } if (any(abs(unlist(postBetas)-unlist(trueBetas))>tol)) { stop("Betas differ by more than tolerance") }
pLevels <- list(Skill1=c("High","Med","Low")) obsLevels <- c("Full","Partial","None") trueLnAlphas <- list(log(1),log(.25)) trueBetas <- list(2, -.5) priorLnAlphas <- list(log(.5),log(.5)) priorBetas <- list(1, -1) truedist <- calcDPCTable(pLevels,obsLevels,trueLnAlphas,trueBetas, rules="Compensatory",link="partialCredit") prior <- calcDPCTable(pLevels,obsLevels,priorLnAlphas,priorBetas, rules="Compensatory",link="partialCredit") post1 <- prior + round(truedist*1000) map1 <- mapDPC(post1,pLevels,obsLevels,priorLnAlphas,priorBetas, rules="Compensatory",link="partialCredit") if (map1$convergence != 0) { warning("Optimization did not converge:", map1$message) } postLnAlphas <- map1$lnAlphas postBetas <- map1$betas fitdist <- calcDPCTable(pLevels,obsLevels,map1$lnAlphas,map1$betas, rules="Compensatory",link="partialCredit") ## Tolerance for recovery test. tol <- .01 maxdistdif <- max(abs(fitdist-truedist)) if (maxdistdif > tol) { stop("Posterior and True CPT differ, maximum difference ",maxdistdif) } if (any(abs(unlist(postLnAlphas)-unlist(trueLnAlphas))>tol)) { stop("Log(alphas) differ by more than tolerance") } if (any(abs(unlist(postBetas)-unlist(trueBetas))>tol)) { stop("Betas differ by more than tolerance") }
Summary statistics for a data set consisting of marks on five mathematics exams originally presented in Mardia, Kent and Bibby (1979). The raw data are not given, but rather the summary statistics reported in Whittaker (1990) are provided.
data(MathGrades)
data(MathGrades)
A list consisting of the following components:
Names of the five tests.
Average score on each test.
Covariance matrix.
Correlation matrix.
Diagonal of the inverse covariance matrix.
Partial correlation (scaled inverse correlation) matrix.
A set of marginal distributions for discrete variables corresponding to the five assessments.
Summary statistics reported here are taken from Whittaker (1990).
Original data on 88 students is reported in Mardia, Kent and Bibby (1979).
Mardia, K.V. and Kent, J.T. and Bibby, J.M. (1979) Multivariate Analysis. Academic Press.
Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley.
data(MathGrades) ##Note: Some of these tests may return false due to machine precision ##issues. round(scaleMatrix(MathGrades$var),2) == MathGrades$cor round(diag(solve(MathGrades$cor)),2) == MathGrades$icdiag round(scaleMatrix(solve(MathGrades$cor)),2) == MathGrades$pcor
data(MathGrades) ##Note: Some of these tests may return false due to machine precision ##issues. round(scaleMatrix(MathGrades$var),2) == MathGrades$cor round(diag(solve(MathGrades$cor)),2) == MathGrades$icdiag round(scaleMatrix(solve(MathGrades$cor)),2) == MathGrades$pcor
Takes a graph described by an incidence matrix, and creates an ordering of the nodes using maximum cardinality search (Tarjan and Yannakakis, 1984). A primary application of this method is to chose an ordering of nodes in an undirected graph to make a corresponding directed graph.
mcSearch(sm, start = colnames(sm)[1])
mcSearch(sm, start = colnames(sm)[1])
sm |
A logical matrix whose rows and columns correspond to nodes (variables) and a true value indicates an edge between the variables. |
start |
The name of the first element. |
The sm
argument should be an incidence matrix for a graph, with
row and column names set to the names of the nodes/variables.
The function returns an ordering of the nodes where each node is chosen so that it has a maximum number of neighbors among those nodes higher in the order.
Ties are broken by chosing the first node in the graph (using the
order of the columns) matching the criteria. One special case is the
first node which is always an arbitrary choice. The start
argument can be used to force a particular selection.
A vector of lenght equal to the number of rows whose values correspond to the order of the variable in the final order.
If the graph is triangulated, then the ordering produced by
mcSearch
should be perfect — for each node in the
order, the set of neighbors of that node which preceed it in the
ordering is completely connected. Perfect node orderings are useful
in going from undirected to directed graphical representations. If
we take an undirected graph and a perfect ordering, a define as the
parents of an node all of its neighbors which are previous in
the order and define as its children all of the nodes which are later
in the order, then when converting the graph back to the undirected
form no additional “moralization” edges will be required.
Thus, this function can be used to generate orders for
buildParentList
.
Graphical models generally exist only over triangulated graphs.
Therefore, and incidence matrix which has been produced through the
use of the structMatrix
function should alway work.
Tarjan and Yannakakis (1984) prove that the maximum cardinality
search always produces a perfect ordering when the graph is
triangulated.
When the graph is not triangulated, the maximum cardinality search algorithm can be used to generate “fill-ins” to triangulate the graph. Lauritzen and Spiegelhalter (1988) note this use. While maximum cardinality search will produce an ordering quickly, the ordering itself has now particular optimality properties as far as the triangulated graph which it creates (Almond, 1995).
Russell Almond
Almond, R.G. (1995). Graphical Belief Modeling. Chapman and Hall.
Lauritzen, S.L. and D.J. Spiegelhalter (1988). Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion). Journal of the Royal Statistical Society,Series B, 50, 205-247.
Tarjan, R.E. and M. Yannakakis (1984). Simple Linear-Time Algorithms to test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs. Siam J. Comput. 13, 566-579.
data(MathGrades) MG.struct <- structMatrix(MathGrades$var) ord <- mcSearch(MG.struct) # Arbitrary start orda <- mcSearch(MG.struct, "Algebra") # Put algebra first. names(sort(orda)) # names of the variables in the chosen order. # Sort rows and columns of structure matrix by MC order MG.struct[order(orda), order(orda)]
data(MathGrades) MG.struct <- structMatrix(MathGrades$var) ord <- mcSearch(MG.struct) # Arbitrary start orda <- mcSearch(MG.struct, "Algebra") # Put algebra first. names(sort(orda)) # names of the variables in the chosen order. # Sort rows and columns of structure matrix by MC order MG.struct[order(orda), order(orda)]
Calculates the mutual information for a two-way table of observed counts or a joint probability distribution. The mutual information is a measure of association between two random variables.
mutualInformation(table)
mutualInformation(table)
table |
A two way table or probability distribution. Possibly
the output of the |
This is basically the Kullback-Leibler distance between the joint probability distribution and the probability distribution created by assuming the marginal distributions are independent. This is given in the following formula:
For square matrixes, the maximum mutual information, which should
should correspond to a diagnoal matrix, is log(d)
where
d
is the number of dimensions.
Russell Almond
https://en.wikipedia.org/wiki/Mutual_information
Shannon (1948) “A Mathematical Theory of Communication.”
table
, CPF
,
ewoe.CPF
, expTable
## UCBAdmissions is a three way table, so we need to ## make it a two way table. mutualInformation(apply(UCBAdmissions,c(1,2),sum)) apply(UCBAdmissions,3,mutualInformation) apply(UCBAdmissions,2,mutualInformation) apply(UCBAdmissions,1,mutualInformation) print(c(mutualInformation(matrix(1,2,2)),0)) print(c(mutualInformation(diag(2)), mutualInformation(1-diag(2)), log(2))) print(c(mutualInformation(diag(3)),log(3)))
## UCBAdmissions is a three way table, so we need to ## make it a two way table. mutualInformation(apply(UCBAdmissions,c(1,2),sum)) apply(UCBAdmissions,3,mutualInformation) apply(UCBAdmissions,2,mutualInformation) apply(UCBAdmissions,1,mutualInformation) print(c(mutualInformation(matrix(1,2,2)),0)) print(c(mutualInformation(diag(2)), mutualInformation(1-diag(2)), log(2))) print(c(mutualInformation(diag(3)),log(3)))
A conditional probability table (CPT) represents a collection of probability distribution, one for each configuration of the parent variables. This function normalizes the CPT, insuring that the probabilities in each conditional distribution sum to 1.
normalize(cpt) ## S3 method for class 'CPF' normalize(cpt) ## S3 method for class 'data.frame' normalize(cpt) ## S3 method for class 'CPA' normalize(cpt) ## S3 method for class 'array' normalize(cpt) ## S3 method for class 'matrix' normalize(cpt) ## S3 method for class 'table' normalize(cpt) ## Default S3 method: normalize(cpt)
normalize(cpt) ## S3 method for class 'CPF' normalize(cpt) ## S3 method for class 'data.frame' normalize(cpt) ## S3 method for class 'CPA' normalize(cpt) ## S3 method for class 'array' normalize(cpt) ## S3 method for class 'matrix' normalize(cpt) ## S3 method for class 'table' normalize(cpt) ## Default S3 method: normalize(cpt)
cpt |
A conditional probability table stored in either array
( |
The normalize
function is a generic function which attempts to
normalize a conditional probability distribution.
A conditional probability table in RNetica is represented in one of
two ways. In the conditional probability array (CPA
) the table
is represented as a dimensional array. The first
dimensions correspond to configurations of the parent variables and
the last dimension the child value. The
normalize.CPA
method
adjusts the data value so that the sum across all of the child states
is . Thus,
apply(result,1:p,sum)
should result in a
matrix of 's. The method
normalize.array
first coerces
its argument into a CPA
and then applies the
normalize.CPA
method.
The second way to represent a conditional probability table in RNetica
is to use a data frame (CPF
). Here the factor variables
correspond to a configuration of the parent states, and the numeric
columns correspond to states of the child variable. Each row
corresponds to a particular configuration of parent variables and the
numeric values should sum to one. The normalize.CPF
function
makes sure this constraint holds. The method
normalize.data.frame
first applies as.CPF()
to make the
data frame into a CPF.
The method normalize.matrix
ensures that the row sums are 1.
It does not change the class.
The default method only works for numeric objects. It ensures that
the total sum is .
NA
's are not allowed and will produce a result that is all
NA
s.
An object with similar properties to cpt
, but adjusted so that
probabilities sum to one.
For normalize.CPA
and normalize.array
an normalized
CPA
array.
For normalize.CPF
and normalize.data.fram
an normalized
CPF
data frame.
For normalize.matrix
an matrix whose row sums are .
For normalize.default
a numeric vector whose values sum to
.
May be other functions for CPTs later.
Russell Almond
n14 <- normalize(1:4) normalize(matrix(1:6,2,3)) normalize(array(1:24,c(4,3,2))) arr <- array(1:24,c(4,3,2), list(a=c("A1","A2","A3","A4"), b=c("B1","B2","B3"), c=c("C1","C2"))) arr <- as.CPA(arr) normalize(arr) arf <- as.CPF(arr) normalize(arf) df2 <- data.frame(parentval=c("a","b"), prob.true=c(1,1),prob.false=c(1,1)) normalize(df2) ## Admit is the response variable, so force it to be the last dimension. UCBA1 <- aperm(UCBAdmissions,3:1) normalize(UCBA1)
n14 <- normalize(1:4) normalize(matrix(1:6,2,3)) normalize(array(1:24,c(4,3,2))) arr <- array(1:24,c(4,3,2), list(a=c("A1","A2","A3","A4"), b=c("B1","B2","B3"), c=c("C1","C2"))) arr <- as.CPA(arr) normalize(arr) arf <- as.CPF(arr) normalize(arf) df2 <- data.frame(parentval=c("a","b"), prob.true=c(1,1),prob.false=c(1,1)) normalize(df2) ## Admit is the response variable, so force it to be the last dimension. UCBA1 <- aperm(UCBAdmissions,3:1) normalize(UCBA1)
This link function assumes that the effective theta for this distribution defines the mean of a normal distribution in a generalized regression model. The link scale parameter describes the residual variance.
normalLink(et, linkScale = NULL, obsLevels = NULL)
normalLink(et, linkScale = NULL, obsLevels = NULL)
et |
A matrix of effective theta values. There should be one row in this table for each configuration of the parent variables of the conditional probability table and one column for each state of the child variables except for the last. |
linkScale |
The residual standard deviation parameter. This
value must be supplied and should be a positive number; the default
value of |
obsLevels |
An optional character vector giving the names of the
child variable states. If supplied, it should have length
|
This function takes care of the third step in the algorithm of
calcDPCTable
. Its input is a matrix of effective theta
values (comparable to the last column of the output of
eThetaFrame
), one column for each of the child variable
states (obsLevels
) except for the last one. Each row
represents a different configuration of the parent variables. The
output is the conditional probability table. The use of this function
makes calcDPCTable
behave like calcDNTable
.
The idea behind this link function was first proposed in Almond
(2010), and it is more completely described in Almond et al. (2015).
The motivation comes from assuming that the child variable is created
by taking cuts on an underlying continuous variable. The marginal
distribution of this variable is a standard normal. The conditional
distribution is like a regression prediction with the effective theta
from the parent variables (the et
argument) as the expected
value and the linkScale
parameter as the residual standard
deviation.
The calculation works as follows: First cut points are set related to
the categories of the child variable. Let m
be the number of
categories (this should be one more than the number of columns of
et
) and the length of obsLevels
if that is supplied).
Then the cut points are set at cuts <- qnorm(((m -
1):1)/m)
.
Then for each row of the conditional probability table i
, the
probability of being in state k
is calculated by
pnorm(cuts[k]-et[i, 1])/linkScale) -
pnorm(cuts[k-1]-et[i, 1])/linkScale)
with the pnorm
expression set to 0 or 1 at the endpoints. Note that only the first
column of et
is used in the calculation.
A matrix with one more column than et
giving the conditional
probabilities for each configuration of the parent variables (which
correspond to the rows).
The motivation for the normal link function originally came from the
observation of odd behavior when variables given a DiBello-Samejima
distribution (that is, using the gradedResponse
link
function) were used as parent variables for other variables with a
DiBello-Samejima distribution.
One potential reason for the odd behavior was that the graded response
link function was not an inverse of the procedure used to assign the
effectiveThetas
to the parent variables. Thus, using a
probit link function (normalLink
) was thought to be better for
parent variables than using a logistic link function
(gradedResponse
), at the same time the convention of
assigning parent values based on quantiles of the normal distribution
started. This made the normalLink
and effectiveThetas
approximate inverses (information is still lost through
discritization). Note that in the current implementation the scale
factor of 1.7 has been added to both the partialCredit
and gradedResponse
functions to make the logistic
function closer to the normal distribution and a better inverse for
the effective theta procedure.
Russell Almond
Almond, R. G. (2010). ‘I can name that Bayesian network in two matrixes.’ International Journal of Approximate Reasoning. 51, 167-178.
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Other link functions: partialCredit
,
gradedResponse
.
Functions which directly use the link function:
eThetaFrame
, calcDPCTable
,
mapDPC
Earlier version of the graded response link:
calcDNTable
skill1l <- c("High","Medium","Low") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## Get some effective theta values. et <- effectiveThetas(3) normalLink(matrix(et,ncol=1),.5,correctL) normalLink(matrix(et,ncol=1),.3,correctL) normalLink(matrix(et,nrow=3,ncol=2),.5,pcreditL) normalLink(matrix(et,nrow=3,ncol=2),.8,pcreditL) normalLink(matrix(et,nrow=3,ncol=4),.5,gradeL) normalLink(matrix(et,nrow=3,ncol=4),.25,gradeL)
skill1l <- c("High","Medium","Low") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## Get some effective theta values. et <- effectiveThetas(3) normalLink(matrix(et,ncol=1),.5,correctL) normalLink(matrix(et,ncol=1),.3,correctL) normalLink(matrix(et,nrow=3,ncol=2),.5,pcreditL) normalLink(matrix(et,nrow=3,ncol=2),.8,pcreditL) normalLink(matrix(et,nrow=3,ncol=4),.5,gradeL) normalLink(matrix(et,nrow=3,ncol=4),.25,gradeL)
The function numericPart()
converts a data.frame
to a
matrix
, by dropping columns which contain non-numeric data.
The function factorPart
grabs the state information by
selecting only columns which are factors.
numericPart(table) factorPart(table)
numericPart(table) factorPart(table)
table |
A |
The primary purpose is to split a conditional probability distribution in data frame format (with a set of factor rows identifying the states of the parent variables) to a table of just the numbers, and a data frame of just the factors so that they can be tackled separately.
A matrix containing just the numeric (or factor) columns of the data frame.
Russell Almond
data.frame
, matrix
,
data.matrix
name <-c("Shahrazad", "Marguerite") height <- c(34, 36) weight <- c(28, 26) twins <- data.frame(name=as.factor(name),height=height, weight=weight) numericPart(twins) factorPart(twins)
name <-c("Shahrazad", "Marguerite") height <- c(34, 36) weight <- c(28, 26) twins <- data.frame(name=as.factor(name),height=height, weight=weight) numericPart(twins) factorPart(twins)
The observable characteristic plot is an analogue of the item characteristic curve for latent class models. Estimates of the class success probability for each latent is plotted. Reference lines are added for various groups which are expected to have roughly the same probability of success.
OCP(x, n, lclabs, pi, pilab = names(pi), lcnames = names(x), a = 0.5, b = 0.5, reflty = 1, ..., newplot = TRUE, main = NULL, sub = NULL, xlab = "Latent Classes", ylab = "Probability", cex = par("cex"), xlim = NULL, ylim = NULL, cex.axis = par("cex.axis")) OCP2(x, n, lclabs, pi, pilab = names(pi), lcnames = names(x), set1 = seq(1, length(x)-1, 2), setlabs = c("Set1", "Set2"), setat = -1, a = 0.5, b = 0.5, reflty = 1, ..., newplot = TRUE, main = NULL, sub = NULL, xlab = "Latent Classes", ylab = "Probability", cex = par("cex"), xlim = NULL, ylim = NULL, cex.axis = par("cex.axis"))
OCP(x, n, lclabs, pi, pilab = names(pi), lcnames = names(x), a = 0.5, b = 0.5, reflty = 1, ..., newplot = TRUE, main = NULL, sub = NULL, xlab = "Latent Classes", ylab = "Probability", cex = par("cex"), xlim = NULL, ylim = NULL, cex.axis = par("cex.axis")) OCP2(x, n, lclabs, pi, pilab = names(pi), lcnames = names(x), set1 = seq(1, length(x)-1, 2), setlabs = c("Set1", "Set2"), setat = -1, a = 0.5, b = 0.5, reflty = 1, ..., newplot = TRUE, main = NULL, sub = NULL, xlab = "Latent Classes", ylab = "Probability", cex = par("cex"), xlim = NULL, ylim = NULL, cex.axis = par("cex.axis"))
x |
Vector of success counts for each latent class. |
n |
Vector of of the same size as |
lclabs |
Character vector of plotting symbols for each latent class |
pi |
Vector of probability estimates for various levels. |
pilab |
Character vector of names for each level. |
lcnames |
Character vector of names for each latent class for axis. |
set1 |
For |
setlabs |
Character vectors of set labels for |
setat |
Numeric scalar giving the x-coordinate for set labels. |
a |
The first parameter for beta pseudo-prior, passed to
|
b |
The second parameter for beta pseudo-prior, passed to
|
reflty |
Provides the line type (see
|
... |
Additional graphics parameters passed to
|
newplot |
Logical. If true, a new plotting window is created. Otherwise, information is added to existing plotting window. |
main |
Character scalar giving main title (see
|
sub |
Character scalar giving sub title (see
|
xlab |
Character scalar giving x-axis label (see
|
ylab |
Character scalar giving x-axis label (see
|
cex |
Scalar giving character expansion for plotting symbols (see
|
xlim |
Plotting limits for x-axis (see
|
ylim |
Plotting limits for y-axis (see
|
cex.axis |
Scalar giving character expansion for latent class
names (axis tick labels; |
Most cognitively diagnostic models for assessments assume that students with different patterns of skills will have different patterns of success and failures. The goal of this plot type is to empirically check to see if the various groups conform to their expected probability.
The assumption is that the population is divided into a number of
latent classes (or groups of latent classes) and that the class
label for each member of the class is known. The latent classes are
partitioned into levels, where each level is assumed to have
roughly the same proportion of success. (This is often
c("-","+")
for negative and positive skill patterns, but there
could be multiple levels.)
The key idea of the observable characteristic plot is to compare a
credibility interval for the success probability in each latent class,
to the modelled success probability given by its level. A beta
credibility interval is computed for each latent class using
betaci(x,n,a,b)
, with the expected value set at
. The plotting symbols given in
lclabs
are
plotted at the mean (usually, these correspond to the latent class is
in), with vertical error bars determined by betaci
. Horizontal
reference lines are added at the values given by pi
. The idea
is that if the error bars cross the reference line for the appropriate
level, then the latent class fits the model, if not it does not.
The variant OCP2
is the same except that the latent classes are
further partitioned into two sets, and the labels for the latent
classes for different sets are plotted on different lines. This is
particularly useful for testing whether attributes which are not
thought to be relevant to a given problem are actually irrelevant.
The output of betaci
is returned invisibly.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 10.
Sinharay, S. and Almond, R.G. (2006). Assessing Fit of Cognitively Diagnostic Models: A case study. Educational and Psychological Measurement. 67(2), 239–257.
Sinharay, S., Almond, R. G. and Yan, D. (2004). Assessing fit of models with discrete proficiency variables in educational assessment. ETS Research Report. http://www.ets.org/research/researcher/RR-04-07.html
nn <- c(30,15,20,35) pi <- c("+"=.15,"-"=.85) grouplabs <- c(rep("-",3),"+") x <- c("(0,0)"=7,"(0,1)"=4,"(1,0)"=2,"(1,1)"=31) OCP (x,nn,grouplabs,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), main="Data that fit the model") x1 <- c("(0,0)"=7,"(0,1)"=4,"(1,0)"=11,"(1,1)"=31) OCP (x1,nn,grouplabs,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), main="Data that don't fit the model") nnn <- c("(0,0,0)"=20,"(0,0,1)"=10, "(0,1,0)"=10,"(0,1,0)"=5, "(1,0,0)"=10,"(1,0,1)"=10, "(1,1,1)"=10,"(1,1,1)"=25) xx <- c("(0,0,0)"=5,"(0,0,1)"=2, "(0,1,0)"=2,"(0,1,1)"=2, "(1,0,0)"=2,"(1,0,1)"=0, "(1,1,0)"=9,"(1,1,1)"=21) grouplabs1 <- rep(grouplabs,each=2) OCP2 (xx,nnn,grouplabs1,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), setlabs=c("Low Skill3","High Skill3"),setat=-.8, main="Data for which Skill 3 is irrelevant") xx1 <- c("(0,0,0)"=2,"(0,0,1)"=5, "(0,1,0)"=1,"(0,1,1)"=3, "(1,0,0)"=0,"(1,0,1)"=2, "(1,1,0)"=5,"(1,1,1)"=24) OCP2 (xx1,nnn,grouplabs1,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), setlabs=c("Low Skill3","High Skill3"),setat=-.8, main="Data for which Skill 3 is relevant")
nn <- c(30,15,20,35) pi <- c("+"=.15,"-"=.85) grouplabs <- c(rep("-",3),"+") x <- c("(0,0)"=7,"(0,1)"=4,"(1,0)"=2,"(1,1)"=31) OCP (x,nn,grouplabs,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), main="Data that fit the model") x1 <- c("(0,0)"=7,"(0,1)"=4,"(1,0)"=11,"(1,1)"=31) OCP (x1,nn,grouplabs,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), main="Data that don't fit the model") nnn <- c("(0,0,0)"=20,"(0,0,1)"=10, "(0,1,0)"=10,"(0,1,0)"=5, "(1,0,0)"=10,"(1,0,1)"=10, "(1,1,1)"=10,"(1,1,1)"=25) xx <- c("(0,0,0)"=5,"(0,0,1)"=2, "(0,1,0)"=2,"(0,1,1)"=2, "(1,0,0)"=2,"(1,0,1)"=0, "(1,1,0)"=9,"(1,1,1)"=21) grouplabs1 <- rep(grouplabs,each=2) OCP2 (xx,nnn,grouplabs1,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), setlabs=c("Low Skill3","High Skill3"),setat=-.8, main="Data for which Skill 3 is irrelevant") xx1 <- c("(0,0,0)"=2,"(0,0,1)"=5, "(0,1,0)"=1,"(0,1,1)"=3, "(1,0,0)"=0,"(1,0,1)"=2, "(1,1,0)"=5,"(1,1,1)"=24) OCP2 (xx1,nnn,grouplabs1,pi,c("-","+"),ylim=c(0,1), reflty=c(2,4), setlabs=c("Low Skill3","High Skill3"),setat=-.8, main="Data for which Skill 3 is relevant")
The observable characteristic plot is a plot that compares a
conditional probabiltiy distribution with a set of data nominally from
that distribution. The exp
argument is the conditional
distribution in CPF
format. This is plotted as a
barchart. The obs
argument is the data in CPF
(rows not normalized). The data are plotted as cumulative probability
estimates with error bars overlaid on the barplot. The function
OCP2.CPF
does the same thing, but contrasts two different data
sets.
OCP.CPF(obs, exp, ..., baseCol = "chocolate", limits = c(lower = 0.025, upper = 0.975), Parents = getTableParents(exp), States = getTableStates(exp), nstates=length(States), key=list(points=FALSE,rectangles=TRUE,text=States), par.settings = list(), point.pch = paste((nstates-1):0), point.col = "black", point.cex = 1.25, line.col = "black", line.type=1) OCP2.CPF(obs1, obs2, exp, ..., baseCol = "chocolate", limits = c(lower = 0.025, upper = 0.975), Parents = getTableParents(exp), States = getTableStates(exp), nstates=length(States), key=list(points=FALSE,rectangles=TRUE,text=States), par.settings = list(), point1.pch = paste((nstates-1):0), point1.col = "black", point1.cex = 1.25, line1.col = "black", line1.type=1, point2.pch = paste((nstates-1):0), point2.col = "cyan", point2.cex = 1.25, line2.col = "cyan", line2.type=2)
OCP.CPF(obs, exp, ..., baseCol = "chocolate", limits = c(lower = 0.025, upper = 0.975), Parents = getTableParents(exp), States = getTableStates(exp), nstates=length(States), key=list(points=FALSE,rectangles=TRUE,text=States), par.settings = list(), point.pch = paste((nstates-1):0), point.col = "black", point.cex = 1.25, line.col = "black", line.type=1) OCP2.CPF(obs1, obs2, exp, ..., baseCol = "chocolate", limits = c(lower = 0.025, upper = 0.975), Parents = getTableParents(exp), States = getTableStates(exp), nstates=length(States), key=list(points=FALSE,rectangles=TRUE,text=States), par.settings = list(), point1.pch = paste((nstates-1):0), point1.col = "black", point1.cex = 1.25, line1.col = "black", line1.type=1, point2.pch = paste((nstates-1):0), point2.col = "cyan", point2.cex = 1.25, line2.col = "cyan", line2.type=2)
obs , obs1 , obs2
|
A |
exp |
A |
... |
Other arguments passed to |
baseCol |
The base color to use for the color gradients in the bars. |
limits |
A vector of two probabilities representing the quantiles of the beta distribution to be used in error bars. |
Parents |
A character vector giving the names of the parent variables. |
States |
A character vector giving the names of the child states. |
nstates |
An integer giving the nubmer of states. |
par.settings |
Setting passed to |
key |
Settings passed to |
point.pch , point1.pch , point2.pch
|
Characters used to plot data points. |
point.col , point1.col , point2.col
|
The colour to be used for the overlay points. |
point.cex , point1.cex , point2.cex
|
The size (cex) to be used for the overlay points. |
line.col , line1.col , line2.col
|
The colour to be used for the overlay lines. |
line.type , line1.type , line2.type
|
The type (lty) for the overlay lines. |
The cannonical use for this plot is to test the conditional
probability model used in particular Bayesian network. The exp
argument is the proposed table. If the parents are fully observed,
then the obs
argument would be a contingency table with
the rows representing parent configurations and the columns the data.
If the parents are not fully observed, then obs can be replaced by an
expected contingincy table (see Almond, et al, 2015).
The bars are coloured using an intesity scale starting from a base
colour. The baseColor
argument gives the darkest colour in the
scale. The base plot is very similar to barchart.CPF
.
Point and interval estimates are made for the cumulative probabilities using a
simple Bayesian estimator (this avoids potential issues with 0 cell
counts). Basically, the expected values (exp
) are added to the
data as a prior. The intervals are formued using quantiles of the
beta distribution. The limits
argument gives the quantiles
used.
The point and interval estimaters are plotted over the barchart using
a numeric label (going from 0 smallest to , were
is
the number of states of the child variable) and line running from the
lower to upper bound. If the line overlaps the corresponding bar in
the barchart, then the data are consistent with the model (at least in
that cell).
The parameters point.pch
point.col
, point.cex
,
line.col
, and line.type
control the appearance of the
points and lines. These may either be scalars or vectors which match
the number of states of the child variable.
The function OCP2.CPF
is similar but it meant for situations in
which two different data distributions are to be compared. In this
one (obs1
, point1.*
, line1.*
) is plotted in the
upper half of the bar, and the other (obs2
, point2.*
,
line2.*
) in the lower half. Two important applications of this
particular variation:
Two different data sets are related to the presence/absence of an additional parent variable. Thus, this is a test of whether or not the additional parent is relevant.
Two different data sets represent a focal and reference demographic group. This is a test of whether or not the item behaves similarly for both groups.
Returns an object of class
lattice::trellis.object
. The default print
method will plot this function.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 10.
Sinharay, S. and Almond, R.G. (2006). Assessing Fit of Cognitively Diagnostic Models: A case study. Educational and Psychological Measurement. 67(2), 239–257.
Sinharay, S., Almond, R. G. and Yan, D. (2004). Assessing fit of models with discrete proficiency variables in educational assessment. ETS Research Report. http://www.ets.org/research/researcher/RR-04-07.html
betaci
, OCP
, barchart.CPF
,
cptChi2
skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) ## Compatible data datTheta <- cpfTheta ## Copy to get the shape datTheta[1,] <- rmultinom(1,25,cpfTheta[1,]) OCP.CPF(datTheta,cpfTheta) ## Incompatible data datThetaX <- cpfTheta ## Copy to get the shape datThetaX[1,] <- rmultinom(1,100,c(.05,.45,.5)) OCP.CPF(datThetaX,cpfTheta) OCP2.CPF(datTheta,datThetaX,cpfTheta) cptComp <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Compensatory") cptConj <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Conjunctive") datComp <- cptComp for (i in 1:nrow(datComp)) ## Each row has a random sample size. datComp[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1),cptComp[i,3:4]) datConj <- cptConj for (i in 1:nrow(datConj)) ## Each row has a random sample size. datConj[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1),cptConj[i,3:4]) ## Compatible OCP.CPF(datConj,cptConj) ## Incompatible OCP.CPF(datComp,cptConj) OCP2.CPF(datConj,datComp,cptConj) cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)), rule="OffsetDisjunctive") cptPC2 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=1,S2=0)), rule="OffsetDisjunctive") datPC1 <- cptPC1 for (i in 1:nrow(datPC1)) ## Each row has a random sample size. datPC1[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC1[i,3:5]) datPC2 <- cptPC2 for (i in 1:nrow(datPC2)) ## Each row has a random sample size. datPC2[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC2[i,3:5]) ## Compatible OCP.CPF(datPC1,cptPC1) ## Incompatible OCP.CPF(datPC2,cptPC1) OCP2.CPF(datPC1,datPC2,cptPC1)
skill1l <- c("High","Medium","Low") skill2l <- c("High","Medium","Low","LowerYet") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") cpfTheta <- calcDPCFrame(list(),skill1l,numeric(),0,rule="Compensatory", link="normalLink",linkScale=.5) ## Compatible data datTheta <- cpfTheta ## Copy to get the shape datTheta[1,] <- rmultinom(1,25,cpfTheta[1,]) OCP.CPF(datTheta,cpfTheta) ## Incompatible data datThetaX <- cpfTheta ## Copy to get the shape datThetaX[1,] <- rmultinom(1,100,c(.05,.45,.5)) OCP.CPF(datThetaX,cpfTheta) OCP2.CPF(datTheta,datThetaX,cpfTheta) cptComp <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Compensatory") cptConj <- calcDPCFrame(list(S2=skill2l,S1=skill1l),correctL, lnAlphas=log(c(1.2,.8)), betas=0, rule="Conjunctive") datComp <- cptComp for (i in 1:nrow(datComp)) ## Each row has a random sample size. datComp[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1),cptComp[i,3:4]) datConj <- cptConj for (i in 1:nrow(datConj)) ## Each row has a random sample size. datConj[i,3:4] <- rmultinom(1,rnbinom(1,mu=15,size=1),cptConj[i,3:4]) ## Compatible OCP.CPF(datConj,cptConj) ## Incompatible OCP.CPF(datComp,cptConj) OCP2.CPF(datConj,datComp,cptConj) cptPC1 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=999,S2=0)), rule="OffsetDisjunctive") cptPC2 <- calcDPCFrame(list(S1=skill1l,S2=skill2l),pcreditL, lnAlphas=log(1), betas=list(full=c(S1=0,S2=999),partial=c(S2=1,S2=0)), rule="OffsetDisjunctive") datPC1 <- cptPC1 for (i in 1:nrow(datPC1)) ## Each row has a random sample size. datPC1[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC1[i,3:5]) datPC2 <- cptPC2 for (i in 1:nrow(datPC2)) ## Each row has a random sample size. datPC2[i,3:5] <- rmultinom(1,rnbinom(1,mu=25,size=1),cptPC2[i,3:5]) ## Compatible OCP.CPF(datPC1,cptPC1) ## Incompatible OCP.CPF(datPC2,cptPC1) OCP2.CPF(datPC1,datPC2,cptPC1)
These functions take a vector of “effective theta” values for a collection of parent variables and calculates the effective theta value for the child variable according to the named rule. Used in calculating DiBello–Samejima and DiBello–Normal probability tables. These versions have a single slope parameter (alpha) and one difficulty parameter per parent variable.
OffsetConjunctive(theta, alpha, betas) OffsetDisjunctive(theta, alpha, betas)
OffsetConjunctive(theta, alpha, betas) OffsetDisjunctive(theta, alpha, betas)
theta |
A matrix of effective theta values whose columns correspond to parent variables and whose rows correspond to possible skill profiles. |
alpha |
A single common discrimination parameter. (Note these
function expect discrimination parameters and not log discrimination
parameters as used in |
betas |
A vector of difficulty (-intercept) parameters. Its
length should be the same as the number of columns in |
For OffsetConjunctive
, the combination function for each row is:
For OffsetDisjunctive
, the combination function for each row is:
A vector of normal deviates corresponding to the effective theta
value. Length is the number of rows of thetas
.
These functions expect the unlogged discrimination parameters, while
calcDSTable
expect the log of the discrimination parameters.
The rationale is that log discrimination is bound away from zero, and
hence a more natural space for MCMC algorithms. However, it is poor
programming design, as it is liable to catch the unwary.
These functions are meant to be used as structure functions in the DiBello–Samejima and DiBello–Normal models. Other structure functions are possible and can be excepted by those functions as long as they have the same signature as these functions.
Note that the offset conjunctive and disjunctive model don't really
make much sense in the no parent case. Use Compensatory
instead.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015). Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.
effectiveThetas
,calcDSTable
,
calcDNTable
,calcDPCTable
,
Compensatory
,
eThetaFrame
skill <- c("High","Medium","Low") thetas <- expand.grid(list(S1=seq(1,-1), S2 = seq(1, -1))) OffsetDisjunctive(thetas, 1.0, c(S1=0.25,S2=-0.25)) OffsetConjunctive(thetas, 1.0, c(S1=0.25,S2=-0.25)) eThetaFrame(list(S1=skill,S2=skill), 1.0, c(S1=0.25,S2=-0.25), "OffsetConjunctive") eThetaFrame(list(S1=skill,S2=skill), 1.0, c(S1=0.25,S2=-0.25), "OffsetDisjunctive")
skill <- c("High","Medium","Low") thetas <- expand.grid(list(S1=seq(1,-1), S2 = seq(1, -1))) OffsetDisjunctive(thetas, 1.0, c(S1=0.25,S2=-0.25)) OffsetConjunctive(thetas, 1.0, c(S1=0.25,S2=-0.25)) eThetaFrame(list(S1=skill,S2=skill), 1.0, c(S1=0.25,S2=-0.25), "OffsetConjunctive") eThetaFrame(list(S1=skill,S2=skill), 1.0, c(S1=0.25,S2=-0.25), "OffsetDisjunctive")
This takes a bunch of strings of the form "[High:.3,Med:.5,Low:.2]"
and parses it into a vector c(High=.3,Med=.5,Low=.2)
.
parseProbVec(pVec) parseProbVecRow(splitrow)
parseProbVec(pVec) parseProbVecRow(splitrow)
pVec |
A string of the form |
splitrow |
A collection of strings |
StatShop outputs marginal distributions in the format
[state0:val0,state1:val1,...]
. This
function takes a vector of strings containing probability vectors and
parses them, returning a matrix of the values, with column names given
by the names of the states.
The function parseProbVecRow()
is
an internal function which parses a single row (after it has been
split on the commas).
A matrix containing the values. The rows correspond to the elements
of pVec
. The columns correspond to the state names.
Russell Almond
http://research.ets.org/~ralmond/StatShop/dataFormats.html
parseProbVec(c(Good = "[High:.8,Med:.15,Low:.05]", Bad = "[High:.15,Med:.35,Low:.5]", Ugly = "[High:.01,Med:.09,Low:.9]"))
parseProbVec(c(Good = "[High:.8,Med:.15,Low:.05]", Bad = "[High:.15,Med:.35,Low:.5]", Ugly = "[High:.01,Med:.09,Low:.9]"))
This function converts a matrix of effective theta values into a conditional probability table by applying the generalized partial credit model to each row of the table.
partialCredit(et, linkScale = NULL, obsLevels = NULL)
partialCredit(et, linkScale = NULL, obsLevels = NULL)
et |
A matrix of effective theta values. There should be one row in this table for each configuration of the parent variables of the conditional probability table and one column for each state of the child variables except for the last. |
linkScale |
Unused. For compatibility with other link functions. |
obsLevels |
An optional character vector giving the names of the child variable states.
If supplied, it should have length |
This function takes care of the third step in the algorithm of
calcDPCTable
. Its input is a matrix of effective theta
values (comparable to the last column of the output of
eThetaFrame
), one column for each of the child variable
states (obsLevels
) except for the last one. Each row
represents a different configuration of the parent variables. The
output is the conditional probability table.
Let be the child variable of the distribution, and assume that
it can take on
possible states labeled
through
in increasing order. The generalized partial
credit model defines a set of functions
for
, where
The conditional probabilities for each child state is calculated by taking the differences between the curves.
The matrix
et
is the values of
. This function then performs the rest of the
generalized partial credit model. The original Samejima
(1969) development assumed that all of the functions
had the same linear form
, with the
strictly increasing (note that in CPTtools, the states are ordered
from highest to lowest, so that they should be strictly decreasing).
This meant that the curves would never cross. The general notation of
calcDPCTable
does not ensure the curves do not cross,
which could result in negative probabilities. This function handles
this case by forcing negative probabilities to zero (and adjusting the
probabilities for the other state to be properly normalized).
If supplied obsLevels
is used for the column names.
A matrix with one more column than et
giving the conditional
probabilities for each configuration of the parent variables (which
correspond to the rows).
The development here follows Muraki (1992) rather than Samejima (1969).
The linkScale
parameter is unused. It is for compatibility
with other link function choices.
Russell Almond
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015). Bayesian Networks in Educational Assessment. Springer. Chapter 8.
Muraki, E. (1992). A Generalized Partial Credit Model: Application of an EM Algorithm. Applied Psychological Measurement, 16, 159-176. DOI: 10.1177/014662169201600206
I also have planned a manuscript that describes these functions in more detail.
Other Link functions:
gradedResponse
,normalLink
Functions which directly use the link function:
eThetaFrame
, calcDPCTable
,
mapDPC
## Set up variables skill1l <- c("High","Medium","Low") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## Get some effective theta values. et <- effectiveThetas(3) partialCredit(matrix(et,ncol=1),NULL,correctL) partialCredit(outer(et,c(Full=1,Partial=-1)),NULL,pcreditL) partialCredit(outer(et,c(A=2,B=1,C=0,D=-1)),NULL,gradeL)
## Set up variables skill1l <- c("High","Medium","Low") correctL <- c("Correct","Incorrect") pcreditL <- c("Full","Partial","None") gradeL <- c("A","B","C","D","E") ## Get some effective theta values. et <- effectiveThetas(3) partialCredit(matrix(et,ncol=1),NULL,correctL) partialCredit(outer(et,c(Full=1,Partial=-1)),NULL,pcreditL) partialCredit(outer(et,c(A=2,B=1,C=0,D=-1)),NULL,gradeL)
A dimensional probability vector likes on a
-simplex.
This simplex can be projected onto 2-dimension, with vertices
corresponding to the state of the variable
(
simplex_vertex_projection
calculates these co-ordinates).
The simplexplot
function produces the plot.
plotsimplex(data, radius = 1, order = NULL, labels_cex = 1, labels = colnames(data), show_labels = TRUE, points_col = "#00000044", points_pch = 19, points_cex = 1, label_offset = 0.9, projection_vertexes = simplex_vertex_projection(ncol(data), radius), show_points = TRUE, show_circle = TRUE, circle_col = "lightgray", show_edges = TRUE, edges_col = "lightgray", ...) simplex_vertex_projection(nvert, r=1)
plotsimplex(data, radius = 1, order = NULL, labels_cex = 1, labels = colnames(data), show_labels = TRUE, points_col = "#00000044", points_pch = 19, points_cex = 1, label_offset = 0.9, projection_vertexes = simplex_vertex_projection(ncol(data), radius), show_points = TRUE, show_circle = TRUE, circle_col = "lightgray", show_edges = TRUE, edges_col = "lightgray", ...) simplex_vertex_projection(nvert, r=1)
data |
A data frame or matrix with rows representing individuals, and columns states. The row sums should be one. |
radius , r
|
The radius of the circle on which to plot. |
order |
Order of the states. |
labels_cex |
Graphical parameter controlling label size. |
labels |
Labels for the vertices. |
show_labels |
Logical, should labels be shown. |
points_col |
Colours for the points. |
points_pch |
Glyph for the points. |
points_cex |
Size for the points. |
label_offset |
Location for the labels. |
projection_vertexes |
A function for computing the vertex locations. |
show_points |
Logical, should the points be plotted. |
show_circle |
Logical, should the reference cicle be plotted. |
circle_col |
Logical, colour for the circle. |
show_edges |
Logical, should the edges be shown. |
edges_col |
The colour for the edges. |
... |
extra arguments are ignored. |
nvert |
The number of vertices. |
This function is adapted from the archetypes::simplexplot
function. As the probability vector is already a simplex, we
don't need the archetypes overhead.
The function simplex_vertex_projection
returns a matrix
giving the co-ordinates of the vertices.
The function simplexplot
is normally called for its
side-effects, however, it invisibly returns a data structure
containing plot details.
Russell Almond
See Section 6 in "Probabilistic Archetypal Analysis" by Seth and Eugster (2014), http://arxiv.org/abs/1312.7604.
This is a simplified version of the code in the archetypes package. https://cran.r-project.org/package=archetypes
data("language16") ptab <- pvecTable(language16,"Reading") plotsimplex(ptab)
data("language16") ptab <- pvecTable(language16,"Reading") plotsimplex(ptab)
Produces credibility intervals for hanging barplots. Assumes that each column represents a sum of proportions and produces corresponding intervals for the cumulative sums. Values hanging below and above the reference line are treated separately, and returned values below the reference are negative.
proflevelci(data, profindex, limits=list(lower=.025,upper=.975),a=.5, b=.5)
proflevelci(data, profindex, limits=list(lower=.025,upper=.975),a=.5, b=.5)
data |
A matrix of data values where each column refers to a bar in barplot. The values should be scaled so that the column sum is the number of individuals in that group. |
profindex |
The level in the chart which corresponds to the
reference (proficiency) line. This should be a positive integer
less than |
limits |
The upper and lower credibility limits. |
a |
Value for the |
b |
Value for the |
For a stacked bar plot, the natural comparisons involve not category probabilities but the sum of the category probabilities up through the current bar. For hanging bar plots, this should go in both directions. So for example, if the categories are “Below Basic”, “Basic”, “Proficient”, and “Advanced”, and the zero line is to be put between “Basic” and “Proficient”, then we need credibility intervals for Pr(“Basic” or “Below Basic”), Pr(“Basic”), Pr(“Proficient”), Pr(“Proficient” or “Advanced”).
The proflevelci
function splits the states up into those above
the line and below the line using profindex
. It then generates
credibility intervals using betaci
for the cumulative
sums in each group. The primary purpose is to create confidence
intervals for stacked bar charts (see compareBars2
).
A list of data sets of the same length as the limits
argument.
Each data set has the same shape as the data
argument and
represents a quantile of the data associated with the value in limits.
With the default limits of lower
and upper
, the result
is a list of two elements
lower |
Gives the lower bounds of the confidence interval. |
upper |
Gives the upper bounds of the confidence interval. |
Russell Almond
margins <- data.frame ( Trouble=c(Novice=19,Semester1=24,Semester2=28,Semseter3=20,Semester4=9), NDK=c(Novice=1,Semester1=9,Semester2=35,Semseter3=41,Semester4=14), Model=c(Novice=19,Semester1=28,Semester2=31,Semseter3=18,Semester4=4) ) proflevelci(margins,3,limits=c(lower=.025,upper=.975))
margins <- data.frame ( Trouble=c(Novice=19,Semester1=24,Semester2=28,Semseter3=20,Semester4=9), NDK=c(Novice=1,Semester1=9,Semester2=35,Semseter3=41,Semester4=14), Model=c(Novice=19,Semester1=28,Semester2=31,Semseter3=18,Semester4=4) ) proflevelci(margins,3,limits=c(lower=.025,upper=.975))
In running a typical Bayes net engine, as each piece of
evidence comes in, updated marginal distributions for several
variables are output. This function reads a such a log,
expressed as a data frame, and creates a data structure
suitable for doing weight of evidence analyses. The
probabilities are in the pvec (parseProbVec
)
format.
readHistory(histdat, obscol = "Item", valcol = "Result", probcol="Probability")
readHistory(histdat, obscol = "Item", valcol = "Result", probcol="Probability")
histdat |
A data frame which has columns corresponding to
|
.
obscol |
Name of the column with the name of the observable at each time point. |
valcol |
Name of the column with the value of the observable at each time point. |
probcol |
Name of the column with the probabity vectors. |
The assumption is that the histdat
data frame contains
a history of measurements on a latent variable. The
probcol column should contain a probability vector of
the form: [High:0.527,Medium:0.447,Low:0.025]
. This
function parses this column (see parseProbVec
) and
builds a matrix with columns corresponding to the states of the
latent variable.
The rows are given names of the form, <obscol>=<valcol>
,
where <obscol>
and <valcol>
are the values in
the respective columns.
A matrix whose column names are taken from the probability
vectors and row names are taken from the obscol
and
valcol
fields.
The previous version (a) directly read the CSV file, and (b) had the names of the columns hard coded. This version will break any code that relied on the old version. However, it is hopefully more generally useful.
Russell Almond
http://research.ets.org/~ralmond/StatShop/dataFormats.html
testFiles <- system.file("testFiles",package="CPTtools") allcorrect <- readHistory(read.csv(file.path(testFiles, "CorrectSequence.csv"),as.is=TRUE), probcol="Margin.sequences.") woeHist(allcorrect,"High",c("Medium","Low")) allincorrect <- readHistory(read.csv(file.path(testFiles, "InCorrectSequence.csv"),as.is=TRUE), probcol="Margin.sequences.")
testFiles <- system.file("testFiles",package="CPTtools") allcorrect <- readHistory(read.csv(file.path(testFiles, "CorrectSequence.csv"),as.is=TRUE), probcol="Margin.sequences.") woeHist(allcorrect,"High",c("Medium","Low")) allincorrect <- readHistory(read.csv(file.path(testFiles, "InCorrectSequence.csv"),as.is=TRUE), probcol="Margin.sequences.")
Takes a table representing a conditional probability distribution or a
set of hyper-Dirichlet parameters and rescales the numeric part of the
table. The function rescaleTable()
scales the table by
scaleFactor
, the function normalizeTable()
scales the
function by the sum of the rows, making the result a conditional
probability table.
rescaleTable(table, scaleFactor) normalizeTable(table)
rescaleTable(table, scaleFactor) normalizeTable(table)
table |
A data frame describing a conditional probability table. Assumes that the conditions are expressed as factor variables, and all numeric columns represent states of the child variable. |
scaleFactor |
A scalar or vector of length equal to the number of
rows of |
For rescaleTable()
, every numeric column of table
is
multiplied by scaleFactor
. This can be used to create a set of
hyper-Dirichlet parameters by multiplying a conditional probability
table by the effective sample size.
For normalizeTable()
, the scaleFactor
is set to be
1/rowSums(table)
(excluding the factor variables) so that the
resulting table is a proper conditional probability table.
A data frame of the same shape as table
with the numeric
entries suitably scaled.
The function scaleTable
does a similar rescaling, only
it works with a separate ‘Sum’ and ‘Scale’ columns in
the table.
Russell Almond
#conditional table X2.ptf <- data.frame(Theta=c("Expert","Novice"), correct=c(4,2), incorrect=c(2,4)) X2.t99 <- rescaleTable(X2.ptf,99/6) #Reweight to effective samples size of 99 X2.t31 <- rescaleTable(X2.ptf,c(3,1)) #Weight expert prior 3 times more than #novice prior. X2.dtf <- normalizeTable(X2.ptf) #Unconditional table Theta.ptf <- data.frame(Expert=3,Novice=3) Theta.t100 <- rescaleTable(Theta.ptf,100/6) #Reweight to effective #sample size of 100 Theta.dtf <- normalizeTable(Theta.ptf)
#conditional table X2.ptf <- data.frame(Theta=c("Expert","Novice"), correct=c(4,2), incorrect=c(2,4)) X2.t99 <- rescaleTable(X2.ptf,99/6) #Reweight to effective samples size of 99 X2.t31 <- rescaleTable(X2.ptf,c(3,1)) #Weight expert prior 3 times more than #novice prior. X2.dtf <- normalizeTable(X2.ptf) #Unconditional table Theta.ptf <- data.frame(Expert=3,Novice=3) Theta.t100 <- rescaleTable(Theta.ptf,100/6) #Reweight to effective #sample size of 100 Theta.dtf <- normalizeTable(Theta.ptf)
Creates a correlation matrix from a covariance matrix by scaling rows and columns to have a unit diagonal. Also can be used to create a partial correlation matrix from an inverse covariance/correlation matrix.
scaleMatrix(X)
scaleMatrix(X)
X |
A square, positive definite matrix (covariance matrix). |
Divides rows and columns by square root of the diagonal elements.
A matrix of the same size and shape as the original with a unit diagonal.
Russell Almond
data(MathGrades) ## Create a correlation matrix from a covariance matrix. round(scaleMatrix(MathGrades$var),2) == MathGrades$cor ## Create a partial correlation matrix from a correlation matrix round(scaleMatrix(solve(MathGrades$cor)),2) == MathGrades$pcor ##Note: Some of these tests may return false due to machine precision ##issues.
data(MathGrades) ## Create a correlation matrix from a covariance matrix. round(scaleMatrix(MathGrades$var),2) == MathGrades$cor ## Create a partial correlation matrix from a correlation matrix round(scaleMatrix(solve(MathGrades$cor)),2) == MathGrades$pcor ##Note: Some of these tests may return false due to machine precision ##issues.
Takes a matrix or vector with a Sum
and Scale
column and
rescales it by multiplying each remaining element by the value of
for that row.
If the last two rows are not named Sum
and Scale
then it
simply returns its argument.
scaleTable(table)
scaleTable(table)
table |
A matrix or vector in which the last two columns are named "Scale" and "Sum". |
The parameters of a Dirichlet distribution can be stored in two ways,
one is to have each cell in the table represent a pseudo count. The
other was is to have each row represent a probability vector and use
an additional pseudo sample size (the Scale
column). If the
probability vector is reported in a some other metric (say as a
percentage or as a fraction of some smaller sample) the the Sum
column is used to store the row sum.
Rescaled table with Sum
and Scale
columns removed. This
makes some attempt to preserve the type of the table
argument
as a matrix, row vector or numeric object.
Used by the function compareDS
to compare tables which may be
in different formats.
Russell Almond
http://research.ets.org/~ralmond/StatShop/dataFormats.html
c1 <- matrix(c(70,20,10,10,20,70),nrow=2,byrow=TRUE, dimnames=list(NULL,c("H","M","L"))) s1 <- matrix(c(7,2,1,10,100,1,2,7,10,100),nrow=2,byrow=TRUE, dimnames=list(NULL,c("H","M","L","Sum","Scale"))) ## 1 row matrixes need special handling (c1[1,] is a vector not a matrix) c1r1 <- matrix(c1[1,],nrow=1,dimnames=list(NULL,c("H","M","L"))) s1r1 <- matrix(s1[1,],nrow=1,dimnames=list(NULL,c("H","M","L","Sum","Scale"))) stopifnot( identical(c1,scaleTable(s1)), identical(c1[1,],scaleTable(s1[1,])), identical(c1r1,scaleTable(s1r1)) ) # This should have no effect when run on matrixes without the Sum and # Scale column. stopifnot( identical(c1,scaleTable(c1)), identical(c1[1,],scaleTable(c1[1,])), identical(c1r1,scaleTable(c1r1)) )
c1 <- matrix(c(70,20,10,10,20,70),nrow=2,byrow=TRUE, dimnames=list(NULL,c("H","M","L"))) s1 <- matrix(c(7,2,1,10,100,1,2,7,10,100),nrow=2,byrow=TRUE, dimnames=list(NULL,c("H","M","L","Sum","Scale"))) ## 1 row matrixes need special handling (c1[1,] is a vector not a matrix) c1r1 <- matrix(c1[1,],nrow=1,dimnames=list(NULL,c("H","M","L"))) s1r1 <- matrix(s1[1,],nrow=1,dimnames=list(NULL,c("H","M","L","Sum","Scale"))) stopifnot( identical(c1,scaleTable(s1)), identical(c1[1,],scaleTable(s1[1,])), identical(c1r1,scaleTable(s1r1)) ) # This should have no effect when run on matrixes without the Sum and # Scale column. stopifnot( identical(c1,scaleTable(c1)), identical(c1[1,],scaleTable(c1[1,])), identical(c1r1,scaleTable(c1r1)) )
This produces a series of stacked bar plots staggered so that the baseline corresponds to a particular state level. This is primarily designed for producing plots of probability vectors coming out of Bayes net scoring.
stackedBarplot(height, width = 1, space = 0.2, offset = 0, names.arg = NULL, legend.text = NULL, horiz = FALSE, density = NULL, angle = 45, col = NULL, border = par("fg"), main = NULL, sub = NULL, xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, xpd = TRUE, axis = TRUE, axisnames = TRUE, cex.axis = par("cex.axis"), cex.names = par("cex.axis"), newplot = TRUE, axis.lty = 0, ...)
stackedBarplot(height, width = 1, space = 0.2, offset = 0, names.arg = NULL, legend.text = NULL, horiz = FALSE, density = NULL, angle = 45, col = NULL, border = par("fg"), main = NULL, sub = NULL, xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, xpd = TRUE, axis = TRUE, axisnames = TRUE, cex.axis = par("cex.axis"), cex.names = par("cex.axis"), newplot = TRUE, axis.lty = 0, ...)
height |
A matrix giving the heights of the bars. The columns represent bars, and the rows represent groups within each bar. |
width |
A numeric vector of length equal to the number of columns
in |
space |
A numeric vector of length equal to the number of columns
in |
offset |
A numeric vector of length equal to the number of
columns in |
names.arg |
If not missing, used as axis labels (see
|
legend.text |
If node null, a legend is generated (see
|
horiz |
A logical value. If true, stacked bars are printed horizontally instead of vertically. |
density |
Density of shading lines (see
|
angle |
Angle of shading lines (see
|
col |
Color used for each bar. This should be a scalar or a vector
of length equal to the number of rows of |
border |
Color for the rectangle borders (see
|
main |
Main title for plot (see |
sub |
Subtitle for plot (see |
xlab |
X-axis label for plot (see |
ylab |
Y-axis label for plot (see |
xlim |
Limits in user co-ordinates for the x-axis. Should be a vector of length 2. |
ylim |
Limits in user co-ordinates for the y-axis. Should be a vector of length 2. |
xpd |
A logical value controlling clipping. (see
|
axis |
A logical value. If true, a numeric scale is printed on the appropriate axis. |
axisnames |
A logical value. If true, column names are printed on the appropriate axis. |
cex.axis |
Character size used for the numeric axis labels (see
|
cex.names |
Character size used for the text (column names) axis
labels (see |
newplot |
A logical value. If true a new graphics region is created. If false, the plot is placed on top of the existing graphics region. |
axis.lty |
A value passed as the |
... |
This is a more detailed version of the stackedBars
graph which allows finer control. It is used mainly by
compareBars
.
There are some differences from stackedBars
. First,
height
can be any value, not just a vector of probability.
Second, offset
is given as a numeric value in the units of
height, rather than as an index into the array of heights. Most of
the rest of the arguments merely expose the graphical arguments to the
user.
The midpoints of the bars are returned invisibly.
Russell Almond
compareBars
, colorspread
,
buildMarginTab
, marginTab
,
barplot
,stackedBars
margins <- data.frame ( Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09), NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14), Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04) ) margins <- as.matrix(margins) baseline <- apply(margins[1:2,],2,sum) stackedBarplot(margins,offset=-baseline, main="Marginal Distributions for NetPASS skills", sub="Baseline at 2nd Semester level.", col=hsv(223/360,.2,0.10*(5:1)+.5))
margins <- data.frame ( Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09), NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14), Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04) ) margins <- as.matrix(margins) baseline <- apply(margins[1:2,],2,sum) stackedBarplot(margins,offset=-baseline, main="Marginal Distributions for NetPASS skills", sub="Baseline at 2nd Semester level.", col=hsv(223/360,.2,0.10*(5:1)+.5))
This produces a series of stacked bar plots staggered so that the baseline corresponds to a particular state level. This is primarily designed for producing plots of probability vectors coming out of Bayes net scoring.
stackedBars(data, profindex, ..., ylim = c(min(offsets) - 0.25, max(1 + offsets)), cex.names = par("cex.axis"), percent=TRUE, digits = 2*(1-percent), labrot=FALSE)
stackedBars(data, profindex, ..., ylim = c(min(offsets) - 0.25, max(1 + offsets)), cex.names = par("cex.axis"), percent=TRUE, digits = 2*(1-percent), labrot=FALSE)
data |
A |
profindex |
The index of the proficiency which should be used as a baseline. |
... |
Graphical arguments passed to |
ylim |
Default limits for Y axis. |
cex.names |
Magnification for names. |
percent |
Logical value. If true data values are treated as percentages instead of probabilities. |
digits |
Number of digits for overlaid numeric variables. |
labrot |
If true, labels are rotated 90 degrees. |
This plot type assumes that each column in its first argument is a
probability vector. It then produces a stacked bar for each column.
The baseline of the bar is offset by the probability for being in the
category marked by profindex
or below.
The probability values are overlaid on the bars.
Returns the midpoints of the bars invisibly.
Russell Almond
This plot type was initially developed in Jody Underwood's Evolve project.
Almond, R. G., Shute, V. J., Underwood, J. S., and Zapata-Rivera, J.-D (2009). Bayesian Networks: A Teacher's View. International Journal of Approximate Reasoning. 50, 450-460.
compareBars
, colorspread
,
buildMarginTab
, marginTab
,
barplot
,stackedBarplot
margins <- data.frame ( Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09), NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14), Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04) ) stackedBars(margins,3, main="Marginal Distributions for NetPASS skills", sub="Baseline at 3rd Semester level.", cex.names=.75, col=hsv(223/360,.2,0.10*(5:1)+.5)) stackedBars(margins,3, main="Marginal Distributions for NetPASS skills", sub="Baseline at 3rd Semester level.", percent=FALSE,digits=2, cex.names=.75, col=hsv(223/360,.2,0.10*(5:1)+.5))
margins <- data.frame ( Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09), NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14), Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04) ) stackedBars(margins,3, main="Marginal Distributions for NetPASS skills", sub="Baseline at 3rd Semester level.", cex.names=.75, col=hsv(223/360,.2,0.10*(5:1)+.5)) stackedBars(margins,3, main="Marginal Distributions for NetPASS skills", sub="Baseline at 3rd Semester level.", percent=FALSE,digits=2, cex.names=.75, col=hsv(223/360,.2,0.10*(5:1)+.5))
This function finds an undirected graphical representation of a multivariate normal distribution with the given covariance matrix, by associating edges with non-zero entries. Graphical structure is given as an adjacency matrix.
structMatrix(X, threshold = 0.1)
structMatrix(X, threshold = 0.1)
X |
A variance matrix. |
threshold |
A numeric value giving the threshold for a value to be considered “non-zero”. |
For a multivariate normal model, zero entries in the inverse covariance matrix correspond to conditional independence statements true in the multivariate normal distribution (Whitaker, 1990; Dempster, 1972). Thus, every non-zero entry in the inverse correlation matrix corresponds to an edge in an undirected graphical model for the structure.
The threshold
parameter is used to determine how close to zero
a value must be to be considered zero. This allows for both
estimation error and numerical precision when inverting the covariance
matrix.
An adjacency matrix of the same size and shape as X
. In this
matrix result[i,j]
is TRUE
if and only if Node
and Node
are neighbors in the graph.
Models of this kind are known as “Covariance Selection Models” and were first studied by Dempster (1972).
Russell Almond
Dempster, A.P. (1972) Covariance Selection. Biometrics, 28, 157–175.
Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley.
scaleMatrix
, mcSearch
,
buildParentList
data(MathGrades) MG.struct <- structMatrix(MathGrades$var)
data(MathGrades) MG.struct <- structMatrix(MathGrades$var)
Creates a weight of evidence balance sheet from a history of marginal distributions.
woeBal(hist, pos, neg, obs=NULL, title = "Evidence Balance Sheet", col = rev(colorspread("slategray",ncol(hist),maxsat=TRUE)), posCol="cyan", negCol="red", stripCol=c("white","lightgray"), lcex = 0.65)
woeBal(hist, pos, neg, obs=NULL, title = "Evidence Balance Sheet", col = rev(colorspread("slategray",ncol(hist),maxsat=TRUE)), posCol="cyan", negCol="red", stripCol=c("white","lightgray"), lcex = 0.65)
hist |
A matrix whose rows represent time points (after tests) and columns represent probabilities. |
pos |
An expression for selecting rows of the |
neg |
An expression for selecting the rows corresponding to
the complement of the hypothesis. (The default value is
|
obs |
An optional character vector of the same length as the
number of rows of |
title |
Title for plot |
col |
A list of color values for probability bars. |
posCol |
The color to be used for bars showing positive weights of evidence. |
negCol |
The color to be used for bars showing negative weights of evidence. |
stripCol |
The colors to be used for the time step labels. Setting this to a vector of two colors creates alternate color stripes. Set this to "white" to disable that effect. |
lcex |
Character expansion size for labels. |
This constructs a weight of evidence balance sheet (Madigan,
Mosurski, and Almond, 1997) showing the changes
to the probability distribution and weight of evidence for each
change in the probability. The probabilities are given in the
hist
argument in which each row should be a probability
distribution for the target variable. The labels for the plot are
taken from the row labels of the hist
argument.
Madigan, Mosurski and Almond (1997) note that the definition of weight
of evidence is somewhat problematic if the hypothesis variable is not
binary. In that case, they recommend partitioning the states into a
positive and negative set. The pos
and neg
are meant to describe that partition. They can be any expression
suitable for selecting columns from the hist
matrix. This
function calls woeHist()
to calculate weights of evidence.
The row names of hist
are printed left-justified in the
leftmost column. If observed values (obs
) are supplied, they
are printed right justified in the same column.
The midpoints of the bars (see barplot
) are
returned invisibly.
Starts a new plotting page and creates three side-by-side plots, one for the labels, one for the probability bars and one for the weight of evidence bars.
Russell Almond
Good, I. (1971) The probabilistic explication of information, evidence, surprise, causality, explanation and utility. In Proceedings of a Symposium on the Foundations of Statistical Inference. Holt, Rinehart and Winston, 108-141.
Madigan, D., Mosurski, K. and Almond, R. (1997) Graphical explanation in belief networks. Journal of Computational Graphics and Statistics, 6, 160-181.
Almond, R. G., Kim, Y. J., Shute, V. J. and Ventura, M. (2013). Debugging the Evidence Chain. In Almond, R. G. and Mengshoel, O. (Eds.) Proceedings of the 2013 UAI Application Workshops: Big Data meet Complex Models and Models for Spatial, Temporal and Network Data (UAI2013AW), 1-10. http://ceur-ws.org/Vol-1024/paper-01.pdf
Almond, R.G., Mislevy, R.J., Steinberg, L.S., Williamson, D.M. and Yan, D. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 7.
readHistory
, woeHist
,
barplot
, Colors
sampleSequence <- read.csv(system.file("testFiles","SampleStudent.csv", package="CPTtools"), header=TRUE,row.names=1) woeBal(sampleSequence[,c("H","M","L")],c("H"),c("M","L"),lcex=1.25) woeBal(sampleSequence[,c("H","M","L")],c("H"),c("M","L"), obs=sampleSequence[,"Acc"],lcex=1.25)
sampleSequence <- read.csv(system.file("testFiles","SampleStudent.csv", package="CPTtools"), header=TRUE,row.names=1) woeBal(sampleSequence[,c("H","M","L")],c("H"),c("M","L"),lcex=1.25) woeBal(sampleSequence[,c("H","M","L")],c("H"),c("M","L"), obs=sampleSequence[,"Acc"],lcex=1.25)
Takes a matrix providing the probability distribution for the target variable at several time points and returns a weight of evidence for all time points except the first.
woeHist(hist, pos=1L, neg=NULL)
woeHist(hist, pos=1L, neg=NULL)
hist |
A matrix whose rows represent time points (after tests) and columns represent probabilities. |
pos |
An expression for selecting rows of the |
neg |
An expression for selecting the rows corresponding to
the complement of the hypothesis. (The default value is
|
Good (1971) defines the Weight Of Evidence (WOE) as:
Where is used to indicate the negation of the
hypothesis. Good recommends taking the log base 10 and multiplying by
100, and calls the resulting units centibans. The second
definition of weight of evidence as a difference in log odd leads
naturally to the idea of an incremental weight of evidence for each
new observation.
Following Madigan, Mosurski and Almond (1997), all that is needed to
calculate the WOE is the marginal distribution for the hypothesis
variable at each time point. They also note that the definition is
somewhat problematic if the hypothesis variable is not binary. In
that case, they recommend partitioning the states into a
positive and negative set. The pos
and neg
are meant to describe that partition. They can be any expression
suitable for selecting columns from the hist
matrix.
A vector of weights of evidence of length one less than the number of
rows of hist
(i.e., the result of applying diff()
to the
vector of log odds.)
Russell Almond
Good, I. (1971) The probabilistic explication of information, evidence, surprise, causality, explanation and utility. In Proceedings of a Symposium on the Foundations of Statistical Inference. Holt, Rinehart and Winston, 108-141.
Madigan, D., Mosurski, K. and Almond, R. (1997) Graphical explanation in belief networks. Journal of Computational Graphics and Statistics, 6, 160-181.
testFiles <- system.file("testFiles",package="CPTtools") allcorrect <- readHistory(read.csv(file.path(testFiles, "CorrectSequence.csv"),as.is=TRUE), probcol="Margin.sequences.") woeHist(allcorrect,"High",c("Medium","Low")) woeHist(allcorrect,1:2,3)
testFiles <- system.file("testFiles",package="CPTtools") allcorrect <- readHistory(read.csv(file.path(testFiles, "CorrectSequence.csv"),as.is=TRUE), probcol="Margin.sequences.") woeHist(allcorrect,"High",c("Medium","Low")) woeHist(allcorrect,1:2,3)