If you have a license key from Norsys, set the value of
"NETICA_LICENSE_KEY"
in ~/.Renviron
.
This exercise uses the language assessment proposed in @Mislevy1995.
Based on that paper, the following Netica files were built (and are
distributed in the “sampleNets” directory of the RNetica
package.)
Model | Filename | Variables | Replications |
---|---|---|---|
Proficiency | LanguagePM.dne | Reading, Writing, Speaking, Listening | NA |
Reading | ReadingTask.dne | TaskA | 5 |
Writing | WritingTask.dne | TaskB | 3 |
Speaking | SpeakingTask.dne | TaskC | 3 |
Listening | ListeningTask.dne | TaskD | 5 |
pmname <- "LanguagePM"
tasks <- c("Reading","Writing","Speaking","Listening")
reps.form16<-c(5,3,3,5)
Load the proficiency and evidence models.
pmfile <- system.file(paste0("sampleNets/",
pmname,".dne"),
package="RNetica")
pm <- ReadNetworks(pmfile,session=sess)
emfiles <- system.file(paste0("sampleNets/",
tasks,
"Task.dne"),
package="RNetica")
ems <- ReadNetworks(emfiles,session=sess)
Putting the proficiency variables and observable variables into node sets will make it easier to find them later.
While the proficiency model is a complete Bayesian network, the
evidence models (with “Task” in their name) are fragments. In
particular, they have unconnected edges corresponding to the node of the
proficiency variables. The AdjoinNetwork()
function copies
the evidence model into the active network and reconnects the edges. A
network containing the proficiency model plus one or more evidence
models is a motif.
For testing purposes, motifs with just the proficiency variable and the evidence model for a single task are useful.
For evaluating an assessment form, a motif consisting of the
proficiency model plus evidence models for all of the tasks is what is
needed. To construct this, adjoin copies of the evidence models
according the the form design given in the reps
variable.
## If reruning this code, delete the motif.
if (exists("motif16") && is.active(motif16))
DeleteNetwork(motif16)
motif16 <- CopyNetworks(pm,"motif16")
for (whichTask in 1:length(reps.form16)) {
for (iTasks in 1:reps.form16[whichTask]) {
newobs <- AdjoinNetwork(motif16,ems[[whichTask]])
}
}
observables <- NetworkNodesInSet(motif16,"Observables")
names(observables)
#> [1] "TaskD4" "TaskD3" "TaskD2" "TaskD1" "TaskD" "TaskC2" "TaskC1" "TaskC"
#> [9] "TaskB2" "TaskB1" "TaskB" "TaskA4" "TaskA3" "TaskA2" "TaskA1" "TaskA"
proficiencies <- NetworkNodesInSet(motif16,"Proficiencies")
names(proficiencies)
#> [1] "Listening" "Speaking" "Writing" "Reading"
Finally, compile the network.
First set some constants. B
is the number of cases to
simulate.
B <- 1000L
casefile16 <- tempfile("motif16test",fileext=".cas")
filestream <- CaseFileStream(casefile16, session=sess)
rng <- NewNeticaRNG(123456779, session=sess)
allvars <- c(proficiencies,observables)
The sample data and true values for the proficiency variables. However, need a space for the margin scores. Pre-allocating the space makes the simulation run faster.
Loop through the simulees.
Generate the data using the motif.
Write out the cases.
Retract the generated proficiency variables.
Collect the marginal distributions for the proficiency variables.
Unset all variables.
WithOpenCaseStream(filestream,
WithRNG(rng,
for (b in 1L:B) {
## 1. Generate the data
GenerateRandomCase(allvars,rng=rng)
## 2. Write the findings both proficiency and observable variables.
WriteFindings(allvars,filestream,b)
## 3. Retract the proficiency "findings", this leaves the marginal "scores" of the remaining variables.
lapply(proficiencies,RetractNodeFinding)
## 4. Save the marginal distributions
motif16.margins[b,] <- unlist(lapply(proficiencies,NodeBeliefs))
motif16.modes[b,] <- unlist(lapply(proficiencies,NodeMode))
## 5. Now clear out the rest of the nodes.
RetractNetFindings(motif16)
}))
Read the case data back in, and bind the raw data to the scores. Use
the orderVars
function to convert to factors.
motif16.data <- orderVars(read.CaseFile(casefile16,session=sess),
allvars)
motif16.modes <- orderVars(as.data.frame(motif16.modes),proficiencies)
motif16.data <- data.frame(motif16.data,motif16.margins,
mode=motif16.modes)
head(motif16.data)
#> IDnum Listening Speaking Writing Reading TaskD4 TaskD3
#> 1 1 Novice Intermediate Novice Intermediate Wrong Wrong
#> 2 2 Novice Novice Intermediate Advanced Wrong Wrong
#> 3 3 Intermediate Intermediate Intermediate Novice Wrong Wrong
#> 4 4 Novice Advanced Novice Novice Wrong Wrong
#> 5 5 Intermediate Intermediate Advanced Advanced Wrong Wrong
#> 6 6 Intermediate Advanced Intermediate Novice Right Wrong
#> TaskD2 TaskD1 TaskD TaskC2 TaskC1 TaskC TaskB2 TaskB1
#> 1 Wrong Wrong Wrong Good Good Poor PoorW_On_ PoorW_On_
#> 2 Wrong Wrong Wrong Okay Poor Okay PoorW_On_ GoodW_Off_
#> 3 Right Wrong Right Good Very_Good Okay PoorW_On_ GoodW_Off_
#> 4 Wrong Wrong Wrong Very_Good Poor Very_Good PoorW_Off_ PoorW_On_
#> 5 Wrong Wrong Right Okay Okay Good GoodW_On_ GoodW_On_
#> 6 Wrong Right Wrong Very_Good Good Very_Good PoorW_Off_ PoorW_On_
#> TaskB TaskA4 TaskA3 TaskA2 TaskA1 TaskA Listening.Novice
#> 1 GoodW_On_ Okay Good Poor Okay Poor 8.506357e-01
#> 2 PoorW_On_ Very_Good Very_Good Very_Good Very_Good Good 9.196874e-01
#> 3 PoorW_Off_ Poor Okay Poor Okay Poor 4.029975e-04
#> 4 PoorW_Off_ Good Poor Poor Good Okay 9.583628e-01
#> 5 GoodW_Off_ Very_Good Okay Good Okay Very_Good 3.311799e-02
#> 6 PoorW_On_ Poor Poor Good Okay Poor 9.516245e-05
#> Listening.Intermediate Listening.Advanced Speaking.Novice
#> 1 0.14856082 0.0008034986 0.197699726
#> 2 0.08000454 0.0003080814 0.275341392
#> 3 0.79419041 0.2054065764 0.054948598
#> 4 0.03918812 0.0024490482 0.002450181
#> 5 0.95899987 0.0078821173 0.095165730
#> 6 0.65141857 0.3484863043 0.012975344
#> Speaking.Intermediate Speaking.Advanced Writing.Novice Writing.Intermediate
#> 1 0.48786390 0.31443641 0.08341378 0.6577616
#> 2 0.71005255 0.01460609 0.83588433 0.1572442
#> 3 0.61901480 0.32603660 0.16053411 0.5409212
#> 4 0.03919352 0.95835632 0.47419086 0.4334946
#> 5 0.88375419 0.02108006 0.05525145 0.6505799
#> 6 0.27726373 0.70976090 0.09513711 0.7480154
#> Writing.Advanced Reading.Novice Reading.Intermediate Reading.Advanced
#> 1 0.258824587 5.645623e-01 0.43536362 7.412684e-05
#> 2 0.006871484 7.038870e-08 0.06890685 9.310931e-01
#> 3 0.298544705 9.542122e-01 0.04578766 1.103481e-07
#> 4 0.092314564 7.033880e-01 0.29656067 5.129754e-05
#> 5 0.294168621 2.271402e-05 0.52052850 4.794488e-01
#> 6 0.156847462 9.386655e-01 0.06133413 3.301953e-07
#> mode.Listening mode.Speaking mode.Writing mode.Reading
#> 1 Novice Intermediate Intermediate Novice
#> 2 Novice Intermediate Novice Advanced
#> 3 Intermediate Intermediate Intermediate Novice
#> 4 Novice Advanced Novice Novice
#> 5 Intermediate Intermediate Intermediate Intermediate
#> 6 Intermediate Advanced Intermediate Novice
The basic tool used to see how accurately the latent variables can be estimated is the confusion matrix. Let akm be the count of the number of cases where the simulated state was k and the estimated state was m. Several statistics are available using that matrix.
The basic confusion matrix can be built using the
table()
function.
motif16.cm <-
lapply(names(proficiencies), function (p)
table(motif16.data[,paste0(c("","mode."),p)]))
names(motif16.cm) <- names(proficiencies)
motif16.cm$Reading
#> mode.Reading
#> Reading Novice Intermediate Advanced
#> Novice 216 46 0
#> Intermediate 37 398 45
#> Advanced 0 41 217
The function CPTtools::accuracy
gives the fraction of
cases on the diagonal.
sapply(motif16.cm,CPTtools::accuracy)
#> Listening Speaking Writing Reading
#> 0.840 0.687 0.603 0.831
The Goodman–Kruskal lambda statistic corrects for agreement by simply guessing the most populuous category.
round(sapply(motif16.cm,CPTtools::gkLambda),3)
#> Listening Speaking Writing Reading
#> 0.672 0.330 0.191 0.675
And the Fleis-Cohen kappa corrects for random agreement.
round(sapply(motif16.cm,CPTtools::fcKappa),3)
#> Listening Speaking Writing Reading
#> 0.740 0.478 0.329 0.733
These are all explained in greater detail in the vignette
vignette("MeasuringAgreement", package="CPTtools")
.
Using the marginal distributions instead of the modal predictions, we
can construct an expected confusion matrix. This is done using the
function CPTtools::expTable
.
motif16.em <-
lapply(names(proficiencies), function (p)
CPTtools::expTable(motif16.data,p,p,"<var>\\.<state>"))
names(motif16.em) <- names(proficiencies)
round(motif16.em$Reading,3)
#> Reading
#> Reading Novice Intermediate Advanced
#> Novice 206.702 51.083 0.168
#> Intermediate 55.238 372.955 58.196
#> Advanced 0.060 55.962 199.636
The same statistics can be used for looking at agreement:
round(data.frame(
Accuracy=sapply(motif16.em,CPTtools::accuracy),
Lambda=sapply(motif16.em,CPTtools::gkLambda),
Kappa =sapply(motif16.em,CPTtools::fcKappa)),
3)
#> Accuracy Lambda Kappa
#> Listening 0.757 0.513 0.609
#> Speaking 0.583 0.167 0.325
#> Writing 0.501 -0.010 0.197
#> Reading 0.779 0.570 0.651
Now use six tasks of each type.
motif24.addreps <- c(1,3,3,1)
reps.form24 <- reps.form16+motif24.addreps
## If reruning this code, delete the motif.
if (exists("motif24") && is.active(motif24))
DeleteNetwork(motif24)
motif24 <- CopyNetworks(motif16,"motif24")
for (whichTask in 1:length(motif24.addreps)) {
for (iTasks in 1:motif24.addreps[whichTask]) {
newobs <- AdjoinNetwork(motif24,ems[[whichTask]])
}
}
observables <- NetworkNodesInSet(motif24,"Observables")
## Same names, but different nodes in different networks.
proficiencies <- NetworkNodesInSet(motif24,"Proficiencies")
CompileNetwork(motif24)
B <- 1000L
casefile24 <- tempfile("motif24test",fileext=".cas")
filestream <- CaseFileStream(casefile24, session=sess)
allvars <- c(proficiencies,observables)
rng <- NewNeticaRNG(677912356, session=sess)
varstatenames <-
names(unlist(lapply(proficiencies,NodeBeliefs)))
motif24.margins <- matrix(NA_real_,B,length(varstatenames))
colnames(motif24.margins) <- varstatenames
motif24.modes <- matrix(NA_character_,B,length(proficiencies))
colnames(motif24.modes) <- names(proficiencies)
WithOpenCaseStream(filestream,
WithRNG(rng,
for (b in 1L:B) {
## 1. Generate the data
GenerateRandomCase(allvars,rng=rng)
## 2. Write the findings both proficiency and observable variables.
WriteFindings(allvars,filestream,b)
## 3. Retract the proficiency "findings", this leaves the marginal "scores" of the remaining variables.
lapply(proficiencies,RetractNodeFinding)
## 4. Save the marginal distributions
motif24.margins[b,] <- unlist(lapply(proficiencies,NodeBeliefs))
motif24.modes[b,] <- unlist(lapply(proficiencies,NodeMode))
## 5. Now clear out the rest of the nodes.
RetractNetFindings(motif24)
}))
motif24.data <- orderVars(read.CaseFile(casefile24,session=sess),
allvars)
motif24.modes <- orderVars(as.data.frame(motif16.modes),proficiencies)
motif24.data <- data.frame(motif24.data,motif16.margins,
mode=motif24.modes)
head(motif24.data)
#> IDnum Listening Speaking Writing Reading TaskD5 TaskD4
#> 1 1 Intermediate Intermediate Intermediate Advanced Wrong Wrong
#> 2 2 Advanced Novice Intermediate Intermediate Wrong Right
#> 3 3 Intermediate Intermediate Novice Novice Right Wrong
#> 4 4 Advanced Advanced Novice Novice Right Right
#> 5 5 Novice Novice Intermediate Advanced Wrong Wrong
#> 6 6 Novice Intermediate Advanced Novice Wrong Wrong
#> TaskD3 TaskD2 TaskD1 TaskD TaskC5 TaskC4 TaskC3 TaskB5
#> 1 Wrong Right Wrong Wrong Very_Good Okay Good GoodW_On_
#> 2 Wrong Right Right Right Very_Good Good Good PoorW_On_
#> 3 Wrong Right Right Wrong Good Good Good PoorW_Off_
#> 4 Wrong Right Right Right Good Very_Good Very_Good PoorW_Off_
#> 5 Wrong Wrong Wrong Wrong Poor Poor Good GoodW_Off_
#> 6 Wrong Wrong Wrong Wrong Okay Poor Poor PoorW_On_
#> TaskB4 TaskB3 TaskA5 TaskC2 TaskC1 TaskC TaskB2
#> 1 GoodW_On_ GoodW_Off_ Good Okay Good Good GoodW_On_
#> 2 PoorW_Off_ GoodW_On_ Okay Good Good Okay GoodW_On_
#> 3 PoorW_Off_ PoorW_On_ Poor Okay Very_Good Good PoorW_Off_
#> 4 PoorW_Off_ PoorW_Off_ Poor Very_Good Very_Good Very_Good PoorW_Off_
#> 5 GoodW_On_ GoodW_Off_ Good Poor Poor Good GoodW_On_
#> 6 GoodW_On_ PoorW_On_ Okay Poor Poor Poor PoorW_Off_
#> TaskB1 TaskB TaskA4 TaskA3 TaskA2 TaskA1 TaskA
#> 1 GoodW_On_ GoodW_On_ Very_Good Okay Very_Good Okay Good
#> 2 GoodW_On_ GoodW_Off_ Very_Good Very_Good Good Good Poor
#> 3 GoodW_Off_ PoorW_Off_ Poor Poor Good Poor Poor
#> 4 PoorW_On_ PoorW_Off_ Poor Okay Poor Good Poor
#> 5 GoodW_On_ GoodW_On_ Good Very_Good Good Good Very_Good
#> 6 PoorW_On_ PoorW_Off_ Poor Poor Okay Okay Good
#> Listening.Novice Listening.Intermediate Listening.Advanced Speaking.Novice
#> 1 8.506357e-01 0.14856082 0.0008034986 0.197699726
#> 2 9.196874e-01 0.08000454 0.0003080814 0.275341392
#> 3 4.029975e-04 0.79419041 0.2054065764 0.054948598
#> 4 9.583628e-01 0.03918812 0.0024490482 0.002450181
#> 5 3.311799e-02 0.95899987 0.0078821173 0.095165730
#> 6 9.516245e-05 0.65141857 0.3484863043 0.012975344
#> Speaking.Intermediate Speaking.Advanced Writing.Novice Writing.Intermediate
#> 1 0.48786390 0.31443641 0.08341378 0.6577616
#> 2 0.71005255 0.01460609 0.83588433 0.1572442
#> 3 0.61901480 0.32603660 0.16053411 0.5409212
#> 4 0.03919352 0.95835632 0.47419086 0.4334946
#> 5 0.88375419 0.02108006 0.05525145 0.6505799
#> 6 0.27726373 0.70976090 0.09513711 0.7480154
#> Writing.Advanced Reading.Novice Reading.Intermediate Reading.Advanced
#> 1 0.258824587 5.645623e-01 0.43536362 7.412684e-05
#> 2 0.006871484 7.038870e-08 0.06890685 9.310931e-01
#> 3 0.298544705 9.542122e-01 0.04578766 1.103481e-07
#> 4 0.092314564 7.033880e-01 0.29656067 5.129754e-05
#> 5 0.294168621 2.271402e-05 0.52052850 4.794488e-01
#> 6 0.156847462 9.386655e-01 0.06133413 3.301953e-07
#> mode.Listening mode.Speaking mode.Writing mode.Reading
#> 1 Novice Intermediate Intermediate Novice
#> 2 Novice Intermediate Novice Advanced
#> 3 Intermediate Intermediate Intermediate Novice
#> 4 Novice Advanced Novice Novice
#> 5 Intermediate Intermediate Intermediate Intermediate
#> 6 Intermediate Advanced Intermediate Novice
Rather than calculating the expected values by hand, the Netica
NeticaTester
object can be used instead.
This function uses the case file we generated above, but marks certain nodes as targets, whose values will be estimated and the estimated values compared with the actual (simulated) values.
[Note, the network is inferred from the nodes in the list of target
variables, proficiencies
. In this case, these are the nodes
in the motif24
network. However, that doesn’t matter as the
extra observables won’t be referenced.]
filestream16 <- CaseFileStream(casefile16,session=sess)
motif16.test <-
testNetwork(proficiencies,
OpenCaseStream(filestream16))
#> Warning in oldstream$open(): Stream is already open: nothing done.
summary(motif16.test)
#> ErrorRate LogLoss QuadraticLoss kappa QWK lambda
#> Listening 0.160 0.3826550 0.2335446 0.7402665 0.8343082 0.6721311
#> Speaking 0.313 0.6567707 0.4155516 0.4775627 0.6452752 0.3297645
#> Writing 0.397 0.8083365 0.4999925 0.3287138 0.4793781 0.1914460
#> Reading 0.169 0.3554127 0.2212610 0.7331514 0.8367263 0.6750000
#> LinearLambda
#> Listening 0.6721311
#> Speaking 0.3233405
#> Writing 0.1527495
#> Reading 0.6750000