By Dr Matthew Carroll, Research Division, Cambridge University Press & Assessment
When we produce an assessment, we need to know that it is measuring what it is intended to measure. We need to make sure that the results tell us something meaningful, so that any decisions based on those results are well-informed. This, broadly speaking, is the concept of assessment ‘validity’.
There are various different ways that people think about validity – colleagues have written about some of them on our blog 'In pursuit of reliability and validity'.
One way is to consider ‘predictive validity’. In practice, this means whether a test’s results can be used to predict outcomes in a related – but independent – measure.
For example, we might expect that students who scored higher on a mathematics baseline test at the start of the year should achieve higher grades at the end of the year. If this happened, we could conclude that the baseline test had good predictive validity. If, on the other hand, there was no relationship, we might question whether the baseline test really was measuring what we intended it to.
In CEM’s baseline tests, MidYIS and Yellis, predictive validity is a key consideration, because one of their uses is to predict performance in high-stakes examinations. Students with higher scores on MidYIS or Yellis should be go on to attain higher grades at GCSE or IGCSE. And, as we seek to ensure the tests work well for their intended purposes, we have to regularly consider their predictive validity.
With this in mind, we recently carried out some research to explore the relationships between MidYIS/Yellis and IGCSE grades. Specifically, as the use of MidYIS and Yellis grows internationally, we wanted to know whether their predictive validity holds around the world. Their well-established use in the UK means that we have a good sense of their predictive validity here, but what about elsewhere? After all, that is what we would expect – the tests should have good predictive validity regardless of where they are taken.
"CEM tests have good predictive validity for IGCSEs, whether you’re in the UK or anywhere else in the world."
To do this, we looked at all students who had taken IGCSEs – anywhere in the world – in 2018 or 2019, and identified those who had done MidYIS or Yellis in the preceding years. We then matched their IGCSE grades to their MidYIS or Yellis scores, and grouped them into students based in the UK and those based elsewhere.
Once this matching process was complete, we looked at the relationships between CEM test scores and IGCSE grades for fourteen subjects, covering a range of disciplines, from the sciences, to languages, to arts and humanities. For each subject, we were able to include data on thousands of students, from hundreds of schools, and from around fifty countries.
With this large dataset, we analysed relationships between test scores and IGCSE grades in two ways. First, we looked at correlations, and then we used a more complex statistical method known as mixed effects regression. We’ll focus here on the correlation results.
Correlations measure the strength of linear relationships – if one number increases as the other one does, we’d see a positive correlation coefficient.
We found that in both the UK and the rest of the world, for all subjects, higher CEM test scores were associated with higher IGCSE grades. The median correlation coefficient was 0.55 in the UK, and 0.53 in the rest of the world. The mean correlation coefficient was 0.54 in the UK, and 0.50 in the rest of the world.
For context, these correlations are of similar magnitude to the relationships between national standardised test scores and GCSE grades, which are used to guide GCSE awarding in England (see Analysis of use of Key Stage 2 data in GCSE predictions)
For some subjects, like Physics and Computer Science, there was virtually no difference in correlation strength between the UK and the rest of the world. In other subjects, like English Literature and Biology, the correlation was slightly higher in the UK; in a couple of subjects, History and Geography, it was slightly higher in the rest of the world. But, overall, differences were generally small.
There are two main points to note. First, as expected, higher MidYIS/Yellis scores were associated with higher IGCSE grades. Second, there was very little difference in these relationships between the UK and the rest of the world.
And this is exactly what we would hope to see – CEM tests have good predictive validity for IGCSEs, whether you’re in the UK or anywhere else in the world.
This is just a brief summary of part of the research – we will soon be releasing a report that will provide more details on the methods we used and the results from both the correlation analysis and the mixed effects regressions.