Never mind the table, look at the sample size!
This morning I was just thinking that it’s been a while since I’ve filed anything in the category marked bad statistics when I glanced at today’s copy of the Times Higher and found something that’s given me an excuse to rectify my lapse. Last week saw the publication of said organ’s new Student Experience Survey which ranks British Universities in order of the responses given by students to questions about various aspects of the teaching, social life and so on. I had a go at this table a few years ago, but they still keep trotting it out. Here are the main results, sorted in decreasing order:
| University | Score | Resp. | |
| 1 | University of East Anglia | 84.8 | 119 |
| 2 | University of Oxford | 84.2 | 259 |
| 3 | University of Sheffield | 83.9 | 192 |
| 3 | University of Cambridge | 83.9 | 245 |
| 5 | Loughborough University | 82.8 | 102 |
| 6 | University of Bath | 82.7 | 159 |
| 7 | University of Leeds | 82.5 | 219 |
| 8 | University of Dundee | 82.4 | 103 |
| 9 | York St John University | 81.2 | 88 |
| 10 | Lancaster University | 81.1 | 100 |
| 11 | University of Southampton | 80.9 | 191 |
| 11 | University of Birmingham | 80.9 | 198 |
| 11 | University of Nottingham | 80.9 | 270 |
| 14 | Cardiff University | 80.8 | 113 |
| 14 | Newcastle University | 80.8 | 125 |
| 16 | Durham University | 80.3 | 188 |
| 17 | University of Warwick | 80.2 | 205 |
| 18 | University of St Andrews | 79.8 | 109 |
| 18 | University of Glasgow | 79.8 | 131 |
| 20 | Queen’s University Belfast | 79.2 | 101 |
| 21 | University of Hull | 79.1 | 106 |
| 22 | University of Winchester | 79 | 106 |
| 23 | Northumbria University | 78.9 | 100 |
| 23 | University of Lincoln | 78.9 | 103 |
| 23 | University of Strathclyde | 78.9 | 107 |
| 26 | University of Surrey | 78.8 | 102 |
| 26 | University of Leicester | 78.8 | 105 |
| 26 | University of Exeter | 78.8 | 130 |
| 29 | University of Chester | 78.7 | 102 |
| 30 | Heriot-Watt University | 78.6 | 101 |
| 31 | Keele University | 78.5 | 102 |
| 32 | University of Kent | 78.4 | 110 |
| 33 | University of Reading | 78.1 | 101 |
| 33 | Bangor University | 78.1 | 101 |
| 35 | University of Huddersfield | 78 | 104 |
| 36 | University of Central Lancashire | 77.9 | 121 |
| 37 | Queen Mary, University of London | 77.8 | 103 |
| 37 | University of York | 77.8 | 106 |
| 39 | University of Edinburgh | 77.7 | 170 |
| 40 | University of Manchester | 77.4 | 252 |
| 41 | Imperial College London | 77.3 | 148 |
| 42 | Swansea University | 77.1 | 103 |
| 43 | Sheffield Hallam University | 77 | 102 |
| 43 | Teesside University | 77 | 103 |
| 45 | Brunel University | 76.6 | 110 |
| 46 | University of Portsmouth | 76.4 | 107 |
| 47 | University of Gloucestershire | 76.3 | 53 |
| 47 | Robert Gordon University | 76.3 | 103 |
| 47 | Aberystwyth University | 76.3 | 104 |
| 50 | University of Essex | 76 | 103 |
| 50 | University of Glamorgan | 76 | 108 |
| 50 | Plymouth University | 76 | 112 |
| 53 | University of Sunderland | 75.9 | 100 |
| 54 | Canterbury Christ Church University | 75.8 | 102 |
| 55 | De Montfort University | 75.7 | 103 |
| 56 | University of Bradford | 75.5 | 52 |
| 56 | University of Sussex | 75.5 | 102 |
| 58 | Nottingham Trent University | 75.4 | 103 |
| 59 | University of Roehampton | 75.1 | 102 |
| 60 | University of Ulster | 75 | 101 |
| 60 | Staffordshire University | 75 | 102 |
| 62 | Royal Veterinary College | 74.8 | 50 |
| 62 | Liverpool John Moores University | 74.8 | 102 |
| 64 | University of Bristol | 74.7 | 137 |
| 65 | University of Worcester | 74.4 | 101 |
| 66 | University of Derby | 74.2 | 101 |
| 67 | University College London | 74.1 | 102 |
| 68 | University of Aberdeen | 73.9 | 105 |
| 69 | University of the West of England | 73.8 | 101 |
| 69 | Coventry University | 73.8 | 102 |
| 71 | University of Hertfordshire | 73.7 | 105 |
| 72 | London School of Economics | 73.5 | 51 |
| 73 | Royal Holloway, University of London | 73.4 | 104 |
| 74 | University of Stirling | 73.3 | 54 |
| 75 | King’s College London | 73.2 | 105 |
| 76 | Bournemouth University | 73.1 | 103 |
| 77 | Southampton Solent University | 72.7 | 102 |
| 78 | Goldsmiths, University of London | 72.5 | 52 |
| 78 | Leeds Metropolitan University | 72.5 | 106 |
| 80 | Manchester Metropolitan University | 72.2 | 104 |
| 81 | University of Liverpool | 72 | 104 |
| 82 | Birmingham City University | 71.8 | 101 |
| 83 | Anglia Ruskin University | 71.7 | 102 |
| 84 | Glasgow Caledonian University | 71.1 | 100 |
| 84 | Kingston University | 71.1 | 102 |
| 86 | Aston University | 71 | 52 |
| 86 | University of Brighton | 71 | 106 |
| 88 | University of Wolverhampton | 70.9 | 103 |
| 89 | Oxford Brookes University | 70.5 | 106 |
| 90 | University of Salford | 70.2 | 102 |
| 91 | University of Cumbria | 69.2 | 51 |
| 92 | Napier University | 68.8 | 101 |
| 93 | University of Greenwich | 68.5 | 102 |
| 94 | University of Westminster | 68.1 | 101 |
| 95 | University of Bedfordshire | 67.9 | 100 |
| 96 | University of the Arts London | 66 | 54 |
| 97 | City University London | 65.4 | 102 |
| 97 | London Metropolitan University | 65.4 | 103 |
| 97 | The University of the West of Scotland | 65.4 | 103 |
| 100 | Middlesex University | 65.1 | 104 |
| 101 | University of East London | 61.7 | 51 |
| 102 | London South Bank University | 61.2 | 50 |
| Average scores | 75.5 | 11459 | |
| YouthSight is the source of the data that have been used to compile the table of results for the Times Higher Education Student Experience Survey, and it retains the ownership of those data. Each higher education institution’s score has been indexed to give a percentage of the maximum score attainable. For each of the 21 attributes, students were given a seven-point scale and asked how strongly they agreed or disagreed with a number of statements based on their university experience. | |||
My current employer, the University of Sussex, comes out right on the average (75.5) and is consequently in the middle in this league table. However, let’s look at this in a bit more detail. The number of students whose responses produced the score of 75.5 was just 102. That’s by no means the smallest sample in the survey, either. The University of Sussex has over 13,000 students. The score in this table is therefore obtained from less than 1% of the relevant student population. How representative can the results be, given that the sample is so incredibly small?
What is conspicuous by its absence from this table is any measure of the “margin-of-error” of the estimated score. What I mean by this is how much the sample score would change for Sussex if a different set of 102 students were involved. Unless every Sussex student scores exactly 75.5 then the score will vary from sample to sample. The smaller the sample, the larger the resulting uncertainty.
Given a survey of this type it should be quite straightforward to calculate the spread of scores from student to student within a sample from a given University in terms of the standard deviation, σ, as well as the mean score. Unfortunately, this survey does not include this information. However, lets suppose for the sake of argument that the standard deviation for Cardiff is quite small, say 10% of the mean value, i.e. 7.55. I imagine that it’s much larger than that, in fact, but this is just meant to be by way of an illustration.
If you have a sample size of N then the standard error of the mean is going to be roughly (σ⁄√N) which, for Sussex, is about 0.75. Assuming everything has a normal distribution, this would mean that the “true” score for the full population of Sussex students has a 95% chance of being within two standard errors of the mean, i.e. between 74 and 77. This means Sussex could really be as high as 43rd place or as low as 67th, and that’s making very conservative assumptions about how much one student differs from another within each institution.
That example is just for illustration, and the figures may well be wrong, but my main gripe is that I don’t understand how these guys can get away with publishing results like this without listing the margin of error at all. Perhaps its because that would make it obvious how unreliable the rankings are? Whatever the reason we’d never get away with publishing results without errors in a serious scientific journal.
This sampling uncertainty almost certainly accounts for the big changes from year to year in these tables. For instance, the University of Lincoln is 23rd in this year’s table, but last year was way down in 66th place. Has something dramatic happened there to account for this meteoric rise? I doubt it. It’s more likely to be just a sampling fluctuation.
In fact I seriously doubt whether any of the scores in this table is significantly different from the mean score; the range from top to bottom is only 61 to 85 showing a considerable uniformity across all 102 institutions listed. What a statistically literate person should take from this table is that (a) it’s a complete waste of time and (b) wherever you go to University you’ll probably have a good experience!
Follow @telescoper
April 29, 2013 at 12:50 pm
And you didn’t even get to the further issue of how these small number of responding students were selected from their respective cohorts . . .
April 29, 2013 at 2:13 pm
Exactly. If, for instance, a large number of people were asked and a small number responded (which is the way these things are often done in my experience), then the sample is almost certainly non-representative. People are much more likely to respond to a survey if they have strong opinions.
My university places considerable weight in tenure and promotion decisions on student surveys of teaching quality. Response rates for these surveys are sometimes quite low. The people in charge assure me that they know the samples are representative, but I’ve never managed to get a clear statement of what data this is based on.
April 29, 2013 at 5:11 pm
It’s easy. If the response is favourable, the sample must be representative. If not, it must be biased.
April 29, 2013 at 9:07 pm
That must be a good thing as haven’t a lot of universities historically closed ranks when students have complained/ given feedback? The students who do give feedback might well be either very positive or negative, but at least they will hopefully be acknowledged this time. It might not be a big sample size but it is a start.
Having studied at a couple of places on the list I always felt a few lecturers regarded students as something of an inconvenience.
I’m not into having to pay for education, but one good thing to have happened is they are now much more open to scrutiny.
How you interact with people affects how much you get paid/promoted in other workplaces, so why not universities?
Not surprised UEA is at the top, lovely city and campus.
April 29, 2013 at 1:38 pm
Peter, you say you don’t understand how these guys can get away with publishing results like these without the associated errors.
What, to me, is even more unfathomable is why university managements can get away with beating departments over the heads with them. Every year we have to write a report explaining why we’ve dropped x places in the NSS results (and what we plan to do about it) or write a piece for dissemination to the rest of the university on what aspects of our ‘excellent practice’ was responsible for us improving our position by y places.
I have been told that I am not allowed to use the argument that tables of data such as NSS and THE are statistically meaningless.
April 29, 2013 at 8:49 pm
Peter, you really should act in a more responsible manner and stop denigrating the efforts of those involved in producing league tables. If you carry on in this vein, you’ll end up concluding that the results from REF should carry error estimates. Clearly no right-thinking person could support such a suggestion.
April 29, 2013 at 9:27 pm
I’m fighting a losing battle over using error bars in our REF modelling, instead of taking our “critical friends” scores as Gospel, when trying to work out how selective to be.
April 29, 2013 at 8:50 pm
peter
if we’re willing to assume the scores do not encode any information (ie they won’t change year-to-year because of any true improvements or declines in institutional quality), and THES claims the “methodology” is unchanged from 2012, then surely the variation in scores for institutes between 2012 and 2013 is a good proxy for the error in any particular score?
i managed to track down the 2012 spreadsheet and a quick match of the institutes shows that the standard deviation of the differences in their scores between the two years is just 2.2 (with maximum changes of +5.1 and -6.1, so <3-sigma).
which doesn't seem very likely, unless a lot of the score criteria are defaulted to specified values for each institute.
ian
April 29, 2013 at 9:10 pm
…there is something very odd going on – if you look at the number of respondents – there are a large spikes in the frequencies of institutions with ~100 or ~50 responses.
are they culling responses to remove outliers in an attempt to suppress the variance?
in fact if you just look at the standard deviation of the difference in scores between 2012/2013 for those institutions with > 120 responses the variation is even lower – just 1.2.
i’ve mailed you the 2012 spreadsheet in case you want to play.
April 29, 2013 at 9:20 pm
here’s some description of the “methodology” used for this polling:
http://www.youthsight.com/media-centre/announcements/congratulations-to-uea-top-dogs-in-this-years-student-experience-survey-fieldwork-once-again-by-youthsight/
this may explain the spikes in the respondent numbers if they repeatedly polled or chased respondents until they had either >50 (to be included at all) or >100 (which they appear to view as a threshold).
the use of a 7-point scale does make me wonder if the minimal variance is because the respondents choose the middle point for most responses.
April 29, 2013 at 9:21 pm
Not quite right because the sample sizes are different in the two years. Also I think the individual scores are integers so there might be a peculiar effect of that..
April 30, 2013 at 5:29 pm
I noticed the spikes in response numbers too. Assuming no correlation between sample size and quality of institution I guess it should be possible to model the 2013 results on their own as containing an error (which should reduce with sample size as root (n)) and an underlying population variance which wouldn’t. I’ll see if I get a chance to look at this. As an tangential point, this is how we estimate confusion noise in Herschel maps, which would mean if this analysis goes anywhere I’d be able to claim it as REF impact!