Never mind the table, look at the sample size!

This morning I was just thinking that it’s been a while since I’ve filed anything in the category marked bad statistics when I glanced at today’s copy of the Times Higher and found something that’s given me an excuse to rectify my lapse. Last week saw the publication of said organ’s new Student Experience Survey which ranks British Universities in order of the responses given by students to questions about various aspects of the teaching, social life and so on. I had a go at this table a few years ago, but they still keep trotting it out. Here are the main results, sorted in decreasing order:


	University	Score	Resp.
1	University of East Anglia	84.8	119
2	University of Oxford	84.2	259
3	University of Sheffield	83.9	192
3	University of Cambridge	83.9	245
5	Loughborough University	82.8	102
6	University of Bath	82.7	159
7	University of Leeds	82.5	219
8	University of Dundee	82.4	103
9	York St John University	81.2	88
10	Lancaster University	81.1	100
11	University of Southampton	80.9	191
11	University of Birmingham	80.9	198
11	University of Nottingham	80.9	270
14	Cardiff University	80.8	113
14	Newcastle University	80.8	125
16	Durham University	80.3	188
17	University of Warwick	80.2	205
18	University of St Andrews	79.8	109
18	University of Glasgow	79.8	131
20	Queen’s University Belfast	79.2	101
21	University of Hull	79.1	106
22	University of Winchester	79	106
23	Northumbria University	78.9	100
23	University of Lincoln	78.9	103
23	University of Strathclyde	78.9	107
26	University of Surrey	78.8	102
26	University of Leicester	78.8	105
26	University of Exeter	78.8	130
29	University of Chester	78.7	102
30	Heriot-Watt University	78.6	101
31	Keele University	78.5	102
32	University of Kent	78.4	110
33	University of Reading	78.1	101
33	Bangor University	78.1	101
35	University of Huddersfield	78	104
36	University of Central Lancashire	77.9	121
37	Queen Mary, University of London	77.8	103
37	University of York	77.8	106
39	University of Edinburgh	77.7	170
40	University of Manchester	77.4	252
41	Imperial College London	77.3	148
42	Swansea University	77.1	103
43	Sheffield Hallam University	77	102
43	Teesside University	77	103
45	Brunel University	76.6	110
46	University of Portsmouth	76.4	107
47	University of Gloucestershire	76.3	53
47	Robert Gordon University	76.3	103
47	Aberystwyth University	76.3	104
50	University of Essex	76	103
50	University of Glamorgan	76	108
50	Plymouth University	76	112
53	University of Sunderland	75.9	100
54	Canterbury Christ Church University	75.8	102
55	De Montfort University	75.7	103
56	University of Bradford	75.5	52
56	University of Sussex	75.5	102
58	Nottingham Trent University	75.4	103
59	University of Roehampton	75.1	102
60	University of Ulster	75	101
60	Staffordshire University	75	102
62	Royal Veterinary College	74.8	50
62	Liverpool John Moores University	74.8	102
64	University of Bristol	74.7	137
65	University of Worcester	74.4	101
66	University of Derby	74.2	101
67	University College London	74.1	102
68	University of Aberdeen	73.9	105
69	University of the West of England	73.8	101
69	Coventry University	73.8	102
71	University of Hertfordshire	73.7	105
72	London School of Economics	73.5	51
73	Royal Holloway, University of London	73.4	104
74	University of Stirling	73.3	54
75	King’s College London	73.2	105
76	Bournemouth University	73.1	103
77	Southampton Solent University	72.7	102
78	Goldsmiths, University of London	72.5	52
78	Leeds Metropolitan University	72.5	106
80	Manchester Metropolitan University	72.2	104
81	University of Liverpool	72	104
82	Birmingham City University	71.8	101
83	Anglia Ruskin University	71.7	102
84	Glasgow Caledonian University	71.1	100
84	Kingston University	71.1	102
86	Aston University	71	52
86	University of Brighton	71	106
88	University of Wolverhampton	70.9	103
89	Oxford Brookes University	70.5	106
90	University of Salford	70.2	102
91	University of Cumbria	69.2	51
92	Napier University	68.8	101
93	University of Greenwich	68.5	102
94	University of Westminster	68.1	101
95	University of Bedfordshire	67.9	100
96	University of the Arts London	66	54
97	City University London	65.4	102
97	London Metropolitan University	65.4	103
97	The University of the West of Scotland	65.4	103
100	Middlesex University	65.1	104
101	University of East London	61.7	51
102	London South Bank University	61.2	50

Average scores		75.5	11459

YouthSight is the source of the data that have been used to compile the table of results for the Times Higher Education Student Experience Survey, and it retains the ownership of those data. Each higher education institution’s score has been indexed to give a percentage of the maximum score attainable. For each of the 21 attributes, students were given a seven-point scale and asked how strongly they agreed or disagreed with a number of statements based on their university experience.

My current employer, the University of Sussex, comes out right on the average (75.5) and is consequently in the middle in this league table. However, let’s look at this in a bit more detail. The number of students whose responses produced the score of 75.5 was just 102. That’s by no means the smallest sample in the survey, either. The University of Sussex has over 13,000 students. The score in this table is therefore obtained from less than 1% of the relevant student population. How representative can the results be, given that the sample is so incredibly small?

What is conspicuous by its absence from this table is any measure of the “margin-of-error” of the estimated score. What I mean by this is how much the sample score would change for Sussex if a different set of 102 students were involved. Unless every Sussex student scores exactly 75.5 then the score will vary from sample to sample. The smaller the sample, the larger the resulting uncertainty.

Given a survey of this type it should be quite straightforward to calculate the spread of scores from student to student within a sample from a given University in terms of the standard deviation, σ, as well as the mean score. Unfortunately, this survey does not include this information. However, lets suppose for the sake of argument that the standard deviation for Cardiff is quite small, say 10% of the mean value, i.e. 7.55. I imagine that it’s much larger than that, in fact, but this is just meant to be by way of an illustration.

If you have a sample size of N then the standard error of the mean is going to be roughly (σ⁄√N) which, for Sussex, is about 0.75. Assuming everything has a normal distribution, this would mean that the “true” score for the full population of Sussex students has a 95% chance of being within two standard errors of the mean, i.e. between 74 and 77. This means Sussex could really be as high as 43rd place or as low as 67th, and that’s making very conservative assumptions about how much one student differs from another within each institution.

That example is just for illustration, and the figures may well be wrong, but my main gripe is that I don’t understand how these guys can get away with publishing results like this without listing the margin of error at all. Perhaps its because that would make it obvious how unreliable the rankings are? Whatever the reason we’d never get away with publishing results without errors in a serious scientific journal.

This sampling uncertainty almost certainly accounts for the big changes from year to year in these tables. For instance, the University of Lincoln is 23rd in this year’s table, but last year was way down in 66th place. Has something dramatic happened there to account for this meteoric rise? I doubt it. It’s more likely to be just a sampling fluctuation.

In fact I seriously doubt whether any of the scores in this table is significantly different from the mean score; the range from top to bottom is only 61 to 85 showing a considerable uniformity across all 102 institutions listed. What a statistically literate person should take from this table is that (a) it’s a complete waste of time and (b) wherever you go to University you’ll probably have a good experience!

Follow @telescoper

This entry was posted on April 29, 2013 at 12:34 pm and is filed under Bad Statistics with tags League Tables, margin of error, statistics, Times Higher. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

12 Responses to “Never mind the table, look at the sample size!”

Daniel Mortlock Says:
April 29, 2013 at 12:50 pm

And you didn’t even get to the further issue of how these small number of responding students were selected from their respective cohorts . . .

Reply
- Ted Bunn Says:
  April 29, 2013 at 2:13 pm
  
  Exactly. If, for instance, a large number of people were asked and a small number responded (which is the way these things are often done in my experience), then the sample is almost certainly non-representative. People are much more likely to respond to a survey if they have strong opinions.
  
  My university places considerable weight in tenure and promotion decisions on student surveys of teaching quality. Response rates for these surveys are sometimes quite low. The people in charge assure me that they know the samples are representative, but I’ve never managed to get a clear statement of what data this is based on.
  
  Reply
  - telescoper Says:
    April 29, 2013 at 5:11 pm
    
    It’s easy. If the response is favourable, the sample must be representative. If not, it must be biased.
  - Michael Kenyon Says:
    April 29, 2013 at 9:07 pm
    
    That must be a good thing as haven’t a lot of universities historically closed ranks when students have complained/ given feedback? The students who do give feedback might well be either very positive or negative, but at least they will hopefully be acknowledged this time. It might not be a big sample size but it is a start.
    
    Having studied at a couple of places on the list I always felt a few lecturers regarded students as something of an inconvenience.
    
    I’m not into having to pay for education, but one good thing to have happened is they are now much more open to scrutiny.
    
    How you interact with people affects how much you get paid/promoted in other workplaces, so why not universities?
    
    Not surprised UEA is at the top, lovely city and campus.
Gary M Says:
April 29, 2013 at 1:38 pm

Peter, you say you don’t understand how these guys can get away with publishing results like these without the associated errors.

What, to me, is even more unfathomable is why university managements can get away with beating departments over the heads with them. Every year we have to write a report explaining why we’ve dropped x places in the NSS results (and what we plan to do about it) or write a piece for dissemination to the rest of the university on what aspects of our ‘excellent practice’ was responsible for us improving our position by y places.

I have been told that I am not allowed to use the argument that tables of data such as NSS and THE are statistically meaningless.

Reply
John Peacock Says:
April 29, 2013 at 8:49 pm

Peter, you really should act in a more responsible manner and stop denigrating the efforts of those involved in producing league tables. If you carry on in this vein, you’ll end up concluding that the results from REF should carry error estimates. Clearly no right-thinking person could support such a suggestion.

Reply
- telescoper Says:
  April 29, 2013 at 9:27 pm
  
  I’m fighting a losing battle over using error bars in our REF modelling, instead of taking our “critical friends” scores as Gospel, when trying to work out how selective to be.
  
  Reply
ian smail Says:
April 29, 2013 at 8:50 pm

peter

if we’re willing to assume the scores do not encode any information (ie they won’t change year-to-year because of any true improvements or declines in institutional quality), and THES claims the “methodology” is unchanged from 2012, then surely the variation in scores for institutes between 2012 and 2013 is a good proxy for the error in any particular score?

i managed to track down the 2012 spreadsheet and a quick match of the institutes shows that the standard deviation of the differences in their scores between the two years is just 2.2 (with maximum changes of +5.1 and -6.1, so <3-sigma).

which doesn't seem very likely, unless a lot of the score criteria are defaulted to specified values for each institute.

ian

Reply
- ian smail Says:
  April 29, 2013 at 9:10 pm
  
  …there is something very odd going on – if you look at the number of respondents – there are a large spikes in the frequencies of institutions with ~100 or ~50 responses.
  
  are they culling responses to remove outliers in an attempt to suppress the variance?
  
  in fact if you just look at the standard deviation of the difference in scores between 2012/2013 for those institutions with > 120 responses the variation is even lower – just 1.2.
  
  i’ve mailed you the 2012 spreadsheet in case you want to play.
  
  Reply
- ian smail Says:
  April 29, 2013 at 9:20 pm
  
  here’s some description of the “methodology” used for this polling:
  
  http://www.youthsight.com/media-centre/announcements/congratulations-to-uea-top-dogs-in-this-years-student-experience-survey-fieldwork-once-again-by-youthsight/
  
  this may explain the spikes in the respondent numbers if they repeatedly polled or chased respondents until they had either >50 (to be included at all) or >100 (which they appear to view as a threshold).
  
  the use of a 7-point scale does make me wonder if the minimal variance is because the respondents choose the middle point for most responses.
  
  Reply
- telescoper Says:
  April 29, 2013 at 9:21 pm
  
  Not quite right because the sample sizes are different in the two years. Also I think the individual scores are integers so there might be a peculiar effect of that..
  
  Reply
Seb Oliver Says:
April 30, 2013 at 5:29 pm

I noticed the spikes in response numbers too. Assuming no correlation between sample size and quality of institution I guess it should be possible to model the 2013 results on their own as containing an error (which should reduce with sample size as root (n)) and an underlying population variance which wouldn’t. I’ll see if I get a chance to look at this. As an tangential point, this is how we estimate confusion noise in Herschel maps, which would mean if this analysis goes anywhere I’d be able to claim it as REF impact!

Reply