Why we should be cautious about university rankings

James Raftery says Belinda Bozzoli takes a step backward in her call for them to be embraced


There is an urgent need for more public awareness and critical discussion of the international university ranking systems. In this regard, Belinda Bozzoli takes a step backward in her article "It's time we embraced and consciously competed in world university rankings, however flawed they may be" [Politicsweb, 28 August 2014]. On higher education in general, her voice is a potentially useful one. And in the piece of 28 August, she expresses some valid concerns - e.g., the dangers of ever-increasing class sizes and the need for appropriate investment. But she sees rankings-fixation as a price we must be prepared to pay for articulating these concerns, and here I disagree.

For academics, the rankings are part of a bigger picture. Their relatively recent emergence is one of three great pressures affecting performance evaluation and management strategy in South African universities. Another is the government's policy of funding institutions by a formula that rewards numbers of publications in accredited journals, regardless of their quality.

The third is an international imperative, since the 1980s, for universities to more closely resemble businesses and the corporate world, where measurement (as opposed to judgment) is a natural paradigm. The robustness of that imperative is perhaps surprising, given the spectacular corporate failures of the last 10-15 years.

There are now numerous bodies that rank universities. They are largely commercial concerns, not public services, and not all of their products were designed to be used as league tables. Each of them adopts a set of preferred "indicators" (such as "research"), as well as metrics that encode an institution's performance in an indicated category as a single number.

There are already obvious dangers here, as academic activity should be judged, rather than measured, but crudeness is not the only problem. Metrics notoriously distort, over time, the data that they set out to capture. In academia, they can lead to a cynical downward revision of intellectual goals, as scholars struggle to meet short-term productivity targets, risking a loss of public trust in the long run. Physicist Peter Higgs says, for instance, that his identification of the Higgs boson in 1964 - the source of his recent Nobel prize - would not be achievable today, because academics are expected to "keep churning out papers" [The Guardian, 6 December 2013].

Performance metrics quickly become better detectors of glitter than of gold, and this is no longer a rarefied concern. The resulting damage to science has attracted comment in a leading article in The Economist ["How science goes wrong", 19 October 2013], which concludes that "the false trails laid down by shoddy research are an unforgivable barrier to understanding". Another shocking symptom of the publish-or-perish culture is the rise of published papers that are in fact computer-generated gibberish [see Ian Sample's article on this in The Guardian, 26 February 2014]. There is clearly more at stake here than Bozzoli's Rugby World Cup analogy would suggest.

But the story gets worse. It is a time-consuming responsibility to judge an institution (or, for that matter, a person). And the 21st century attention span is not well attuned to a list of universities, each accompanied by an ARRAY of numbers, corresponding to different indicators. Judgment-shy officials will prefer the crude simplicity of a SINGLE number to an array, as two arrays can't easily be compared. Thus, the temptation arises to attach WEIGHTS to multiple indicators, and then combine them into an overall magical score, yielding a linear ranking of institutions (or again, people).

Even if the metrics are trustworthy (and that's a big "if"), the weights attached to them are always arbitrary. The ranking systems don't agree on the choice of indicators (let alone metrics), but if they did, there would still be no correct way to assign weights. The overall scores and relative positions of institutions will always be dubious functions of this subjectivity. That is presumably one of the "flaws" alluded to - but unaccountably dismissed - in Bozzoli's apologia.

Not every ranking system resorts to weighting indicators for the sake of a linear outcome. The SCImago and the Leiden rankings resist this temptation, balking perhaps at its presumption. As a consequence, those two systems command less media attention than the ones mentioned by Bozzoli - the so-called "big three", viz. the Shanghai system (whose real name is the Academic Ranking of World Universities, or ARWU), the Times Higher Education (THE) system, and the Quacquarelli Symonds (or QS) Rankings.

Eschewing nuance, the big three order the world's universities from top to bottom. Only the first few hundred positions are made public, however, and within these limits, the lower echelons are sometimes announced just as groupings - presumably because of a lack of confidence in the suppressed detail.

In the QS rankings, "reputation surveys" account for 50 percent of an institution's score, attracting charges of opacity. In the THE system, such surveys count for 33 percent, and income for a further 10.75 percent. Bad news for most South African universities is that high student-staff ratios are penalized in both systems, more severely in QS. The Shanghai Rankings offer greater transparency, as they avoid reputation surveys, but instead they reward too generously the existence of academic "celebrities".

They attach a weight of 30 percent to the presence of Nobel Prize or Fields Medal Laureates among the staff or alumni of a university, and a further 20 percent to the presence of so-called "highly cited researchers". All of the big three count publications and/or citations, but in the Shanghai system, 20 percent of a university's score comes from its output in just two journals - "Nature" and "Science". [For more about the ranking methods, see A.P. Matthews, "South African universities in the world rankings", Scientometrics 92 (2012), 675-695.]

Simon Marginson, who serves on the advisory boards of both the Shanghai and the THE Rankings, says that, in social science terms, such products are "rubbish". He argues that the THE and QS are fatally flawed outside their selection of the top 50 universities, and he criticizes Shanghai's preoccupation with Nobel Laureates. He recommends that we "collectively ... critique and discredit the bad social science at the base of multi-indicator rankings", adding that the predilection of governments to use rankings as a proxy for quality makes speaking out even more important [The Australian, 16 October 2013]. Ellen Hazenkorn, an advisor to the Higher Education Authority in Ireland, writes that the rankings are "replete with perceptions of conflict of interest and self-interest, along with self-appointed auditors - all of which, in this post-global financial crisis age, would almost certainly provoke concern in other sectors" [University World News, 4 April 2014].

Lists of the top 50 universities tend to include the most famous names. Critics argue that the rankings start by reflecting inequalities in society and then help to entrench them. This point is elaborated, for instance, in Hazenkorn's book "Rankings and the Reshaping of Higher Education: The Battle for World-Class Excellence" (Palgrave Macmillan, 2011). Certainly, already famous universities are the best placed to attract new celebrities to their ranks. Without their reputation surveys and celebrity dividends, the THE, QS and Shanghai outcomes would probably become more volatile - as other ranking systems already are - and that might reduce their influence. But the truth about the quality of tertiary institutions is inherently complex, and no-one is well-served by questionable stabilizing criteria that hide the complexity.

The list of highly cited researchers (HCRs) is updated from time to time by the commercial company Thomson Reuters. Recent delays and stability issues with this list have attracted attention on Richard Holmes' blog "University Ranking Watch", and elsewhere. Bozzoli concedes that much "gaming" of the system occurs in the form of HCR trading - which might be the quickest way "up" for South African universities if we had the money - but she asserts that the purchase of celebrities at the University of Queensland has had a good trickle-down effect. Be that as it may, other institutions - most notably King Abdulaziz University in Jeddah, Saudi Arabia - have made spectacular progress in the Shanghai rankings by becoming the SECONDARY affiliation of multiple HCRs based at other universities, and this kind of manipulation can hardly be beneficial [see Paul Jump's article in Times Higher Education, 17 July 2014].

We should remember the extent to which the mere EXISTENCE of signed-up celebrities - as opposed to their current activities - affects an institution's standing in the Shanghai system, and the consequent volatility that can result from re-locations. According to the aforementioned article of Matthews, for instance, the University of Cape Town would fall 130 places if it lost its (presumably few) Nobel alumni and HCRs, and it is hard to see in this a correct measure of the real impact of such a loss.

Academics are frequently reminded that the rankings matter to the parents of prospective students, but the Shanghai system makes no attempt to evaluate anything other than research. The quality of education is deliberately excluded from its remit. Holmes, in an only partly satirical list of 20 ways for a university to rise in the rankings, puts "get rid of students" as his first recommendation [University Ranking Watch, 22 December 2013]. And it is doubtful that the ranking of one's alma mater enhances one's employability - see D.D. Gutterplan, "Re-evaluating the college rankings game" [New York Times, 1 June 2014].

A significant indicator in the THE system is the average number of citations per paper published over the past six years. This provides an instructive example of how bibliometrics can go wrong, as it is a fraction whose denominator is the total number of publications at a given university (regardless of staff size). If that number is unusually LOW, and if a few of the papers are highly cited, a misleadingly large score can be achieved. University Ranking Watch [2 June 2014] sees in this the only plausible explanation for the meteoric ascent of India's Panjab University into the THE's top 250 in 2013, given the THE's own simultaneous negative verdict on that institution's research, and the involvement of a few of its staff with the international large Hadron Collider project (one of whose outputs - a paper on the Higgs boson - has already accumulated 10,000 citations).

The important point is that such embarrassments are not isolated anomalies. The quality of a university, like that of an individual's work, is simply not reducible to a single number, and no pretence that it is will eliminate absurdities of this kind. Incidentally, on the perils of counting citations and the meaninglessness of journal "impact factors", see David Colquhoun's spirited article "How to get good science" [Physiology News No 69, Winter 2007], where he argues that bibliometry is "as much witchcraft as homeopathy".

Bozzoli writes "... if you accept that these quantitative measures ... are but a proxy for the real issues of quality, then they become far less worrying." Indeed they might, but she provides no convincing justification for her premise and, as indicated above, there are good reasons to reject it. She says, in effect, that universities with the "best" staff (and low class sizes) should perform well in hypothetically reliable rankings, hence the converse should obtain in the systems that actually exist. She adds that it does obtain at Harvard, which comes top in most of the rankings - but she admits in a different context that Harvard's endowment equates to one percent of the USA's gross domestic product. Beyond the carefully chosen examples in her article, the argumentative burden of the "rankist" is formidable.

The word "best" demands explication but, once outsourced to the rankings, it becomes circular in any defense of the rankings themselves. In shrugging this burden off, Bozzoli exemplifies a depressing fatalism over the "numbers culture" embattling academia worldwide. With no good yardstick to hand, she would have us endorse bad ones, as long as they enthrall a significant clientele whose ignorance or cynicism can be counted on. In her own words, "if you can't beat them, join them". The abdication of leadership in this prescription is disappointing.

Bozzoli also claims that our higher education sector is "mired in fruitless and paralyzing debates" about criticisms of the rankings, while the ranking outcomes are not taken seriously in our universities. Although there may be differences between institutions, my impression is that local academics already experience the numbers culture as an ever-present tyranny. We owe it to the taxpayer and to our students to believe in what we publish, but the pressures of bibliometry - of which the rankings are an outgrowth - make it harder and harder to hold on to that aim.

Meanwhile, academic managers are largely prevented from developing a much-needed critique of the whole system, being hostages to their own need to compete in it. We have reached a point where we need fewer score-sheets and better arguments. Reacting to parallel developments in the UK, historian Stefan Collini sounds a warning that may be equally pertinent here. In his book "What are universities for?" (Penguin, 2012), he reminds us that "we are merely custodians for the present generation of a complex intellectual inheritance which we did not create - and which is not ours to destroy".

James Raftery is a professor of mathematics at the University of Pretoria. He writes in his personal capacity.

Click here to sign up to receive our free daily headline email newsletter