By Charles Kowalski
In 1997 the Harvard Business Review judged a bland sounding management concept called the ‘balanced scorecard’ (BSC) as one of the most influential management concepts in the journal’s history (Head 2011). The BSC used ‘key performance indicators’ (KPIs) to measure individuals’ ‘performance’, with a view to IT intensifying this process. For academics this meant that KPIs came to be used to measure ‘research performance’ with the Research Excellent Framework audit in 2014 and the preceding Research Assessment Exercises.
For REF 2014 the KPIs related to: the ‘impact’ of research, which accounted for 20% of the score and pertained to research which would have a benefit for business or ‘civil society’; ‘research environment’ worth 15% of the score which related to the number of PhD students; and ‘research outputs’ which account for 65% of the score. Each subject specific ‘unit of assessment’ (UoA) had a REF sub-panel made up of those deemed experts in their discipline. Scores were awarded for outputs ranging from 4* to 1*. Only 4* and 3* work attracted ‘quality related’ (QR) funding and the status sought by universities operating in an increasingly cut throat market for undergraduate and postgraduate – especially high fee paying international postgraduate – ‘customers’.
Before ‘outputs’ are submitted to the relevant REF sub-panel an internal process of grading and selection occurs. This process is of course subjective and the internal reviewers may not be familiar with all the sub-fields within their discipline. Anecdotes are plentiful of articles that eventually got accepted in journals with a high ‘impact rating’ getting marked as a 1* or 2* output with author concerned then being excluded from the REF or forced to produce another ‘output’ if there was time to compensate for the ‘lower value output’. This is not to endorse the impact ratings of journals (because citations can stem for fashionable topics or the name of authors) but it is to point to the flawed process of internal review. This problem is of course intensified when a discipline is not returned to a matching UoA but one of a cognate discipline.
When the work of a department is submitted to a REF panel many in the department can be excluded and the work of academics at overseas universities can be submitted providing they are on at least a 0.2 (20% of full time) contract. The raw REF results take no account of staff not submitted or those whose ‘outputs’ are simply purchased. The data can be re-interpreted to take these into account by publications like the Time Higher but funding – and status – come from the raw results.
The grading of the outputs by the worthies of the REF sub-panel is as problematic as the internal review. For REF 2014 191, 232 ‘outputs’ had to be read and graded by about 1000 assessors with panels of 10-30 members each having to read and grade hundreds of ‘outputs’ in 1 year. Each member of the physics UoA sub-panel, for example, had to read 640 articles (Sayer 2014). Where books are involved the process is even more demanding of course. Given this, it is not surprising that grades are awarded on the basis of skim reading and subjective impressions, as some panel members admit (Sayer 2014). Furthermore, given the heterogeneity within some disciplines serious questions arise about the competence and prejudices of panel members. The 27 academics on the history UoA sub-panel had to grade nearly 7000 pieces of work (including monographs) from more than 1750 researchers, with the sub-panel not having specialist knowledge of all the subjects within history. Yet, despite lacking such specialist knowledge, they graded a mountain of work, determining whether, in their eyes, it was ‘internationally excellent’ or not (Sayer 2014). Problems concerning prejudice are also compounded by the fact the ‘outputs’ are not reviewed anonymously.
Submitting a Freedom of Information request to see how decisions were made will not get very far. The 2008 Research Assessment Exercise sub-panels shredded all documents showing how they reached their decisions to avoid FOI requests being used to throw light onto the process (Sayer 2014).
With REF 2020 things will be different thanks to big data replacing UoA peer review sub-panels with an intensified use of KPIs / metrics. As Holmwood (2014) argues:
The ‘metricisation’ of the REF is a ‘Big Data’ project, with every academic contributing data points by publishing and citing publications that are available for online searches. Moreover, the current system is so costly that private companies – for example, Thomson Reuters – may office to provide metric data at a lower price. Professional judgment by a panel of peers would be replaced by ‘crowd-sourced’ judgments […T]he metricisation of the REF would allow it to be the sole production of management together with a contracted ‘Big Data’ company.
For neoliberals this is to be celebrated for two reasons. First, it overcomes an outdated patronage system which, for them, may protect vested interests within professional groups, i.e. academics on a sub-panel being favourable to their disciplinary peers. All of which overlooks the more messy reality, where intra-disciplinary prejudices, lack of specialist knowledge and a lack of sympathy way well hold sway. Second, it allows market transparency, with potential employers and managers able to see the immediate ‘worth’ of the human capital they may or have invested in. This, in turn, can lead to intensified micro management, which was a key objective of the BSC and its realisation through KPIs. In place of old fashioned professional autonomy there would be micro management to increase performance and ‘impact’.
Neoliberals need to be careful though. In their submission to the Metrics Review Call for Evidence by HEFCE (the Higher Education Funding Council for England), Martin, Nightingale and Ingenio (2014), who broadly support the use of metrics, argue that:
metrics should be used to inform, rather than substitute for, expert judgment. Metrics, by themselves cannot be set up to provide an algorithm to make decisions, nor can one assume that past performance (what is measured is inevitably in the past) is a reliable guide to future prospects.
Their concern is that the use of metrics alone will lead to ‘gaming’ with researchers publishing only on ‘fashionable topics’ or in the more integrated fields with, for example, oncology ‘performing well’ and epidemiology not doing well. This leads them to argue that ‘a likely consequence of the inappropriate use of metrics is the suppression of diversity and creativity and hence also of socio-economic impact of research’. In other words, increased academic ‘performance’ may well not lead to increased ‘impact’. Instead gaming academics could thrive through what, it is suggested here, we can call auto-patronage.
The ability to practice auto-patronage would vary according to subject. The sciences would be most able (albeit with declining ‘impact’ as a result); the humanities least able; with the social sciences having divided fortunes. Nonetheless, this does not stop the zealous champion of neoliberalism in higher education, David Eastwood (Vice-Chancellor of Birmingham University), celebrating the new metrics regime for ‘internationalising’ research ‘excellence’ (Holmwood 2014). The £410,000 a year marketeer wants a future of auto-patronage, limited creativity and, ironically, limited ‘impact’.
Simon Head 2011 ‘The Grim Threat To British Universities’
John Holmwood Dec. 2014
Ben Martin, Paul Nightingale and Ismael Rafols Dec. 2014 ‘Response To The Metrics Review Call For Evidence’
Derek Sayer Dec. 2014 One Scholar’s Crusade Against The REF