Prospective Students can Explore Ranking Data

May 4, 2018

Ranking university programs can be useful: it helps students decide which school to attend, it helps prospective professors decide where to apply for jobs, and it lets university administrators determine which of their units are performing exceptionally well.

What does it really mean for one department to be ranked higher than another? Does it mean that they publish more papers? That more of their graduates create successful companies? It isn’t clear that there’s any single right answer to these questions.

It is clear, however, that ranking can be done badly, and unfortunately, according to the Computing Research Association, this is what has happened to the US News and World Report rankings for
computer science, which is perhaps the most widely used and influential ranking. The CRA issued a statement describing a number of problems with the methods used by US News and World Report–including the fact that they do a poor job tracking the venues where computer scientists publish research papers–and concludes: “Anyone with knowledge of CS research will see these rankings for what they are—nonsense–and ignore them. But others may be seriously misled.”

CRA Statement

Beyond the problems identified by the CRA, the US News rankings are also hard to interpret since the criteria they are based on are not public. We don’t know the formula they use, nor do we have access to the data that they use as input to the secret formula. This makes it hard for people, such as prospective students, to get benefit from rankings, because it just isn’t clear what it means for one computer science department to be ranked above another.

Computer science professor Emery Berger, at the University of Massachusetts at Amherst, has come up with a better way to do rankings called CSRankings. His method is transparent: anyone can inspect the formula that it uses and also the data is fed into the formula. The entire implementation for his ranking system is available as open source software!

CSRankings is based on the idea that the best computer science departments are the ones that publish the most articles at “top tier” conferences. These conferences might accept only 10-20% of the papers submitted for publication each year and they are where the best researchers tend to submit their best work.

By counting only top-tier publications, instead of total publications, CSRankings avoids the problem of inflating the ranking of researchers who publish a large number of low-quality publications. The CSRankings system is carefully designed to be a zero-sum game: the total credit that it gives to a top-tier paper cannot be inflated by adding authors to a paper.

The openness of the CSRankings system and its data set is a huge advantage. The best thing is that the CSRankings web site allows everyone to explore the data.

Let’s say that a prospective student is interested in operating systems and formal verification. That person can select only those two areas of interest and the site will show the departments that publish heavily in top-tier conferences in those specific areas. A prospective student can then drill down at the department level and see who the key players are in those areas and read their code and papers.

This is a fundamentally different use of a ranking system. The ultimate purpose of the rankings is to guide us toward accurate data that can be used to make informed decisions.