Posted by: bluesyemre | June 23, 2017

#MicrosoftAcademic is on the verge of becoming a #bibliometric superpower by Sven E. Hug and Martin P. Brändle


Last year, the new Microsoft Academic service was launched. Sven E. Hug and Martin P. Brändle look at how it compares with more established competitors such as Google Scholar, Scopus, and Web of Science. While there are reservations about the availability of instructions for novice users, Microsoft Academic has impressive semantic search functionality, broad coverage, structured and rich metadata, and solid citation analysis features. Moreover, accessing raw data is relatively cheap. Given these benefits and its fast pace of development, Microsoft Academic is on the verge of becoming a bibliometric superpower.

In 2016, Microsoft released a new academic search engine. This happened quietly, as if the company was afraid of embarrassing itself again, as it did years ago in its loss to Google Scholar in the search engine race. However, there is absolutely no need to hide the new database, Microsoft Academic, since it has the potential to outduel Google Scholar, Web of Science, and Scopus. In fact, with 168 million records as of early 2017, the database has already outstripped Web of Science (59 million records) and Scopus (66 million records) in terms of coverage. Nothing can be reliably said with regard to Google Scholar, as its size has not been declared (estimations range from 160-200 million records). To access the new database, one can use the Microsoft Academic search interface or the Academic Knowledge API.

Search less, research more?

First trials show that the search interface of Microsoft Academic returns relatively few but very accurate results. This is due to its semantic search engine, which leverages entities associated with a paper (e.g. fields of study, journal, author, affiliation). In contrast, most other scholarly databases rely on search terms, which are also employed by Microsoft Academic but only if semantic search fails. Much like library databases, Microsoft Academic offers a range of filtering and sorting options to refine search results. This is very convenient and a plus compared to Google Scholar, which provides only very limited refinement options.

However, the search interface in its current stage is not without pitfalls and drawbacks. Above all, tutorials or instructions are virtually missing, leaving any first-time user puzzled. For example, who would have guessed that the small symbols showing up in the search slot represent the entities that constitute the database (e.g. a laboratory flask for “field of study”)? And who would have known that natural language queries such as “papers about bibliometrics after 1977 citing Eugene Garfield” can be performed? Also, one has to be aware that queries need a bit of patience and can be choppy at times, as it takes the engine a while to suggest supplementary search terms and to eventually display the results.

Microsoft recognises that semantic query is still not popular and that users need time to adapt. Hence, the new database may not yet live up to its slogan: “research more, search less”. However, Microsoft Academic is being developed at a relentless pace. Just recently, a social networking site for academics has been integrated. Hopefully, the performance of and the instructions for the search interface will be further improved soon.

Beyond searching: citation analysis

To fully tap the wealth of Microsoft Academic, one has to employ the Academic Knowledge API, which comes at relatively low cost ($0.25 per 1,000 queries). We have examined the API from the perspective of bibliometrics (i.e. the quantitative study of scholarly communication) and found that the metadata is structured and rich and can easily be retrieved, handled, and processed. The API allows retrieving aggregated citation counts and frequency distributions of citations. These features enable the calculation of a wide range of indicators and are a major advantage of Microsoft Academic over Google Scholar. First studies have shown that citation analyses with Microsoft Academic, Scopus, and Web of Science yield similar results with respect to the h-index, average-based and distribution-based indicators, and rank correlations of citation counts.

However, there are some limitations regarding the available metadata. First, the database does not provide the document type of a publication, which is often used for normalising indicators. Second, the fields of study – there are more than 50,000 of them! – cannot readily be employed for bibliometric analyses as they represent the semantics of a paper rather than traditional research fields.

Read complete

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.


%d bloggers like this: