Giter Site home page Giter Site logo

kevinderrane / data-analytics-thesis Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 5.87 MB

My thesis on ranking algorithms, submitted in partial fulfilment of the requirements for the degree of Masters of Science in Computer Science (Data Analytics).

License: MIT License

R 100.00%
ranking-algorithms evaluation-metrics citation-cartels algorithms thesis

data-analytics-thesis's Introduction

Data-Analytics-Thesis

My thesis on ranking algorithms, submitted in partial fulfilment of the requirements for the degree of Masters of Science in Computer Science (Data Analytics).

This thesis acheived a first class honours degree.

Abstract

Citation analysis is an important tool used to evaluate researchers and their scientific work. The most common evaluation metrics used today are the impact factor for journals and the h-index for authors. In recent years a trend has emerged where these evaluation metrics are increasingly being used to determine whether or not a researcher gets considered for a job, gets a promotion, or even gets considered for a government grant. The issue here is that these evaluation metrics are easily manipulated by self-citations and the more serious recent emergence of citation cartels. On the one hand, self-citations are easy to spot but on the other hand, citation cartels are not. This research project introduces alternative approaches, which are based on Google’s PageRank algorithm, to evaluate researchers and journals. A citation dataset composed by Valcav Belák, ArnetCite, was used. How these algorithms ranked papers compared to raw citation counts was first looked at. The robustness of these algorithms against author self-citations was then determined. After this, four of the lowest ranking papers in both algorithms were chosen and a citation cartel was formed by creating synthetic citation data with cartel features by modifying existing entries. The performance of the algorithms is measured in terms of how robust they are after their scores were recalculated when the cartel was created. The methodologies and the results of the algorithms are discussed, and future work and limitations are also provided.

data-analytics-thesis's People

Contributors

kevinderrane avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.