Giter Site home page Giter Site logo

elmadany / orca Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ubc-nlp/orca

0.0 0.0 0.0 83 KB

ORCA is a large-scale Arabic Language Understanding Evaluation Benchmark

Home Page: https://orca.dlnlp.ai/

License: Apache License 2.0

Python 92.55% Jupyter Notebook 7.45%

orca's Introduction



Documentation GitHub stars GitHub forks

In this work, we introduce ORCA, a publicly available benchmark for Arabic language understanding evaluation. ORCA is carefully constructed to cover diverse Arabic varieties and a wide range of challenging Arabic understanding tasks exploiting 60 different datasets across seven NLU task clusters. To measure current progress in Arabic NLU, we use ORCA to offer a comprehensive comparison between 18 multilingual and Arabic language models.

ORCA Task Cluster

We arrange ORCA, into seven NLU task clusters. These are (1) sentence classification, (2) structured prediction (3) semantic textual similarity and paraphrase, (4) text classification, (5) natural language inference, (6) word sense disambiguation, and (7) question answering.

(1) Natural Language Inference (NLI)

Task Variation Metric Reference
ANS Stance MSA Macro F1 (Khouja, 2020)
Baly Stance MSA Macro F1 (Balyet al., 2018)
XLNI MSA Macro F1 (Conneau et al., 2018)

(2) Question Answering (QA)

Task Variation Metric Reference
Question Answering MSA Macro F1 (Abdul-Mageed et al., 2020a)

(3) Semantic Textual Similarity and Paraphrase (STSP)

Task Variation Metric Reference
Emotion Regression MSA Spearman Correlation (Saif et al., 2018)
MQ2Q MSA Macro F1 (Seelawi al., 2019)
STS MSA Macro F1 (Cer et al., 2017)

(4) Sentence Classification (SC)

Task Variation Metric Reference
Abusive DA Macro F1 (Mulki et al., 2019)
Adult DA Macro F1 (Mubarak et al., 2021)
Age DA Macro F1 (Abdul-Mageed et al., 2020b)
ANS Claim MSA Macro F1 (Khouja, 2020)
ANS Claim MSA Macro F1 (Khouja, 2020)
Dangerous DA Macro F1 (Alshehri et al., 2020)
Dialect Binary DA Macro F1 (Farha, 2020), (Zaidan, 2014), (Abdul-Mageed et al., 2020c), (Bouamor et al., 2019), (Abdelaliet al., 2020), (El-Haj, 2020).
Dialect Country DA Macro F1 (Farha, 2020), (Zaidan, 2014), (Abdul-Mageed et al., 2020c), (Bouamor et al., 2019), (Abdelaliet al., 2020), (El-Haj, 2020).
Dialect Region DA Macro F1 (Farha, 2020), (Zaidan, 2014), (Abdul-Mageed et al., 2020c), (Bouamor et al., 2019), (Abdelaliet al., 2020), (El-Haj, 2020).
Emotion DA Macro F1 (Abdul-Mageed et al., 2020b)
Gender DA Macro F1 (Abdul-Mageed et al., 2020b)
Hate Speech DA Macro F1 (Mubarak et al., 2020)
Irony DA Macro F1 (Ghanem al., 2019)
Machine Generation MSA Macro F1 (Nagoudi et al., 2020)
Offensive DA Macro F1 (Mubarak et al., 2020)
Sarcasm DA Macro F1 (Farha and Magdy, 2020)
Sentiment Analysis DA Macro F1 (Abdul-Mageed et al., 2020c)

(5) Structure Predictions (SP)

Task Variation Metric Reference
Aqmar NER MSA Macro F1 (Mohit, 2012)
Arabic NER Corpus MSA Macro F1 (Benajiba and Rosso, 2007)
Dialect Part Of Speech DA Macro F1 (Darwish et al., 2018)
MSA Part Of Speech MSA Macro F1 (Liang et al., 2020)

(6) Topic Classification (TC)

Task Variation Metric Reference
Topic MSA Macro F1 (Abbas et al.,2011), (Chouigui et al.,2017), (Saad, 2010).

(7) Word Sense Disambiguation (WSD)

Task Variation Metric Reference
Word Sense Disambiguation MSA Macro F1 (El-Razzaz, 2021)

How to use ORCA

Install Requirments

    pip install datasets transformers seqeval

Fine-tuning a model on ORCA tasks

We provide a Google Colab Notebook that includes instructions for fine-tuning any model on ORCA tasks. colab

Submitting your results on ORCA test

We design a public leaderboard for scoring PLMs on ORCA. Our leaderboard is interactive and offers rich meta-data about the various datasets involved as well as the language models we evaluate.

You can evalute your models using ORCA leaderboard: https://orca.dlnlp.ai


Citation

If you use ORCA for your scientific publication, or if you find the resources in this repository useful, please cite our paper as follows (to be updated):

@article{elmadany2022orca,
  title={{ORCA: A Challenging Benchmark for Arabic Language Understanding}},
  author={Elmadany, AbdelRahim and Nagoudi, El Moatez Billah and Abdul-Mageed, Muhammad},
  booktitle = "61st Annual Meeting of the Association for Computational Linguistics (ACL’23)",
  address = "Toronto, Canada",
  publisher = "Association for Computational Linguistics",
  url={https://arxiv.org/pdf/2212.10758.pdf},
  year={2023}
}


Acknowledgments

We gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada, the Social Sciences and Humanities Research Council of Canada, Canadian Foundation for Innovation, ComputeCanada and UBC ARC-Sockeye. We also thank the Google TensorFlow Research Cloud (TFRC) program for providing us with free TPU access.

orca's People

Contributors

nagoudi avatar elmadany avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.