Name: Data Science for Social Impact Research Group @ University of Pretoria
Type: Organization
Bio: We are the Data Science for Social Impact research group at the Computer Science Department, University of Pretoria.
Twitter: dsfsi_research
Location: University of Pretoria, South Africa
Blog: https://dsfsi.github.io
Data Science for Social Impact Research Group @ University of Pretoria's Projects
Zondo Commission or State Capture Commission Transcripts
A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.
Curated corpora for Setswana. Used to train PuoBERTa.
South African Member Of Parliament Data
StatsSA statistical language glossary in machine-readable format
TextAugment: Text Augmentation Library
A template for MSc, MIT and PhD that meets UP requirements.
The dataset contains editions from the South African government magazine Vuk'uzenzele. Data was scraped from PDFs that have been placed in the data/raw folder. The PDFS were obtained from the Vuk'uzenzele website.
This repository is an initial pipeline for reading, processing, labelling and classifying unstructured annual reports of South African (SA) banks with the aim of identifying financial risk. It leveraged work by the Corporate Financial Information Environment-Final Report Structure Extractor (CFIE–FRSE) of El-Haj et al. which created a corpus of annual reports of United Kingdom (UK) companies.
Dataset of South African Disinformation [Fake News] Website Data collected in 2020
IsiZulu News (articles and headlines) and Siswati News (headlines) Corpora - za-isizulu-siswati-news-2022
DSFSI South African Terminlogy Lists and Lexicon Project