Giter Site home page Giter Site logo

amirmohammadkz / personality_detection Goto Github PK

View Code? Open in Web Editor NEW
28.0 4.0 8.0 7.69 MB

BB-SVM model for automatic personality detection of the essays dataset (Big-Five personality labeled traits)

Home Page: https://sentic.net/personality-detection-using-bagged-svm-over-bert.pdf

License: MIT License

Python 100.00%
personality-traits personality-detection bert svm essays-dataset machine-learning scikit-learn personality-profiling sentiment-analysis

personality_detection's Introduction

BB-SVM model for automatic personality detection of the Essays dataset (Big-Five personality labelled traits)

This repository containts Bagging SVM over BERT model for classifying Essays dataset.

Installation

See the requirements.txt for the list of dependent packages which can be installed via:

pip -r requirements.txt

Specified versions are used in the paper. Note that the updated versions of the requirement modules may change the results. Some experiments verified that the updated sklearn improves the accuracy. However, please also check the bert-as-service requirements (e.g. 1.10<Tensorflow ver<2 is required). The code can be run using Python ver 3.7 . Users' feedback indicated that it cannot be run on Python ver> 3.8

Usage

1- Run shrink_data.py to convert documents to subdocuments. By running this step, BERT can process the whole sub-documents.

python shrink_data.py

2- Run BERT server on all layers from cmd/terminal (more information here)

bert-serving-start -model_dir uncased_L-12_H-768_A-12/ -num_worker=4 -max_seq_len=NONE -show_tokens_to_client -pooling_layer -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1

3- Run process_data_with_sentence_bert.py to extract BERT word embeddings.

python process_data_with_sentence_bert.py

4- Run svm_result_calculator.py to extract the personality traits. (you can change the svm.py code to use Bagging or not)

python svm_result_calculator.py

Running Time

On an Intel Core i7-4720 HQ CPU, our fine-tuning model only takes about 7 minutes to train.

Citation

If you use this code in your work then please cite the paper - Personality Trait Detection Using Bagged SVM over BERT Word Embedding Ensembles with the following:

@inproceedings{kazameinipersonality,
  title={Personality Trait Detection Using Bagged SVM over BERT Word Embedding Ensembles},
  author={Kazameini, Amirmohammad and Fatehi, Samin and Mehta, Yash and Eetemadi, Sauleh and Cambria, Erik},
  booktitle={Proceedings of the The Fourth Widening Natural Language Processing Workshop},
  Organization = {Association for Computational Linguistics},
  year={2020}}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.