Giter Site home page Giter Site logo

argrecsys / arg-miner Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 3.0 61.41 MB

This repository contains a simple but efficient implementation of an argument-based recommender system, which makes use of NLP techniques and a taxonomy and lexicon of connectors to extract argument graphs from the proposals and citizen debates available in the Decide Madrid e-participation platform.

Home Page: https://argrecsys.github.io/arg-miner/

License: Apache License 2.0

Java 2.17% HTML 52.71% Jupyter Notebook 45.12% Batchfile 0.01%
argument-mining argument-based-recommender-systems nlp arguments connectors recommender-systems decide-madrid

arg-miner's Introduction

Extraction and use of arguments in Recommender Systems

version last-update license

This repository contains a simple but efficient implementation of an argument-based recommender system, which makes use of NLP techniques and a taxonomy and lexicon of connectors to extract argument graphs from the proposals and citizen debates available in the Decide Madrid e-participation platform.

Papers

This work (v1.0) was presented as a paper at Joint Workshop of the 3rd Edition of Knowledge-aware and Conversational Recommender Systems (KaRS) and the 5th Edition of Recommendation in Complex Environments (ComplexRec) co-located with 15th ACM Conference on Recommender Systems (RecSys 2021). Virtual Event, Amsterdam, The Netherlands, September 25, 2021. Paper and presentation slides can be found here.

This work (v2.0) was presented as a paper at 23st Annual International Conference on Digital Government Research. Virtual Event, Seoul National University, South-Korea, June 15, 2022. Paper and presentation slides can be found here.

Part of this work (v2.6.4) has been submitted as a paper to Government Information Quarterly. A draft of the paper can be found here.

Solution

This system is composed of 2 main modules, which are:

  • Argument Miner: automatic argument extractor based on NLP techniques and a lexicon of connectors.
  • Argument-based Recommender System: It makes recommendations of proposals and arguments based on topics and aspects of interest to the user.

Resources

This project uses the two-level taxonomy of argument relations and the set of linguistic connectors (in English and Spanish) published in the argrecsys/connectors repository.

Algorithm

We propose an heuristic method aimed to automatically identify and extract arguments from textual content, which is evaluated on citizen proposals and comments from the Decide Madrid e-participatory platform. The method follows a simple but effective algorithm to address the three basic tasks of argument mining, namely argument detection, argument constituent identification, and argument relation recognition using the two-level taxonomy.

arg-algorithm

Preliminary Results

We present some statistics from a preliminary offline test on the automatic identification and extraction of arguments from the citizen proposals available in the Decide Madrid database:

  • From the full list of 318 connectors in Spanish, 11,645 arguments are identified and extracted (9,676 from simple sentences and 1,969 from compound sentences).
  • Arguments were automatically extracted as follows: 2025 in the proposals and 9620 in the proposal comments.
  • Of the 11,645 arguments extracted (some proposals had more than one argument), 5,136 (44.1 %) were identified with connectors from the CONTRAST category, 4,669 (40.1 %) from the CONSEQUENCE category, 1,316 (11.3 %) from the CAUSE category, 505 (4.3 %) from the ELABORATION category, and 19 (0.2 %) from the CLARIFICATION category.

Descriptive and Network Analysis

  • The descriptive analysis of the result of the automatic annotation can be seen in the following report.
  • Network analysis to find the argumentative threads within the proposals can be seen in the following report.

Outputs

All the results generated by the Argument Miner and Recommender System can be consulted in the following folder.

Example in JSON format of an argument extracted from a citizen proposal about public transportation by the Argument Miner.

{
    "5717-0-1-1": {
        "proposalID": 5717,
        "majorClaim": {
            "entities": "[]",
            "text": "Allowing pets on public transport",
            "nouns": "[pets, transport]"
        },
        "sentence": "We are almost forced to use public transport in the city but pets are not allowed in EMT",
        "claim": {
            "entities": "[]",
            "text": "We are almost forced to use public transport in the city",
            "nouns": "[use, transport, city]"
        },
        "linker": {
            "value": "but", "intent": "attack",
            "category": "CONTRAST", "subCategory": "OPPOSITION"
        },
        "premise": {
            "entities": "[EMT]",
            "text": "pets are not allowed in EMT",
            "nouns": "[pets]"
        },
        "mainVerb": "forced",
        "pattern": {
            "value": "[S]-[conj_LNK]-[S]-[PUNCT]",
            "depth": 1
        },
        "syntacticTree": "(sentences 
                            (S (sn (PRP We)) (group.verb (VBP are) ... ))
                            (conj but)
                            (S (sn (NNS pets)) (group.verb (VBP are) (RB not) ... ))
                            (PUNCT .))"
   }
}

Example in XML format of recommendations of citizen proposals and arguments about public transportation generated by the Argument-based Recommender System

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<recommendations>
    <proposals quantity="5">
        <proposal id="20307" topics="buses" categories="mobility" date="2017-12-10" districts="Tetuán">
            Urban buses connecting San Chinarro and Las Tablas with Cuatro Caminos</proposal>
        <proposal id="1432" topics="environment" categories="mobility" date="2015-09-18" districts="city">
            Public transportation in Madrid Río</proposal>
        <proposal id="5717" topics="pets" categories="mobility" date="2015-11-18" districts="city">
            Allowing pets on public transport</proposal>
        <proposal id="4671" topics="public transport" categories="mobility" date="2015-11-05" districts="city">
            Public transport price</proposal>
        <proposal id="2769" topics="transport pass" categories="mobility" date="2015-10-07" districts="city">
            The Transport Pass should expire in one month</proposal>
    </proposals>
    <topics quantity="1">
        <topic value="transport" aspects="subway,use,price,transports" quantity="4">
            <aspect value="subway" quantity="2">
                <argument id="20307-1">
                    <claim>The PAU of Norte Sanchinarro Las Tablas are poorly served by public transport</claim>
                    <connector category="cause" subcategory="reason" intent="support">due to</connector>
                    <premise>the ineffectiveness of light subway</premise>
                </argument>
                <argument id="1432-1">
                    <claim>The Madrid Rio park was created promising that public transport would reach there</claim>
                    <connector category="contrast" subcategory="opposition" intent="attack">but</connector>
                    <premise>it is false, the Legazpi subway is far away and buses are non-existent</premise>
                </argument>
            </aspect>
            <aspect value="use" quantity="1">
                <argument id="5717-1">
                    <claim>The use of public transport in the city is almost forced</claim>
                    <connector category="contrast" subcategory="opposition" intent="attack">but</connector>
                    <premise>in EMT pets are not allowed</premise>
                </argument>
            </aspect>
            <aspect value="price" quantity="1">
                <argument id="4671-1">
                    <claim>Lower the price of transportation</claim>
                    <connector category="cause" subcategory="reason" intent="support">because</connector>
                    <premise>it is very expensive</premise>
                </argument>
            </aspect>
            <aspect value="transport" quantity="1">
                <argument id="2769-1">
                    <claim>The Madrid Transport Pass expires in 30 days</claim>
                    <connector category="contrast" subcategory="opposition" intent="attack">but</connector>
                    <premise>not all months have 30 days, there are several months that have 31 days</premise>
                </argument>
            </aspect>
        </topic>
    </topics>
</recommendations>

The other results (in file form) of the argument extractor and recommender system can be viewed here.

Dependencies

The implemented solutions depend on or make use of the following libraries and .jar files:

Documentation

Please read the contributing and code of conduct documentation.

Authors

Created on Apr 10, 2021
Created by:

License

This project is licensed under the terms of the Apache License 2.0.

Acknowledgements

This work was supported by the Spanish Ministry of Science and Innovation (PID2019-108965GB-I00).

arg-miner's People

Contributors

ansegura7 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

arg-miner's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.