Giter Site home page Giter Site logo

cloud-usage's Introduction





ASReview: Active learning for Systematic Reviews

Systematically screening large amounts of textual data is time-consuming and often tiresome. The rapidly evolving field of Artificial Intelligence (AI) has allowed the development of AI-aided pipelines that assist in finding relevant texts for search tasks. A well-established approach to increasing efficiency is screening prioritization via Active Learning.

The Active learning for Systematic Reviews (ASReview) project, published in Nature Machine Intelligence implements different machine learning algorithms that interactively query the researcher. ASReview LAB is designed to accelerate the step of screening textual data with a minimum of records to be read by a human with no or very few false negatives. ASReview LAB will save time, increase the quality of output and strengthen the transparency of work when screening large amounts of textual data to retrieve relevant information. Active Learning will support decision-making in any discipline or industry.

ASReview software implements three different modes:

  • Oracle Screen textual data in interaction with the active learning model. The reviewer is the 'oracle', making the labeling decisions.
  • Exploration Explore or demonstrate ASReview LAB with a completely labeled dataset. This mode is suitable for teaching purposes.
  • Simulation Evaluate the performance of active learning models on fully labeled data. Simulations can be run in ASReview LAB or via the command line interface with more advanced options.

Installation

The ASReview software requires Python 3.8 or later. Detailed step-by-step instructions to install Python and ASReview are available for Windows and macOS users.

pip install asreview

Upgrade ASReview with the following command:

pip install --upgrade asreview

To install ASReview LAB with Docker, see Install with Docker.

How it works

ASReview LAB explained - animation

Getting started

Getting Started with ASReview LAB.

ASReview LAB

Citation

If you wish to cite the underlying methodology of the ASReview software, please use the following publication in Nature Machine Intelligence:

van de Schoot, R., de Bruin, J., Schram, R. et al. An open source machine learning framework for efficient and transparent systematic reviews. Nat Mach Intell 3, 125โ€“133 (2021). https://doi.org/10.1038/s42256-020-00287-7

For citing the software, please refer to the specific release of the ASReview software on Zenodo https://doi.org/10.5281/zenodo.3345592. The menu on the right can be used to find the citation format of prevalence.

For more scientific publications on the ASReview software, go to asreview.ai/papers.

Contact

For an overview of the team working on ASReview, see ASReview Research Team. ASReview LAB is maintained by Jonathan de Bruin and Yongchao Terry Ma.

The best resources to find an answer to your question or ways to get in contact with the team are:

PyPI version DOI Downloads CII Best Practices

License

The ASReview software has an Apache 2.0 LICENSE. The ASReview team accepts no responsibility or liability for the use of the ASReview tool or any direct or indirect damages arising out of the application of the tool.

cloud-usage's People

Contributors

abelsiqueira avatar pre-commit-ci[bot] avatar rensvandeschoot avatar zoneout215 avatar

Stargazers

 avatar

Watchers

 avatar

cloud-usage's Issues

When I put more than 3 datasets into the K8s, it ends up with an error

When I put more than 3 datasets into the K8s, it ends up with an error

To Reproduce
I followed README.md with 4 datasets from Synergy-datasets in the data folder: van den Schoot, Lenaars, Hall, and Menon.
This leads to errors in all of the workers:

Received 'asreview simulate data/Hall_2012.csv -s output/simulation/Hall_2012/state_files/sim_Hall_2012_4905.asreview --prior_record_id 4905 231 5214 36 4214 788 8582 7303 3880 4806 7253 --seed 165\n'
Traceback (most recent call last):
File "/root/.local/bin/asreview", line 8, in
sys.exit(main())
File "/root/.local/lib/python3.8/site-packages/asreview/main.py", line 48, in main
entry.load()().execute(sys.argv[2:])
File "/root/.local/lib/python3.8/site-packages/asreview/entry_points/simulate.py", line 137, in execute
raise ProjectExistsError("Project already exists.")
asreview.project.ProjectExistsError: Project already exists.
Traceback (most recent call last):
File "/app/worker-receiver.py", line 53, in
worker = Worker()
File "/app/worker-receiver.py", line 37, in init
channel.start_consuming()
File "/usr/local/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 1883, in start_consuming
self._process_data_events(time_limit=None)
File "/usr/local/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 2044, in _process_data_events
self.connection.process_data_events(time_limit=time_limit)
File "/usr/local/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 851, in process_data_events
self._dispatch_channel_events()
File "/usr/local/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 567, in _dispatch_channel_events
impl_channel._get_cookie()._dispatch_events()
File "/usr/local/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 1510, in _dispatch_events
consumer_info.on_message_callback(self, evt.method,
File "/app/worker-receiver.py", line 41, in callback
subprocess.run(
File "/usr/local/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'asreview simulate data/Hall_2012.csv -s output/simulation/Hall_2012/state_files/sim_Hall_2012_4905.asreview --prior_record_id 4905 231 5214 36 4214 788 8582 7303 3880 4806 7253 --seed 165

Additional information

  • OS: Mac M1
  • ASReview version ..2
  • CPU limit in worker.yml is 500
  • Memory limit in worker.yml is 552

Fix limitation of one data file

Currently, you can only have one data file in the data folder (at least with Makita's ARFI template) because the mkdir of the second file happens after the simulate of the first file.

During minikube start there is an Error on Linux, should I worry or not?

After minikube start I experience this erorr, but everything works fine after this.

m/rabbitmq/cluster-operator/releases/latest/download/cluster-operator.yml"๐Ÿ˜„ minikube v1.30.1 on Ubuntu 20.04 (amd64)
โœจ Automatically selected the docker driver. Other choices: ssh, none
๐Ÿ“Œ Using Docker driver with root privileges
๐Ÿ‘ Starting control plane node minikube in cluster minikube
๐Ÿšœ Pulling base image ...
๐Ÿ”ฅ Creating docker container (CPUs=2, Memory=16000MB) ...| E0528 08:15:17.810888 3192134 network_create.go:102] failed to find free subnet for docker network minikube after 20 attempts: no free private network subnets found with given parameters (start: "192.168.49.0", step: 9, tries: 20)

โ— Unable to create dedicated network, this might result in cluster IP change after restart: un-retryable: no free private network subnets found with given parameters (start: "192.168.49.0", step: 9, tries: 20)

๐Ÿณ Preparing Kubernetes v1.26.3 on Docker 23.0.2 ...| E0528 08:15:28.094721 3192134 start.go:131] Unable to get host IP: network inspect: docker network inspect minikube --format "{"Name": "{{.Name}}","Driver": "{{.Driver}}","Subnet": "{{range .IPAM.Config}}{{.Subnet}}{{end}}","Gateway": "{{range .IPAM.Config}}{{.Gateway}}{{end}}","MTU": {{if (index .Options "com.docker.network.driver.mtu")}}{{(index .Options "com.docker.network.driver.mtu")}}{{else}}0{{end}}, "ContainerIPs": [{{range $k,$v := .Containers }}"{{$v.IPv4Address}}",{{end}}]}": exit status 1
stdout:

stderr:
Error response from daemon: network minikube not found

Workers do not recieve messages from RabbitMQ, but all the pods are running

I have run kubeclt apply -f tasker.yml, and from logs of the takser it is evident that all of commands have been send to the workers. But all the workers show such output:

Logging as default_user_G08RHRdf8edvEBcNsM1
[*] Waiting for messages. CTRL+C to exit
^C

The volume of the dataset is not a problem for RabbitMQ heartbeat, because I have already run simulations via K8s on this data. I may have some difficulties with K8s setup, but I wondered I how I can follow the message from Tasker to Worker, so i can make sure that every step is alright.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.