Is your feature request related to a problem? Please describe.
The aggregated data used for our backend needs to be stored in a MongoDB. Currently, we store the papers as pdfs and other information in txt or CSV which is not compatible.
Describe the solution you'd like
Implement a module in this repository that connects to a local MongoDB and stores the data for papers the same way as defined here.
The collection and document models of MongoDB should always match the backend repository.
If there are features that are not yet implemented in this repository yet (e.g., institutions), ignore them for now, but have a general interface such that later it can be included (e.g., via a callback function).
There should be an environment variable to connect the CLI to an online MongoDB for later deployment. We won't use it now, but later it is important.
Additional context
For example, there could be a file created under nlpland/modules/database.py where we connect to a local database, and store the paper information. For testing purposes, we can store a few papers from each conference and year.