getrd / academic-file-converter Goto Github PK

View Code? Open in Web Editor NEW

333.0 13.0 98.0 472 KB

📚 Import Bibtex publications and Jupyter Notebook blog posts into your Markdown website or book. 将Bibtex转换为Markdown网站

Home Page: https://docs.hugoblox.com/reference/content-types/#automatically-import-publications-from-bibtex

License: MIT License

Python 56.76% TeX 3.50% Makefile 0.46% Jupyter Notebook 39.28%

academic hugo cli bibtex gatsby nextjs jekyll biblatex latex markdown

academic-file-converter's Introduction

中文

Academic File Converter

📚 Easily import publications and Jupyter notebooks to your Markdown-formatted website or book

Features

Import Jupyter notebooks as blog posts or book chapters
Import publications (such as books, conference proceedings, and journals) from your reference manager to your Markdown-formatted website or book
- Simply export a BibTeX file from your reference manager, such as Zotero, and provide this as the input to the converter tool
Compatible with all static website generators such as Next, Astro, Gatsby, Hugo, etc.
Easy to use - 100% Python, no dependency on complex software such as Pandoc
Automate file conversions using a GitHub Action

Community

📚 View the documentation below
💬 Chat live with the community on Discord
🐦 Twitter: @GetResearchDev @GeorgeCushen #MadeWithAcademic

❤️ Support Open Research & Open Source

We are on a mission to foster open research by developing open source tools like this.

To help us develop this open source software sustainably under the MIT license, we ask all individuals and businesses that use it to help support its ongoing maintenance and development via sponsorship and contributing.

Support the open research movement:

⭐️ Star this project on GitHub
❤️ Become a GitHub Sponsor and unlock perks
☕️ Donate a coffee
👩‍💻 Contribute

Installation

Open your Terminal or Command Prompt app and enter one of the installation commands below.

With Pipx

For the easiest installation, install with Pipx:

pipx install academic

Pipx will automatically install the required Python version for you in a dedicated environment.

With Pip

To install using the Python's Pip tool, ensure you have Python 3.11+ installed and then run:

pip3 install -U academic

Usage

Open your Command Line or Terminal app and use the cd command to navigate to the folder containing the files you wish to convert, for example:

cd ~/Documents/my_website

Import publications

Download references from your reference manager, such as Zotero, in the Bibtex format.

Say we downloaded our publications to a file named my_publications.bib within the website folder, let's import them into the content/publication/ folder:

academic import my_publications.bib content/publication/ --compact

Optional arguments:

--compact Generate minimal markdown without comments or empty keys
--overwrite Overwrite any existing publications in the output folder
--normalize Normalize tags by converting them to lowercase and capitalizing the first letter (e.g. "sciEnCE" -> "Science")
--featured Flag these publications as featured (to appear in your website's Featured Publications section)
--verbose or -v Show verbose messages
--help Help

Import full text and cover image

After importing publications, we suggest you:

Edit the Markdown body of each publication to add the full text directly to the page (if the publication is open access), or otherwise, to add supplementary notes for each publication
Add an image named featured to each publication's folder to visually represent your publication on the page and for sharing on social media
Add the publication PDF to each publication folder (for open access publications), to enable your website visitors to download your publication

Learn more in the Hugo Blox Docs.

Import blog posts from Jupyter Notebooks

Say we have our notebooks in a notebooks folder within the website folder, let's import them into the content/post/ folder:

academic import 'notebooks/*.ipynb' content/post/ --verbose

Optional arguments:

--overwrite Overwrite any existing blog posts in the output folder
--verbose or -v Show verbose messages
--help Help

Contribute

Interested in contributing to open source and open research?

Learn how to contribute code on Github.

Check out the open issues and contribute a Pull Request.

For local development, clone this repository and use Poetry to install and run the converter using the following commands:

git clone https://github.com/GetRD/academic-file-converter.git
cd academic-file-converter
poetry install
poetry run academic import tests/data/article.bib output/publication/ --overwrite --compact
poetry run academic import 'tests/data/**/*.ipynb' output/post/ --overwrite --verbose

When preparing a contribution, run the following checks and ensure that they all pass:

Lint: make lint
Format: make format
Test: make test
Type check: make type

Help beta test the dev version

You can help test the latest development version by installing the latest main branch from GitHub:

pip3 install -U git+https://github.com/GetRD/academic-file-converter.git

License

Licensed under the MIT License.

academic-file-converter's People

Contributors

Stargazers

Watchers

Forkers

andreas-h fredericloulergue christianbrodbeck laszewsk nimdvir liuq harvester57 demol-ee laurentperrinet ceandrade gobikannanp sjamieson matthewturk iiegn joe4dev mohannadcse rschmehl jayhesselberth highlando yng87 drt24 lsy3 darmajalayame panggilajanamaku astro-informatics markusdr astrochun jeffscottlevine bilalqtech aludwig25 danielschemmel flyn-org ttjpleizier mrustl rk-mlu qtm-iisc arichardson yongjiguan sekisekula allansp84 halexand minghao2016 akademiekonometri aiclouds dilipsinghal66 dpastorgalan vadimiljin trenchmortar solliolli gaybro8777 openuniversity nirvananimbusa ric-bianchi armfazh 7yl4r bbest jacquesdurden quantumpl iowaguy dyurovsky stevenmayher edu2all ostaadt haipinglu bidouilles ustab bearpaw website-templete-inspiration chandreshiit cmssliangxin ionizing skdsusit reveurmichael aruznieto lirenfeng2018 tlml jgieseler bdai6 ranulfobezerra szarnyasg griseljimenez zichengc20 sebakorb aleksiknuutila opmorgan danieledipompeo rufai5 dsachar snel-repo hugopicanco amarmandavia djungarian-infoseeker jihaechoi dev-random-tech leejedev monellz

academic-file-converter's Issues

volume and number are missing in MLA and APA with Journal publications

If you look at the exported BibTeX of a journal publication type the volume and number is listed as

number = {3},
volume = {12},

(the numbers are just sample data of course)
but it is not added in the index.md!
This info should be in the publication list of Academic for both MLA and APA style.
Is it possible to include this info in the index.md files and view them on the generated Hugo Academic website?
Thanks for any updates.

import assets: include MathJax & Reveal extensions

Problem, as defined in #13:

[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (MathMenu.js, line 0)
[Error] Refused to execute http://localhost:1313/extensions/MathMenu.js?V=2.7.4 as script because "X-Content-Type: nosniff" was given and its Content-Type is not a script MIME type.
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (MathZoom.js, line 0)
[Error] Refused to execute http://localhost:1313/extensions/MathZoom.js?V=2.7.4 as script because "X-Content-Type: nosniff" was given and its Content-Type is not a script MIME type.

Similar situation with other dynamically loaded scripts such as Reveal extensions.

Newline character needed

Hello all,

Every time I update/clean install this Python package, I have to manually add a newline character \n at line https://github.com/sourcethemes/academic-admin/blob/e144c4fe6e4df3357764cfb37481343d1ff91cc9/academic/cli.py#L334

My code looks like f.write(source_file.read() + '\n')

Otherwise, some comments at the beginning of the CSS and JS files are not properly escaped, thus ending in a dysfunctional main.min.js script, and in errors in the Console tab of Developer Tools in Chrome/Firefox.

Am I the only one having this issue ?

Thanks !

Unable to install

Hi,

First, thumbs up for what looks like a great tool. Unfortunately I didn't succeed in the installation. I have python3 setup, I did pip3 install -U academic with success, but trying to import returns the following academic: command not found

Sorry to bother you with probably a trivial question, but what did I do wrong (Academic was installed with git, hugo is installed version v0.50)?

Improve CI in GitHub Action

Building upon #49 where the GitHub Action for CI was created.

See workflow at https://github.com/wowchemy/hugo-academic-cli/actions

Todo:

Formatting (Isort and Black) will not currently provide errors, so requires reading the log, even if workflow is successful. Auto commit these formatting improvements in CI if there are formatting changes.
Complete implementation of code coverage (initially with low coverage threshold)
Fix the 2 format flow warnings below:

/home/runner/.local/share/virtualenvs/hugo-academic-cli-jnkQB8Ka/lib/python3.8/site-packages/isort/main.py:914: UserWarning: W0501: The following deprecated CLI flags were used and ignored: --recursive!
  warn(
/home/runner/.local/share/virtualenvs/hugo-academic-cli-jnkQB8Ka/lib/python3.8/site-packages/isort/main.py:918: UserWarning: W0500: Please see the 5.0.0 Upgrade guide: https://pycqa.github.io/isort/docs/upgrade_guides/5.0.0/
  warn(

Italics are stripped from titles

I have a bibtex entry from my bibtex repository that looks like this:

@article{kamvar2018something,
  title = {Something in the agar does not compute: on the discriminatory power of mycelial compatibility in {\textit{Sclerotinia sclerotiorum}}},
  volume = {44},
  issn = {1983-2052},
  url = {https://doi.org/10.1007/s40858-018-0263-8},
  doi = {10.1007/s40858-018-0263-8},
  number = {1},
  journal = {Tropical Plant Pathology},
  publisher = {Springer Science and Business Media LLC},
  author = {Kamvar, Zhian N. and Everhart, Sydney E.},
  year = {2018},
  month = {Nov},
  pages = {32–40}
}

However, when I try to import it with academic, the curly braces get stripped away and the species name goes from Sclerotinia sclerotiorum to \textitSclerotinia sclerotiorum:

@article{kamvar2018something,
 author = {Kamvar, Zhian N. and Everhart, Sydney E.},
 doi = {10.1007/s40858-018-0263-8},
 issn = {1983-2052},
 journal = {Tropical Plant Pathology},
 month = {Nov},
 number = {1},
 pages = {32–40},
 publisher = {Springer Science and Business Media LLC},
 title = {Something in the agar does not compute: on the discriminatory power of mycelial compatibility in \textitSclerotinia sclerotiorum},
 url = {https://doi.org/10.1007/s40858-018-0263-8},
 volume = {44},
 year = {2018}
}

Is there a way to get the bibtex parser to replace common tags like textit, textbf, and textsc with their html equivalents?

When importing thesis' metadata, if it could also import if it is a B.S., M.S., or Ph.D. Thesis

So, when using academic import to import Thesis metadata, I would love for the importer to also distinguish between B.S., M.S. or Ph.D. Thesis (which is already part of the metadata in my references).

Alternatively, which metadata does academic import look for when populating the "publication" section? Maybe I could add the metadata there. Ultimately, I would love the information to appear here:

When BibTeX type is book, set Markdown `publication` to BibTeX `publisher`

The key publisher is not converted into the Markdown key publication when the BibTeX entry type is @book but it works fine when the entry type is an @article.

TypeError: combining() argument must be a unicode character, not str

Python 3.8.
Most recent version of academic-admin

PS C:\Users\Brian\Documents\GitHub\me> academic import --bibtex 'C:\Users\Brian\Downloads\MyLibrary.bib' Traceback (most recent call last): File "c:\python38\lib\runpy.py", line 192, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\python38\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Python38\Scripts\academic.exe\__main__.py", line 9, in <module> File "c:\python38\lib\site-packages\academic\cli.py", line 50, in main parse_args(sys.argv[1:]) # Strip command name, leave just args. File "c:\python38\lib\site-packages\academic\cli.py", line 104, in parse_args import_bibtex(known_args.bibtex, File "c:\python38\lib\site-packages\academic\cli.py", line 126, in import_bibtex bib_database = bibtexparser.load(bibtex_file, parser=parser) File "c:\python38\lib\site-packages\bibtexparser\__init__.py", line 71, in load return parser.parse_file(bibtex_file) File "c:\python38\lib\site-packages\bibtexparser\bparser.py", line 177, in parse_file return self.parse(file.read(), partial=partial) File "c:\python38\lib\site-packages\bibtexparser\bparser.py", line 155, in parse self._expr.parseFile(bibtex_file_obj) File "c:\python38\lib\site-packages\bibtexparser\bibtexexpression.py", line 286, in parseFile return self.main_expression.parseFile(file_obj, parseAll=True) File "c:\python38\lib\site-packages\pyparsing.py", line 2561, in parseFile return self.parseString(file_contents, parseAll) File "c:\python38\lib\site-packages\pyparsing.py", line 1935, in parseString loc, tokens = self._parse(instring, 0) File "c:\python38\lib\site-packages\pyparsing.py", line 1675, in _parseNoCache loc, tokens = self.parseImpl(instring, preloc, doActions) File "c:\python38\lib\site-packages\pyparsing.py", line 4762, in parseImpl return super(ZeroOrMore, self).parseImpl(instring, loc, doActions) File "c:\python38\lib\site-packages\pyparsing.py", line 4688, in parseImpl loc, tmptokens = self_expr_parse(instring, preloc, doActions) File "c:\python38\lib\site-packages\pyparsing.py", line 1675, in _parseNoCache loc, tokens = self.parseImpl(instring, preloc, doActions) File "c:\python38\lib\site-packages\pyparsing.py", line 4235, in parseImpl ret = e._parse(instring, loc, doActions) File "c:\python38\lib\site-packages\pyparsing.py", line 1708, in _parseNoCache tokens = fn(instring, tokensStart, retTokens) File "c:\python38\lib\site-packages\pyparsing.py", line 1314, in wrapper ret = func(*args[limit[0]:]) File "c:\python38\lib\site-packages\bibtexparser\bparser.py", line 203, in <lambda> lambda s, l, t: self._add_entry( File "c:\python38\lib\site-packages\bibtexparser\bparser.py", line 299, in _add_entry d = self.customization(d) File "c:\python38\lib\site-packages\bibtexparser\customization.py", line 508, in convert_to_unicode record[val] = latex_to_unicode(record[val]) File "c:\python38\lib\site-packages\bibtexparser\latexenc.py", line 67, in latex_to_unicode string = _replace_all_latex(string, itertools.chain( File "c:\python38\lib\site-packages\bibtexparser\latexenc.py", line 55, in _replace_all_latex string = _replace_latex(string, l.rstrip(), u) File "c:\python38\lib\site-packages\bibtexparser\latexenc.py", line 37, in _replace_latex if unicodedata.combining(unicod): TypeError: combining() argument must be a unicode character, not str

Bibtexparser bug: 'str' object is not callable

I've had no luck with the publication import feature, so far. Running
academic import --bibtex file.bib
results in a
TypeError: 'str' object is not callable (with a long traceback), even when file.bib is as simple as the following:

@article{dummy,
	author = {T. Author},
	journal = {J. of Sci.},
	pages = {25--33},
	title = {Some short title},
	volume = {116},
	year = {2012}
}

import --assets does not include webfonts

I've successfully run academic import --assets and this downloads some assets and makes necessary changes to the project.

I've found two small bugs afterwards:

Headers of the original files don't seem to get converted properly. In my case it was on line 111 of main.min.js I found:

//# sourceMappingURL=gmaps.min.js.map/* @preserve

Which I had to convert to:

//# sourceMappingURL=gmaps.min.js.map
/* @preserve

When that was done, I did end up with more errors when loading the page concerning missing files:

[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (index.json, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (bootstrap.min.css.map, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (instantsearch.min.js.map, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-brands-400.woff2, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (academicons.ttf, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-solid-900.woff2, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-regular-400.woff2, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-solid-900.woff, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (academicons.woff, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-regular-400.woff, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-brands-400.woff, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-solid-900.ttf, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (academicons.svg, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-brands-400.ttf, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-regular-400.ttf, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-brands-400.svg, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-regular-400.svg, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (fa-solid-900.svg, line 0)
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (MathMenu.js, line 0)
[Error] Refused to execute http://localhost:1313/extensions/MathMenu.js?V=2.7.4 as script because "X-Content-Type: nosniff" was given and its Content-Type is not a script MIME type.
[Error] Failed to load resource: the server responded with a status of 404 (Not Found) (MathZoom.js, line 0)
[Error] Refused to execute http://localhost:1313/extensions/MathZoom.js?V=2.7.4 as script because "X-Content-Type: nosniff" was given and its Content-Type is not a script MIME type.

New release?

Hi, first of all thank you for this incredible template and these scripts.

I would suggest to push a new release on pip -- current one has some broken issues with publications.
For example, the cite button does not work.
Latest release available saves the bib with the full cite key, while academic expect only cite.bib.
The code on the master fixed it but its not available on PyPI.

Example

Parsing entry Author2019
Creating folder content/publication/author-2019
Saving citation to content/publication/author-2019/author-2019.bib
Saving Markdown to 'content/publication/author-2019/index.md'

Can you create a new release?

Remove dependency on Hugo

Background

Merging of #68 introduced a dependency on Hugo by using the hugo new command.

Users have reported that the Academic tool can now be more difficult to use given the dependencies that are needed to run the tool locally.

Make Hugo an optional requirement again so that users can continue to generate publication files locally and upload them to GitHub without needing to install Hugo and its dependencies locally. This change would provide a portable solution, making it easier to also use the Academic CLI with other static site generators.

Proposal

Based on #73 (comment)

Python 3.8.5 - parsing issues

Hi! I am getting the following error when trying to use academic to import a bibtex file with publications into my blog:

TypeError: combining() argument must be a unicode character, not str

I have previously managed to use the tool, can this be related to updates in python?

import assets: download fonts too

Add FA fonts
Add Academicons fonts

Local assets do not seem to be considered

Hello,
I'm delighted to find out there is a possibility to use the academic theme without having to give away the privacy of my visitors.
However, after installing academic-admin and running academic import --assets, my extension Privacy Badger tells me there still are trackers on my website.
What could have gone wrong?
Thanks

Error when checking for hugo in docker

I'm trying to use Academic to import my publications from a bibtex file but it turns out that the latest addition to check for a docker file breaks the script for me.

I get the following error:

ERROR:
        Can't find a suitable configuration file in this directory or any
        parent. Are you in the right directory?

        Supported filenames: docker-compose.yml, docker-compose.yaml

Traceback (most recent call last):
  File "C:\ProgramData\Miniconda3\Scripts\academic-script.py", line 9, in <module>
    sys.exit(main())
  File "C:\ProgramData\Miniconda3\lib\site-packages\academic\cli.py", line 43, in main
    parse_args(sys.argv[1:])  # Strip command name, leave just args.
  File "C:\ProgramData\Miniconda3\lib\site-packages\academic\cli.py", line 101, in parse_args
    import_bibtex(
  File "C:\ProgramData\Miniconda3\lib\site-packages\academic\import_bibtex.py", line 55, in import_bibtex
    parse_bibtex_entry(
  File "C:\ProgramData\Miniconda3\lib\site-packages\academic\import_bibtex.py", line 100, in parse_bibtex_entry
    page.load("index.md")
  File "C:\ProgramData\Miniconda3\lib\site-packages\academic\editFM.py", line 23, in load
    file = open(self.path, "r").readlines()
FileNotFoundError: [Errno 2] No such file or directory: 'content\\publication\\hill-life-2020\\index.md'

I am using Miniconda as my Python environment and I do have Docker installed and running in the background, but I do not have a docker-compose.yml file in the current directory that I am working in. I am running on Windows 10.

Commenting out the lines 8 - 11 in the function hugo_in_docker_or_local() in utils.py and just leaving hugo = "" allows the script to work for me.

Migrate config/params TOML to YAML

This may not belong here, but I've found myself having to manually diff changes between config.toml and params.toml between Wowchemy versions.

I'm not sure what the failure modes might be, but adding new config/params options anytime those new changes are made would be good, I think?

The use of migrating from TOML to YAML is that ruamel.yaml supports round-trip comments. I don't believe the toml pypi package supports that.

Add support for Biber - different date format

Hi,

Recently started using HUGO and the academic theme, so far so good!

However, I've hit a mini-hurdle when trying to use your script to convert a .bib to many .bibs for my publications record.

Essentially, I use biblatex. It seems the only really problem this causes is parsing months. In biblatex months are already a 2-digit number, .e.g 04 not apr.

Would it be worth adding an extra level of robustness to your python script to parse numeric months?

I've made the changes to my fork and will create a pull request referencing this issue for you to decide on.

Thanks,

Chas

AttributeError: 'NoneType' object has no attribute 'info'

Just getting familiar with Academic and academic-admin to consider using to update my academic website. It all looks great!

But I'm having some issues importing my publications from a bib file. For my current website I do this using bib2html, with some bespoke scripts. academic-admin looks like a much better way to do this so I'm keen to switch, but it doesn't seem to be working for me at present. Not sure if I have installed correctly(?), although I did follow the instructions on the website.

Here is the traceback when running inside a cloned version of academic:

> academic import --bibtex mybibs.bib -v
08:32:38PM DEBUG: Found Entry: ['article', 'id', {'YEAR': '2019', 'JOURNAL': 'Journal', 'TITLE': 'Title here', 'AUTHOR': 'A. N. Other'}]
08:32:38PM DEBUG: Apply customizations and return dict
Traceback (most recent call last):
  File "/usr/local/bin/academic", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/academic/cli.py", line 94, in main
    import_bibtex(args.bibtex, pub_dir=args.publication_dir, featured=args.featured, overwrite=args.overwrite, normalize=args.normalize)
  File "/usr/local/lib/python3.7/site-packages/academic/cli.py", line 111, in import_bibtex
    parse_bibtex_entry(entry, pub_dir=pub_dir, featured=featured, overwrite=overwrite, normalize=normalize)
  File "/usr/local/lib/python3.7/site-packages/academic/cli.py", line 116, in parse_bibtex_entry
    log.info(f"Parsing entry {entry['ID']}")
AttributeError: 'NoneType' object has no attribute 'info'

The test bib files simply contains

@article{id,
    AUTHOR      = "A. N. Other",
    TITLE       = "Title here",
    JOURNAL     = "Journal",
    YEAR        = "2019"
}

I'm running Python 3.7 (installed via anaconda) on Mac OS X Mojave.

import assets: add source maps

Problem defined in #13 . Specifically, at least gmaps.min.js.map should be included.

Add tests and CI with Github Actions

Add flake8 and pytest framework for testing.
Add at least one test for importing BibTeX with a test bib file.
Setup CI with Github Actions

hugo command usage in bibtext import does not check cmd return

Usage of hugo with subprocess should ideally wait for process completion and check for errors. This would eliminate the sleep cruft there too.

https://github.com/wowchemy/hugo-academic-cli/blob/93acb29fd7833d3932868bf96875f284a4dabcb5/academic/import_bibtex.py#L80-L82

Currently when the hugo command fails to create the index.md (if hugo isn't installed for example) an error without this needed info is thrown: FileNotFoundError: [Errno 2] No such file or directory: 'content/publication/.../index.md'.

It would be better to show the error from the hugo subprocess:

[tylar@tylar-pc www_marinebon2]$ python
Python 3.8.3 (default, May 17 2020, 18:15:42)
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> subprocess.call(f"{hugo} new {markdown_path} --kind publication", shell=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'hugo' is not defined

Interpret \textsc{} and other commands when building publication lists

It is common to use commands such as \textsc{} in BibTeX entries. It would be nice if this were cleanly handled when academic-admin generates index pages from a BibTeX file, but I am not sure how this could be done.

adding shortjournal with biblatex

Biblatex uses shortjournal keyword which is useful. Could this be added using a command switch?

LF will be replaced by CRLF

When run academic import --bibtex <path_to_your/publications.bib>, there is no error.
And then I want to up-to-data my github repository, run 'git add .', there is an error:

warning: LF will be replaced by CRLF in content/publication/papers.bib.
The file will have its original line endings in your working directory

Please tell me more details about this operation, how can I show my publication on the website by python academic library?

Thank you very much!

Incorrect parsing of unquoted months

The unquoted months field in the bibtex cause a crash.
@article{ZhengLocationBasedHandshake2017, title = {Location {{Based Handshake}} and {{Private Proximity Test}} with {{Location Tags}}}, volume = {14}, copyright = {All rights reserved}, number = {4}, journal = {IEEE Transactions on Dependable and Secure Computing}, author = {Zheng, Yao and Li, Ming and Lou, Wenjing and Hou, Y. Thomas}, month = jul, year = {2017}, keywords = {journal}, pages = {406--419}, file = {/Users/yao/Cloud/Zotero/storage/ZD5TTQHE/Zheng et al. - 2017 - Location Based Handshake and Private Proximity Tes.pdf} }
It causes
bibtexparser.bibdatabase.UndefinedString: 'jul'

It has an easy fix. Replace
parser = BibTexParser()
with
parser = BibTexParser(common_strings=True)

Import --assets is incomplete

I am trying to build a static site, that is truly independent from external resources. From the descriptions it sounds like this tool is intended to exactly do this.

However after downloading the css files, the website still downloads fonts from gstatic.com, and it still connects to identity.netlify.com (however this might be due to another reason, I guess). Also I got the issue that after performing the command all icons (presumably from font-awesome) stop working.

Therefore I think it would be good to clearly state the purpose of this tool in the documentation and what it does and what not. Better yet, it would be awesome if it actually would download all external dependencies and build a self-sustained site.

Invalid escape character in "index.md" (e.g for spanish characters)

Trying to render my website in hugo 0.52 (using the R package "blogdown") fails due to:

Error: Error building site: "content\publication\rn-915\index.md:4:1": unmarshal failed: 
Near line 3 (last key parsed 'authors'): invalid escape character 'a'; 
only the following escape characters are allowed: \b, \t, \n, \f, \r, \", \\, \uXXXX, 
and \UXXXXXXXX

E.g. one entry entry "RN915" of the example.bib is translated running academic import --bibtex "example.bib" (version 0.26) in the following "index.md" and "rn-915.bib" in the content/publication folder:
rn-915.zip

I attached more problematic publications in "example.bib" and my conda "environment.yml" file

Add support for Google Scholar export URL as input source?

Is there any interest in supporting imports directly from Google Scholar BibTeX exports ?

`content/publication` to bibtex/CSL

So, I've been thinking about how moving towards using hugo new ... and Netlify CMS might mean that folks don't have citations, but are still adding publications.

Maybe, along with #48, this effort can move towards adding citations after a new publication has been added.

I don't think this idea's been fully fleshed out, but basically it's summarized with – having something like --export-bibtex which scrapes the contents of content/publication (or the like) and assembles a bibtex or JSON CSL as you alluded to in #48.

Failed to install

I tried to install the academic package as described in tutorial but get this error information:

ERROR: Cannot uninstall 'ruamel-yaml'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

How to resolve this issue please?

LaTeX commands in TITLE field (bibtex files) are not preserved when importing

After importing a bibtex file with the following entry:

@article {MR3750218,
    AUTHOR = {Alarc\'{o}n, Antonio and Forstneri\v{c}, Franc},
     TITLE = {Complete densely embedded complex lines in {$\Bbb C^2$}},
   JOURNAL = {Proc. Amer. Math. Soc.},
  FJOURNAL = {Proceedings of the American Mathematical Society},
    VOLUME = {146},
      YEAR = {2018},
    NUMBER = {3},
     PAGES = {1059--1067},
      ISSN = {0002-9939},
   MRCLASS = {32H02 (32E10 32M17 53C42)},
  MRNUMBER = {3750218},
MRREVIEWER = {Yu Kawakami},
       DOI = {10.1090/proc/13873},
       URL = {https://doi.org/10.1090/proc/13873},
}

I obtained the following markdown file for the publication entry

---
title: "Complete densely embedded complex lines in $Bbb C^2$"
date: 2018-01-01
publishDate: 2019-08-23T19:28:30.647940Z
authors: ["Antonio Alarcón", "Franc Forstnerič"]
publication_types: ["2"]
abstract: ""
featured: false
publication: "*Proc. Amer. Math. Soc.*"
url_pdf: "https://doi.org/10.1090/proc/13873"
doi: "10.1090/proc/13873"
---

Observe how the characters in the authors name are correctly transformed but the backslash character of the LaTeX command \Bbb (for blackboard letter) is missing in the title. This produce a bad formater title.

Compare the two images below. The first one is with "$Bbb C^2$" while the second one is "$\Bbb C^2$" (which is the correct one).

The character "\" inside $$ in the title should be doubled instead of removed. See for instance the warning at the end of the Manually managing content in the academic documentation.

BibTeX superscripts in math environment parsed wrong

When converting a bibtex file with a superscript somewhere, the curly brackets are not parsed. For example, if the title contains:

A $^{32}$S isotope

This ends up being:

A $^32$S isotope

In the clean_bibtex_str(s) routine on line 276 of cli.py these curly brackets would be replaced with nothing by the command in line 280:

s = s.replace("{", "").replace("}", "")

However, printing out s at the beginning of this routine shows that the title never arrives with a curly bracket in it. Not sure if this is a problem with bibtexparser or with academic-admin, but any fix would be very interesting. Alternatively, how would you solve such a title?

Journal newcommand shorthands and extra tilde

For astronomy publications, the BibTeX files often contain tilde in author names and the journal is not spelled out (often using a \newcommand). For example see below:

@ARTICLE{2016ApJS..226....5L,
   author = {{Ly}, C. and {Malhotra}, S. and {Malkan}, M.~A. and {Rigby}, J.~R. and 
        {Kashikawa}, N. and {de los Reyes}, M.~A. and {Rhoads}, J.~E.
        },
    title = "{The Metal Abundances across Cosmic Time (MACT) Survey. I. Optical Spectroscopy in the Subaru Deep Field}",
  journal = {\apjs},
archivePrefix = "arXiv",
   eprint = {1602.01089},
 keywords = {galaxies: abundances, galaxies: distances and redshifts, galaxies: evolution, galaxies: ISM, galaxies: photometry, galaxies: star formation},
     year = 2016,
    month = sep,
   volume = 226,
      eid = {5},
    pages = {5},
      doi = {10.3847/0067-0049/226/1/5},
   adsurl = {https://ui.adsabs.harvard.edu/abs/2016ApJS..226....5L},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

In the above case, the journal shows up as "apjs" and the author list contains the extra tildes.

I think there are a number of ways to fix these issues, and perhaps it would make more sense to provide a scrubbed BibTeX file that is based on some third-party open-source python code. Thoughts?

Support custom publication types

Hi. How are you inteding to deal with https://github.com/sourcethemes/academic-admin/blob/master/academic/cli.py#L115?
I can try to help!

academic: command not found

This is probably me being a numbnut, but the admin tools seem to install ok, but then I get "academic: command not found" if I try to use them?

I'm on Ubuntu 18.04.4 LTS

Tags are being imported with additional quotes

When the import is run all tags are imported as '"Tag 1"', '"Tag2"' etc. The additional double quotes inside the single quotes cause issues when using the tags with the website

UnicodeDecodeError

I have several Chinese bib entries in my bib file (Here's a demo:
demo.zip) and when I ran the import command, I got an UnicodeDecodeError as following with the demo bib file:

D:\personalPage>academic import --bibtex content/publication/demo.bib
Traceback (most recent call last):
  File "d:\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "d:\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\Anaconda3\Scripts\academic.exe\__main__.py", line 9, in <module>
  File "d:\anaconda3\lib\site-packages\academic\cli.py", line 76, in main
    import_bibtex(args.bibtex, pub_dir=args.publication_dir, featured=args.featured, overwrite=args.overwrite)
  File "d:\anaconda3\lib\site-packages\academic\cli.py", line 91, in import_bibtex
    bib_database = bibtexparser.load(bibtex_file, parser=parser)
  File "d:\anaconda3\lib\site-packages\bibtexparser\__init__.py", line 71, in load
    return parser.parse_file(bibtex_file)
  File "d:\anaconda3\lib\site-packages\bibtexparser\bparser.py", line 165, in parse_file
    return self.parse(file.read(), partial=partial)
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa1 in position 56: illegal multibyte sequence

I am using Windows 10 and installed python 3 with Anaconda, by the way, and it works well for entries in English.

I'll appreciate any suggestion to conquer this issue. Thanks,

import --bibtex creates strange filenames for *.bib files

I just installed using pip install academic; academic -h tells me I'm using version v0.3.0.

When I run academic import --normalize --bibtex publications.bib, the *.bib files which are created all have the filename {slugify(entry['ID'])}.bib. In my site, when clicking on the download link in the cite popup, I do get prompted for the download, and the download works okay, but the filename is again {slugify(entry['ID'])}.bib, which is ugly and confusing.

Base publication slug (URL) on title rather BibTeX ID

Currently, we use the ID provided by Bibtex. An ID based on the title may be more intuitive but this kind of ID is unavailable in Bibtex, so would require implementing the below algorithm to generate a unique ID based on the title.

Requirements:

consider up to first 5 words of title (after removal of stop words?)
strips special chars
converts all characters to lowercaps
slugify: replace whitespaces, underscores and periods by hyphens/dashes
reduce multiple consecutive dashes to one
Either check generated ID is unique and append numeric ID if clashes, or else append hash to name

import --assets is broken

ᐅ academic import --assets
Downloading jquery.min.js from https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js...
Downloading bootstrap.min.js from https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.1.3/js/bootstrap.min.js...
Downloading highlight.min.js from https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/highlight.min.js...
Downloading MathJax.js from https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.4/MathJax.js?config=TeX-AMS_CHTML-full...
Downloading isotope.pkgd.min.js from https://cdnjs.cloudflare.com/ajax/libs/jquery.isotope/3.0.4/isotope.pkgd.min.js...
Downloading imagesloaded.pkgd.min.js from https://cdnjs.cloudflare.com/ajax/libs/jquery.imagesloaded/4.1.3/imagesloaded.pkgd.min.js...
Downloading autotrack.js from https://cdnjs.cloudflare.com/ajax/libs/autotrack/2.4.1/autotrack.js...
Downloading gmaps.min.js from https://cdnjs.cloudflare.com/ajax/libs/gmaps.js/0.4.25/gmaps.min.js...
Downloading leaflet.js from https://cdnjs.cloudflare.com/ajax/libs/leaflet/1.2.0/leaflet.js...
Downloading jquery.fancybox.min.js from https://cdnjs.cloudflare.com/ajax/libs/fancybox/3.2.5/jquery.fancybox.min.js...
Downloading fuse.min.js from https://cdnjs.cloudflare.com/ajax/libs/fuse.js/3.2.1/fuse.min.js...
Downloading jquery.mark.min.js from https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/jquery.mark.min.js...
Downloading instantsearch.min.js from https://cdnjs.cloudflare.com/ajax/libs/instantsearch.js/2.10.2/instantsearch.min.js...
Downloading anchor.min.js from https://cdnjs.cloudflare.com/ajax/libs/anchor-js/4.1.1/anchor.min.js...
Merging JS assets into static/js/vendor/main.min.js
Traceback (most recent call last):
  File "/usr/local/bin/academic", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/academic/cli.py", line 75, in main
    import_assets()
  File "/usr/local/lib/python3.6/site-packages/academic/cli.py", line 282, in import_assets
    merge_files(js_files, JS_FILENAME)
  File "/usr/local/lib/python3.6/site-packages/academic/cli.py", line 323, in merge_files
    with open(destination, 'w', encoding='utf-8') as f:
IsADirectoryError: [Errno 21] Is a directory: 'static/js/vendor/main.min.js'

unable to run code

unable to run -- some conflict with url parse after installation.

baiao:academic_sh Simone$ pip install -U academic
Requirement already up-to-date: academic in /usr/local/lib/python2.7/site-packages (0.2.2)
Requirement already satisfied, skipping upgrade: bibtexparser in /usr/local/lib/python2.7/site-packages (from academic) (1.0.1)
Requirement already satisfied, skipping upgrade: requests in /usr/local/lib/python2.7/site-packages (from academic) (2.20.1)
Requirement already satisfied, skipping upgrade: toml in /usr/local/lib/python2.7/site-packages (from academic) (0.10.0)
Requirement already satisfied, skipping upgrade: pyparsing in /usr/local/lib/python2.7/site-packages (from bibtexparser->academic) (2.3.0)
Requirement already satisfied, skipping upgrade: future in /usr/local/lib/python2.7/site-packages (from bibtexparser->academic) (0.17.1)
Requirement already satisfied, skipping upgrade: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python2.7/site-packages (from requests->academic) (3.0.4)
Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in /usr/local/lib/python2.7/site-packages (from requests->academic) (2018.10.15)
Requirement already satisfied, skipping upgrade: urllib3<1.25,>=1.21.1 in /usr/local/lib/python2.7/site-packages (from requests->academic) (1.24.1)
Requirement already satisfied, skipping upgrade: idna<2.8,>=2.5 in /usr/local/lib/python2.7/site-packages (from requests->academic) (2.7)

baiao:academic_sh Simone$ academic academic import --bibtex ./content/articles/shpubs_2018.bib
Traceback (most recent call last):
File "/usr/local/bin/academic", line 7, in
from academic.cli import main
File "/usr/local/lib/python2.7/site-packages/academic/cli.py", line 11, in
from urllib.parse import urlparse
ImportError: No module named parse
baiao:academic_sh Simone$ pip install urlparse
Collecting urlparse
Could not find a version that satisfies the requirement urlparse (from versions: )
No matching distribution found for urlparse

conda package – add to readme?

Hello, I have created a conda-forge package for academic-admin, which is available from https://anaconda.org/conda-forge/academic

I was wondering whether you think it could be a good idea to mention the availability from conda-forge in the readme, and add a badge showing the latest conda-forge package version? I would happily file a PR for this.

Add logger and verbose mode

Add Python's logger for proper logging.
Convert all or most print statements to appropriate level (e.g. Info) logging statements
Add -v and --verbose arguments for verbose mode.
Only output info type messages in verbose mode

Config file refactoring broke the package

Hello :)

Since the changes introduced in HugoBlox/hugo-blox-builder@ae8a86a and given that ther's no config.toml in the root folder anymore, the tool exits with the message

Please navigate to your website directory (where config.toml resides) and re-run.

Add support for front matter options via `extras` field in reference managers

Goals

Add support for front matter options (e.g. projects, categories) via reference manager's extras field.

This could simplify the process of importing publications, effectively using the reference manager as the CMS for publications.

Details

Configuration of front matter options via the reference manager's extras field will streamline the import process and alleviate the need to edit any of the automatically generated Markdown files. Hence, all publication editing can be performed in the user's existing reference management tool.

For example, this would enable users to set front matter options such as categories within their reference manager.

There are many limitations to the extras field in reference managers. We'd need to design a data structure and consider how multiple front matter options, including lists and nested key-value pairs, are stored in a flat extras field.

Handle linebreaks in BibTeX leading to invalid TOML strings

first of all: it is so cool that you made this -- thanks a lot!

Now to my problem: I have the bibtex files which do contain newlines. This tool works, but then when I run Hugo, I get errors like this one:

ERROR 2018/10/19 21:47:29 failed to parse page metadata for "publication/Behrens2018AMT/index.md": Near line 1 (last key parsed 'title'): strings cannot contain newlines for publication/Behrens2018AMT/index.md

in case of the bibtex

@Article{Behrens2018AMT,
  title =	 {{GOME-2A} retrievals of tropospheric {NO2} in
                  different spectral ranges -- influence of
                  penetration depth},
  AUTHOR =	 {Behrens, L. K. and Hilboll, A. and Richter, A. and
                  Peters, E. and Eskes, H. and Burrows, J. P.},
  DOI =		 {10.5194/amt-11-2769-2018},
  journal =	 {Atmos. Meas. Tech.},
  NUMBER =	 5,
  PAGES =	 {2769--2795},
  VOLUME =	 11,
  YEAR =	 2018,
}

which leads to this markdown:

+++
title = "GOME-2A retrievals of tropospheric NO2 in
different spectral ranges -- influence of
penetration depth"
date = 2018-01-01
authors = ["L. K. Behrens", "A. Hilboll", "A. Richter", "E. Peters", "H. Eskes", "J. P. Burrows"]
publication_types = ["2"]
abstract = ""
selected = "false"
publication = "*Atmos. Meas. Tech.*"
doi = "10.5194/amt-11-2769-2018"
+++

Automatically convert Jupyter notebooks to Markdown

Use the Python nbconvert module to clean up Jupyter notebooks and import them as blog posts.

Algo idea:

Recursively locate all Jupyter notebooks in a site's notebooks/ folder
Parse each notebook into a Markdown page bundle within content/post/ folder using nbconvert's Python function
- defaults to the h1 heading (falling back to filename) as the Hugo page title, and the conversion date for date
- ~~If the first cell is Jupyter 'raw' type and begins with --- (YAML) use it as the page's front matter and hide that cell from the output~~
- Fetch front matter options from notebook metadata field

Tests:

Add a unit test on a Jupyter notebook and check the Markdown output

Error converting month to number

Hello,
I was just about to write my own script because I could not find one before. I am happy that you did the work now so everyone can use this awesome script.

I found a small bug that at least for me prevents the script from running. It exits with the following error:
Traceback (most recent call last): File "/home/palladion/.local/lib/python3.6/site-packages/academic/cli.py", line 187, in month2number return str(list(calendar.month_abbr).index(month_abbr)).zfill(2) ValueError: 'mar' is not in list

At least in my locale calendar.month_abbr gives me "Mar" instead of the expected lowercase "mar". Then the month can not be found in the list and the exception gets thrown.

I hope this helps. Thank you