Comments (8)
Thanks!
- We should add a middleware that normalizes field names.
- We could consider a default lower-case mapping.
from python-bibtexparser.
Maybe something like this (Works For Meβ’)?
import bibtexparser
from bibtexparser.library import Library
from bibtexparser.model import Block, Entry
class NormalizeFieldNames(bibtexparser.middlewares.middleware.BlockMiddleware):
def __init__(self,
allow_inplace_modification: bool = True):
super().__init__(allow_inplace_modification=allow_inplace_modification,
allow_parallel_execution=True)
def transform_entry(self, entry: Entry, library: "Library") -> Union[Block, Collection[Block], None]:
for field in entry.fields:
field.key = field.key.lower()
return entry
Usage example:
library = bibtexparser.parse_file(filename,
append_middleware=[NormalizeFieldNames(),
bibtexparser.middlewares.SeparateCoAuthors(),
bibtexparser.middlewares.SplitNameParts()])
from python-bibtexparser.
That's probably alright. Would you be willing to convert it to a PR (adding a test)? I think this is a quite common use-case that we should support.
from python-bibtexparser.
Fully agree with @tdegeus, and would appreciate a PR by @Technologicat
Just one remark: We'd have to be able to handle "new" duplicates somehow (i.e., if two field keys exist in the original block which only differ in their capitalization). That's particularly important now that we're pushing the use of entries as dicts. In principle, we have an entry type DuplicateFieldKeyBlock
that should be used here, but I am also happy to support additional suggestions. These would probably have to be enabled with a corresponding constructor parameter (e.g. raising an exception). Does this make sense?
from python-bibtexparser.
@tdegeus: Sure.
@MiWeiss: Good point about conflicting keys. But I'll need a bit more information about the desired way to tackle it.
The way this approximately went is, yesterday I got a sudden need to extract some data from BibTeX in Python.
Within an hour, I had installed bibtexparser
, upgraded it to 2.x, ran into this issue (since my datafiles happened to use capitalized keys), written the simplest possible field key normalizer, and posted a copy here. So it's fair to say I'm kind of new to this project :)
from python-bibtexparser.
A solution would be to issue a warning (similar to library.failed_blocks) and use the last key value.
from python-bibtexparser.
@csware: Thanks. Yes, that's one possible solution, and probably the simplest one that works.
Considering alternatives, what about the EDIT: Nevermind, I think I understood what you all meant now.DuplicateFieldKeyBlock
mentioned by @MiWeiss?
from python-bibtexparser.
Implemented, using @csware's suggestion of emitting a warning and letting the last value win. Please review.
from python-bibtexparser.
Related Issues (20)
- Separator first names HOT 3
- Custom middleware HOT 3
- Errors in name-split HOT 3
- β¨ Nice way to handle enclosings for months while writing with the default stack HOT 1
- Suppress warnings HOT 3
- `DuplicateFieldKeyBlock.ignore_error_block` retains `{}` in field values HOT 4
- some suggestions from a first-time user HOT 1
- How to replace bp.bparser.BibTexParser() ? HOT 2
- Enclosing middleware fails for certain content HOT 3
- Use case sensitive entry type HOT 4
- Make behaviour entry closer to `dict`: check if field is present
- Default `allow_parallel_execution=True` HOT 3
- Discussion: closer `dict` mimicking `Entry` HOT 1
- Apply style using pre-commit HOT 1
- docs: links to code not working
- docs: switch themes? HOT 4
- Error in latex_to_unicode
- split_multiple_persons_names() bug HOT 1
- More flexible handling of case sensitivity in all keys HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-bibtexparser.