Comments (7)
I'm also considering whether or not we should even detect these legacy encodings by default.
from chardet.
If we want to deprecate those, we should save that removal until a major version bump and communicate it well.
from chardet.
Yeah. And I don't think we should remove them entirely. We should just not check for them by default.
from chardet.
So in that case, I'd like to propose a different API than what you might be thinking about for these issues.
I think a Configuration
object that works like this might be good:
from chardet import config
cfg = config.Configuration().enable(
'EUC-TW'
).prerefine(
'GB2312', 'Shift_JIS', 'EUC-TW',
)
Then this would make the selections immutable and this could be passed like so:
import chardet
chardet.detect(input_bytes, config=cfg)
from chardet.
Could you layout the advantages of that over just adding kwargs to chardet.detect
?
From my perspective I could see a configuration object being really handy if you were making loads of calls to chardet.detect
at various points in your code, but my intuition is that most people are just calling it once (or repeatedly in a loop).
from chardet.
So, for one, I've grown to hate kwargs. Since we're supporting Python 2, it means we cannot make them keyword-only arguments which means that people will call them positionally. This makes it hard to deprecate features as we need to. Configuration objects as described above mean that we can make a method raise a DeprecationWarning, and removing it is as easy as making the method a no-op.
from chardet.
That's a really good point about deprecation. I also hate people calling keyword arguments positionally, so I think you've got the right approach there.
from chardet.
Related Issues (20)
- 检测不出pdf编码 HOT 1
- Predicted encoding unable to decode given string HOT 2
- Not working for .DELTA file
- [Bug]Predicted encoding error
- detect encode wrong!
- Detect pep-0263
- test_detect_all_and_detect_one_should_agree fails on Python 3.11b3 HOT 4
- Dependency warning (v5.0.0) HOT 1
- chardet 5.0 KeyError with Python 3.10 on Windows HOT 5
- Is the license LGPL v2.1 or later or just LGPLv2.1 only? HOT 3
- Documentation licensed only to non-commercial and personal use found
- Documentation licensed only to non-commercial and personal use found HOT 1
- Allow running of the package via `python3 -m chardet ...` HOT 4
- Encoding error
- Next release for Python 3.11 HOT 1
- type annotation and implementation mismatch HOT 2
- How to use Chardet for this Python code, as to read files that have ANSI encoder?
- chardetect cli: UnicodeEncodeError when filename is not utf8
- wrong result. actual johab - expected latin1 HOT 4
- Failed to detect CP932 encoded file
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chardet.