Comments (8)
Ah, right, I forgot about paths. Falling back with a deprecation warning sounds like the way to go.
from pip.
Are you able to share the contents of the requirements.txt
file you were using?
from pip.
@matthewhughes934
The contents of my requirements.txt are as follows:
# server
supervisor==4.2.5
gunicorn==21.2.0
gevent==23.9.1
# web
Werkzeug==2.3.7
celery==5.2.7
click==8.1.7
dataclasses_json==0.6.4
Flask==2.3.3
Flask_Cors==3.0.10
Flask_Login==0.6.2
Flask_Migrate==4.0.5
Flask_RESTful==0.3.9
flask_sqlalchemy==3.0.5
SQLAlchemy==2.0.0
minio==7.2.4
psycopg2-binary==2.9.9
python-dotenv==1.0.1
redis==5.0.2
requests==2.31.0
# rag
langchain==0.1.16
llama-index==0.10.30
llama-index-core==0.10.30 # 这个必须手指定,不然构建的时候会去获取最新的版本,可能会有bug。
llama-index-retrievers-bm25==0.1.3
llama-index-storage-index-store-redis==0.1.2
llama-index-storage-kvstore-redis==0.1.3
llama-index-storage-docstore-mongodb==0.1.3
llama-index-vector-stores-milvus==0.1.10
llama-index-vector-stores-qdrant==0.2.5
llama-parse==0.4.1
rank-bm25==0.2.2
ragas==0.1.1
qdrant-client==1.9.0
pymongo==4.6.3
motor==3.4.0
asyncpg==0.29.0
spacy==3.7.4
jieba==0.42.1
./zh_core_web_sm-3.7.0-py3-none-any.whl
scikit-learn==1.4.2
# data loader 相关依赖
pypdf==4.2.0
pdfminer-six==20231228
PyMuPDF==1.24.2
docx2txt==0.8
python-docx==1.1.0
openpyxl==3.1.2
# 评估相关
dashscope==1.19.2
zhipuai==2.1.0
from pip.
@matthewhughes934
I modified pip_internal\utils\encoding.py
and added the ignore
parameter to its data.decode
method, which resolved the issue.
from pip.
It’s probably best to always use ascii
with replace
. We only allow ASCII in requirements, and anything else (e.g. comments) are ignored by the parser anyway.
A PR would be much welcomed.
from pip.
I modified pip_internal\utils\encoding.py and added the ignore parameter to its data.decode method, which resolved the issue.
I guess the underlying issue was: the file looks to be UTF-8 encoded but you're working in an environment that uses a simplified Chinese locale, and so uses GBK for decoding. I guess an alternative solution would be to run Python in UTF-8 mode (https://docs.python.org/3/using/windows.html#utf-8-mode)
from pip.
It’s probably best to always use
ascii
withreplace
. We only allow ASCII in requirements, and anything else (e.g. comments) are ignored by the parser anyway.A PR would be much welcomed.
It’s probably best to always use
ascii
withreplace
. We only allow ASCII in requirements, and anything else (e.g. comments) are ignored by the parser anyway.A PR would be much welcomed.
👍 happy to get a PR up. I'm wondering two things:
- If I change
auto_decode
: are there places where we want decoding to fail (pererrors="strict"
) or would it be ok to always replace? Or is there code elsewhere that should be changed? - 🤔 Is there any potential for issues with multi-byte/non-ascii-extended encodings: I have no idea how common these might be, but I guess a consequence could be instead of getting a 'failed to decode' error you could get an error about pip failing to install a package named "����"
from pip.
We only allow ASCII in requirements, and anything else (e.g. comments) are ignored by the parser anyway.
Unfortunately, requirements aren't the only things in a requirement file. --requirement <path to file to include>
could include arbitrary Unicode characters, and for that matter a simple local pathname is valid (and could be Unicode).
However, the documentation states that requirement files should be UTF-8 by default, so this seems like a simple bug in auto_decode
- https://github.com/pypa/pip/blob/main/src/pip/_internal/utils/encoding.py#L35 should be using UTF-8. (And arguably the BOM detection in there is in violation of the spec, but IMO it's not worth changing).
Of course, even though this is technically a bug fix, it is still a breaking change, potentially, so we need to consider how we handle that. (We could fall back to the system encoding if UTF8 fails, with a deprecation warning - this won't avoid mojibake, but it will catch outright encoding failures).
from pip.
Related Issues (20)
- Pip is unable to exclude Python pre-release versions via `python-requires`
- Pip is not working at all with any Package. SSL Errors! HOT 2
- WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', PermissionError(1, 'Operation not permitted'))': /simple/imblearn/ WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', PermissionError(1, 'Operation not permitted'))': /simple/imblearn/ HOT 1
- "WARNING: Error parsing dependencies of console" on m1 mac after update to pip 24.1.1 HOT 4
- pip fails in GitHub action with Invalid version (image name!) HOT 16
- Pip endlessly downloads all previous versions of python packages HOT 5
- Bugfix release PRs do not trigger test suite on CI HOT 1
- Pip memory usage for large cached install dominated by list of candiate pages HOT 1
- Missing git tag for 24.1.2 HOT 3
- Incorrect number of `"` in documentation
- Sharing a troubleshooting tool I made, simple barebone certifi.core (Lifeboat). HOT 5
- `pip._vendor.distlib.DistlibException: Unable to locate finder for 'pip._vendor.distlib'` raised when started by a separate process HOT 3
- Add tests for require-virtualenv HOT 1
- Please keep "pip search" function. HOT 1
- abi3audit dependencies HOT 2
- [24.1] name== hack for listing package versions no longer works HOT 8
- pip show does not recognise pyproject.yaml Licenses HOT 2
- proxy environment variable integration breaks other applications
- Upgrade to urllib3 to v2.x.x HOT 4
- Upgrade pip's version automatically HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pip.