Comments (10)
Thanks for reporting. Last time I looked, boto3
was unfinished and had many issues.
But if it works better, we could (should) switch to boto3 of course. This will need careful testing (and updating all code) though, as the two libraries are not API-compatible IIRC.
from smart_open.
Similar problem for me. Perhaps smart_open isn't being used recursively at the contained objects within models like LsiModel
(LsiModel.Projection
object is split into two pickles, and smart_open
not being used to open them separately). I have no problem opening these pickles with smart_open
using the exact URIs given in the warning message.
lsi = LsiModel.load(s3_uri('lsi_model_both_130K'))
WARNING:root:failed to load projection from s3://XYZ@bucket-name/lsi_model_both_130K.projection: [Errno 2] No such file or directory: 's3://XYZ@bucket-name/lsi_model_both_130K.projection.u.npy'
from smart_open.
That's right, the numpy objects are loaded using numpy's native load
operation, which doesn't support s3 or smart_open
.
The easiest work-around is to store your object using pickle (a single file) and smart_open, avoiding gensim's load
/save
that split the object into multiple files. Or pass separately=[]
to lsi.save()
, that should work too.
from smart_open.
Perfect! Hadn't thought of the single-file workaround. Thanks!
from smart_open.
It happens with Python >= 2.7.9. A solution to this issue would be to set calling_format=ProtocolIndependentOrdinaryCallingFormat()
when calling smart_open
and passing calling_format
to boto.connect_s3
. Additionally, you need to set S3Connection.DefaultHost
to S3 endpoint with region name: e.g.,
S3Connection.DefaultHost='s3-eu-west-1.amazonaws.com'
. Since currently smart_open
does not support calling_format
, having support for it will require a pull request. I could provide one, if needed.
from smart_open.
@evdoks Thanks for the troubleshooting. A PR and a test would be greatly appreciated.
from smart_open.
@evdoks IIRC, in one of the recent PRs (merged already), we added passing parameters from smart_open directly through to S3.
Does that help with passing calling_format
here?
from smart_open.
calling_format
explicitly added in #83
from smart_open.
@ziky90 We replaced boto with boto3 in the S3 subsystem. Could you please check if the problem with dots in bucket names still occurs?
from smart_open.
I close issue because it works fine with bucket name with dots (like com.test-gensim.wow
) and current master fbc82cc
from smart_open.
Related Issues (20)
- GCS permission denied 'storage.buckets.get' when using 'open' HOT 2
- python 3.11 support?
- Support for type annotations HOT 3
- Suggeted - allowing cache mechanism for files
- Getting OSError in s3 when permission for kms:Decrypt are missing HOT 4
- S3 open fails on files that contain '@' in their path HOT 5
- Writing to FTP fails with error "503 ASCII (Text) data type is not supported for file transfer operations. Please configure your FTP client to use IMAGE (Binary) type and try again" HOT 1
- Test failures with urllib3 2.0.4
- Compatibility issue with soundfile HOT 1
- Add OAuth2 support HOT 1
- pip install for version 3.0.0 failing HOT 14
- Feature request: zstandard compression HOT 1
- Incompatibility with moto 5 HOT 8
- Version 7.0.0 issue - import botocore error HOT 5
- Inconsistent python_requires minimum version HOT 3
- The result of smart_open.open (FileLikeProxy) lacks a __next__-method in 7.0.0 whereas in 6.4.0 (_io.TextIOWrapper) it did HOT 5
- zstd write does not work with `wb` mode
- No way to specify generation when opening a GS blob
- S3 SinglepartWriter writes on exception when garbage collected HOT 2
- [Documentation] `s3` URI example uses `my_key` ambiguously
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from smart_open.