Giter Site home page Giter Site logo

Comments (10)

piskvorky avatar piskvorky commented on May 18, 2024

Thanks for reporting. Last time I looked, boto3 was unfinished and had many issues.

But if it works better, we could (should) switch to boto3 of course. This will need careful testing (and updating all code) though, as the two libraries are not API-compatible IIRC.

from smart_open.

hobson avatar hobson commented on May 18, 2024

Similar problem for me. Perhaps smart_open isn't being used recursively at the contained objects within models like LsiModel (LsiModel.Projection object is split into two pickles, and smart_open not being used to open them separately). I have no problem opening these pickles with smart_open using the exact URIs given in the warning message.

lsi = LsiModel.load(s3_uri('lsi_model_both_130K'))
WARNING:root:failed to load projection from s3://XYZ@bucket-name/lsi_model_both_130K.projection: [Errno 2] No such file or directory: 's3://XYZ@bucket-name/lsi_model_both_130K.projection.u.npy'

from smart_open.

piskvorky avatar piskvorky commented on May 18, 2024

That's right, the numpy objects are loaded using numpy's native load operation, which doesn't support s3 or smart_open.

The easiest work-around is to store your object using pickle (a single file) and smart_open, avoiding gensim's load/save that split the object into multiple files. Or pass separately=[] to lsi.save(), that should work too.

from smart_open.

hobson avatar hobson commented on May 18, 2024

Perfect! Hadn't thought of the single-file workaround. Thanks!

from smart_open.

evdoks avatar evdoks commented on May 18, 2024

It happens with Python >= 2.7.9. A solution to this issue would be to set calling_format=ProtocolIndependentOrdinaryCallingFormat() when calling smart_open and passing calling_format to boto.connect_s3. Additionally, you need to set S3Connection.DefaultHost to S3 endpoint with region name: e.g.,
S3Connection.DefaultHost='s3-eu-west-1.amazonaws.com'. Since currently smart_open does not support calling_format, having support for it will require a pull request. I could provide one, if needed.

from smart_open.

tmylk avatar tmylk commented on May 18, 2024

@evdoks Thanks for the troubleshooting. A PR and a test would be greatly appreciated.

from smart_open.

piskvorky avatar piskvorky commented on May 18, 2024

@evdoks IIRC, in one of the recent PRs (merged already), we added passing parameters from smart_open directly through to S3.

Does that help with passing calling_format here?

from smart_open.

tmylk avatar tmylk commented on May 18, 2024

calling_format explicitly added in #83

from smart_open.

mpenkov avatar mpenkov commented on May 18, 2024

@ziky90 We replaced boto with boto3 in the S3 subsystem. Could you please check if the problem with dots in bucket names still occurs?

from smart_open.

menshikh-iv avatar menshikh-iv commented on May 18, 2024

I close issue because it works fine with bucket name with dots (like com.test-gensim.wow) and current master fbc82cc

from smart_open.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.