Comments (10)
Is there an internal buffer in boto that needs to be flushed?
from smart_open.
I don't think so... that's not quite what I meant sorry.
I'm trying to handle the situation where I'm passing the file object into a third-party library that I don't have control over. That library is calling the "flush" method. So I ordinarily I do:
f = open('test.csv', 'w')
library.library_method(f) # everything works ok - library_method calls flush - no errors
vs
f = smart_open('s3://bucket/blah/test.csv', 'w')
library.library_method(f) # fails - flush doesn't exist
I.e. it would be nice if smart_open could be used as a direct replacement for open
. As I said above, flush doesn't have to do anything - it can just be a NoOp. If it can be flushed - great.
from smart_open.
I guess we can do that, sure. Any other methods of that sort? Can you add them all in a PR?
Plus add clear motivation & comments to those no-op methods too, so someone doesn't remove them in the future by accident.
from smart_open.
@tmylk I have made a PR resolving this issue like you asked me to. Do review.
from smart_open.
This request is similar to previous #64 (comment) request to inherit or quack io.IOBase
interface support. See python docs for the list of other functions that need to be added. They only have minor differences across Python versions.
from smart_open.
Yes - I was thinking a sensible implementation might inherit from io.IOBase
I'll find some time to work on this and open a PR.
from smart_open.
@menshikh-iv This is done.
(smartopen)sergeyich:issue152 misha$ python
Python 3.6.3 (default, Oct 4 2017, 06:09:15)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import smart_open
>>> f = smart_open.smart_open('s3://private/key.txt')
>>> dir(f)
['__abstractmethods__', '__class__', '__del__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_abc_cache', '_abc_negative_cache', '_abc_negative_cache_version', '_abc_registry', '_buffer', '_buffer_size', '_checkClosed', '_checkReadable', '_checkSeekable', '_checkWritable', '_content_length', '_current_pos', '_eof', '_fill_buffer', '_line_terminator', '_object', '_raw_reader', '_read_from_buffer', 'close', 'closed', 'detach', 'fileno', 'flush', 'isatty', 'raw', 'read', 'read1', 'readable', 'readinto', 'readinto1', 'readline', 'readlines', 'seek', 'seekable', 'tell', 'terminate', 'truncate', 'writable', 'write', 'writelines']
from smart_open.
Thanks @mpenkov
Works fine with current master fbc82cc
from smart_open.
Hello Everyone,
I see that @mpenkov 's comment shows that the f object has the 'fileno' method implemented. However, after migration to the new open function, I no longer see 'fileno' as a method for this object anymore. This is rather unfortunate as I would like to use the return value of smart_open.open() as the stdin parameter to a Popen call. But as it is not implemented it throws this error:
[ERROR] UnsupportedOperation: fileno
Is there a reason the fileno operation is no longer supported? Or am I missing something?
from smart_open.
@peter-t-kim I'm not sure at which stage we got rid of fileno. I don't think it's a product of the change to the open function (that change happened at a different abstraction level). You can confirm this by checking on the commit in my previous comment (fbc82cc) and then:
In [6]: fin = open('s3://bucket/key') # redacted
In [7]: 'fileno' in dir(fin)
Out[7]: True
In [8]: fin.fileno()
---------------------------------------------------------------------------
UnsupportedOperation Traceback (most recent call last)
<ipython-input-8-63d0a20bf555> in <module>
----> 1 fin.fileno()
UnsupportedOperation: fileno
When reading from S3, smart_open does not create any new file descriptors. The file descriptors get created by the OS when opening local files: in the case of reading from S3, we bypass that abstraction level entirely. So, I don't think it's possible to even have a file descriptor in your situation.
If you want to pipe data from S3 into a subprocess, you have to do something like this.
from smart_open.
Related Issues (20)
- S3 ContentEncoding is disregarded HOT 2
- Support for streaming from ZIP archives broken since 6.0? HOT 1
- Check files consistency between cloud providers storages
- Support for wasb/wasbs protocols HOT 2
- Copiyng and decompressing huge files on the fly HOT 3
- Slow performance due to lack of buffering for GzipFile.write HOT 6
- S3Path or PureS3Path returns NoSuchKeyExists on open('rb') handle intermittently HOT 5
- S3 multipart uploads to text streams are committed on exception
- Error when opening docstring HOT 1
- GCS permission denied 'storage.buckets.get' when using 'open' HOT 2
- python 3.11 support?
- Support for type annotations HOT 3
- Suggeted - allowing cache mechanism for files
- Getting OSError in s3 when permission for kms:Decrypt are missing HOT 4
- S3 open fails on files that contain '@' in their path HOT 5
- Writing to FTP fails with error "503 ASCII (Text) data type is not supported for file transfer operations. Please configure your FTP client to use IMAGE (Binary) type and try again" HOT 1
- Test failures with urllib3 2.0.4
- Compatibility issue with soundfile HOT 1
- Add OAuth2 support HOT 1
- pip install for version 3.0.0 failing HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from smart_open.