Comments (29)
Adding the bug
tag since it looks like there is a real bug here as well. Even when credentials are set via env variables, reading from buckets doesn't always work.
from turicreate.
Update: Bug resolved; waiting on docs.
from turicreate.
@davidswaven Stay tuned - I can't give an exact date but we're putting together a 4.1 release including all current fixes.
from turicreate.
We are hoping for this week.
from turicreate.
Turi Create 4.1 is now available.
from turicreate.
Thanks for reporting this. We will look into this issue and update the instructions on using S3.
from turicreate.
The fix to support s3 regions other than the default is in #90. Leaving this issue open for now, to update the documentation as well.
from turicreate.
Docs updated with #183.
from turicreate.
hi, any idea when this will be released ? thanks
from turicreate.
I'm able to make it work from my laptop. Thanks.
Unfortunately, from an EC2 linux server that has an IAM role, I have no way to access my SFrame stored in S3 bucket.
If I don't provide any environment variable, I get the following (and expected) error:
KeyError('No access key found. Please set the environment variable AWS_ACCESS_KEY_ID.',)
If i provide an access & secure key that has access to the bucket, I get the following (but unexpected) error:
IOError: s3://{my-bucket}/{my_sframe_folder_path} not found.: iostream error
The same code was working in lib sframe 2.1
from turicreate.
Thanks @davidswaven - sounds like there is still a (now more obscure) bug here. I'll reopen this issue to track that.
from turicreate.
@davidswaven Are you still able to repro this on the latest Turi Create (either 4.3.2 or 5.0b2)? If so, by any chance do you have capital letters in your bucket name? I think we may have issues specific to that case.
from turicreate.
Closing for now -- please reopen if this has not been fixed.
from turicreate.
We are experiencing the same issue using the latest version of Turi Create within EC2. When trying to access an S3 bucket within a Linux instance on AWS EC2, we receive the following errors:
Traceback (most recent call last):
File "/root/venv/lib64/python3.6/site-packages/turicreate/data_structures/sframe.py", line 808, in __init__
self.__proxy__.load_from_sframe_index(url)
File "turicreate/cython/cy_sframe.pyx", line 71, in turicreate.cython.cy_sframe.UnitySFrameProxy.load_from_sframe_index
File "turicreate/cython/cy_sframe.pyx", line 74, in turicreate.cython.cy_sframe.UnitySFrameProxy.load_from_sframe_index
OSError: s3:/bucket_name/path/to/sframe not found.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "index.py", line 3, in <module>
ratings = tc.load_sframe("s3:/bucket_name/path/to/sframe")
File "/root/venv/lib64/python3.6/site-packages/turicreate/data_structures/sframe.py", line 83, in load_sframe
sf = SFrame(data=filename)
File "/root/venv/lib64/python3.6/site-packages/turicreate/data_structures/sframe.py", line 812, in __init__
raise ValueError('Unknown input type: ' + format)
File "/root/venv/lib64/python3.6/site-packages/turicreate/cython/context.py", line 49, in __exit__
raise exc_type(exc_value)
OSError: s3:/bucket_name/path/to/sframe not found.
Traceback (most recent call last):
File "save_model.py", line 23
model.save("s3://bucket_name/path/to/save/file.model)
File "/root/venv/lib64/python3.6/site-packages/turicreate/toolkits/_model.py", line 443, in save
return glconnect.get_unity().save_model(self, _make_internal_url(location))
File "turicreate/cython/cy_unity.pyx", line 97, in turicreate.cython.cy_unity.UnityGlobalProxy.save_model
File "turicreate/cython/cy_unity.pyx", line 103, in turicreate.cython.cy_unity.UnityGlobalProxy.save_model
OSError: Unable to create directory structure at s3://id:key:bucket_name/path/to/file.model. Ensure that you have write permission to this location, or try again with a different path.
Despite the error messages, we are able to successfully access our S3 bucket using these credentials with AWS CLI.
from turicreate.
Can you share a small repro script that we can try out and reproduce the issue?
from turicreate.
@srikris. Here is a small script to reproduce.
import turicreate as tc
ratings = tc.SFrame.read_csv("s3://path")
model = tc.recommender.create(ratings, target="rating", verbose=False)
model.save("s3://path")
For the environment, we spun up an Amazon Linux AMI 2018.03.0 (HVM), SSD Volume Type
EC2 instance and installed Python 3.6
. We created an IAM user that had full S3 access for the credentials.
from turicreate.
@oakesjessica Thanks for reporting this. We have found the bug. We will keep you posted!
from turicreate.
@oakesjessica I think we have identified the issue. Fix is up for PR. Thanks!
from turicreate.
Fixed with #1416.
from turicreate.
@srikris, @znation. Thank you!
from turicreate.
The fix for OSError: Unable to create directory structure
is now available in Turi Create 5.3.1.
from turicreate.
@srikris, @znation. Thank you for the updated fix in 5.3.1. However, we are still having issues reading and writing directly to our S3 bucket. The traceback error paths are the same as above but with different errors.
>>> tc.SFrame.read_csv("s3://bucket/to/file.csv")
Traceback (most recent call last):
File "/root/venv/lib64/python3.6/site-packages/turicreate/data_structures/sframe.py", line 1037, in _read_csv_impl
errors = proxy.load_from_csvs(internal_url, parsing_config, type_hints)
File "turicreate/cython/cy_sframe.pyx", line 76, in turicreate.cython.cy_sframe.UnitySFrameProxy.load_from_csvs
File "turicreate/cython/cy_sframe.pyx", line 84, in turicreate.cython.cy_sframe.UnitySFrameProxy.load_from_csvs
RuntimeError: No files corresponding to the specified path (s3://bucket/to/file.csv).
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/root/venv/lib64/python3.6/site-packages/turicreate/data_structures/sframe.py", line 1504, in read_csv
**kwargs)[0]
File "/root/venv/lib64/python3.6/site-packages/turicreate/data_structures/sframe.py", line 1037, in _read_csv_impl
errors = proxy.load_from_csvs(internal_url, parsing_config, type_hints)
File "/root/venv/lib64/python3.6/site-packages/turicreate/cython/context.py", line 49, in __exit__
raise exc_type(exc_value)
RuntimeError: No files corresponding to the specified path (s3://bucket/to/file.csv).
>>> model.save('s3://bucket/to/save_model.model')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/root/venv/lib64/python3.6/site-packages/turicreate/toolkits/_model.py", line 443, in save
return glconnect.get_unity().save_model(self, _make_internal_url(location))
File "turicreate/cython/cy_unity.pyx", line 97, in turicreate.cython.cy_unity.UnityGlobalProxy.save_model
File "turicreate/cython/cy_unity.pyx", line 103, in turicreate.cython.cy_unity.UnityGlobalProxy.save_model
OSError: Maximum retry time reached
We used the same repro script provided above to test this out with 5.3.1 and using cli commands such as aws s3 cp s3://bucket/to/file.csv ./
are successful so it doesn't seem to be our credentials. Is there another hidden issue or is there something missing on our end?
from turicreate.
Hmmm... Can you try adding the following line of code before you perform any S3 access. (Say immediately after import.
tc.config.set_runtime_config('TURI_FILEIO_INSECURE_SSL_CERTIFICATE_CHECKS', 1)
from turicreate.
Setting that config did allow us to successfully save to our S3 bucket. Although, each save took a minimum of 30 minutes or more to finish, is there something we can do to increase the efficiency? Using a p2 instance did not seem to help with the speed.
Unfortunately, we are still getting the same retrieval error, RuntimeError: No files corresponding to the specified path (s3://bucket/to/file.csv)
, even though the file does exist.
from turicreate.
The S3 write path could be optimized. I don't think we are taking advantage of parallel uploading capabilities. A workaround is to write it out to local disk and use awscli to upload it.
The read issue is odd though. It is surprising that you can write to the bucket, but not read from it. What region is your bucket in? Does the bucket name have uppercase characters?
Can you help us with some diagnosis steps?
import turicreate as tc
tc.config.set_log_level(2)
print tc.config.get_server_log_location() + ".0"
# attempt to read the CSV here
A log file will be produced in the location printed by the print statement.
You might need to strip it of s3 path information, before attaching here, or you can email it to me at [email protected]
Thanks!
from turicreate.
@ylow. Cool, I am currently using the workaround you suggested so I'll just keep using that until the upload method is optimized more. Our bucket is in the us-east-1
region and does not contain uppercase characters. Sure, I will email you the log file. Thank you!
from turicreate.
Is it resolved?
from turicreate.
@franz101 - good question.
@davidswaven or @oakesjessica - we recently rewrote much of our S3 code to use AWS's SDK. I suspect this issue is likely now fixed. Please try using the most recent version of TuriCreate and let us know if the issue has been resolved.
from turicreate.
This issues should have been fixed in 6.2. I haven't heard back here. So I'm going to close this issue.
from turicreate.
Related Issues (20)
- TuriCreate not saving nor exporting in Google Colab HOT 1
- Install turicreate on Google colab HOT 1
- Is the image similarity example broken for CoreML?
- Turi Create installation on anaconda virtual environment HOT 1
- issue seaborn
- GraphLab Create requires a license to use in linux HOT 1
- SFRAME problem and turicreate HOT 3
- Can you continue training ObjectDetector model?
- Object detection - Segfault after a large number of iterations
- available data sets in turicreate
- Mac M2 model.export_coreml('.mlmodel') Unable to export model HOT 1
- TuriCreate still doesn't work on M1 using rosetta terminal HOT 7
- While training object_detector in colab randomly Using CPU/GPU to create model.
- Trying to create a model on a larger dataset - Loss stuck at the same number and not moving, resulting model predictions detect nothing
- Support Python 3.9 HOT 1
- pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
- Simple Image Classification Model gives different confidence level (Between Coreml UI and iOS App)
- pip dependency conflicts: conda-repo-cli 1.0.20 requires nbformat==5.4.0, but you have nbformat 5.7.3 which is incompatible. HOT 1
- AttributeError: module 'numpy' has no attribute 'typeDict' HOT 1
- Cannot install and import TuriCreate HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from turicreate.