Comments (2)
Okay, I guess I can't attach...
From a3203122831aadbdd7ec641057c949cd7f65db3b Mon Sep 17 00:00:00 2001 From: Grant Olson Date: Mon, 19 Jul 2010 17:02:24 -0400 Subject: [PATCH] Find that works for buckets with many thousand items --- lib/aws/s3/object.rb | 50 +++++++++++++++++++++----------------------------- 1 files changed, 21 insertions(+), 29 deletions(-) diff --git a/lib/aws/s3/object.rb b/lib/aws/s3/object.rb index bcdf9e1..95b5296 100644 --- a/lib/aws/s3/object.rb +++ b/lib/aws/s3/object.rb @@ -143,41 +143,33 @@ module AWS # Returns the object whose key is name in the specified bucket. If the specified key does not # exist, a NoSuchKey exception will be raised. def find(key, bucket = nil) - # N.B. This is arguably a hack. From what the current S3 API exposes, when you retrieve a bucket, it - # provides a listing of all the files in that bucket (assuming you haven't limited the scope of what it returns). - # Each file in the listing contains information about that file. It is from this information that an S3Object is built. + # Bucket results come in chunks, 1000 by default. + # if the key isn't in the first chunk, we need to look through + # subsequent chunks until we find it. # - # If you know the specific file that you want, S3 allows you to make a get request for that specific file and it returns - # the value of that file in its response body. This response body is used to build an S3Object::Value object. - # If you want information about that file, you can make a head request and the headers of the response will contain - # information about that file. There is no way, though, to say, give me the representation of just this given file the same - # way that it would appear in a bucket listing. - # - # When fetching a bucket, you can provide options which narrow the scope of what files should be returned in that listing. - # Of those options, one is marker which is a string and instructs the bucket to return only object's who's key comes after - # the specified marker according to alphabetic order. Another option is max-keys which defaults to 1000 but allows you - # to dictate how many objects should be returned in the listing. With a combination of marker and max-keys you can - # *almost* specify exactly which file you'd like it to return, but marker is not inclusive. In other words, if there is a bucket - # which contains three objects who's keys are respectively 'a', 'b' and 'c', then fetching a bucket listing with marker set to 'b' will only - # return 'c', not 'b'. - # - # Given all that, my hack to fetch a bucket with only one specific file, is to set the marker to the result of calling String#previous on - # the desired object's key, which functionally makes the key ordered one degree higher than the desired object key according to - # alphabetic ordering. This is a hack, but it should work around 99% of the time. I can't think of a scenario where it would return - # something incorrect. - # We need to ensure the key doesn't have extended characters but not uri escape it before doing the lookup and comparing since if the object exists, # the key on S3 will have been normalized - key = key.remove_extended unless key.valid_utf8? - bucket = Bucket.find(bucket_name(bucket), :marker => key.previous, :max_keys => 1) - # If our heuristic failed, trigger a NoSuchKey exception - if (object = bucket.objects.first) && object.key == key - object - else - raise NoSuchKey.new("No such key `#{key}'", bucket) + key = key.remove_extended unless key.valid_utf8? + bkt_name = bucket_name bucket + partial_bucket = Bucket.find(bkt_name) + + while not partial_bucket.nil? + last_key = nil + partial_bucket.each do |s3object| + last_key = s3object.key + return s3object if last_key == key.to_s + end + if partial_bucket.is_truncated + partial_bucket = Bucket.find(bkt_name, :marker => last_key) + else + partial_bucket = nil + end end + + raise NoSuchKey.new("No such key `#{key}'", bucket) end + # Makes a copy of the object with key to copy_key, preserving the ACL of the existing object if the :copy_acl option is true (default false). def copy(key, copy_key, bucket = nil, options = {}) bucket = bucket_name(bucket) -- 1.6.5.1
from aws-s3.
I found this article to be helpful in getting over the 1000 limit http://jakanapes.com/blog/2010/11/01/s3s-object-limit/
from aws-s3.
Related Issues (20)
- content_type returns with a comma
- all meta data has "x-amz-meta" prefix to the name
- Error witth ActionView::Base.default_form_builder HOT 2
- Make the library threadsafe HOT 2
- How I can stub s3 in Rspec? HOT 1
- Header/Option for facilitating server-side encryption HOT 1
- Link to official Amazon SDK
- Consider adding additional contributors or transferring ownership HOT 8
- NameError: uninitialized constant AWS::S3::Base HOT 1
- aws-s3 license? HOT 3
- `.error.raise` on 403 Forbidden responses causes method_missing blowup HOT 1
- Unable to attach IAM role in S3 bucket policy HOT 1
- Ruby 2.5.1 not supported ? HOT 1
- Upload a directory method
- Cannot use environment variables with aws-s3?
- Problem getting data from S3 (No method value)
- Errno::EPIPE: Broken pipe HOT 2
- Warning Digest::Digest is deprecated; use Digest HOT 7
- Rails + aws-s3 : AWS::S3::Base.establish_connection! just once or before every s3 operation ?
- @@{:instance writer=>true} issue HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aws-s3.