Giter Site home page Giter Site logo

Comments (1)

tomhoag avatar tomhoag commented on June 21, 2024

I just went through the AWS docs on this recently for another project of mine.

Tell me more about how objects are being created and the buckets that they are being stored in?

With AWS S3, you can setup up expiration and transition rules on a bucket. The rules use object prefixes or tags to determine which objects should be acted upon. The rules are time based using the object creation time. (It gets a bit more involved if the objects are versioned).

If the boto code is creating buckets and putting objects into the buckets, it would probably be best to take a closer look at the AWS S3 python library to see if it supports S3 rule creation. If there is a small number of static buckets that are used day in and day out for the storage of web scrapings, it might be easier to use the AWS console to create the rules.

In the later case, the only possible change would be standardizing the object prefixes/tags that boto is using so the the rules don't have to be overly complicated.

One other thought, rules can also be used to transition S3 objects into low cost, slow access AWS Glacier storage. If there's any thought that someday the scraped data might be useful, it may be worthwhile to transition it to glacier before deleting from S3.

from home-data-gallery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.