Giter Site home page Giter Site logo

Comments (2)

RobinL avatar RobinL commented on July 16, 2024

How do we feel about the following API?

from etl_manager.meta import get_existing_database_from_glue_catalogue

# Note I am not going to attempt to read current tables from Glue and create table objects
db = get_existing_database_from_glue_catalogue('my_database')

t = TableMeta(name="table1", location="somewhere")
t.add_column(name= "employee_id2", type= "character", description= "a new description")

db.add_table(t)

# Will not replace existing tables unless overwrite is set to true
db.append_tables_to_glue_database(overwrite=False)

from etl_manager.

isichei avatar isichei commented on July 16, 2024

Yeah fine by me. On top of that this should be used to fix #117. I'd imagine that you could have something like:

    def create_glue_database(self, delete_if_exists=False):
        """
        Creates a database in Glue based on the database object calling the method function.
        By default, will error out if database exists - unless delete_if_exists is set to True (default is False).
        """

        if delete_if_exists:
            self.delete_glue_database()

         db = get_existing_database_from_glue_catalogue(self.name)
         if db:
             existing_tables = db._tables
         else:
             db = {"DatabaseInput": {"Description": self.description, "Name": self.name}}
             _glue_client.create_database(**db)
            existing_tables = []

        for tab in [t for t in self._tables if t not in existing_tables]:
            glue_table_def = tab.glue_table_definition(self.s3_database_path)
            _glue_client.create_table(DatabaseName=self.name, TableInput=glue_table_def)

There are some issues with the above (indenting probably for one). But we would probably want to parameterise the function to only update new tables, set a list of tables to update or do all of them. Anyway thought I'd add this as it will define what is returned from get_existing_database_from_glue_catalogue

from etl_manager.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.