Giter Site home page Giter Site logo

osarchiver's Introduction

OSArchiver: OpenStack databases archiver

OSArchiver is a python package that aims to archive and remove soft deleted data from OpenStack databases. The package is shiped with a main script called osarchiver that reads a configuration file and run the archivers.

Philosophy

  • OSArchiver doesn't have any knowledge of Openstack business objects
  • OSArchiver purely relies on the common way of how OpenStack marks data as deleted by setting the column 'deleted_at' to a datetime. It means that a row is archivable/removable if the 'deleted_at' column is not NULL

Limitations

  • Support Mysql/MariaDB as db backend.
  • python >= 3.5

Design

OSArchiver reads an INI configuration file in which you can define:

  • archivers: a section that hold one source and a non mandatory list of destinations
  • sources: a section that define a source of where the data should be read (basically the OS DB)
  • destinations: a section that define where the data should be archived

How does it works:

                                   .----------.
        .--------------------------| Archiver |-----------------------------.
        |                          '----------'                             |
        |                                                                   |
        |                                                                   |
        |                                                                   |
        v                        _______________                            v
   .--------.                    \              \                    .-------------.
   | Source |-------------------->) ARCHIVE DATA )------------------>| Desinations |
   '--------'                    /______________/                    '-------------'
        |                                |                                  |
        |                                |                                  |
        |                                |                                  |
        |                                |                                  |
        |                                |                                  |
        |                                v                                  |
        |                  .--------------------------.                     |
        v                 ( No error and delete_data=1 )                    |
                           '--------------------------'                     |
    _.-----._                            |                    _.-----._     |
  .-         -.                          |                  .-         -.   |   ___   
  |-_       _-|                          |                  |-_       _-|   |  |   |\ 
  |  ~-----~  |                          |                  |  ~-----~  |<--'->|   ' ___   
  |           |                          |                  |           |      | SQL|   |\ 
  `._       _.'                          |                  `._       _.'      |____|   '-|---.
     "-----"                             |                     "-----"              | CSV |   |
  OpenStack DB                           v                  Archiving DB            |_____|   |
        ^                        _______________                                              v
        |                        \              \                                 .-----------------------.
        '-------------------------) DELETE DATA  )                               ( remote_store configured )
                                 /______________/                                 '-----------------------'
                                                                                              |
                                                                                              v
                                                                                         __________ 
                                                                                        [_|||||||_°]
                                                                                        [_|||||||_°]
                                                                                        [_|||||||_°]

                                                                                 Remote Storage (Swift, ...)

Installation

git clone https://github.com/ovh/osarchiver.git
cd osarchiver
pip install -r requirements.txt
pip setup.py install

osarchiver script

# osarchiver --help
usage: osarchiver [-h] --config CONFIG [--log-file LOG_FILE]
                  [--log-level {info,warn,error,debug}] [--debug] [--dry-run]

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG       Configuration file to read
  --log-file LOG_FILE   Append log to the specified file
  --log-level {info,warn,error,debug}
                        Set log level
  --debug               Enable debug mode
  --dry-run             Display what would be done without really deleting or
                        writing data

Configuration

The configuation is an INI file containing several sections. You configure your differents archivers in this configuration file. An example is available at the root of the repository.

DEFAULT section:

  • Drescription: default section that define default/fallback value for options
  • Format [DEFAULT]
  • configuration parameters: all the parameters of archiver, source, destination and backend section can be added in this section, those will be the fallback value if the value is not set in a section.

Archiver section:

  • Description: defines where to read data and where to archive them and/or delete.
  • Format [archiver:name]
  • configuration parameters:
    • src: name of the src section
    • dst: comma separated list of destination section names
    • enable: 1 or 0, if set to 0 the archiver is ignored and not run

Example:

[archiver:My_Archiver]
src: os_prod
dst: file, db

[src:os_prod]
...

[dst:file]
...

[dst:db]
....

Source section:

  • Description: defines where the OpenStack database are. It supports for now one backend (db) but it may be easily extended
  • Format [src:name]
  • configuration parameters:
    • backend*: the name of backend to use, only db is supported
    • retention: 12 MONTH
    • archive_data: 0 or 1 if set to 1 expect a dest to archive the data else won't run the archiving step just the delete step.
    • delete_data: 0 or 1 if set to 1 will run the delete step. If the archive step fails the delete step is not run to prevent loose of data.
    • backend specific options

Destination section:

  • Description: defines where the data should be written. It supports for now two backends (db for datatabase and file [csv, sql]) and may be extended
  • Format [dst:name]
  • configuration parameters:
    • backend*: the name of backend to use, db or file
    • backend specific options

Backends options:

db

  • Description: is the database (mysql/mariadb) backend
  • options:
    • host: DB host to connect to
    • port: port of MariaDB server is running on
    • user: login of MariaDB server to connect with
    • password: password of user
    • delete_limit: apply a LIMIT to DELETE statement
    • select_limit: apply a LIMIT to SELECT statement
    • bulk_insert: data are inserted in DB every builk_insert rows
    • deleted_column: name of column that holds the date of soft delete, is also used to filter table to archive, it means that the table must have the deleted_column to be archived
    • where: the literal SQL where applied to the select statement Ex: where=${deleted_column} <= SUBDATE(NOW(), INTERVAL ${retention})
    • retention: how long time of data to keep in database (SQL format: 12 MONTH, 1 DAY, etc..)
    • excluded_databases: comma, cariage return or semicolon separated regexp of DB to exclude when specfiying '*' as database. The following DB are akways ignored: 'mysql', 'performance_schema', 'information_schema'
    • excluded_tables: comma, cariage return or semicolon separated regexp of DB to exclude when specifying '' as table. Ex: shadow_.,.*_archived
    • db_suffix: a non mendatory suffix to apply to the archiving DB. The default suffix '_archive' is applied if you archive on same host than source without setting a db_suffix or table_suffix (avoid reading and writing on the same db.table)
    • table_suffix: apply a suffix to the archinving table if specified

file

  • Description: is the file archiving destination type, it writes SQL data in a file using one or several formats (supported: SQL, CSV)
    • directory: the directory path where to archive data. You may use the {date} keyword to append automaticaly the date to the directory path. (/backup/archive_{date})
    • formats: a comma, semicolon or cariage return separated list that define the format in witch archive the data (csv, sql)

You've developed a new cool feature ? Fixed an annoying bug ? We'd be happy

to hear from you !

Have a look in CONTRIBUTING.md

Related links

License

See https://github.com/ovh/osarchiver/blob/master/LICENSE

osarchiver's People

Contributors

arnaudmorin avatar pslestang avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.