Giter Site home page Giter Site logo

data-engineering-collective / minimalkv Goto Github PK

View Code? Open in Web Editor NEW
16.0 5.0 9.0 1.06 MB

A minimal key-value store interface for binary data (maintained fork of simplekv).

Home Page: https://minimalkv.readthedocs.io/en/latest/index.html

License: Other

Python 98.65% HCL 1.00% Dockerfile 0.05% Shell 0.30%

minimalkv's Introduction

minimal key-value storage api

minimalkv is an API for very basic key-value stores used for small, frequently accessed data or large binary blobs. Its basic interface is easy to implement and it supports a number of backends, including the filesystem, SQLAlchemy, MongoDB, Redis and Amazon S3/Google Storage.

Installation

minimalkv is available on PyPI and can be installed through pip:

pip install minimalkv

or via conda on conda-forge:

conda install -c conda-forge minimalkv

Documentation

The documentation for minimalkv is available at https://minimalkv.readthedocs.io.

License

minimalkv is licensed under the terms of the BSD-3-Clause license.

minimalkv's People

Contributors

amerkel2 avatar crepererum avatar criemen avatar damianbarabonkovqc avatar danilobellini avatar dependabot[bot] avatar felix-marczinowski-by avatar fhoehle avatar fjetter avatar fmarczin avatar fuhrysteve avatar hoffmann avatar janjagusch avatar johanols avatar jtilly avatar jvrsantacruz avatar marco-neumann-by avatar marthakelly avatar matthias-bach-by avatar mbr avatar mganesh1308 avatar mlondschien avatar sharathpuranik avatar siboehm avatar simonbohnen avatar simonbohnenqc avatar thomasmarwitzqc avatar usha-nemani-by avatar xhochy avatar zerosteiner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

minimalkv's Issues

Handling of colon in Azure KeyValueStores

Hi everyone,
currently minimalkv prohibits using colon ":" in KeyValueStores due to it not being in VALID_KEY_RE_EXTENDED.
Azure Stores would support colons in their path. As far as I know this is currently impossible to read using minimalkv. I.e. if another application writes a file with a colon, you cannot read this file with minimalkv (no expert here though, maybe there is a way to quote this).

I can create a PR to allow for colons only for Azure stores. Would this be ok or is there a decision not to allow this to keep the allowed filenames uniform between different store classes?

Boto3 Store is untested code

In SimpleKV the Boto3 store was merged while the CI had been down for months: mbr/simplekv#106

So there was never a successful test-run on the CI. I spent an hour on it but couldn't figure out how to get it to connect to a Minio instance without bigger changes to the code. This is due to the client being created implicitly (contrary to explicitly like in the Boto store). I couldn't find a way to point the client to a local endpoint without modifying the global ~/.aws/config file.

The boto store still works, but the boto library hasn't seen any updates since 2018, so we should move to boto3 at some point definitely.

Ambivalent license

In the README.rst it is stated that the minimalkv license is MIT. However, the LICENSE file itself looks like the BSD-3 license. So which license is minimalkv distributed under? Would you mind making this a bit more clear?

Thank you for a great software package!

`get_store_from_url` fails due to missing import `redis`

Without redis installed, we currently get

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/quantcore/thek/utils.py:698: in inner_f
    return f(*args, **kwargs)
src/quantcore/thek/view.py:134: in __init__
    if self.exists and partition_on is None:
src/quantcore/thek/kartothek_dataset.py:107: in exists
    return DatasetMetadata.exists(uuid=self.uuid, store=self.store())
/opt/conda/envs/quantcore.thek/lib/python3.7/site-packages/minimalkv/_get_store.py:55: in get_store_from_url
    return get_store(**url2dict(url))
/opt/conda/envs/quantcore.thek/lib/python3.7/site-packages/minimalkv/_get_store.py:145: in get_store
    store = create_store(type, params)
/opt/conda/envs/quantcore.thek/lib/python3.7/site-packages/minimalkv/_store_creation.py:23: in create_store
    return _create_store_hfs(type, params)
/opt/conda/envs/quantcore.thek/lib/python3.7/site-packages/minimalkv/_store_creation.py:114: in _create_store_hfs
    from minimalkv._hstores import HFilesystemStore
/opt/conda/envs/quantcore.thek/lib/python3.7/site-packages/minimalkv/_hstores.py:6: in <module>
    from minimalkv.memory.redisstore import RedisStore
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    import re
    from io import BytesIO
    from typing import IO, Iterator, List, Optional, Union
>   from redis import StrictRedis
E   ModuleNotFoundError: No module named 'redis'

when running get_store_from_url in an environment without redis-py installed. This is not surprising, as we don't put the import into the class / function scopes as we e.g. do for the GoogleCloudStore. Similarly for the GitCommitStore. The question here IMO is how this worked before.

Add remaining tests to Github Actions

Currently being skipped:

  • azure
  • boto3 (test through minio)
  • mongodb
  • redis
  • sqlalchemy

Delete, bc we cannot test it:

  • boto (doesn't work with Python3)

This is how it used to be done on simplekv travis:
https://github.com/mbr/simplekv/blob/master/.travis.yml

before_script:
- bash .travis/start_minio.sh
- docker run -p 10000:10000 mcr.microsoft.com/azure-storage/azurite azurite-blob --blobHost 0.0.0.0 &
- docker run -d --name fake-gcs-server -p 4443:4443 fsouza/fake-gcs-server -scheme http
- psql -c 'create database simplekv_test;' -U postgres
- psql -c 'ALTER ROLE travis CONNECTION LIMIT -1;' -U postgres
- mysql -e 'create database simplekv_test;'
- mysql -e 'set global max_connections = 200;'

Support registering new backends

As a user of the minimalkv framework, I want to create and register new backends and use them through the get_store_from_url function without changing the minimalkv library. This is currently not possible, as the extract_params function hard-codes the known storage types:

https://github.com/data-engineering-collective/minimalkv/blob/main/minimalkv/_urls.py#L70-L122

What I'm imagining is a registration function that makes minimalkv aware of this new storage type, similarly to how fsspec does it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.