Giter Site home page Giter Site logo

pywbem / nocasedict Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 0.0 407 KB

A case-insensitive ordered dictionary for Python

License: GNU Lesser General Public License v2.1

Makefile 15.65% Python 78.34% Shell 6.01%
case-insensitive ordered dict ordereddict python

nocasedict's Introduction

nocasedict - A case-insensitive ordered dictionary for Python

Version on Pypi Test status (master) Docs status (master) Test coverage (master)

Overview

Class NocaseDict is a case-insensitive ordered dictionary that preserves the original lexical case of its keys.

Example:

$ python
>>> from nocasedict import NocaseDict

>>> dict1 = NocaseDict({'Alpha': 1, 'Beta': 2})

>>> dict1['ALPHA']  # Lookup by key is case-insensitive
1

>>> print(dict1)  # Keys are returned with the original lexical case
NocaseDict({'Alpha': 1, 'Beta': 2})

The NocaseDict class supports the functionality of the built-in dict class of Python 3.8 on all Python versions it supports.

Limitation: Any functionalities added to the dict class in Python 3.9 or later are not yet supported. These are:

  • d | other - Added in Python 3.9.
  • d |= other - Added in Python 3.9.

The case-insensitivity is achieved by matching any key values as their casefolded values. By default, the casefolding is performed with str.casefold() for unicode string keys and with bytes.lower() for byte string keys. The default casefolding can be overridden with a user-defined casefold method.

Functionality can be added using mixin classes:

  • HashableMixin mixin class: Adds case-insensitive hashability.
  • KeyableByMixin mixin generator function: Adds ability to get the key from an attribute of the value object.

Why yet another case-insensitive dictionary: We found that all previously existing case-insensitive dictionary packages on Pypi either had flaws, were not well maintained, or did not support the Python versions we needed.

Installation

To install the latest released version of the nocasedict package into your active Python environment:

$ pip install nocasedict

This will also install any prerequisite Python packages.

For more details and alternative ways to install, see Installation.

Documentation

Change History

Contributing

For information on how to contribute to the nocasedict project, see Contributing.

License

The nocasedict project is provided under the GNU Lesser General Public License (LGPL) version 2.1, or (at your option) any later version.

nocasedict's People

Contributors

andy-maier avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nocasedict's Issues

allow_unnamed_keys or not?

The current NocaseDict class has an attribute allow_unnamed_keys that controls whether it allows None as a key. By default, it does not allow Noneas a key.

The standard Python dict always allows None as a key.

Pywbem uses this to disallows unnamed keys in most cases, except for CIMInstanceName.keybindings where they are allowed.

Proposal: Remove the allow_unnamed_keys attribute from NocaseDict and always allow None as a key. The places in pywbem where unnamed keys should be disallowed need to be changed to check for that, instead of delegating the check to NocaseDict.

DISCUSSION

Test with latest package levels does not use latest indirect dependents

Actual behavior

The installation of packages is done with pip install --upgrade. That does not upgrade already installed versions of indirect dependents. The Ubuntu used by Travis has quite a number of packages preinstalled in the virtualenv it provides, so these packages stay at versions that are not the latest, if they satisfy the minimum requirements of the other packages.

Expected behavior

Testing with latest package levels should upgrade all packages including indirect dependencies.

This can be achieved by adding the pip option --upgrade-strategy eager.

Execution environment

  • nocasedict version: 1.0.1
  • Python version: any
  • Operating System (type+version): any

Use OrderedDict only before py37

Starting with Python 3.7, the standard dict is guaranteed to be ordered, so the use of OrderedDict can be replaced by the standard dict on py>=3.7.

Testcases that specify no expected warning tolerate warnings

Actual behavior

Testcases that specify no expected warning tolerate warnings that occur nevertheless.

Expected behavior

Testcases that specify no expected warning should verify that no warning occurs and fail otherwise.

Execution environment

  • nocasedict version: 1.0.0
  • Python version: and
  • Operating System (type+version): any

Make NocaseDict derived from dict

Currently, NocaseDict is derived from object, but it should ideally be derived from dict. The issue is that it does not want to inherit the dict data, at least not the current implementation. This can probably be handled though with some changes in the implementation.

DISCUSSION: Do we want NocaseDict to be derived from dict, for type checking?

COMMENT/KS: It is confusing when it is not inherited and probably not very pythonic. Thought it over last night and we should not be calling it a dict if it does not inherit from dict.

raise AttributeError for missing lower() method

The NocaseDict class currently raises TypeError if the dict keys do not have a lower() method. The nocaselist.NocaseList class raises AttributeError in that case, which is the better solution because in some cases TypeError is also raised for other issues, and this allows distinguishing the cases better.

Change to raise AttributeError instead, for missing lower() method on key objects.

Issues with UserWarning about unpreserved order of items in input

Actual behavior

The UserWarning that is issued to indicate that the order of items in input to the NocaseDict initialization or to its update() method is not preserved has two issues:

  1. The source code location is incorrect when issued as a result of NocaseDict object initialization: It does not reference the user code, but the NocaseDict init method.

  2. When the input is a standard dict, the warning is issued even when that dict contains only one item.

Expected behavior

  1. Correct location in user code.

  2. Warning is issued only when the dict contains more than one item.

Execution environment

  • nocasedict version: 1.0.0
  • Python version: any
  • Operating System (type+version): any

Consider using `casefold` instead of `lower`

I am comparing various implementations of case-insensitive dictionaries in Python, and have found that this one uses str.lower() rather than str.casefold(). See e.g. pydicti, which uses casefold() is available, and falls back to lower() if not available. If both fail, the key is used verbatim.

The advantage of casefold is that is has better unicode support than lower.

Use of obj.name in NocaseDict() init

The current NocaseDict init method supports an iterable of pywbem CIM objects whose name attribute is used as dict key. That ability is very convenient for pywbem and needs to be retained.

Options are:

  • NocaseDict continues to support the functionality, but in a cleaned up way. It would be an additional functionality on top of what the standard Python dict supports.
  • NocaseDict no longer supports it, and the pywbem code handles it. That is possible because NocaseDict objects so far were not created by pywbem users.

COMMENT/ks: Actually it is used in pywbemcli but only in tests. The tests can be changed.; it is used to generate keybindings for test_instances.py and scopes for test_qualdecl.py

DISCUSSION

Minor issues with Python 3 Sphinx theme

After switching to the Python 3 theme for the docs, there are some minor issues with it:

  • It shows an empty "Copyright" text at the bottom right end of each page.
  • It shows a link "Python" to www.python.org at the top left and bottom left end of each page.
  • There is a reddish background color on some areas, e.g. on the Parameter or Raises sections of method/function descriptions.

Use Python 3 theme in Sphinx generated docs

We are currently using the "classic" Sphinx theme. This theme is clearly associated with Python 2.

Since Python 2 went EOL, we should change to use the Python 3 standard theme ("python-docs-theme" package on Pypi).

Python 2 vs Python 3 semantics of keys() etc, iter..(), view..() methods

The current implementation of keys() etc. and iter..() is consistent with Python 2, but misses the view..() methods added in Python 2.7.

In Python 3, the iter..() and view..() methods were dropped and the keys() etc. methods return a view.

Proposal: Implement the behavior of the standard dict class for each of the major Python versions. If a common behavior across the major Python versions is needed, users can use the six package.

Stronger typing

It would be nice to have stronger typing when using the package. Currently, Mypy does not find any types when importing from nocasedict:

image

A good first step would be to add the py.typed marker, and ensure that all function arguments and return values have type hints. The project could benefit from Mypy linting itself, but that is not required.

Implement fromkeys() from Python 2 dict

The standard dict class in Python 2 supports a method fromkeys(). That method is not supported in Python 3.

  fromkeys(...)
      dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.
      v defaults to None.

Implement the method when on Python 2.

Validate tests against standard dict

In order to verify that the error handling of NocaseDict is compatible to the standard dict, support could be added to run the NocaseDict tests against a standard dict.

Hashable or not

The current NocaseDict code supports a __hash__() method that calculates a hash value from the set of tuples of lower cased dict key and dict item value. That makes it a hashable object. Hashable objects can be used as members in sets or as keys in mappings (dicts).

NocaseDict objects are mutable, but the hash value is calculated under the assumption that the object does not change its value while used in a set or as a key, i.e. it is calculated just once for a particular object. For this reason, the mutable standard types in Python (e.g. dict) are not hashable. Changing the value while in a set or used as a key can have strange effects (there are forum threads full of that).

Right now, we document that the NocaseDict objects that are used in a set or as a mapping key must not change while being used that way. However, there is no enforcement about that.

Pywbem is using NocaseDict objects in sets (I believe, need to double check).

DISCUSSION: Should we continue supporting NocaseDict objects being hashable, or should we remove that functionality because it is considered too dangerous for the general user.

Byte-string keys no longer work

Actual behavior

Creating a NocaseDict with byte-string keys generates an error since #122 was merged (version 1.0.5):

from nocasedict import NocaseDict
NocaseDict([(b'MyKey', b'myValue')])

>>> AttributeError: 'bytes' object has no attribute 'casefold'

Expected behavior

Byte-string keys worked fine in versions 1.0.4 and lower due to use of .lower() (which exists on both byte-strings and unicode strings). Either the SemVer should be bumped, and the API changed to generate an error for byte-string keys, or a compatibility wrapper should be added.

Execution environment

  • nocasedict version: 1.1.0
  • Python version: 3.11
  • Operating System (type+version): Linux

Clarify what the rules are for implementing __sizeof__()

The sys.getsizeof(obj) function returns the memory size of obj in Bytes. It does that by calling __sizeof__() on the object and adding the GC overhead if the object is GC-managed.

The rules for whether a user-defined class like NocaseDict has to implement __sizeof__() are not documented in the Python docs.

Here is a comparison between dict and NocaseDict that suggests that not implementing __sizeof__() is incorrect. On the other hand, dictionaries are referencing their key and value objects, so if multiple dictionaries reference the same objects it would not make too much sense to attribute their sizes to the dictionaries.

import sys
from nocasedict import NocaseDict
nd = NocaseDict()
d = dict()
print("len dict NocaseDict")
for x in range(0, 25):
    print(x, sys.getsizeof(d), sys.getsizeof(nd))
    key = 'key' + str(x)
    nd[key] = x
    d[key] = x

resulting in:

len dict NocaseDict
0 232 48
1 232 48
2 232 48
3 232 48
4 232 48
5 232 48
6 360 48
7 360 48
8 360 48
9 360 48
10 360 48
11 640 48
12 640 48
13 640 48
14 640 48
15 640 48
16 640 48
17 640 48
18 640 48
19 640 48
20 640 48
21 640 48
22 1176 48
23 1176 48
24 1176 48
25 1176 48

Release as 0.5.0 instead of 1.0.0, and dev status beta.

In order to better be able to fix yet unknown issues and in case the use of NocaseDict in pywbem reveals a need for changes, this first release should be done by going back to version number 0.5.0 and dev status "5 - Beta", I think.

DISCUSSION

AttributeError: 'Path' object has no attribute 'write_bytes' from virtualenv on py34

Actual behavior

See Travis run: https://travis-ci.org/github/pywbem/nocasedict/jobs/732205320

Testcase test1: Pip install from repo root directory: /home/travis/build/pywbem/nocasedict
Saving location of current virtualenv: /home/travis/virtualenv/python3.4.8
Before creating virtualenv: /home/travis/build/pywbem/nocasedict/tmp_installtest/virtualenvs/nocasedict_test_test1
Python version: Python 3.4.8 from /home/travis/virtualenv/python3.4.8/bin/python
Pip version: pip 19.1.1 from /home/travis/virtualenv/python3.4.8/lib/python3.4/site-packages/pip (python 3.4)
Creating virtualenv: /home/travis/build/pywbem/nocasedict/tmp_installtest/virtualenvs/nocasedict_test_test1
Error: Command failed with rc=1: virtualenv -p /home/travis/virtualenv/python3.4.8/bin/python  /home/travis/build/pywbem/nocasedict/tmp_installtest/virtualenvs/nocasedict_test_test1, output follows:
AttributeError: 'Path' object has no attribute 'write_bytes'
Makefile:585: recipe for target 'installtest' failed

Expected behavior

Success

Execution environment

  • nocasedict version: 1.0.0
  • Python version: 3.4
  • Operating System (type+version): Ubuntu (Travis)

Optimize pip backtracking in "make develop"

"make develop" causes pip to use backtracking back to:

Using cached tox-4.0.1-py3-none-any.whl.metadata (4.9 kB)
Using cached sphinx-6.0.0-py3-none-any.whl.metadata (6.2 kB)
Using cached pyproject_api-1.5.1-py3-none-any.whl.metadata (2.6 kB)

Docs and README review

README:

  • Add examples for hashable and keyableby.
  • Add a statement about compatibility to standard dict.

Docs:

  • Add example for case-insensitive access to docstring of NocaseDict class.
  • Improve wording "preserved order".
  • Add a statement about compatibility to standard dict.

Remove TODO in lt/gt/le/ge comparisons

The current implementation raises TypeError for these comparisons. That is consistent with dict, and also the message is consistent.

However, there are still TODOs asking for decision on whether to support the comparisons.
These TODOs should be removed.

2.0.1: pytest fails in almost all units with `TypeError: exceptions must be derived from Warning, not <class 'NoneType'>`

I'm packaging your module as an rpm package so I'm using the typical PEP517 based build, install and test cycle used on building packages from non-root account.

  • python3 -sBm build -w --no-isolation
  • because I'm calling build with --no-isolation I'm using during all processes only locally installed modules
  • install .whl file in </install/prefix> using installer module
  • run pytest with $PYTHONPATH pointing to sitearch and sitelib inside </install/prefix>
  • build is performed in env which is cut off from access to the public network (pytest is executed with -m "not network")
  • Python version: 3.9.18
  • Operating System (type+version): Linux x86/64
  • pytest 8.1.1

Please let me know if you need more details or want me to perform some diagnostics.

Implement has_key() from Python 2 dict

The standard dict class in Python 2 supports a method has_key(key). That method is not supported in Python 3.

  has_key(...)
      D.has_key(k) -> True if D has a key k, else False

Implement the method when on Python 2.

Workflow based upload to Pypi

Currently, the project still uses "make upload" to upload to Pypi. The disadvantage is that this requires a personal ID on Pypi. Change this to a workflow based publishing, like done in the pywbem repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.