Giter Site home page Giter Site logo

cython-lint's Introduction

Hello everybody,

I work at Quansight Labs and spend time on:

  • Polars
  • pandas
  • assorted consluting / training

I've also written some assorted little tools:

  • cython-lint (the world's first Cython linter?)
  • absolufy-imports (superseded by ruff)
  • nbQA (mostly superseded by ruff, but still gets ~100k downloads/month for some reason)
  • auto-walrus (hopefully won't be implemented in ruff)
  • polars-upgrade (the only way is up)
  • polars-xdt (eXtra stuff for Dates 'n' Times)
  • narwhals (🎵singing in the ocean, causing a commotion🎶)

and a tutorial on how to write Polars Plugins: https://marcogorelli.github.io/polars-plugins-tutorial/. Don't be scared

cython-lint's People

Contributors

cclauss avatar dalcinl avatar grawlinson avatar jeremiedbb avatar kmaehashi avatar marcogorelli avatar pre-commit-ci[bot] avatar webknjaz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

cython-lint's Issues

Support IWYU checks in pyx files

Cython implicitly brings all cimport objects in a pxd file into the namespace of the corresponding pyx file. For example:

# foo.pxd
from lib.stdint cimport uint8_t

# foo.pyx
cdef f(uint8_t x):  # uint8_t is known because it's cimported in the pxd file
    pass

However, this can lead to inconsistent cimports across a package where some types are not cimported in the file that they are used. It would be helpful if cython-lint could raise an error in such cases.

Since I could imagine that approach not being favored by everyone, it would also be acceptable to have a configuration option that allows cython-lint to do the exact opposite and disallow any duplicate cimports between pyx and pxd files. Although not my preference, there may be some users of cython-lint with a preference for that. My first choice would be for cython-lint to be a more opinionated linter (a la black) in this situation, but I think the configuration approach would be an acceptable resolution as well. The truly undesirable case is when some cimports are present in both files while others are not, making it impossible to know a priori where to look.

#no-cython-lint directive is not honored

Version: 0.11 (it would be nice to have cython-lint --version!)

Cython file:

def _read_string(GenericStream st, size_t n):
    cdef object obj = st.read_string(n, &d_ptr, True)  # no-cython-lint

Output:

/tmp/foo.pyx:2:17: 'obj' defined but unused

Output log file

Analogous to flake8, I wanted to output all the possible warnings in a .log file. However, I've seen that this is not possible yet.
My proposal is to add this feature with the same argument as in flake8:

# Call on flake8
flake8 --output-file=flake8.log src

# Proposal for call on cython-lint
cython-lint --output-file=cylint.log src

I believe this is a nice feature to add, as other linters like pylint also allow to create log files by using --output=pylint.log.

Tox lists non-existing envs

Hi, I noticed that tox has docs-related envs enumerated while they aren't actually declared. Could be a good first issue to implement or delete those.

Config to globally ignore errors

Unless I missed something, the current way to ignore errors is to add # no-cython-lint on a line by line basis.
I can become cumbersome on a big project, where we'd like to ignore some errors globally. In flake8 you can pass --ignore=<coma separated list of errors to ignore>. What do you think about adding this functionality to cython-lint ?

As a second step, it would be even nicer to be able to write the config for cython-lint in the setup.cfg file of a project, e.g

[cython-lint]
max-line-length=88
ignore=E24,E121,...

Request: New lint to catch potential memory leaks

Recently I found a bug in a cython code base which caused a memory leak. The issue was the author had written a __cdel__ instead of __del__ or __dealloc__. I think it would be valuable to have a rule that we could potentially opt into, which would allow us to find __cinit__ methods without a matching __dealloc__ implementation. I took a pass and didn't see anything like this documented in the current rule set.

Triple quotes

Does double-quote-cython-strings handle conversion of ''' to """ in docstrings or multiline strings?

Failing to pass in paths results in confusing error message

Hi 👋 , thanks for the great tool! I noticed that failing to pass in a path or paths to cython-lint results in the confusing error,

Traceback (most recent call last):
  File "/Users/nicholas/micromamba/bin/cython-lint", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/nicholas/micromamba/lib/python3.11/site-packages/cython_lint/cython_lint.py", line 898, in main
    config = {k.replace('-', '_'): v for k, v in _get_config(paths).items()}
                                                 ^^^^^^^^^^^^^^^^^^
  File "/Users/nicholas/micromamba/lib/python3.11/site-packages/cython_lint/cython_lint.py", line 852, in _get_config
    root = pathlib.Path(os.path.commonpath(paths))
                        ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen posixpath>", line 531, in commonpath
ValueError: commonpath() arg is an empty sequence

I believe a simple remedy for this would be to replace the line

parser.add_argument('paths', nargs='*')

with

parser.add_argument('paths', nargs='+')

This should produce a more helpful error message with users make silly mistakes (like me).

turn off E251 check

example of offending code:

cdef inline int string_to_dts(
    str val,
    npy_datetimestruct* dts,
    NPY_DATETIMEUNIT* out_bestunit,
    int* out_local,
    int* out_tzoffset,
    bint want_exc,
    format: str | None = None,
    bint exact = True,
) except? -1:

this one may not be generally applicable to cython

double-quote-cython-strings producing uncythonizable f-strings

System

Ubuntu 22.04

cython == 3.0.10
cython-lint == 0.16.2
python == 3.12.3

Minimum working example

file test.pyx

def function():
    bar = 10
    a = f"foo [{'bar'}]"
    print(a)

run

double-quote-cython-strings test.pyx

changes test.pyx to

def function():
    bar = 10
    a = f"foo [{"bar"}]"
    print(a)

running

cythonize -i test.pyx

leads to error

Error compiling Cython file:
------------------------------------------------------------
...
def function():
    a = f"foo [{"bar"}]"
                 ^
------------------------------------------------------------

test.pyx:2:17: empty expression not allowed in f-string

Error compiling Cython file:
------------------------------------------------------------
...
def function():
    a = f"foo [{"bar"}]"
                 ^
------------------------------------------------------------

test.pyx:2:17: missing '}' in format string expression

Error compiling Cython file:
------------------------------------------------------------
...
def function():
    a = f"foo [{"bar"}]"
                 ^
------------------------------------------------------------

Note that leaving the cython file untouched (using single quotes inside the f-string) cythonizes correctly. This might also be a cython issue

Deprecation warning when building wheel

setuptools complains about a deprecated parameter, as shown below:

/usr/lib/python3.10/site-packages/setuptools/config/setupcfg.py:508: SetuptoolsDeprecationWarning: The license_file parameter is deprecated, use license_files instead.
  warnings.warn(msg, warning_class)

Unused imports trigger even if included in __all__.

For flake8, an import is considered "used" if it appears in a module's __all__. It seems that cython-lint is unaware of __all__ when considering which imports are used or unused.

Example:

# file.pyx
import numpy

__all__ = ["numpy"]
$ cython-lint file.pyx
file.pyx:1:8: 'numpy' imported but unused

Rename the above to file.py -- flake8 is clean.
Renaming the above to file.py and remove __all__:

$ flake8 file.py
file.py:1:1: F401 'numpy' imported but unused

Document error: `Only file sources for code supported`

When we have two layers of includes, I receive the error:

RuntimeError('Only file sources for code supported')

Is this valid? If so, would it be possible to document it in the README?

Input:

watershed.pyx:

include "heap_watershed.pxi"

heap_watershed.pxi

include "heap_general.pxi"

heap_general.pxi can be empty.

Methods with same name as an import cause shadowing errors

If I use a method name that matches a name that has been imported, I get "shadows global import" errors from cython-lint. Maybe this is just me, but I don't see this as unusual practice.

Simple failing example:

from libc.math cimport floor

cdef class Foo:
    cdef double f

    def __init__(self, f):
        self.f = f

    cpdef Foo floor(self):
        return Foo(floor(self.f))

    def __repr__(self):
        return f'Foo({self.f})'

Ignore multiple lines

It would be useful to have, in addition to # no-cython-lint to ignore a single line, to have something like

# cython-lint off
pass
# cython-lint on

to ignore multiple lines of code.

[FR] Support a standalone config file `.cython-lint.yml`

Hi, I'd like to request a feature that is a missing bit for a specific use case.

For projects that use mass maintenance techniques similar to https://github.com/jaraco/skeleton (where there's an upstream Git skeleton that is periodically git merged into downstream), it's important for tools to have their own config files that are auto-detected. The reason is that this maintenance approach causes lower number of merge conflicts compared to having many tool configs in one file.

Examples of modern tools supporting separate non-pyproject config locations (even in the post-PEP 518 world) in the TOML format are Hatch, pip-tools, Poetry, Ruff, Towncrier.

Having experience taking part in implementing this for pip-tools, and leading the review/debates, I want to point out one implementation detail that may come up. It's the naming and the presence of the tool section in a non-pyproject config.
We debated the following options:

  1. Only having it in pyproject.toml as [tool.<tool-name>] and bare keys in the .<tool>.toml one.
  2. Having [<tool-name>] in .<tool>.toml.
  3. Having [tool.<tool-name>] in .<tool>.toml.

We chose the third option. This allows copying the config section across files without changes. It also allows co-existing with other tools. It's also consistent, so the config reader doesn't have to implement multiple fallbacks for reading the config.

This I why I propose having .cython-lint.yml, which would take precedence over pyproject.toml (even when it's empty!). The section header for this file would be the same — [tool.cython-lint].

Here are some of our earlier discussions: jazzband/pip-tools#1863 (comment).

P.S. Many tools also have a --config CLI flag for passing arbitrary file paths. It's not necessary for my use case, but some people may want it at some point.

Expected CNameDeclaratorNode or CFuncDeclaratorNode, got Cython.Compiler.Nodes.CPtrDeclaratorNode

So I've added the pre-commit hook in https://github.com/ionelmc/python-hunter but I get this:

Traceback (most recent call last):
  File "/home/ionel/.cache/pre-commit/repoejaiyjhm/py_env-python3.10/bin/cython-lint", line 8, in <module>
    sys.exit(main())
  File "/home/ionel/.cache/pre-commit/repoejaiyjhm/py_env-python3.10/lib/python3.10/site-packages/cython_lint.py", line 355, in main
    ret |= _main(content, path)
  File "/home/ionel/.cache/pre-commit/repoejaiyjhm/py_env-python3.10/lib/python3.10/site-packages/cython_lint.py", line 270, in _main
    names, imported_names, ret = _traverse_file(
  File "/home/ionel/.cache/pre-commit/repoejaiyjhm/py_env-python3.10/lib/python3.10/site-packages/cython_lint.py", line 223, in _traverse_file
    ret |= visit_funcdef(node, filename, lines, violations=violations)
  File "/home/ionel/.cache/pre-commit/repoejaiyjhm/py_env-python3.10/lib/python3.10/site-packages/cython_lint.py", line 106, in visit_funcdef
    err_msg(
  File "/home/ionel/.cache/pre-commit/repoejaiyjhm/py_env-python3.10/lib/python3.10/site-packages/cython_lint.py", line 52, in err_msg
    raise CythonLintError(
cython_lint.CythonLintError: Unexpected error, please report bug. Expected CNameDeclaratorNode or CFuncDeclaratorNode, got <Cython.Compiler.Nodes.CPtrDeclaratorNode object at 0x7f1f8ba45090>
<Cython.Compiler.Nodes.CPtrDeclaratorNode object at 0x7f1f8ba45090>

Ignore unused imports

In flake8, users can ignore F401 to avoid warning about unused imports. I tried cython-lint --ignore F401 but it seems to do nothing. From the source of cython-lint, it doesn't seem like there is a way to disable this check right now.

f'\'{_import[0]}\' imported but unused',

Invalid defined but unused error

I am not 100% sure whether this is a bug or not. Maybe cython-lint knows something about that pointer cast I don't.

Input:

def __getstate__():
    cdef np.intp_t size = cself.tree_buffer.size() * sizeof(ckdtreenode)
    cdef np.ndarray tree = np.asarray(<char[:size]> cself.tree_buffer.data())

    return tree

Output:

/tmp/foo.pyx:2:20: 'size' defined but unused

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.