Giter Site home page Giter Site logo

Comments (8)

MarcoGorelli avatar MarcoGorelli commented on June 29, 2024 1

I started this here: #57683

I have a little script to do this, which seems to working well enough. Before I go too far, just checking people are OK with it

Pros of keeping docstring substitutions:

  • shorter docstrings within the code
  • helps dedupe some parameter definitions

Pros of removing docstring substitions:

  • docstrings become simple and easy-to-review
  • drastically faster docstring checking thanks to ruff
  • blazingly fast docstring formatting becomes possible
  • unblocks using darglint rules (which may make their way into ruff!) astral-sh/ruff#458, and any other fast docstring rules in ruff (they already have pydocstyle ones)

I don't think ruff replaces all of the validate_docstrings script, but that script can be used with or without docstring substitutions. But removing docstring substitutions at least means we can start removing things from validate_docstrings which are covered by ruff and cut down on CI like this

from pandas.

dontgoto avatar dontgoto commented on June 29, 2024 1

Wasn't aware of this issue and wrote #57826 after I introduced a docstring error in my first PR and got annoyed at the speed of the docstring checks. It reduces the runtime of the check_code.sh docstrings down to 2-3 minutes again.

I also have another PR ready to go once #57826 is merged. That second PR brings the validation down to about 10 seconds. Main issue was the way the temp files were created and how pep8 was called via the shell.

So maybe with a 10 second runtime it's not necessary to do any functionality changes to the docstring validation and sharing in the end?

from pandas.

MarcoGorelli avatar MarcoGorelli commented on June 29, 2024

yeah it needs to generate all these docstrings, write them to temporary files, run validation against them...

tbh I think we should rip all the docstring sharing logic, just have plain old docstrings without substitutions, and let ruff do all the linting and validation in a matter of seconds. this shouldn't be done manually though, maybe I should see if it's automatable

doctests can be validated with pytest --doctest-modules

from pandas.

mroeschke avatar mroeschke commented on June 29, 2024

tbh I think we should rip all the docstring sharing logic, just have plain old docstrings without substitutions,

Big +1

from pandas.

MarcoGorelli avatar MarcoGorelli commented on June 29, 2024

cool - no idea how difficult this actually is but I'll give it a go next week and see

from pandas.

WillAyd avatar WillAyd commented on June 29, 2024

yeah it needs to generate all these docstrings, write them to temporary files, run validation against them...

I get we have a lot of docstrings but I am surprised that this would take 22 minutes.

Not a hill I'd want to die on but I do think the shared docstrings serve a good purpose in keeping things consistent

from pandas.

mroeschke avatar mroeschke commented on June 29, 2024

So maybe with a 10 second runtime it's not necessary to do any functionality changes to the docstring validation and sharing in the end?

I think the docstring sharing is still worth removing to:

  1. Make it easier for new contributors to work on docstrings due to less complexity
  2. Allow ruff to take over docstring formatting and validation

from pandas.

datapythonista avatar datapythonista commented on June 29, 2024

I'm happy to move towards removing or reducing the docstring sharing. But based on my experience getting fully rid of it may take years. If we want to proactively work on duplicating now reused docstrings, I'd start by the ones with higher complexity and see how that goes, before opening issues to remove all them.

from pandas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.