Comments (8)
I started this here: #57683
I have a little script to do this, which seems to working well enough. Before I go too far, just checking people are OK with it
Pros of keeping docstring substitutions:
- shorter docstrings within the code
- helps dedupe some parameter definitions
Pros of removing docstring substitions:
- docstrings become simple and easy-to-review
- drastically faster docstring checking thanks to ruff
- blazingly fast docstring formatting becomes possible
- unblocks using
darglint
rules (which may make their way into ruff!) astral-sh/ruff#458, and any other fast docstring rules in ruff (they already have pydocstyle ones)
I don't think ruff replaces all of the validate_docstrings
script, but that script can be used with or without docstring substitutions. But removing docstring substitutions at least means we can start removing things from validate_docstrings
which are covered by ruff and cut down on CI like this
from pandas.
Wasn't aware of this issue and wrote #57826 after I introduced a docstring error in my first PR and got annoyed at the speed of the docstring checks. It reduces the runtime of the check_code.sh docstrings
down to 2-3 minutes again.
I also have another PR ready to go once #57826 is merged. That second PR brings the validation down to about 10 seconds. Main issue was the way the temp files were created and how pep8 was called via the shell.
So maybe with a 10 second runtime it's not necessary to do any functionality changes to the docstring validation and sharing in the end?
from pandas.
yeah it needs to generate all these docstrings, write them to temporary files, run validation against them...
tbh I think we should rip all the docstring sharing logic, just have plain old docstrings without substitutions, and let ruff do all the linting and validation in a matter of seconds. this shouldn't be done manually though, maybe I should see if it's automatable
doctests can be validated with pytest --doctest-modules
from pandas.
tbh I think we should rip all the docstring sharing logic, just have plain old docstrings without substitutions,
Big +1
from pandas.
cool - no idea how difficult this actually is but I'll give it a go next week and see
from pandas.
yeah it needs to generate all these docstrings, write them to temporary files, run validation against them...
I get we have a lot of docstrings but I am surprised that this would take 22 minutes.
Not a hill I'd want to die on but I do think the shared docstrings serve a good purpose in keeping things consistent
from pandas.
So maybe with a 10 second runtime it's not necessary to do any functionality changes to the docstring validation and sharing in the end?
I think the docstring sharing is still worth removing to:
- Make it easier for new contributors to work on docstrings due to less complexity
- Allow
ruff
to take over docstring formatting and validation
from pandas.
I'm happy to move towards removing or reducing the docstring sharing. But based on my experience getting fully rid of it may take years. If we want to proactively work on duplicating now reused docstrings, I'd start by the ones with higher complexity and see how that goes, before opening issues to remove all them.
from pandas.
Related Issues (20)
- BUG: itemsize wrong for date32[day][pyarrow] dtype? HOT 6
- BUG: Unable to use CustomBusinessDays in a MultiIndex HOT 2
- Potential regression induced by "Cython guard against [c|m|re]alloc failures" HOT 5
- Potential regression induced by "Refactored pandas_timedelta_to_timedeltastruct" HOT 6
- BUG: pandas df.to_markdown() with period at end of field - is not displayed. HOT 2
- BUG: encoding is **silently** ignored for `read_csv` on FileLike objects HOT 2
- BUG: test
- BUG: IndexError with pandas.DataFrame.cumsum where dtype=timedelta64[ns] HOT 6
- DOC: Flipped square bracket in pandas.read_fwf documentation HOT 1
- "ValueError: Must have equal len keys and value when setting with an iterable" when updating an object type cell using .loc with a nd.array
- BUG: install pandas on embedded python 3.8 win32 successful, but failed to `import pandas` HOT 6
- BUG: `testing.assert_frame_equal` ignores `check_names=False` HOT 2
- BUG: Wrong kurtosis outcome due to inadequate fix to previous issues HOT 4
- BUG: Unexpected Styler.format behavior HOT 6
- BUILD: import pandas error (C extension: DLL load failed) in Python 2.7.15 HOT 3
- BUG: CONTAINS_OP run on pd.NA results in pd.NAType.__bool__ call HOT 5
- BUG: Nones in pd.concat MultiIndex keys are not accepted in some cases HOT 1
- DOC: DataFrame.reset_index names param can't be a tuple as docs state HOT 6
- BUG: Implement `fillna(..., limit=x)` for EAs
- API: Make `Series.array` a read-only NumpyExtensionEA when applicable HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pandas.