Giter Site home page Giter Site logo

pandas-stubs's Introduction

Logo

Pandas Stubs

Collection of pandas stub files initially generated using stubgen, fixed when necessary and then partially completed.

Announcement - pandas_stubs moved! This repository is now deprecated!

As of July 2022 pandas_stubs package is no longer sourced from here but instead from a repository owned and maintained by the core pandas team: https://github.com/pandas-dev/pandas-stubs

This is result of a strategic effort lead by the core pandas team to integrate Microsoft type stub repository together with the current pandas_stubs repository.

All future development will take place in the new repository and both the PyPI and CondaForge distributions will be sourced from there.

If you're having any problems with the current package please try switching over, and report any issues on the new Github page. It's available both on PyPI and conda-forge.

Related issue: 172

Motivation

Provide rudimentary coverage of pandas code by static type checking, to alleviate problems mentioned in the following issues 14468, 26766. This approach was taken to achieve accelerated development - compared to refactoring existing Pandas codebase creating stub files is relatively uninhibited.

Due to extensive pandas API, quality of the proposed annotations is, for the most part, not suitable for integration into original codebase, but they can be very useful as a way of achieving some type safety during development.

Installation

This works only for legacy versions. Any version higher than shown here will install a version from the new repository.

The easiest way is using PyPI. This will add .pyi files to pandas package location, which will be removed when uninstalling:

pip install pandas-stubs==1.2.0.62

Another way to install is using Conda:

conda install -c conda-forge pandas-stubs=1.2.0.62

Alternatively, if you want a cleaner PYTHONPATH or wish to modify the annotations, manual options are:

  • cloning the repository along with the files, or
  • including it as a submodule to your project repository,

and then configuring a type checker with the correct paths.

Usage

Let’s take this example piece of code in file round.py

import pandas as pd

decimals = pd.DataFrame({'TSLA': 2, 'AMZN': 1})
prices = pd.DataFrame(data={'date': ['2021-08-13', '2021-08-07', '2021-08-21'],
                            'TSLA': [720.13, 716.22, 731.22], 'AMZN': [3316.50, 3200.50, 3100.23]})
rounded_prices = prices.round(decimals=decimals)

Mypy won't see any issues with that, but after installing pandas-stubs and running it again:

mypy round.py

we get the following error message:

round.py:6: error: Argument "decimals" to "round" of "DataFrame" has incompatible type "DataFrame"; expected "Union[int, Dict[Union[int, str], int], Series]"

And after confirming with the docs we can fix the code:

decimals = pd.Series({'TSLA': 2, 'AMZN': 1})

Version Compatibility

The aim of the current release is to cover the most common parts of the 1.2.0 API, however it can provide partial functionality for other version as well. Future versions will cover new Pandas releases.

Versioning

The versions follow a pattern MAJOR.MINOR.PATCH.STUB_VERSION where the first three parts correspond to a specific pandas API version, while STUB_VERSION is used to distinguish between the versions of the stubs themselves.

Type checkers

As of now mypy is the only type checker the stubs were tested with.

Development

Testing using tox

Tox will automatically run all types of tests mentioned further. It will create isolated temporary environments for each declared version of Python and install pandas-stubs like it would normally be installed when using pip or conda.

Usage is as simple as:

tox

Last few lines of the output should look like this (assuming all Python versions are available):

  pep8: commands succeeded
  py36: commands succeeded
  py37: commands succeeded
  py38: commands succeeded
  py39: commands succeeded

Partial testing

Test the stub files internal consistency:

mypy --config-file mypy.ini third_party/3/pandas

Test the stub files against actual code examples (this will use the stubs from the third_party/3/pandas dir):

mypy --config-file mypy.ini tests/snippets

Test the installed stub files against actual code examples. You'll need to install the library beforehand - the .pyi files from your env will be used:

mypy --config-file mypy_env.ini tests/snippets

Test if the code examples work, when actually ran with pandas:

pytests tests/snippets

Disclaimer

This project provides additional functionality for pandas library. Pandas is available under its own license.

This project is not owned, endorsed, or sponsored by any of AQR Capital Management, NumFOCUS, LLC, Lambda Foundry, Inc. and PyData Development Team.

pandas-stubs's People

Contributors

aholmes avatar aneeshusa avatar bzoracler avatar dependabot[bot] avatar hpomorski avatar ilikeavocadoes avatar joanna-sendorek avatar joannasendorek avatar johanvergeer avatar johnflavin avatar nanne-aben avatar pawellipski avatar saaketp avatar ste-pool avatar tdsmith avatar zkrolikowski-vl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pandas-stubs's Issues

Create tests for the stubs

Design tests that would verify the stubs in real-life use cases. Possible approach is using
TypeCheckSuite from mypy.test.testcheck.

Add type stubs for `read_csv` from `pandas.io.parser`

This declaration: https://github.com/VirtusLab/pandas-stubs/blob/master/third_party/3/pandas/io/parsers.pyi#L16 needs to be updated to accurately reflect the real interface from documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

Tests must be created in a file test_parsers.py which should be placed in the following directory: https://github.com/VirtusLab/pandas-stubs/tree/master/tests/snippets The tests should include common use-cases. Look up other tests in directory to learn the convention. Tests can be ran using tox command.

Series.__getitem__ claims incompatible type

Series.__getitem__(key) claims incompatible type while Series.get(key) type checking succeeds.

Small example to trigger the bug:
$ cat test.py:

from typing import List
import pandas as pd

a: pd.Series = pd.Series({'key': [0,1,2,3]})
b: List[int]

b = a['key']
b = a.__getitem__('key')
b = a.get('key')

$ mypy test.py:

test.py:7: error: Incompatible types in assignment (expression has type "Union[Union[str, int, float, bool], Union[Union[Any, bool_, Any, Any], Union[Any, np.char?, Any, Any, Any, Any, Any, signedinteger[_8Bit], signedinteger[_16Bit], signedinteger[_32Bit], signedinteger[_64Bit]], Union[Any, Any, Any, Any, Any, Any, unsignedinteger[_8Bit], unsignedinteger[_16Bit], unsignedinteger[_32Bit], unsignedinteger[_64Bit]], Union[Any, Any, Any, Any, Any, floating[_16Bit], floating[_32Bit], floating[_64Bit], Any], Union[Any, Any, Any, Any, complexfloating[_32Bit, _32Bit], complexfloating[_64Bit, _64Bit], Any, Any]], Decimal, ByteString, Fraction, DateOffset, Interval, Number, datetime, timedelta]", variable has type "List[int]")
test.py:8: error: Incompatible types in assignment (expression has type "Union[Union[str, int, float, bool], Union[Union[Any, bool_, Any, Any], Union[Any, np.char?, Any, Any, Any, Any, Any, signedinteger[_8Bit], signedinteger[_16Bit], signedinteger[_32Bit], signedinteger[_64Bit]], Union[Any, Any, Any, Any, Any, Any, unsignedinteger[_8Bit], unsignedinteger[_16Bit], unsignedinteger[_32Bit], unsignedinteger[_64Bit]], Union[Any, Any, Any, Any, Any, floating[_16Bit], floating[_32Bit], floating[_64Bit], Any], Union[Any, Any, Any, Any, complexfloating[_32Bit, _32Bit], complexfloating[_64Bit, _64Bit], Any, Any]], Decimal, ByteString, Fraction, DateOffset, Interval, Number, datetime, timedelta]", variable has type "List[int]")
Found 2 errors in 1 file (checked 1 source file)

Environment

python                    3.7.9                h7579374_0    defaults
pandas                    1.2.4            py37h2531618_0    defaults
pandas-stubs              1.1.0.7          py37h89c1867_0    conda-forge
mypy                      0.902            py37h5e8e339_0    conda-forge
mypy_extensions           0.4.3                    py37_0    defaults

Thanks for your efforts!

os.PathLike not compatible with pandas._typing.FilePathOrBuffer

This issue occurs when using df.to_csv(some_path_like) where some_path_like is typed as os.PathLike.

minimal example:

import os
import pandas as pd

def write(path: os.PathLike):
    df = pd.DataFrame()
    df.to_csv(path)

When using this, I get the mypy error: error: Argument 1 to "to_csv" of "NDFrame" has incompatible type "_PathLike[Any]"; expected "Union[str, Path, IO[str], None]"

You use Optional[FilePathOrBuffer[AnyStr]]=..., so I don't actually know if this is an issue of pandas._typing or if it could be fixed here or if it supposed to work like that.

Timestamp does not have a signature

The current type annotations for pandas.Timestamp (https://github.com/VirtusLab/pandas-stubs/blob/master/third_party/3/pandas/_libs/tslibs/timestamps.pyi) does not hae a signature, where in reality it does: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html

This leads to mypy complaining with the error:

error: Too many arguments for "Timestamp"

for code like:

import pandas
pandas.Timestamp("2021-01-01")

Same problem for Timedelta:

import pandas
pandas.Timedelta("1s")

`DataFrame.__getitem__(self, key: Sequence[Column])` stub definition

I believe that there are two issues with the overload DataFrame.__getitem__(self, key: Sequence[Column]). However, I don't really know the intention behind this hint (specifically what Sequence[Column] is supposed to be) - so I hope some light could be shed on this stub. My interpretation is that the intended meaning of Sequence[Column] is something like typing.List[Column] and not typing.Tuple[Column], and the following issues are based on this interpretation.


1. Return type hint

@overload
def __getitem__(self, key: Sequence[Column]) -> Series: ...

Column = Union[int, str]

It looks like the return hint should be DataFrame, not Series. E.g.

>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6], 8: [7, 8, 9]})
>>> df
   A  B  8
0  1  4  7
1  2  5  8
2  3  6  9
>>> df[["A", 8]]
   A  8
0  1  7
1  2  8
2  3  9
>>> df[["A", 8]].__class__
<class 'pandas.core.frame.DataFrame'>
>>> df["A"].__class__
<class 'pandas.core.series.Series'>
>>> df[["A"]].__class__
<class 'pandas.core.frame.DataFrame'>

2. Type hint of key

Sequence is from the base typing module,

from typing import Any, Hashable, IO, Iterable, List, Optional, Sequence, Tuple, Union, Dict, Mapping, Type, \
overload, Iterator, Callable, AnyStr

and according to https://docs.python.org/3/glossary.html#term-sequence

sequence

An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence. Some built-in sequence types are list, str, tuple, and bytes.

Since typing.Sequence covers tuple objects, the stub conflicts with the following behaviour:

>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6], 8: [7, 8, 9]})
>>> 
>>> df[("A", 8)]
Traceback (most recent call last):
  File "~/[venv]/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: ('A', 8)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "~/[venv]/lib/python3.8/site-packages/pandas/core/frame.py", line 2800, in __getitem__
    indexer = self.columns.get_loc(key)
  File "~/[venv]/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: ('A', 8)

tuple objects get treated as the same as Column objects in the following overload:

@overload
def __getitem__(self, key: Column) -> Series: ...

E.g.

>>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6], ("C", int, (99,)): [7, 8, 9]})
>>> df
   A  B  (C, <class 'int'>, (99,))
0  1  4                          7
1  2  5                          8
2  3  6                          9
>>> df[("C", int, (99,))]
0    7
1    8
2    9
Name: (C, <class 'int'>, (99,)), dtype: int64
>>> df[("C", int, (99,))].__class__
<class 'pandas.core.series.Series'>
>>> df[("C", int, (99,))].to_frame()
              C
  <class 'int'>
          (99,)
0             7
1             8
2             9

Type annotation for pipe()

First of all, great project! Very happy user here :)

Now to my point: currently pipe() is annotated to return Any. This means the following doesn't result in a mypy error, even though it really should.

import pandas as pd

def foo(df: pd.DataFrame) -> pd.DataFrame:
    return df

df: int = (
    pd.DataFrame({'a': [1]})
    .pipe(foo)
)

I think we can do better if we define pipe() as follows:

from typing import TypeVar, Callable, Tuple, Union

V = TypeVar("V")

def pipe(self, func: Union[Callable[..., V], Tuple[Callable[..., V], str]], *args, **kwargs) -> V: ...

Now the first example throws a mypy error.

There's one edge case that I came across though: apparently it's possible to pass something other than a callable to pipe().

df = (
    pd.DataFrame({'a': [1]})
    .pipe(1)
)

After which df == 1. Not sure why anyone would want to do that, but sure. We can incorporate that using:

from typing import TypeVar, Callable, Tuple, Union, overload

V = TypeVar("V")

@overload
def pipe(self, func: Union[Callable[..., V], Tuple[Callable[..., V], str]], *args, **kwargs) -> V: ...

@overload
def pipe(self, func: V, *args, **kwargs) -> V: ...

I can open a PR if you're interested. First wanted to check what you thought on the matter. Also, I noticed pipe() is defined in a number of places, and I wasn't sure if we need to annotate all of them (I suppose we should?).

df.rename(columns=...) does not accept dict variable

import pandas as pd


df = pd.DataFrame(columns=["a"])

col_map = {"a": "b"}

df.rename(columns=col_map)

The above minimal example raises what appears to be a false type error on the last line due to passing columns=col_map:

Argument "columns" to "rename" of "DataFrame" has incompatible type "Dict[str, str]"; expected "Union[Mapping[Optional[Hashable], Any], Callable[[Optional[Hashable]], Optional[Hashable]], None]"mypy(error)

If instead I pass the dictionary as a literal value, the error disappears:

df.rename(columns={"a": "b"})  # all is well here

Question: how to use with pyright?

Hi, I write Python with:

  • a src layout for my project, and
  • poetry with virtualenvs.in-project.

Thus, pyright is generally happy with my pyrightconfig.json as follows:

{
  "include": ["src"],
  "venvPath": ".venv",
  "verboseOutput": true,
  "executionEnvironments": [{ "root": "src" }]
}

However, I am struggling to get anything pandas related type-checked. I have tried 2 solutions:

  1. As a submodule:
[submodule "stubs/submodules/pandas-stubs"]
    path = stubs/submodules/pandas-stubs
    url = https://github.com/VirtusLab/pandas-stubs.git

and adding a symlink:

cd stubs/third_party
ln -s ../submodules/pandas-stubs/third_party/3/pandas pandas

My guess is that pyright is expecting stubs/third_party/pandas if I write:

{
  "include": ["src"],
  "stubPath": "stubs/third_party",
  "venvPath": ".venv",
  "verboseOutput": true,
  "executionEnvironments": [{ "root": "src" }]
}
  1. As a pip install.

In either case, I can't get anything out of pyright --verifytypes pandas. What am I doing wrong here?

DataFrame.from_dict false positive while linting

I am getting this error while linting on mypy version 0.902 with python version 3.8.10

error: Argument 1 to "from_dict" of "DataFrame" has incompatible type "Dict[str, List[Any]]"; expected "Dict[str, Union[Union[Union[ExtensionArray, ndarray], Index, Series], Series, Dict[Union[int, str], Any]]]"

This is a false alarm because in pandas own examples they have this code

>>> data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
>>> pd.DataFrame.from_dict(data)

The offending line in the stubs is here

@classmethod
def from_dict(cls: Any, data: Dict[str, Union[AnyArrayLike, Series, Dict[Column, Dtype]]], other_kwargs:...

It appears that data should be extended to include List in addition to Dict, Series.

In fact, looking at the source code for pandas, the data argument is passed into Dataframe.__init__ directly, and it appears that if any data passed to Dataframe.__init__ doesn't match a more specific type, it gets passed to maybe_iterable_to_list.

Should Iterable or List be added to the list of valid types here?

If someone has an answer, I can write a PR

Wrong type stub for pandas.to_datetime()

We use the datetime conversion function of pandas like:

date = pd.to_datetime(times, unit="s", origin=pd.Timestamp("01/01/2000"))

However, having the stubs installed, mypy complains that

error: Argument "origin" to "to_datetime" has incompatible type "Timestamp"; expected "str"

The pandas docs list, after some strings, also "Timestamp convertible" (and passing in that pd.Timestamp for origin actually works at runtime): https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html

Positive check: casting the pd.Timestamp to str before passing it to origin removes the mypy error.

pandas.api.types imports

Importing from pandas.api.types causes MypyError

"pandas.api.types" has no attribute "is_numeric_dtype"

Minimal example:

from pandas.api.types import is_numeric_dtype, is_string_dtype

srs = pd.Series([1,2,3])

is_numeric_dtype(srs)
>>> True
is_string_dtype(srs)
>>> False

arg-type error with `itertuples(name=None)`

pandas.DataFrame.itertuples allows name=None. But with the following code:

# df is a pandas.DataFrame.
df.itertuples(name=None)

Mypy reports the following error:

error: Argument "name" to "itertuples" of "DataFrame" has incompatible type "None"; expected "str"  [arg-type]

pd.DataFrame.eval returns pd.DataFrame

df = pd.DataFrame({"a": [1, 2, 3]})
df.eval('b = a * 3')
Out[9]: 
   a  b
0  1  3
1  2  6
2  3  9
type(df.eval('b = a * 3'))
Out[10]: pandas.core.frame.DataFrame

As you can see above pd.DataFrame.eval also returns pd.DataFrame, doc is not so accurate in this case.

Add "time" attribute to "DatetimeIndex"

The time attribute of DatetimeIndex seems not to be implemented.

Minimal working example:

# file tmp.py
import pandas as pd

dr = pd.date_range(start='2020-1-1', periods=3)
print(dr.time)

Then executing mypy tmp.py results in

tmp.py:5: error: "DatetimeIndex" has no attribute "time"
Found 1 error in 1 file (checked 1 source file)

while the execution python tmp.py of the code works perfectly. Tested with

Python 3.9.7
numpy==1.21.2
pandas==1.3.2
pandas-stubs==1.2.0.22

DataFrame.sort_values overload signature is incomplete with respect to the ascending argument

python 3.6
pandas 1.1.5
pandas-stubs 1.2.0.1

The docs state that a list of bools is accepted for the ascending argument, but the stubs raise an error in that instance.

Minimal example of some (valid) code that fails type checking:

import pandas

df = pandas.DataFrame(
    [[1, 'a'], [2, 'b'], [2, 'c']],
    columns=['num', 'char']
)
print(df)
df = df.sort_values(
    by=['num', 'char'],
    ascending=[True, False],
)
print(df)

Output:

   num char
0    1    a
1    2    b
2    2    c
   num char
0    1    a
2    2    c
1    2    b

mypy error:

error: No overload variant of "sort_values" of "DataFrame" matches argument types "List[str]", "List[bool]", "bool"  [call-overload]
note: Possible overload variant:
note:     def sort_values(self, by: Union[str, List[str]], axis: Union[Literal[0], Literal[1], Union[Literal['index'], Literal['columns']]] = ..., ascending: bool = ..., kind: Union[Literal['quicksort'], Literal['mergesort'], Literal['heapsort']] = ..., na_position: Union[Literal['first'], Literal['last']] = ..., ignore_index: bool = ..., key: Optional[Callable[[Series], Union[Series, Union[Union[ExtensionArray, Any], Index, Series]]]] = ..., *, inplace: Literal[False] = ...) -> DataFrame
note:     <1 more non-matching overload not shown>

A workaround if you came here from a search engine:

import pandas

df = pandas.DataFrame(
    [[1, 'a'], [2, 'b'], [2, 'c']],
    columns=['num', 'char']
)
print(df)
ascending: Any = [True, False]
df = df.sort_values(
    by=['num', 'char'],
    ascending=ascending,
)
print(df)

about the code version

pandas project has several branches or versions.
pandas-stubs only have one branch. My question is, which version of pandas source code does pandas-stubs corresponds to?

Thanks.

DataFrame methods should return DataFrame *or* None rather than Optional[DataFrame]

Problem summary:

I noticed a few methods that are returning Optional[DataFrame], e.g.:

def drop(self, ...) -> Optional[DataFrame]: ...
def rename(self, ...) -> Optional[DataFrame]: ...

from pandas/core/frame.pyi.

It appears to me that these methods may return None because when inplace=True, a dataframe is not returned, and so most of the time, these methods will actually return a dataframe.

The current API as it is defined makes it difficult to adopt these stubs because the user must always assert that the dataframe is not None, which makes the code overly verbose (especially when using method chaining).

Proposal

I'm proposing that the stubs be updated to use typing overload. This would allow us to specify different type signatures for inplace=True and inplace=False.

Are my assumptions correct? Does this sound reasonable? I can open a PR for this if there is interest.

Pandas Timestamp object operations not recognized

Hi there,

I have the following code:

import pandas as pd
pts: pd.Timestamp=pd.to_datetime("4-5-2019")

td: pd.Timedelta = pts-pts

Running mypy I'm getting the following error:
pmypy.py:4: error: Unsupported left operand type for - ("Timestamp")

Python Version 3.9.4
pandas Version: 1.3.2
pandas-stubs Version: 1.2.0.11
mypy Version: 0.910

add __add__ for pd.Timestamp

I'm getting a

error: Unsupported left operand type for + ("Timestamp")

When calling pd.Timestamp("20200101") + BDay(1)
When looking at the stub, there's no implementation for add, I might be being naive (new to stub libs), but I assume adding a stub for this method would resolve the error?

Create a release versioning system parallel to Pandas

We'd like to have separate branches for each Pandas "major.minor" version starting from 1.0: Namely 1.0, 1.1 and 1.2. See PEP440. Maintaining a separate version for each minor version is intractable.

This requires introducing code changes to individual API versions. Without this the versioning doesn't bring any benefit.

Repository tags and releases do not reflect PyPi versions

Hi,

I updated pandas-stubs from 1.1.0.7 to 1.1.0.11 and wanted to see what specifically changed. I came here hoping to see a tag or release, but cannot find either. It would really help with discoverability and usage efforts if the repository tag versions (at minimum, releases would be nice too) reflected release versions on PyPi/vice versa.

`DataFrame.set_index` stub return hint inconsistent with method definition

Return hint of the stub is DataFrame:

def set_index(self, keys: Union[Label, IndexArray, List[Union[Label, IndexArray]]], drop: bool = ..., append: bool = ..., inplace: bool = ..., verify_integrity: bool = ...) -> DataFrame: ...

Based on the return value of Pandas 1.0.x API, the return hint should be Optional[DataFrame]:

https://github.com/pandas-dev/pandas/blob/1.0.x/pandas/core/frame.py#L4366-L4367

Note that this method's docstring in the 1.0.x API was incorrect (https://github.com/pandas-dev/pandas/blob/1.0.x/pandas/core/frame.py#L4210-L4213), this was fixed in the commit pandas-dev/pandas@b110046 , resulting in

https://github.com/pandas-dev/pandas/blob/b110046099014a63217042478393e715a083604b/pandas/core/frame.py#L4536-L4539

        Returns
        -------
        DataFrame or None
            Changed row labels or None if ``inplace=True``.

pd.concat(dict) claims incompatible type

Just tried out pandas-stubs for the first time and got a type error on a line that works as intended.

error: Argument 1 to "concat" has incompatible type "Dict[str, DataFrame]"; expected "Union[Iterable[DataFrame], Mapping[Optional[Hashable], DataFrame]]"

pd.concat({"a": pd.DataFrame([1, 2, 3]), "b": pd.DataFrame([4, 5, 6])}, axis=1)
>>> 	a	b
	0	0
0	1	4
1	2	5
2	3	6

Is this an issue with the type declaration or am I misusing pd.concat?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.