シリーズに文字列が含まれているときに値を検索していないのはなぜですか？

これはおそらく非常に基本的ですが、inはオブジェクトや文字列を含むためには機能しません。シリーズに文字列が含まれているときに値を検索していないのはなぜですか？

>>> import pandas as pd 

>>> s = pd.Series(['a', 'b', 'c']) 
>>> 'a' in s 
False 
>>> 'a' in s.astype('S1') 
False

Series.__contains__ドキュメントはかなり希薄である：

>>> 1 in s 
True

しかし、その後：それをしないのはなぜ（ように見える

[In 1]: s.__contains__? 
Signature: s.__contains__(key) 
Docstring: True if the key is in the info axis 
File:  c:\...\lib\site-packages\pandas\core\generic.py 
Type:  method

私の最初の考えはinは唯一の "インデックス" をチェックするということでした）は、他のタイプでも使用できます。

>>> 1.2 in pd.Series([1.3, 1.2]) 
True 

>>> 1 in pd.Series([1.3, 1.2]) # also works for index 
True

私は解決策には興味がありません。私は単にwhatever in s.valuesまたはnp.any(s.eq(whatever))を使うことができます。私はなぜそれがそのように振る舞うか（または私は何かが不足していますか）知りたいです。

出典

2017-06-06 MSeifert

SeriesはListよりもOrderedDictに似ているため、このように動作します。

1 in {0: 5, 1: 10}と同じように、インデックスはRangeIndex(start=0, stop=2, step=1)であり、インデックス要素はキーに似ているため、1 in pd.Series([5, 10])です。

>>> 1.2 in pd.Series([1.3, 1.2]) 
True

の場合は少し混乱することができ、それはあなたが選択した番号に基づいて単なる偶然だ、なぜ私が見

から1.2は、Aのいずれかとの比較の前にint型に強制変換されますRangeIndexまたはInt64Indexのように、実際には1 in ser.indexと尋ねています。個人的に私はこの行動が気に入らないが、それはそれがやっていることだ。強制がさらに明白にするために

>>> 1.9 in pd.Series([1.3, 1.2]) 
True 
>>> 1.2 in pd.Series([1.3, 1.2], index=[10, 20]) 
False

：

In [54]: np.inf in pd.Series([1.3, 1.2]) 
--------------------------------------------------------------------------- 
OverflowError        Traceback (most recent call last) 
<ipython-input-54-b069ecc5baf6> in <module>() 
----> 1 np.inf in pd.Series([1.3, 1.2]) 

[...] 
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.__contains__ (pandas/_libs/index.c:3924)() 

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.__contains__ (pandas/_libs/hashtable.c:13569)() 

OverflowError: cannot convert float infinity to integer

出典

2017-06-06 01:33:28 DSM

シリーズに文字列が含まれているときに値を検索していないのはなぜですか？

答えて

関連する問題