Python3、pandas.dataframe、表示するルールによって特定のデータを選択する方法

私はpandas.dataframeを持っており、いくつかのルールによって特定のデータを選択したいと考えています。Python3、pandas.dataframe、表示するルールによって特定のデータを選択する方法

次のコードは、今、データフレーム

import datetime 
import pandas as pd 
import numpy as np 

today = datetime.date.today() 
dates = list() 
for k in range(10): 
    a_day = today - datetime.timedelta(days=k) 
    dates.append(np.datetime64(a_day)) 

np.random.seed(5) 
df = pd.DataFrame(np.random.randint(100, size=(10, 3)), 
        columns=('other1', 'actual', 'other2'), 
        index=['{}'.format(i) for i in range(10)]) 

df.insert(0, 'dates', dates) 
df['err_m'] = np.random.rand(10, 1)*0.1 
df['std'] = np.random.rand(10, 1)*0.05 
df['gain'] = np.random.rand(10, 1)

を生成し、私は次の規則によって選択したい：

1. compute the sum of 'err_m' and 'std', then sort the df so that the sum is descending 
2. from the result of step 1, select the part where 'actual' is > 50

おかげ

出典

2017-03-02 aura

をソートすることにより、新しい列を作成し、この1つ：

df['errsum'] = df['err_m'] + df['std'] 
# Return a sorted dataframe 
df_sorted = df.sort('errsum', ascending = False)

あなたは

# Create an array with True where the condition is met 
selector = df_sorted['errsum'] > 50 
# Return a view of sorted_dataframe with only the lines you want 
df_sorted[selector]

出典

2017-03-02 13:58:53 LoicM

感謝したい行を選択します。それは私の問題を解決する – aura

Python3、pandas.dataframe、表示するルールによって特定のデータを選択する方法

答えて

関連する問題