分類器で上位5のトピックの精度を取得するにはどうすればよいですか？

私は22465のテスト文書を持っていますが、私はこれを88の異なるトピックに分類しています。私はpredict_probaを使用して、予測される上位5トピックを取得しています。これらの5つのトピックの精度をどのように印刷することができますか？分類器で上位5のトピックの精度を取得するにはどうすればよいですか？

model1 = LogisticRegression() 
model1 = model1.fit(matrix, labels) 

y_train_pred = model1.predict_log_proba(matrix_test) 
order=np.argsort(y_train_pred, axis=1) 
print(order[:,-5:]) #gives top 5 probabilities 

n=model1.classes_[order[:, -5:]]

正確性について

z=0 
for x, y in zip(label_tmp_test, n): 
    if x in y: 
     z=z+1 
print(z) 
print(z/22465) #This gives me the accuracy by considering top 5 topics

がどのように私は同じように上位5トピックの精度を見つけることができます。

は精度に関しては、これは私がやっている何ですか？ Scikitメトリックは、あなたの中に

q=model1.predict(mat_tmp_test) 
print(metrics.precision_score(n, q))

出典

2016-03-20 minks

で動作することを拒否精度を方法論とほぼ同じです - あなたは、単に特定のラベルに焦点を当てて（精密ごとのラベルメトリックと同様に）、あなたはラベルLのための精度を計算すると言うことができます：

TP = 0. 
FP = 0. 
for x, y in zip(label_tmp_test, n): 

    if x == L: # this is the label we are interested in 
     if L in y: # correct prediction is among selected ones 
      TP = TP + 1 # we get one more true positive instance 

    else: # this is some other label 
     if L in y: # if we predicted that this is a particular label 
      FP = FP + 1 # we have created another false positive 

print(TP/(TP + FP))

一般的な精度が必要な場合は、通常、ラベルごとの精度を平均します。明らかな理由から、これらの種類の措置を講じるには多くのラベルが必要です。

出典

2016-03-20 12:31:28 lejlot

分類器で上位5のトピックの精度を取得するにはどうすればよいですか？

答えて

関連する問題