依存関係の順序はどのように保持しますか？

ディレクトリにファイルを開き、spaCy NLPを実行し、出力依存関係の解析情報を新しいディレクトリのファイルに挿入するコードを作成しました。依存関係の順序はどのように保持しますか？

import spacy, os 

nlp = spacy.load('en') 

path1 = 'C:/Path/to/my/input' 
path2 = '../output' 
for file in os.listdir(path1): 
    with open(file, encoding='utf-8') as text: 
     txt = text.read() 
     doc = nlp(txt) 
     for sent in doc.sents: 
      f = open(path2 + '/' + file, 'a+') 
      for token in sent: 
       f.write(file + '\t' + str(token.dep_) + '\t' + str(token.head) + '\t' + str(token.right_edge) + '\n') 
    f.close()

これは、出力ファイルの依存関係の順序を保持しないという問題です。私はAPIのドキュメントで文字の位置への参照を見つけることができないようです。

出典

2016-11-04 Shane

文字のインデックスはtoken.idxです。単語インデックスはtoken.iです。私はこれが特に直感的ではないことを知っています。あなたができるよう

トークンはまた、位置によって比較：

for child in sent: 
    word1, word2 = sorted((child, child.head))

をこれはあなたのドキュメント順に配置された各依存関係のアークを、になるだろう。しかし、あなたが右端で何をしようとしているのか分からないので、これがあなたの望むものであるかどうかは分かりません。

出典

2016-11-04 20:07:48

ありがとうございますsyllogism_！これはうまくいった。 'の子供のために送られた：私は次のようになってしまった \t \t \t \tヘッド= child.head \t \t \t \t head_pos = child.head.tag_ \t \t \t \tのconst =子 \t \t \t \t const_pos = （\）\ t '+ str（child.dep_）+' \ t '+ str（head）+' \ t ' t '+ str（head_pos）+' \ t '+ str（const）+' \ t '+ str（const_pos） + '\ n'） ' – Shane

依存関係の順序はどのように保持しますか？

答えて

関連する問題