2016-11-17 11 views
2

listdownキーワードに一致する方法を私はリスタは、すべてのリストを組み合わせることができたzip機能を使って、3リスト文章にし、マッチした文章のpython

sentencelist = ['Iraqi txt 1 forces shelled areas outside of Mosul Monday. Hellow Time. Bus is coming to me. The offensive to retake the city from Islamic State militants. ' 
       ' That txt 2 strike came during an assault by Islamic State fighters. He says the evacuation of refugees from a squalid camp. The sky is blue.' 
       ' This txt 3 African migrant says leaving the camp is not an entirely happy experience. Workers in the house. Apple in the tree'] 


topwordlist = ['areas city from offensive', ' He sky camp blue',' leaving happy tree migrant'] 
sentencename_list=['001.txt', '002.txt', '003.txt'] 
#print topwordlist 


listA=zip(sentencename_list,topwordlist ,sentencelist) 
print listA 

を持っています。

listA [(sentencename,topword,sentence), 
     (sentencename,topword,sentence), 
     (sentencename,topword,sentence)] 

結果:

 [("001.txt", "areas city from offensive", "Iraqi txt 1 forces shelled areas outside of Mosul Monday. Hellow Time. Bus is coming to me. The offensive to retake the city from Islamic State militants."), 
("002.txt","He sky camp blue"," That txt 2 strike came during an assault by Islamic State fighters. He says the evacuation of refugees from a squalid camp. The sky is blue."), 
("003.txt","leaving happy tree migrant","This txt 3 African migrant says leaving the camp is not an entirely happy experience. Workers in the house. Apple in the tree")] 

私は唯一の文がtopwordlistが表示されます構成されていmatchSentenceを取得したいと思います。 は、たとえば次の文章は、私がリスタに新しい要素「matchSentence」を追加したいと思いmatchSentence

"Iraqi txt 1 forces shelled areas outside of Mosul Monday.", 
"The offensive to retake the city from Islamic State militants.", 
"He says the evacuation of refugees from a squalid camp." 
" The sky is blue." 
"This txt 3 African migrant says leaving the camp is not an entirely happy experience." 
"Apple in the tree." 

です。例えば

how to get the following (sentencename,topword,matchSentence,sentence) 

[('001.txt', 'areas city from offensive', " Iraqi txt 1 forces shelled areas outside of Mosul Monday." ," Iraqi txt 1 forces shelled areas outside of Mosul Monday. Hellow Time. Bus is coming to me. The offensive to retake the city from Islamic State militants."), 
("001.txt","The offensive to retake the city from Islamic State militants."," 'Iraqi txt 1 forces shelled areas outside of Mosul Monday. Hellow Time. Bus is coming to me. The offensive to retake the city from Islamic State militants."), 
("002.txt","..topword..",".matchSentence1...","..sentence.."), 
("002.txt","..topword..",".matchSentence2...","..sentence.."), 
("002.txt","..topword..",".matchSentence3...","..sentence.."), 
("003.txt","..topword..",".matchSentence...","..sentence.."), 
(.....)] 

答えて

0

あなたは正規表現を通じてマッチを作ってみるかもしれません。これにより

import re 
reg_sentence = re.compile(r'[^\.]+\.') 

、あなたが(限り、彼らは.文字で終わるような)文字列に分離された各センテンスを取得することができます:

matched = false 
matched_sentences = [] 
for word in topword.split(' '): 
    if word in sentence: 
     matched_sentences.append(sentence) 
     matched = true 
     break 
:各文に、次に

sentences_in_string = reg_sentence.findall(any_string) 

これは単なる例です。しかし、私はあなたがあなたのコードにそれらを適応させるなら、あなたが望むものを得ることができると思います。私はこれが役立つことを願っています