文単位で平均単語数を計算する

文章あたりの単語数を数えようとするのに少し問題があります。上記の例の場合文単位で平均単語数を計算する

["Hey, "!", "How", "are", "you", "?", "I", "would", "like", "a", "sandwich", "."]

、計算は1 + 3 + 5/3を次のようになります。私はこのようになりますリストを持っている私の場合のために、私は文章のみのいずれか"!"、"?"、または"."

で終わると仮定しています。私はこれを達成するのに苦労しています！何か案は？入力などの単語のリストのみが存在することができれば場合

3.0

：

import re 
s = "Hey ! How are you ? I would like a sandwich ." 
parts = [len(l.split()) for l in re.split(r'[?!.]', s) if l.strip()] 

print(sum(parts)/len(parts))

出力：re.split()とsum()関数を使用して

出典

2017-02-09 natalien

words = ["Hey", "!", "How", "are", "you", "?", "I", "would", "like", "a", "sandwich", "."] 

sentences = [[]] 
ends = set(".?!") 
for word in words: 
    if word in ends: sentences.append([]) 
    else: sentences[-1].append(word) 

if sentences[0]: 
    if not sentences[-1]: sentences.pop() 
    print("average sentence length:", sum(len(s) for s in sentences)/len(sentences))

出典

2017-02-09 18:21:14 inspectorG4dget

ショート溶液

import re 
s = ["Hey", "!", "How", "are", "you", "?", "I", "would", "like", "a", "sandwich", "."] 
parts = [len(l.split()) for l in re.split(r'[?!.]', ' '.join(s)) if l.strip()] 

print(sum(parts)/len(parts)) # 3.0

出典

2017-02-09 18:23:49 RomanPerekhrest

秒1は本当によく働きました！私は正規表現を使用するのが好きです。私はNLPプロジェクトに取り組んでいます。 – natalien

シンプルなソリューション：

mylist = ["Hey", "!", "How", "are", "you", "?", "I", "would", "like", "a", "sandwich", "."] 
terminals = set([".", "?", "!"]) # sets are efficient for "membership" tests 
terminal_count = 0 

for item in mylist: 
    if item in terminals: # here is our membership test 
     terminal_count += 1 

avg = (len(mylist) - terminal_count)/float(terminal_count)

これは、あなただけの文あたりの平均ではなく、個々のカウントを得ることについて気を前提としています。

あなたは少し空想を取得したい場合、あなたはこのようなものでforループを置き換えることができます。

terminal_count = sum(1 for item in mylist if item in terminals)

出典

2017-02-09 18:35:46 Fagan

それはかなり賢いです。ループの前に端末を 'set 'に格納する方が少し良いでしょう。あるいは、それが過度だと思うなら、少なくとも条件を "if item in"と書くことができます。！ "" – janos

@janosターミナルを自分の定数に引き出すことができます。わかりやすくするために、文字列のリストを好むでしょう。 – Fagan

なぜ 'list'は' set'ではないのですか？ – janos

文単位で平均単語数を計算する

答えて

関連する問題