2017-12-31 71 views
-3

段落の先頭から1つの空白を削除して、段落の最初の文字を大文字にする方法は?先導スペースと大文字

入力:

this is a sample sentence. This is a sample second sentence. 

出力:

This is a sample sentence. This is a sample second sentence. 

私の努力は、これまで:

import spacy, re 
nlp = spacy.load('en_core_web_sm') 
doc = nlp(unicode(open('2.txt').read().decode('utf8'))) 
tagged_sent = [(w.text, w.tag_) for w in doc] 
normalized_sent = [w.capitalize() if t in ["NN","NNS"] else w for (w,t) in tagged_sent] 
normalized_sent1 = normalized_sent[0].capitalize() 
string = re.sub(" (?=[\.,'!?:;])", "", ' '.join(normalized_sent1)) 
rtn = re.split('([.!?] *)', string) 
final = ''.join([i.capitalize() for i in rtn]) 
print final 

これは段落の先頭を除くすべての段落の文章の最初の単語を大文字に?

Output: 
on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. 

Expected output: 
On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. 
+2

あなたがこれまでに試してみました何を?あなたのコードを投稿してください。 – James

+1

"段落"を定義します。 – Sweeper

+0

'nltk'ライブラリを使っても問題ありませんか? –

答えて

1

要件が最初のスペースを削除してから最初の文字の資金を作るためだけにある場合は、このような何か試すことができます。

your_data=' on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. you can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. when you create pictures, charts, or diagrams, they also coordinate with your current document look. ' 
conversion=list(your_data) 
if conversion[0]==' ': 
    del conversion[0] 

capitalize="".join(conversion).split() 
for j,i in enumerate(capitalize): 
    try: 
     if j==0: 
      capitalize[j]=capitalize[j].capitalize() 

     if '.' in i: 
      capitalize[j + 1] = capitalize[j + 1].capitalize() 
    except IndexError: 
     pass 

print(" ".join(capitalize)) 

出力:

On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. 
2

あなたは正規表現とstr.capitalize()を使用することができます。

import re 
s = " this is a sample sentence. This is a sample second sentence." 
new_s = '. '.join(i.capitalize() for i in re.split('\.\s', re.sub('^\s+', '', s))) 

出力:

'This is a sample sentence. This is a sample second sentence.' 
+0

ありがとうございます。しかし、私の期待される出力は: これはサンプル文章です。これはサンプルの第2文です。 –

+0

@Programmer_nltk最近の私の編集をご覧ください。 – Ajax1234

1

簡単な解決策は次のようになり、(私はアヤックスの答え@お勧めします)

x = 'on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. ' 
print('. '.join(map(lambda s: s.strip().capitalize(), x.split('.')))) 

出力:

On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. 
関連する問題