印刷出力を.txtに保存

.txtドキュメントからすべての電子メールアドレスをエクスポートし、すべての電子メールアドレスを出力するスクリプトがあります。私はlist.txtにこれを保存したい、そして可能であれば、を重複を削除しますが、それはエラーを与えるだろう印刷出力を.txtに保存

Traceback (most recent call last): 
    File "mail.py", line 44, in <module> 
    notepad.write(email.read()) 
AttributeError: 'str' object has no attribute 'read'

スクリプト：あなたはopen()を持っているので

from optparse import OptionParser 
import os.path 
import re 

regex = re.compile(("([a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`" 
        "{|}~-]+)*(@|\sat\s)(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?(\.|" 
        "\sdot\s))+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)")) 

def file_to_str(filename): 
    """Returns the contents of filename as a string.""" 
    with open(filename) as f: 
     return f.read().lower() # Case is lowered to prevent regex mismatches. 

def get_emails(s): 
    """Returns an iterator of matched emails found in string s.""" 
    # Removing lines that start with '//' because the regular expression 
    # mistakenly matches patterns like 'http://[email protected]' as '//[email protected]'. 
    return (email[0] for email in re.findall(regex, s) if not  email[0].startswith('//')) 

if __name__ == '__main__': 
    parser = OptionParser(usage="Usage: python %prog [FILE]...") 
    # No options added yet. Add them here if you ever need them. 
    options, args = parser.parse_args() 

    if not args: 
     parser.print_usage() 
     exit(1) 

    for arg in args: 
     if os.path.isfile(arg): 
      for email in get_emails(file_to_str(arg)): 
       #print email 
       notepad = open("list.txt","wb") 
       notepad.write(email.read()) 
       notepad.close() 

     else: 
      print '"{}" is not a file.'.format(arg) 
      parser.print_usage()

出典

2016-12-02 Jesse kraal

ことは、これを試してみてください。 .read（）を使わないでメールしてください。 – neverwalkaloner

@neverwalkalonerが書いたように、 'email.read（）'から '.read（）'を削除します。さらに、重複を削除するには、 'get_emails'の戻り値を' set email in set（get_emails（file_to_str（arg）））： 'でセットに変換することができます。 – internetional

.read（）を削除すると、印刷メールを使用するとlist.txtに1つのメールアドレスしか表示されません。抽出がビジー状態のときにlist.txtをリフレッシュすると、電子メールの変更は反映されますが、1だけ表示されます。 –

When I remove .read() it shows only 1 email adres in list.txt when I use print email is shows a couple of hundred. when refreshing the list.txt while the extraction is busy the email adres change's but it only shows 1.

これは、ループ内のclose()、i。 e。ファイルはそれぞれemailのために書き込まれ、最後のアドレス行だけが書き込まれます。

  notepad = open("list.txt", "wb") 
      for email in get_emails(file_to_str(arg)): 
       #print email 
       notepad.write(email) 
      notepad.close()

またはより良い：：へのループを変更 `notepad.write（電子メール）`：

  with open("list.txt", "wb") as notepad: 
       for email in get_emails(file_to_str(arg)): 
       #print email 
       notepad.write(email)

出典

2018-02-28 13:36:31 Armali

印刷出力を.txtに保存

答えて

関連する問題