はImportErrorとは非同期モジュール

私は簡単なスクリプト書いている要求：使用はImportErrorとは非同期モジュール

が同時HTTP要求を行う各URLの内容を取得します

ロードURLの大きなリストをrequests' asyncモジュールを
リンクがページにあるかどうかを確認するために、ページのコンテンツをlxmlで解析します。
リンクがページに存在する場合、そのページに関する情報をZODBデータベースに保存します。

私はそれがうまく機能4つのまたは5のURLでスクリプトをテストする場合、スクリプトが終了したときに、私は次のようなメッセージを持っている：

Exception KeyError: KeyError(45989520,) in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored

をしかし、私はおよそ24000のURLをチェックしようとすると、それはの終わりに向かって失敗します次のエラーを持つリスト（約400のURLを確認するために残っているとき）：

Traceback (most recent call last): 
    File "check.py", line 95, in <module> 
    File "/home/alex/code/.virtualenvs/linka/local/lib/python2.7/site-packages/requests/async.py", line 83, in map 
    File "/home/alex/code/.virtualenvs/linka/local/lib/python2.7/site-packages/gevent-1.0b2-py2.7-linux-x86_64.egg/gevent/greenlet.py", line 405, in joinall 
ImportError: No module named queue 
Exception KeyError: KeyError(45989520,) in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored

は私がpypi上の利用可能geventのバージョンとダウンロードおよびgevent repositoryから最新のバージョン（1.0b2）をインストールして、両方を試してみました。

なぜこのようなことが起こったのか、また、URLの束をチェックしたときにのみそれが起こったのはわかりません。助言がありますか？ここで

は、全体のスクリプトです：

from requests import async, defaults 
from lxml import html 
from urlparse import urlsplit 
from gevent import monkey 
from BeautifulSoup import UnicodeDammit 
from ZODB.FileStorage import FileStorage 
from ZODB.DB import DB 
import transaction 
import persistent 
import random 

storage = FileStorage('Data.fs') 
db = DB(storage) 
connection = db.open() 
root = connection.root() 
monkey.patch_all() 
defaults.defaults['base_headers']['User-Agent'] = "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko/20100101 Firefox/11.0" 
defaults.defaults['max_retries'] = 10 


def save_data(source, target, anchor): 
    root[source] = persistent.mapping.PersistentMapping(dict(target=target, anchor=anchor)) 
    transaction.commit() 


def decode_html(html_string): 
    converted = UnicodeDammit(html_string, isHTML=True) 
    if not converted.unicode: 
     raise UnicodeDecodeError(
      "Failed to detect encoding, tried [%s]", 
      ', '.join(converted.triedEncodings)) 
    # print converted.originalEncoding 
    return converted.unicode 


def find_link(html_doc, url): 
    decoded = decode_html(html_doc) 
    doc = html.document_fromstring(decoded.encode('utf-8')) 
    for element, attribute, link, pos in doc.iterlinks(): 
     if attribute == "href" and link.startswith('http'): 
      netloc = urlsplit(link).netloc 
      if "example.org" in netloc: 
       return (url, link, element.text_content().strip()) 
    else: 
     return False 


def check(response): 
    if response.status_code == 200: 
     html_doc = response.content 
     result = find_link(html_doc, response.url) 
     if result: 
      source, target, anchor = result 
      # print "Source: %s" % source 
      # print "Target: %s" % target 
      # print "Anchor: %s" % anchor 
      # print 
      save_data(source, target, anchor) 
    global todo 
    todo = todo -1 
    print todo 

def load_urls(fname): 
    with open(fname) as fh: 
     urls = set([url.strip() for url in fh.readlines()]) 
     urls = list(urls) 
     random.shuffle(urls) 
     return urls 

if __name__ == "__main__": 

    urls = load_urls('urls.txt') 
    rs = [] 
    todo = len(urls) 
    print "Ready to analyze %s pages" % len(urls) 
    for url in urls: 
     rs.append(async.get(url, hooks=dict(response=check), timeout=10.0)) 
    responses = async.map(rs, size=100) 
    print "DONE."

出典

2012-04-22 raben

失敗したときのスクリプトの状態の詳細については、デバッグを試しましたか？それはいつも同じURLですか？（例外とログのURLをキャッチ）それはメモリの問題ですか？（実行中のメモリ使用量を調べる）？ –

私はこのような大きなN00Bだけど、とにかく、私は試すことができます...！私はあなたがこのいずれかで、あなたのインポートリストを変更しようとすることができますね。

from requests import async, defaults 
import requests 
from lxml import html 
from urlparse import urlsplit 
from gevent import monkey 
import gevent 
from BeautifulSoup import UnicodeDammit 
from ZODB.FileStorage import FileStorage 
from ZODB.DB import DB 
import transaction 
import persistent 
import random

これを試してみて、それが動作するかどうかを教えてください..私はそれがあなたの問題を解決することができますね:)

出典

2012-04-29 09:07:06

私の解決策を試しても問題が解決しない場合は、このリンクを参考にしてください：http://www.daniweb.com/software-development/python/threads/251918/import-queue-dont-exist –

ありがとう、私試してみるよ。しかし、なぜこれが私の問題を解決すると思いますか？ – raben

私は同じ問題を抱えていて、それは私のために働いたのです。私はまだ理由を理解できません...とにかく、それは自由にテストすることができます:-) –

私はわからないんだけどあなたの問題の原因は何ですか？なぜ、ファイルの先頭にmonkey.patch_all（）がないのですか？

あなたのメインプログラムの先頭に

from gevent import monkey; monkey.patch_all()

を入れて試してみて、それが何を修正するかどうかを確認してもらえますか？

出典

2012-04-30 14:36:54

サルパッチは、 'async'モジュールによって内部的に行われるので、サルのパッチはまったく必要ないと思います。私はあなたの提案に従って、同じ例外が発生しました。 – raben

良い日。私はそれが開いたPythonのバグであると思います。問題1596321 http://bugs.python.org/issue1596321

出典

2012-05-05 08:08:16

はImportErrorとは非同期モジュール

答えて

関連する問題