2016-08-08 4 views
1

私はcraiglistでいくつかの成功の前に治療法を使ってきましたが、私はユーザー名のスチームを恣意的に掻き取ろうとしています。 (例えばxempyです)ユーザー名要素が中に含まれている蒸気のウェブサイトでブランクアレイを返すスクリーニングシェル?

<a class="searchPersonaName" href="https://steamcommunity.com/id/zxZEmpy">xempy</a> 

私は上記URLから実際のユーザ名をこすりするために使用しているコマンドは次のとおりです。

response.select('//*[@id="search_results"]/div[3]/div[3]/a/text()').extract() 
私はこすりしようとしてる

URLは私がINTEだ要素のXPathをコピーするためにChromeを使用

https://steamcommunity.com/search/users/#filter=users&text=xempy 

です手で入力するのではなく、手で入力するだけでなく、手で入力しても、絶対パスで入力するのではなく、空の配列を取得します。ユーザ名「xempy」。

私は間違っていますか?私は同じプロセスを使ってcraigslistをうまく削りましたが、蒸気のウェブサイト上では動作していないように見えて、蒸気治療スクリプトの実際の例は見つかりませんでした。

+1

シェルから 'view(response)'を実行し、ブラウザの実際のソースを見て、右クリックしてソースを選択してください –

答えて

0

ブラウザで実際のソースを見ると、右クリックして結果の兆候が表示されないソースを表示すると、データはajaxリクエストを介してhttps://steamcommunity.com/search/SearchCommunityAjaxに動的に追加されます。

あなたは、AJAXリクエストを模倣する必要がありますが、私がリクエストを使用しているが、ステップはscrapyで同じになります。私たちは、コードを実行した場合

import requests 

headers = { 
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36", 
    "X-Requested-With": "XMLHttpRequest"} 
params = {"text": "xempy", "filter": "users", "sessionid": "", "steamid_user": "false", "page": "1"} 
ajax_url = "https://steamcommunity.com/search/SearchCommunityAjax" 
with requests.Session() as s: 
    s.headers.update() 
    r = s.get("https://steamcommunity.com/search/users/#filter=users&text=xempy") 
    # need to update the session id which we get from the previous gets headers 
    params["sessionid"] = next(
     c.split("=", 1)[1] for c in r.headers["set-cookie"].split(";") if c.startswith("sessionid")) 
    # need to update the session headers 
    s.headers.update(r.headers) 
    # and also the cookies from the previous request 
    s.cookies.update(r.cookies) 
    result = (s.get(ajax_url, params=params).json()) 

あなたは私たちはいくつかのJSONを返す見ることができます。

In [5]: with requests.Session() as s: 
    ...:   s.headers.update() 
    ...:   r = s.get("https://steamcommunity.com/search/users/#filter=users&text=xempy") 
    ...:   params["sessionid"] = next(
    ...:    c.split("=", 1)[1] for c in r.headers["set-cookie"].split(";") if c.startswith("sessionid")) 
    ...:   s.headers.update(r.headers) 
    ...:   s.cookies.update(r.cookies) 
    ...:   result = (s.get(ajax_url, params=params).json()) 
    ...:   print(result) 
    ...:  
{u'html': u'\t\t<div style="float: right; padding-bottom: 2px">\r\n\t\t\t\t\t\tShowing 1 - 11 of 11\t\t\t</div>\r\n\t<div style="clear: both"></div>\r\n\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="16183171" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/zxZEmpy"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/b9/b9c886a08cf17c4f1f31ea19148d8b3bbd748762_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/zxZEmpy">xempy</a><br />\r\n\t\t\t\t\t\t\t\t&nbsp;\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">zxZEmpy</span></div>\r\n\t\t\t\t\t\t\t\t\t\t<div>\r\n\t\t\t\t\tAlso known as: <span style="color: whitesmoke">trill</span>, <span style="color: whitesmoke">[TGIF] Mario Batali</span>, <span style="color: whitesmoke">[TGIF] Mario \xdfatali</span>, <span style="color: whitesmoke">Mario \xdfatali</span>, <span style="color: whitesmoke">[TGIF\'</span>, <span style="color: whitesmoke">[TGIF] Mario \u03b2atali</span>\t\t\t\t</div>\r\n\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="280326130" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/xempyjecar"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/89/8928b324ba9c12859283e8be3f11f19d9232033c_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/xempyjecar">Xempy -A-</a><br />\r\n\t\t\t\t\tIgor<br />\t\t\tSerbia&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/rs.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">xempyjecar</span></div>\r\n\t\t\t\t\t\t\t\t\t\t<div>\r\n\t\t\t\t\tAlso known as: <span style="color: whitesmoke">Xempy -A- NEW SEASON HYPEE</span>, <span style="color: whitesmoke">Brekija</span>, <span style="color: whitesmoke">FAIRPLAY ORGANISATION</span>, <span style="color: whitesmoke">Xempy | csgoshit.com</span>, <span style="color: whitesmoke">Xempy | csgorage.com</span>, <span style="color: whitesmoke">\u2500\u2500\u2500\u2554\u2550\u2550\u2550\u2557</span>, <span style="color: whitesmoke">XempyTheCupcake</span>\t\t\t\t</div>\r\n\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="315139919" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/filipppp"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/ca/caa5747851b5255a2d76699d855bf20e709af3d1_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/filipppp">Xempy -A-</a><br />\r\n\t\t\t\t\tIgor<br />\t\t\tSerbia&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/rs.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">filipppp</span></div>\r\n\t\t\t\t\t\t\t\t\t\t<div>\r\n\t\t\t\t\tAlso known as: <span style="color: whitesmoke">Extreeemeeee</span>, <span style="color: whitesmoke">Ratatatatatata</span>\t\t\t\t</div>\r\n\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="258386073" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/lenyagoglov"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/71/71ee8d0519c74cea0352836b188c747b36224f8f_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/lenyagoglov">Xempys</a><br />\r\n\t\t\t\t\tTed<br />\t\t\tLuxembourg&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/lu.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">lenyagoglov</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="257927191" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/rostislavtseychuk85"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/86/8641de85a283f0d23d1cbeb35ee0c0d5ca87a83b_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/rostislavtseychuk85">Xempys</a><br />\r\n\t\t\t\t\tGabriel<br />\t\t\tLebanon&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/lb.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">rostislavtseychuk85</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="252811169" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/mochulskayaa"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/76/76c10b0744403468aaf8090f56ca8ddd61338925_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/mochulskayaa">Xempys</a><br />\r\n\t\t\t\t\tRichard<br />\t\t\tGuatemala&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/gt.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">mochulskayaa</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="260028611" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/katerukhina"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/24/24241e97a6caf3bd932a01ea22afc6b3d758f1a1_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/katerukhina">Xempys</a><br />\r\n\t\t\t\t\tChristian<br />\t\t\tFiji&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/fj.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">katerukhina</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="292454844" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/purdenkos"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/5c/5c7f9d1b71a68ab8599ae0fe8f2c4e0445348eaa_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/purdenkos">Xempys</a><br />\r\n\t\t\t\t\tPatrik<br />\t\t\tCote D\'ivoire (Ivory Coast)&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/ci.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">purdenkos</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="56000172" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/v2incent"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/ac/ac45a256e0a14712efff255db0105fedd80a4f0e_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/v2incent">Ext4ze ` ^0| \'Xempy^0\'</a><br />\r\n\t\t\t\t\tv2incent<br />\t\t\t&nbsp;\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">v2incent</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="297670812" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/xempy"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/62/62ea583f7f838562c73cb70e3993e27acd583aef_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/xempy">xempsanity `\xb4</a><br />\r\n\t\t\t\t\tIgor<br />\t\t\tSerbia&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/rs.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">xempy</span></div>\r\n\t\t\t\t\t\t\t\t\t\t<div>\r\n\t\t\t\t\tAlso known as: <span style="color: whitesmoke">XEMPYKiNGOFNOTHiNG</span>, <span style="color: whitesmoke">X3MPY</span>, <span style="color: whitesmoke">X3MPY * brother\'s on acc</span>\t\t\t\t</div>\r\n\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t\t\t\t\t<div class="search_row">\r\n\t<div class="search_result_friend">\r\n\t\t\t</div>\r\n\t<div class="mediumHolder_default" data-miniprofile="121633219" style="float:left;"><div class="avatarMedium"><a href="https://steamcommunity.com/id/Empyrk"><img src="https://steamcdn-a.akamaihd.net/steamcommunity/public/images/avatars/6b/6b87d7a04bf211a2665b828436ad34e549f2b193_medium.jpg"></a></div></div>\r\n\t<div class="searchPersonaInfo">\r\n\t\t<a class="searchPersonaName" href="https://steamcommunity.com/id/Empyrk">Empyrk</a><br />\r\n\t\t\t\t\tMatteo<br />\t\t\tToscana, Italy&nbsp;<img style="margin-bottom:-2px" src="https://steamcommunity-a.akamaihd.net/public/images/countryflags/it.gif" border="0" />\t\t\t</div>\r\n\t<div style="clear:left"></div>\r\n\r\n\t\t\t<div class="search_match_info">\r\n\t\t\t\t\t\t\t\t\t\t<div>Custom URL: steamcommunity.com/id/<span style="color: whitesmoke">Empyrk</span></div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\t</div>\r\n\t\t\t\t<div style="clear: both"></div>\r\n\t\t<div style="float: right; padding-bottom: 2px">\r\n\t\t\t\t\t\tShowing 1 - 11 of 11\t\t\t</div>\r\n\t<div style="clear: both"></div>\r\n\r\n\r\n', u'search_filter': u'users', u'search_text': u'xempy', u'success': 1, u'search_page': 1} 

ソースを取得するには、results["html"]にアクセスする必要があります。

+0

ありがとう!それはうまくいった! – isaacprograms

関連する問題