2016-09-16 6 views
1

私はウェブサイトhttp://www.footballlocks.com/nfl_odds.shtmlからオッズ情報をPythonを使って抽出したいと思います。掻き出しテーブルの情報

私はBeautifulSoupでやってみました。

最適な結果は、値が数式に入力されるので、辞書またはリスト形式でオッズ情報を取得することです。

オッズ情報のためのHTMLコードは次のとおりです。

<TABLE COLS="6" WIDTH="650" BORDER="0" CELLSPACING="5" CELLPADDING="2"> 

    <TR> 
    <TD WIDTH="19%"><span title="Date and Time of Game."><B>Date & Time</B></span></TD> 
    <TD WIDTH="21%"><span title="Team Spotting Points in a Bet Against the Point Spread."><B>Favorite</B></span></TD> 
    <TD WIDTH="14%"><span title="Short for Point Spread. Number of Points Subtracted from Final Score of Favorite to Determine Winner of a Point Spread Based Wager."><B>Spread</B></span></TD> 
    <TD WIDTH="21%"><span title="Team Receiving Points in a Bet With the Point Spread."><B>Underdog</B></span></TD> 
    <TD WIDTH="6%"><span title="Line for Betting Over or Under the Total number of Points Scored by Both Teams Combined. Synonymous With Over/Under."><B>Total</B></span></TD> 
    <TD WIDTH="19%"><span title="Money odds to Win the Game Outright, Without any Point Spread. 
Minus (-) is Amount Bettors Risk for Each $100 on the Favorite to Win the Game Outright. 
Plus (+) is Amount Bettors Win for Each $100 Risked on the Underdog to Win the Game Outright."><B>Money Odds</B></span></TD> 
    </TR> 






<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Detroit</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6</TD> 
    <TD>Tennessee</TD> 
    <TD>47</TD> 
    <TD>-$255 +$215</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Houston</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-2.5</TD> 
    <TD>Kansas City</TD> 
    <TD>43</TD> 
    <TD>-$140 +$120</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At New England</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6.5</TD> 
    <TD>Miami</TD> 
    <TD>42</TD> 
    <TD>-$290 +$240</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>Baltimore</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6.5</TD> 
    <TD>At Cleveland</TD> 
    <TD>42.5</TD> 
    <TD>-$300 +$250</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Pittsburgh</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-3.5</TD> 
    <TD>Cincinnati</TD> 
    <TD>48.5</TD> 
    <TD>-$180 +$160</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Washington</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-2.5</TD> 
    <TD>Dallas</TD> 
    <TD>45.5</TD> 
    <TD>-$145 +$125</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At NY Giants</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-4.5</TD> 
    <TD>New Orleans</TD> 
    <TD>53.5</TD> 
    <TD>-$225 +$185</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Carolina</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-13.5</TD> 
    <TD>San Francisco</TD> 
    <TD>45</TD> 
    <TD>-$900 +$600</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:05 ET</TD> 
    <TD>At Arizona</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-7</TD> 
    <TD>Tampa Bay</TD> 
    <TD>50</TD> 
    <TD>-$310 +$260</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:05 ET</TD> 
    <TD>Seattle</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6.5</TD> 
    <TD>At Los Angeles</TD> 
    <TD>38</TD> 
    <TD>-$290 +$240</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:25 ET</TD> 
    <TD>At Denver</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6.5</TD> 
    <TD>Indianapolis</TD> 
    <TD>46.5</TD> 
    <TD>-$280 +$240</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:25 ET</TD> 
    <TD>At Oakland</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-4.5</TD> 
    <TD>Atlanta</TD> 
    <TD>49</TD> 
    <TD>-$210 +$180</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:25 ET</TD> 
    <TD>At San Diego</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-3</TD> 
    <TD>Jacksonville</TD> 
    <TD>47</TD> 
    <TD>-$165 +$145</TD> 
    </TR> 


<TR> 
    <TD>9/18 8:30 ET</TD> 
    <TD>Green Bay</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-2.5</TD> 
    <TD>At Minnesota</TD> 
    <TD>43.5</TD> 
    <TD>-$140 +$120</TD> 
    </TR> 




</TABLE> 

Pythonコードこれまで。

from bs4 import BeautifulSoup 
import urllib 

url = "http://www.footballlocks.com/nfl_odds.shtml" 
html = urllib.urlopen(url) 

soup = BeautifulSoup(html, 'html.parser') 

for record in soup.find_all('tr'): 
    for data in record.find_all('td'): 
    print data.text 

PS。私の経歴は経済学であり、私のプログラミング経験は限られています。

+1

正確には何あなたの質問ですか? – martineau

+0

オッズに関連する情報のみをどのようにして辞書に入れることができますか? { 'カロライナ州で': - 900、 'サンフランシスコ':600 $ 900 + $ 600 は次のようになります - カロライナ\t -13.5 \tサンフランシスコで例えば 9/18 1時ET \t –

+1

質問を編集して、上記の「これは私の質問です」を追加することができます。 – JasonD

答えて

1

我々が使用することができます何のクラスが存在しないとして解析する素敵なHTMLではありませんが、これはdictsのリストにすべての行を置く:

from bs4 import BeautifulSoup 
import requests 


url = "http://www.footballlocks.com/nfl_odds.shtml" 

soup = BeautifulSoup(requests.get(url).content) 

# Use the text of one of the headers to find the correct table 
table = soup.find("span", text="Date & Time").find_previous("table") 



data = [] 
# start from second tr 
for row in table.select("tr + tr"): 
    # index to get the tds we need 
    tds = [td.text for td in row.find_all("td")] 
    fav, under, odds = tds[1], tds[2], tds[-1] 
    # split money odds into fav/under odds 
    f_odds,u_odds = odds.split() 

    data.append({fav: f_odds.replace(u"$", ""), under : u_odds.replace(u"$", "")}) 
from pprint import pprint as pp 
pp(data) 

出力:

[{u'At Detroit': u'-255', u'Tennessee': u'+215'}, 
{u'At Houston': u'-130', u'Kansas City': u'+110'}, 
{u'At New England': u'-290', u'Miami': u'+240'}, 
{u'At Cleveland': u'+225', u'Baltimore': u'-265'}, 
{u'At Pittsburgh': u'-175', u'Cincinnati': u'+155'}, 
{u'At Washington': u'-150', u'Dallas': u'+130'}, 
{u'At NY Giants': u'-215', u'New Orleans': u'+180'}, 
{u'At Carolina': u'-900', u'San Francisco': u'+600'}, 
{u'At Arizona': u'-330', u'Tampa Bay': u'+270'}, 
{u'At Los Angeles': u'+250', u'Seattle': u'-300'}, 
{u'At Denver': u'-275', u'Indianapolis': u'+235'}, 
{u'At Oakland': u'-210', u'Atlanta': u'+180'}, 
{u'At San Diego': u'-160', u'Jacksonville': u'+140'}, 
{u'At Minnesota': u'+115', u'Green Bay': u'-135'}] 
+0

ありがとうございました! –

+0

あなたは実際にはタプルを保存するほうがおそらくおそらく、おそらくおそらくタプルを知っているので、 "{{" team ":u'At Detroit"、 "spread":u'-255 ' }、 "away:{" team ":" u'Tennessee ":" spread ":u '+ 215'}}' –

関連する問題