私は、次のhtmlていない子どものコンテンツ

親コンテンツを選択しbeautifulsoup：。私はtext.find_all（ 'DIV'、クラス_ = "date_on_by"）のgetText（）を使用する場合私は、次のhtmlていない子どものコンテンツ

<div class="date_on_by"> 
<a sasource="qp_focused" href="/author/bill-maurer/articles">Bill Maurer</a> 
<span class="bullet">•</span> Yesterday, 9:33 AM 
<span class="bullet">•</span> 
<span class="comments">98&nbsp;Comments</span> 
</div>

、それは「

を返します。

Bill Maurer • Yesterday, 9:33 AM • 98 Comments

は、しかし、私が本当にしたいことだけです。

Yesterday, 9:33 AM

そのすべての子コンテンツにどのようにそれを行うにはされていない

0？

出典

2016-06-01 Tom Dawn

私はそれを理解しました！

for date in text.find_all('div',class_="date_on_by"): 
     dates.append(re.split(text.find_all('span',class_="bullet")[0].getText(),date.getText())[1])

出典

2016-06-01 19:54:45

あなたはをnext_siblingスパンクラス名とを使用することができます。

In [9]: h = """<div class="date_on_by"> ...: <a sasource="qp_focused" href="/author/bill-maurer/articles">Bill Maurer</a> ...: <span class="bullet">•</span> Yesterday, 9:33 AM ...: <span class="bullet">•</span> ...: <span class="comments">98 Comments</span> ...: </div>""" In [10]: from bs4 import BeautifulSoup In [11]: soup = BeautifulSoup(h) In [12]: print(soup.select_one("div.date_on_by span.bullet").next_sibling.strip()) Yesterday, 9:33 AM

サイドノートでは、あなただけが.find代わりのfind_all(..)[0]を使用する必要があり、最初の要素をしたい場合。

出典

2016-06-01 22:03:55

私は、次のhtmlていない子どものコンテンツ

答えて

関連する問題