2016-11-05 6 views
0

私は:で終わるされている<b></b>タグ内のすべての単語(コロン)をラップしたいこの特定の文字で終わるすべての単語を置き換えるには?

<b>Source:</b> <a href=\'http://archive.ics.uci.edu/ml/datasets/Iris\'>UCI Machine Learning Repository</a><br>Creator: <br>R.A. Fisher<br>Donor: <br>Michael Marshall (MARSHALL%<u>PLU <b>\'@\'</b> io.arc.nasa.gov</u>)<br><b>Abstract:</b> Famous database; from Fisher, 1936<br><b>Data Set Information:</b> This is perhaps the best known database to be found in the pattern recognition literature. Fisher\'s paper is a classic in the field and is referenced frequently to this day. (See Duda &amp; Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other.<br>Predicted attribute: class of iris plant.<br>This is an exceedingly simple domain.<br>This data differs from the data presented in Fishers article (identified by Steve Chadwick, <u>spchadwick <b>\'@\'</b> espeedaz.net</u>). The 35th sample should be: 4.9,3.1,1.5,0.2,"Iris-setosa" where the error is in the fourth feature. The 38th sample: 4.9,3.6,1.4,0.1,"Iris-setosa" where the errors are in the second and third features. <br><b>Attribute Information:</b><br> 1. sepal length in cm<br> 2. sepal width in cm<br> 3. petal length in cm<br> 4. petal width in cm<br> 5. class: <br>  -- Iris Setosa<br>  -- Iris Versicolour<br>  -- Iris Virginica 

のようなHTMLを持って

それを行うにはPythonで正規表現とは何ですか?

私はこの正規表現\b(\w+:)\bを試しましたが、動作しません。

+0

試みを必要とする正規表現です –

答えて

1

は、この正規表現を試してみてください:

<b>[A-Za-z ]{1,}\:</b> 

は、リストにして、すべてのこれらの単語を取得し、その後、あなたが実行したいものは何でも処理を行います。

0
import re 

regex = "<b>bold string with colon:</b>" 

matchObj = re.match(r'<b>(.*):</b>', regex, 0) 
if matchObj: 
    print matchObj.group() 
関連する問題