なぜこの正規表現は2桁の間に1つの空白がないのですか？

文字列の各桁に囲まれた空白を1つ見つけようとしています。なぜこの正規表現は2桁の間に1つの空白がないのですか？

str_locate_all(str1, "\\s+")[[1]] 
    start end 
[1,]  7 7 
[2,] 11 11 

str_locate_all(str1, "[[:digit:]]\\s[[:digit:]]")[[1]] 
    start end 
[1,]  6 8 
[2,] 10 12

私が見て期待したものです：返す

library('stringr') 
str1 <- "1805.6 1-1 1" 
str_locate_all(str1, "\\s+")[[1]] 
str_locate_all(str1, "[[:digit:]]\\s[[:digit:]]")[[1]]

：私は、次の例を構築してきました。今、別の文字列に同じことを実行します。

str2 <- "1805.6 1 1 1" 
str_locate_all(str2, "\\s+")[[1]] 
str_locate_all(str2, "[[:digit:]]\\s[[:digit:]]")[[1]]

しかし、これは、（第二のパターンは2つのだけのエントリを返すことに注意してください）数字に囲まれたスペースの1つを欠場するようだ：だから

str_locate_all(str2, "\\s+")[[1]] 
    start end 
[1,]  7 7 
[2,]  9 9 
[3,] 11 11 

str_locate_all(str2, "[[:digit:]]\\s[[:digit:]]")[[1]] 
    start end 
[1,]  6 8 
[2,] 10 12

質問は、なぜ2番目のパターンが中間の空白を参照していないと8 10と行を返しますか？私はちょうどregexの考え方から物事を見ていないと確信しています。

出典

2016-05-24 Bryan Hanson

マッチが消費されます。 regexをオーバーラップさせるための検索を試してください。http://stackoverflow.com/search?q=%5Br%5D+overlapping+regex –

この出力は興味深いものですが、 'str_locate_all（str2、"（？=（\\ d \\ s \ \ d）） "）[[1]]'終わりは始まりの前です:) –

あなたの数字は、試合後にスペースが消費された後に消費されます。あなたはその試合を見つけることができません。あなたの例では

注： - xがマッチした数字

1805.6 1 1 1 
    x^x 
     | 
    First match 

1805.6 1 1 1 
     ^
     | 
Once the regex engine moves forward, it cannot see backward(unless lookbehind is used). 
Here, first digit from regex is matched with space which is not correct so the match fails outright and next position is attempted. 

1805.6 1 1 1  
     x^x 
     ||Matches digit 
     |Matches space 
     Matches digit 
    (Second match) 

This goes on till end of string is reached

がここ

lookahaeadsがゼロであるとして、あなたは

> str_locate_all(str1, "\\d\\s(?=\\d)")[[1]] 
    start end 
[1,]  6 7 
[2,]  8 9 
[3,] 10 11

として代わりにlookaheadを使用することができます可視化する意味幅、我々はg実際の終了位置よりも1番目の位置が小さい。

出典

2016-05-24 18:46:35 rock321987

この明確な説明をありがとう。関連する質問： '[[：digit：]]と' \\ d'の間に機能的な違いがありますか？ありがとう。 –

@BryanHansonいいえ私はそう考えていません。私の知る限りでは、私は '[[：digit：]]'が主にDFAエンジン – rock321987

によってサポートされていることを知っていました。私が持っていたチートシートの上にあったので、私はそれを使用しました。私は将来、より短い形で行くつもりです。 –

なぜこの正規表現は2桁の間に1つの空白がないのですか？

答えて

関連する問題