2016-12-28 5 views
2

私はRFC5322電子メールアドレスを解析しようとしています。私のパーサは、結果の中の一つが正しいという意味で機能します。しかし、「正しい」結果を選択するにはどうすればよいですか?正しいReadP解析結果を選択する

文字列Foo Bar <[email protected]>が与えられていると、私のパーサはAddress (Just "Foo Bar") "[email protected]"の値を生成するはずです。

また、[email protected]という文字列を指定すると、私のパーサーはAddress Nothing "[email protected]"の値を生成する必要があります。

名前を含む値が優先されます。

私のパーサは次のようになります。私はreadP_to_S rfc5322 "Foo Bar <[email protected]>"でパーサを実行すると

import   Control.Applicative 
import   Data.Char 
import qualified Data.Text      as T 
import   Text.ParserCombinators.ReadP 

onlyEmail :: ReadP Address 
onlyEmail = do 
    skipSpaces 
    email <- many1 $ satisfy isAscii 
    skipSpaces 
    return $ Address Nothing (T.pack email) 

withName :: ReadP Address 
withName = do 
    skipSpaces 
    name <- many1 (satisfy isAscii) 
    skipSpaces 
    email <- between (char '<') (char '>') (many1 $ satisfy isAscii) 
    skipSpaces 
    return $ Address (Just $ T.pack name) (T.pack email) 

rfc5322 :: ReadP Address 
rfc5322 = withName <|> onlyEmail 

、それは次のような結果生成します。この場合

[ (Address {addressName = Nothing, addressEmail = "F"},"oo Bar <[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Fo"},"o Bar <[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo"},"Bar <[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo "},"Bar <[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo B"},"ar <[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo Ba"},"r <[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar"},"<[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar "},"<[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <"},"[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <f"},"[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <fo"},"[email protected]>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <foo"},"@bar.com>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]"},"bar.com>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]"},"ar.com>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]"},"r.com>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]"},".com>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]"},"com>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]"},"om>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]"},"m>") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]"},">") 
, (Address {addressName = Just "Foo Bar", addressEmail = "[email protected]"},"") 
, (Address {addressName = Just "Foo Bar ", addressEmail = "[email protected]"},"") 
, (Address {addressName = Nothing, addressEmail = "Foo Bar <[email protected]>"},"") 
] 

を、私は実際に欲しい結果は、サード表示されますリストの最後。どのようにその好みを表現するのですか?

答えて

3

あなたは設定をしないでください。あなたの問題は、パーサーが実際に必要とするより大きな文字列を受け入れていることです。

は、例えば、私の解決策:

import   Control.Bool 
import   Control.Applicative 
import   Data.Char 
import qualified Data.Text      as T 
import   Data.Text (Text) 
import   Text.ParserCombinators.ReadP 

email :: ReadP Text 
email = do 
    l <- part 
    a <- char '@' 
    d <- part 
    return . T.pack $ l ++ a:d 
    where 
    part = munch1 (isAscii <&&> (/='@') <&&> (/='<') <&&> (/='>')) 

name :: ReadP Text 
name = T.pack <$> chainr1 part sep 
    where 
    part = munch1 (isAlpha <||> isDigit <||> (=='\'')) 
    sep = (\xs ys -> xs ++ ' ':ys) <$ munch1 (==' ') 

onlyEmail :: ReadP Address 
onlyEmail = Address Nothing <$> email 

withName :: ReadP Address 
withName = do 
    n <- name 
    skipSpaces 
    e <- between (char '<') (char '>') email 
    return $ Address (Just n) e 

address :: ReadP Address 
address = skipSpaces *> (withName <|> onlyEmail) 

main = print $ readP_to_S address "Foo Bar <[email protected]>" 

が印刷されます:

[(Address (Just "Foo Bar") "[email protected]","")] 
関連する問題