複数の区切り文字で文字列を分割する

テキスト（意味のあるテキストまたは算術式）があり、それを単語に分割したい。私は一つの区切り文字を持っていた場合
、私が使用したい：複数の区切り文字で文字列を分割する

std::stringstream stringStream(inputString); 
std::string word; 
while(std::getline(stringStream, word, delimiter)) 
{ 
    wordVector.push_back(word); 
}

は、どのように私は、いくつかの区切り文字をトークンに文字列を破ることができますか？

出典

2011-10-01 Ypsilon IV

Boost.StringAlgorithmまたはBoost.Tokenizerが役立ちます。 –

または、いくつかのアイデアは、この答えから得ることができます：http://stackoverflow.com/questions/4888879/elegant-ways-to-count-the-frequency-of-words-in-a-file – Nawaz

@ K-ballo ：質問によると、あなたはBoostのような外部ライブラリを使うべきではありません。 – deepmax

区切り文字の1つが改行文字であると仮定すると、次のように改行文字が区切られます。この例では、区切り文字、アポストロフィ、セミコロンを選択しました。あなたはブーストを持っている場合は

std::stringstream stringStream(inputString); 
std::string line; 
while(std::getline(stringStream, line)) 
{ 
    std::size_t prev = 0, pos; 
    while ((pos = line.find_first_of(" ';", prev)) != std::string::npos) 
    { 
     if (pos > prev) 
      wordVector.push_back(line.substr(prev, pos-prev)); 
     prev = pos+1; 
    } 
    if (prev < line.length()) 
     wordVector.push_back(line.substr(prev, std::string::npos)); 
}

出典

2011-10-01 17:30:43 SoapBox

あなたが私にとって速すぎる：p改行が区切り文字でない場合、単に "通常の"区切り文字の1つを選んで（そしてそれを内側のループから削除する）動作します。 –

することは、あなたが使用できます。

#include <boost/algorithm/string.hpp> 
std::string inputString("One!Two,Three:Four"); 
std::string delimiters("|,:"); 
std::vector<std::string> parts; 
boost::split(parts, inputString, boost::is_any_of(delimiters));

出典

2013-06-03 04:02:46 MattSmith

ブーストを使用して、それを自分で行うといない方には、あなたが面白い場合。

デリミタ文字列が非常に長いと仮定すると、区切り文字であれば文字列のすべての文字をチェックするとO（M）がかかりますので、オリジナルのすべての文字に対してループします文字列、長さNで言うと、O（M * N）です。

私はマップ（「ブーリアン」のようなマップのような）を使っていますが、ここではインデックス= ASCIIの各デリミタの真を持つ単純なブール配列を使用します。

文字列を繰り返し、charが区切り文字であるかどうかをチェックするとO（1）になり、結果的に全体的にO（N）が返されます。ここで

は私のサンプルコードです：

const int dictSize = 256;  

vector<string> tokenizeMyString(const string &s, const string &del) 
{ 
    static bool dict[dictSize] = { false}; 

    vector<string> res; 
    for (int i = 0; i < del.size(); ++i) {  
     dict[del[i]] = true; 
    } 

    string token(""); 
    for (auto &i : s) { 
     if (dict[i]) { 
      if (!token.empty()) { 
       res.push_back(token); 
       token.clear(); 
      }   
     } 
     else { 
      token += i; 
     } 
    } 
    if (!token.empty()) { 
     res.push_back(token); 
    } 
    return res; 
} 


int main() 
{ 
    string delString = "MyDog:Odie, MyCat:Garfield MyNumber:1001001"; 
//the delimiters are " " (space) and "," (comma) 
    vector<string> res = tokenizeMyString(delString, " ,"); 

    for (auto &i : res) { 

     cout << "token: " << i << endl; 
    } 
return 0; 
}

注：tokenizeMyStringは値によってベクトルを返し、最初のスタック上にそれを作成するので、私たちはここに、コンパイラ>>> RVOの電源を使用している - 戻り値最適化:)

出典

2016-12-22 14:02:41 Kohn1001

誰も手動の方法を指摘していない理由を私は知らないが、ここにある：

const std::string delims(";,:. \n\t"); 
inline bool isDelim(char c) { 
    for (int i = 0; i < delims.size(); ++i) 
     if (delims[i] == c) 
      return true; 
    return false; 
}

と機能で：

std::stringstream stringStream(inputString); 
std::string word; char c; 

while (stringStream) { 
    word.clear(); 

    // Read word 
    while (!isDelim((c = stringStream.get()))) 
     word.push_back(c); 
    if (c != EOF) 
     stringStream.unget(); 

    wordVector.push_back(word); 

    // Read delims 
    while (isDelim((c = stringStream.get()))); 
    if (c != EOF) 
     stringStream.unget(); 
}

この方法で、必要に応じてデリムで便利なことができます。

出典

2017-04-04 11:27:33 forumulator

std :: string wordを移動できます。とchar c;ループ内でclear（）の使用を避ける...変数はできるだけローカルで短命でなければならない。 – Mohan

複数の区切り文字で文字列を分割する

答えて

関連する問題