Using Boost Tokenizer escaped_list_separator with different parameters



Answers

It seems like you're declaring your tokenizer type incorrectly.

typedef boost::tokenizer< boost::escaped_list_separator<char> > Tokenizer;
boost::escaped_list_separator<char> Separator( '\\', ' ', '\"' );
Tokenizer tok( s, Separator );

for( Tokenizer::iterator iter = tok.begin(); iter != tok.end(); ++iter )
{ cout << *iter << "\n"; }

You want to make a boost::tokenizer< boost::escaped_list_separator< char > > typed object with a boost::escaped_list_separator< char > separator object as its TokenizerFunc.

Question

Hello i been trying to get a tokenizer to work using the boost library tokenizer class. I found this tutorial on the boost documentation:

http://www.boost.org/doc/libs/1 _36 _0/libs/tokenizer/escaped _list _separator.htm

problem is i cant get the argument's to escaped _list _separator("","","");

but if i modify the boost/tokenizer.hpp file it work's. but that's not and ideal solution was wondering if there's anything i am missing to get diferent arguments into the escaped _list _separator.

i want to make it split on spaces with " and ' for escaping and with no escape character inside the quoted string.

this is used for a argument parsing system in a ingame console system.


include <iostream>
include <boost/tokenizer.hpp>
include <string>

int main() { using namespace std; using namespace boost; string s = "exec script1 \"script argument number one\""; string separator1("");//dont let quoted arguments escape themselves string separator2(" ");//split on spaces string separator3("\"\'");//let it have quoted arguments tokenizer<escaped_list_separator<char>(separator1,separator2,separator3)> tok(s); for(tokenizer<escaped_list_separator<char>(separator1,separator2,separator3)>::iterator beg=tok.begin(); beg!=tok.end();++beg) { cout << *beg << "\n"; } }

the error from visual studio 2005 is error C2974: 'boost::tokenizer' : invalid template argument for 'TokenizerFunc', type expected

EDIT: This question was awnsered by ferrucio and explained by peter thank's everybody.




Boost tokenizer to treat quoted string as one token

Try this code and this way you can avoid using Boost.Tokenizer and Boost.Spirit libs

#include <vector>
#include <string>
#include <iostream>

const char Separators[] = { ' ', 9 };

bool Str_IsSeparator( const char Ch )
{
    for ( size_t i = 0; i != sizeof( Separators ); i++ )
    {
        if ( Separators[i] == Ch ) { return true; }
    }

    return false;
}

void SplitLine( size_t FromToken, size_t ToToken, const std::string& Str, std::vector<std::string>& Components /*, bool ShouldTrimSpaces*/ )
{
    size_t TokenNum = 0;
    size_t Offset   = FromToken - 1;

    const char* CStr  = Str.c_str();
    const char* CStrj = Str.c_str();

    while ( *CStr )
    {
        // bypass spaces & delimiting chars
        while ( *CStr && Str_IsSeparator( *CStr ) ) { CStr++; }

        if ( !*CStr ) { return; }

        bool InsideQuotes = ( *CStr == '\"' );

        if ( InsideQuotes )
        {
            for ( CStrj = ++CStr; *CStrj && *CStrj != '\"'; CStrj++ );
        }
        else
        {
            for ( CStrj = CStr; *CStrj && !Str_IsSeparator( *CStrj ); CStrj++ );
        }

        // extract token
        if ( CStr != CStrj )
        {
            TokenNum++;

            // store each token found
            if ( TokenNum >= FromToken )
            {
                  Components[ TokenNum-Offset ].assign( CStr, CStrj );
                  // if ( ShouldTrimSpaces ) { Str_TrimSpaces( &Components[ TokenNum-Offset ] ); }
                  // proceed to next token
                  if ( TokenNum >= ToToken ) { return; }
            }
            CStr = CStrj;

            // exclude last " from token, handle EOL
            if ( *CStr ) { CStr++; }
        }
    }
}

int main()
{
    std::string test = "1st 2nd \"3rd with some comment\" 4th";
    std::vector<std::string> Out;

    Out.resize(5);
    SplitLine(1, 4, test, Out);

    for(size_t j = 0 ; j != Out.size() ; j++) { std::cout << Out[j] << std::endl; }

    return 0;
}

It uses a preallocated string array (it is not zero-based, but that's easily fixable) and it's pretty simple.




Just passing the whole string to the shell might suit your needs:

eg:

System("./foo some arguments");

This isn't the best solution though.

The better way seems to be write a parser to find each argument and pass it to a exec style function.




You can use an escaped_list_separator from the tokenizer library. See this question for more details on how to apply it to your problem.




Links



Tags