Edinburgh Speech Tools 2.4-release
 
Loading...
Searching...
No Matches
EST_TokenStream Class Reference

#include <include/EST_Token.h>

Public Member Functions

 ~EST_TokenStream ()
 will close file if appropriate for type
 
int open (const EST_String &filename)
 open a \Ref{EST_TokenStream} for a file.
 
int open (FILE *ofp, int close_when_finished)
 open a \Ref{EST_TokenStream} for an already opened file
 
int open (istream &newis)
 open a \Ref{EST_TokenStream} for an already open istream
 
int open_string (const EST_String &newbuffer)
 open a \Ref{EST_TokenStream} for string rather than a file
 
void close (void)
 Close stream.
 
stream access functions
EST_TokenStreamget (EST_Token &t)
 get next token in stream
 
EST_Tokenget ()
 get next token in stream
 
get the next token which must be the argument.
EST_Tokenmust_get (EST_String expected, bool *ok)
 
EST_Tokenmust_get (EST_String expected, bool &ok)
 
EST_Tokenmust_get (EST_String expected)
 
EST_Token get_upto (const EST_String &s)
 get up to {\tt s} in stream as a single token.
 
EST_Token get_upto_eoln (void)
 get up to {\tt s} in end of line as a single token.
 
EST_Tokenpeek (void)
 peek at next token
 
int fread (void *buff, int size, int nitems) EST_WARN_UNUSED_RESULT
 Reading binary data, (don't use peek() immediately beforehand)
 
stream initialization functions
void set_WhiteSpaceChars (const EST_String &ws)
 set which characters are to be treated as whitespace
 
void set_SingleCharSymbols (const EST_String &sc)
 set which characters are to be treated as single character symbols
 
void set_PunctuationSymbols (const EST_String &ps)
 set which characters are to be treated as (post) punctuation
 
void set_PrePunctuationSymbols (const EST_String &ps)
 set which characters are to be treated as (post) punctuation
 
void set_quotes (char q, char e)
 set characters to be used as quotes and escape, and set quote mode
 
int quoted_mode (void)
 query quote mode
 

miscellaneous

int linenum (void) const
 returns line number of \Ref{EST_TokenStream}
 
int eof ()
 end of file
 
int eoln ()
 end of line
 
int filepos (void) const
 current file position in \Ref{EST_TokenStream}
 
int tell (void) const
 tell, synonym for filepos
 
int seek (int position)
 seek, reposition file pointer
 
int seek_end ()
 
int restart (void)
 Reset to start of file/string.
 
const EST_String pos_description ()
 A string describing current position, suitable for error messages.
 
const EST_String filename () const
 The originating filename (if there is one)
 
FILE * filedescriptor ()
 For the people who need the actual description (if possible)
 
EST_TokenStreamoperator>> (EST_Token &p)
 
EST_TokenStreamoperator>> (EST_String &p)
 
ostream & operator<< (ostream &s, EST_TokenStream &p)
 

Detailed Description

A class that allows the reading of \Ref{EST_Token}s from a file stream, pipe or string. It automatically tokenizes a file based on user definable whitespace and punctuation.

The definitions of whitespace and punctuation are user definable. Also support for single character symbols is included. Single character symbols {always} are treated as individual tokens irrespective of their white space context. Also a quote mode can be used to read uqoted tokens.

The setting of whitespace, pre and post punctuation, single character symbols and quote mode must be down (immediately) after opening the stream.

There is no unget but peek provides look ahead of one token.

Note there is an interesting issue about what to do about the last whitespace in the file. Should it be ignored or should it be attached to a token with a name string of length zero. In unquoted mode the eof() will return TRUE if the next token name is empty (the mythical last token). In quoted mode the last must be returned so eof will not be raised.

Author
Alan W Black (awb@c.nosp@m.str..nosp@m.ed.ac.nosp@m..uk): April 1996

Definition at line 235 of file EST_Token.h.

Constructor & Destructor Documentation

◆ EST_TokenStream()

EST_TokenStream::EST_TokenStream ( )

Definition at line 118 of file EST_Token.cc.

◆ ~EST_TokenStream()

EST_TokenStream::~EST_TokenStream ( )

will close file if appropriate for type

Definition at line 167 of file EST_Token.cc.

Member Function Documentation

◆ open() [1/3]

int EST_TokenStream::open ( const EST_String & filename)

open a \Ref{EST_TokenStream} for a file.

Definition at line 200 of file EST_Token.cc.

◆ open() [2/3]

int EST_TokenStream::open ( FILE * ofp,
int close_when_finished )

open a \Ref{EST_TokenStream} for an already opened file

Definition at line 218 of file EST_Token.cc.

◆ open() [3/3]

int EST_TokenStream::open ( istream & newis)

open a \Ref{EST_TokenStream} for an already open istream

Definition at line 238 of file EST_Token.cc.

◆ open_string()

int EST_TokenStream::open_string ( const EST_String & newbuffer)

open a \Ref{EST_TokenStream} for string rather than a file

Definition at line 251 of file EST_Token.cc.

◆ close()

void EST_TokenStream::close ( void )

Close stream.

Definition at line 406 of file EST_Token.cc.

◆ get() [1/2]

EST_TokenStream & EST_TokenStream::get ( EST_Token & t)

get next token in stream

Definition at line 486 of file EST_Token.cc.

◆ get() [2/2]

EST_Token & EST_TokenStream::get ( void )

get next token in stream

Definition at line 710 of file EST_Token.cc.

◆ must_get() [1/3]

EST_Token & EST_TokenStream::must_get ( EST_String expected,
bool * ok )

Definition at line 561 of file EST_Token.cc.

◆ must_get() [2/3]

EST_Token & EST_TokenStream::must_get ( EST_String expected,
bool & ok )
inline

Definition at line 318 of file EST_Token.h.

◆ must_get() [3/3]

EST_Token & EST_TokenStream::must_get ( EST_String expected)
inline

Definition at line 320 of file EST_Token.h.

◆ get_upto()

EST_Token EST_TokenStream::get_upto ( const EST_String & s)

get up to {\tt s} in stream as a single token.

Definition at line 492 of file EST_Token.cc.

◆ get_upto_eoln()

EST_Token EST_TokenStream::get_upto_eoln ( void )

get up to {\tt s} in end of line as a single token.

Definition at line 516 of file EST_Token.cc.

◆ peek()

EST_Token & EST_TokenStream::peek ( void )

peek at next token

Definition at line 830 of file EST_Token.cc.

◆ fread()

int EST_TokenStream::fread ( void * buff,
int size,
int nitems )

Reading binary data, (don't use peek() immediately beforehand)

Definition at line 355 of file EST_Token.cc.

◆ set_WhiteSpaceChars()

void EST_TokenStream::set_WhiteSpaceChars ( const EST_String & ws)
inline

set which characters are to be treated as whitespace

Definition at line 335 of file EST_Token.h.

◆ set_SingleCharSymbols()

void EST_TokenStream::set_SingleCharSymbols ( const EST_String & sc)
inline

set which characters are to be treated as single character symbols

Definition at line 338 of file EST_Token.h.

◆ set_PunctuationSymbols()

void EST_TokenStream::set_PunctuationSymbols ( const EST_String & ps)
inline

set which characters are to be treated as (post) punctuation

Definition at line 341 of file EST_Token.h.

◆ set_PrePunctuationSymbols()

void EST_TokenStream::set_PrePunctuationSymbols ( const EST_String & ps)
inline

set which characters are to be treated as (post) punctuation

Definition at line 344 of file EST_Token.h.

◆ set_quotes()

void EST_TokenStream::set_quotes ( char q,
char e )
inline

set characters to be used as quotes and escape, and set quote mode

Definition at line 347 of file EST_Token.h.

◆ quoted_mode()

int EST_TokenStream::quoted_mode ( void )
inline

query quote mode

Definition at line 349 of file EST_Token.h.

◆ linenum()

int EST_TokenStream::linenum ( void ) const
inline

returns line number of \Ref{EST_TokenStream}

Definition at line 354 of file EST_Token.h.

◆ eof()

int EST_TokenStream::eof ( )
inline

end of file

Definition at line 356 of file EST_Token.h.

◆ eoln()

int EST_TokenStream::eoln ( void )

end of line

Definition at line 818 of file EST_Token.cc.

◆ filepos()

int EST_TokenStream::filepos ( void ) const
inline

current file position in \Ref{EST_TokenStream}

Definition at line 361 of file EST_Token.h.

◆ tell()

int EST_TokenStream::tell ( void ) const
inline

tell, synonym for filepos

Definition at line 363 of file EST_Token.h.

◆ seek()

int EST_TokenStream::seek ( int position)

seek, reposition file pointer

Definition at line 305 of file EST_Token.cc.

◆ seek_end()

int EST_TokenStream::seek_end ( )

Definition at line 269 of file EST_Token.cc.

◆ restart()

int EST_TokenStream::restart ( void )

Reset to start of file/string.

Definition at line 437 of file EST_Token.cc.

◆ pos_description()

const EST_String EST_TokenStream::pos_description ( )

A string describing current position, suitable for error messages.

Definition at line 875 of file EST_Token.cc.

◆ filename()

const EST_String EST_TokenStream::filename ( ) const
inline

The originating filename (if there is one)

Definition at line 372 of file EST_Token.h.

◆ filedescriptor()

FILE * EST_TokenStream::filedescriptor ( )
inline

For the people who need the actual description (if possible)

Definition at line 374 of file EST_Token.h.

◆ operator>>() [1/2]

EST_TokenStream & EST_TokenStream::operator>> ( EST_Token & p)

Definition at line 472 of file EST_Token.cc.

◆ operator>>() [2/2]

EST_TokenStream & EST_TokenStream::operator>> ( EST_String & p)

Definition at line 477 of file EST_Token.cc.

Friends And Related Symbol Documentation

◆ operator<<

ostream & operator<< ( ostream & s,
EST_TokenStream & p )
friend

Definition at line 177 of file EST_Token.cc.


The documentation for this class was generated from the following files: