Home | Extra Components | Examples | Tutorial

Symbol Class Reference

This is the base class from which derive all generated non-terminal and token symbol classes.

More...

#include "sample__Symbol.h"


Public Member Functions


Protected Member Functions


Protected Member Variables


Detailed Description

This is the base class from which derive all generated non-terminal and token symbol classes.


Files Included

#include <iostream>

#include "universal.h"


Member Functions Documentation

Symbol ( ) 

Base symbol class constructor.

Default symbol status is:

- Symbol is not parsed (parsed() returns false)

- Start line no is 1 (getStartLineNo() returns 1)

- Line count for the symbol is 0 (getLineCount() returns 0)

- Start offset of symbol in stream is 0 (getStartOffset() returns 0)

- Current parse offset of symbol in stream is 0 (getCurrentOffset() returns 0)

- No known parent symbol (hasParent() returns false)

- Local pointers to input stream and parser are null.

- No rule reduced (getReducedRuleRank() returns invalid rank -1)

- No current rule being parsed (getCurrentRuleRank() returns invalid rank -1)

- No current symbol being parsed (getCurrentSymbolRank() returns invalid ranl -1)

~Symbol ( ) 

Destructor.

bool parseSymbol Parser & parser Symbol * & rpReturnSymbol Symbol * pParent 

Common parse function of all symbols. This function is overriden and defined by the actual symbol classes and uses the parse function generated for the actual symbol types.

This function allows to parse any symbol from their base interface, but is not called by the generated parse code itself. It is only called by parsers on root symbols for parsing a whole grammar and internally by end-user parse interfaces.

pParent is a pointer to the parent symbol that attempts to parse this one.

rpReturnSymbol is a pointer to the symbol returned by the reduction code. If the user does not specify another return symbol, it'll be a pointer to the symbol object itself. According to the description of this function, the return symbol pointer will always be castable to a pointer to the actual expected symbol type.

See also:

bool parseRoot ( Parser & parser , Symbol * & pReturnSymbol , Symbol * pParent = NULL , bool bBreakOff = true ) 

bool parse ( std::istream & is , std::ostream & os , Symbol * pParent = NULL ) 

bool expand ( std::ostream & os 

Expands the symbol in the passed output stream os. Expanding a symbol is the parsing reverse operation.

Of course, the symbol must have been previously successfully parsed.

Returns true on success.

bool backtrace ( std::ostream & os 

Backtraces the symbol in the passed output stream os.

Of course, the symbol must have been previously successfully parsed.

Returns true on success.

void outputReducedRule ( std::ostream & os 

Outputs to the passed stream os the reduced rule in verbatim text.

bool visit Visitor & visitor 

Visits this symbol with visitor.

The use of the visitor design patter is useful to scan through the parsed symbols tree in the same order as they were parsed.

The field of use of the visitor pattern is wide. It can be used for implementing any post-processing function on parsed data, e.g. to perform some checks on structured language which can only be done once the whole context of some data has been parsed, as checking that a used variable or function was declared. It could also be used for automatic code correction or perform any other advanced substitution of a subset data in a complex structured stream.

For a symbol to be visited, its visit() method must be called with the visitor object as argument. The process() function of the visitor is called there with a pointer to the symbol as argument. The right-hand symbols of the parsed rule are in turn made visited.

See also:

virtual bool Visitor::process ( Symbol * & rpSymbol 

virtual bool Visitor::process ( const char * c_pszConstant , Symbol & parent , unsigned long uSymbolRank 

bool parsed ( ) const 

Tells whether the symbol was successfully parsed. If the symbol couldn't be parsed, this function returns false.

bool isToken ( ) const 

Tells whether the symbol is token. Overriden by the generated symbols.

bool isNonTerminal ( ) const 

Tells whether the symbol is a non-terminal. Overriden by the generated symbols.

bool hasParent ( ) const 

Tells whether the symbol has a parent symbol.

In the normal case, a symbol has always a parent symbol with the exception of the root symbol.

Symbol & parent ( ) 

Returns a reference to the parent symbol.

If the symbol is the root, or more generally if the symbol has no parent, it returns a reference on itself.

const Symbol & parent ( ) const 

Returns a const reference to the parent symbol.

If the symbol is the root, or more generally if the symbol has no parent, it returns a reference on itself.

Symbol * getParent ( ) 

Returns a pointer to the parent symbol.

If the symbol is the root, or more generally if the symbol has no parent, it returns a pointer on itself.

void setParent Symbol * pParent 

Sets the parent for this symbol to pParent.

Not used in the generated code but was foreseen for custom and advanced uses.

long getReducedRuleRank ( ) const 

Returns the rank of the rule that has been reduced for this symbol (successful parsing and reduction code execution).

The rank of the rule is the same as in the grammar. The rank always starts counting from 0 (0 = first rule, 1 = second rule...).

-1 is the value reserved for an invalid rank. This value is for example returned if no rule has been reduced for the symbol.

See also:

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

long getCurrentRuleRank ( ) const 

Returns the rank of the rule that the parse code is currently attempting to parse.

The rank of the rule is the same as in the grammar. The rank always starts counting from 0 (0 = first rule, 1 = second rule...). -1 is the value reserved for an invalid rank (no rule being parsed).

See also:

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

void setReducedRuleRank ( long lRuleRank 

Sets the rank of the rule reduced for this symbol

The rank of the rule must match the rank of the rule in the grammar. The rank always starts counting from 0 (0 = first rule, 1 = second rule...). -1 is the value reserved for an invalid rank.

Not used in the generated code but was foreseen for custom and advanced uses.

See also:

long getReducedRuleRank ( ) const 

long getCurrentSymbolRank ( ) const 

Returns the rank of the symbol in rule that the parse code is currently attempting to parse.

The rank always starts counting from 0 (0 = first symbol, 1 = second symbol ...). -1 is the value reserved for an invalid rank (no symbol being parsed).

See also:

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

void eatTrailingWhiteSpaces ( ) 

This function is called by the parse code to eat trailing whitespaces in the stream once the root symbol was successfully parsed (overall parsing was successful)

As for any other whitespace eaten in the input stream, trailing whitespaces are stored so that they can be restored back when the root symbol is expanded.

Calls and relies on eatWhiteSpaces().

See also:

void eatWhiteSpaces ( ) 

bool parseRoot ( Parser & parser , Symbol * & pReturnSymbol , Symbol * pParent = NULL , bool bBreakOff = true ) 

void eatWhiteSpaces ( ) 

This function is called by the parse code to eat whitespaces in the stream till to find a start character for a constant, token or a non-terminal.

Whitespaces are stored so that they can be restored back when the root symbol is expanded.

Stored whitespaces are the following: Space char (' '), tabulation char (\t), new line char (\n) or carriage return char (\r).

This function also counts the number of new lines encountered while parsing the symbol.

It is for all these reasons that the << stream operator is not used.

See also:

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

char peekchar ( ) const 

Returns the peek char of the stream, i.e. the first non-white space char found in the stream. The peek char is determined in eatWhitespaces().

See also:

void eatWhiteSpaces ( ) 

char readchar ( ) 

Reads a char from the stream and returns it. The read char becomes the peek char.

Obsolete. Should not be used except for specific purposes.

See also:

void eatWhiteSpaces ( ) 

bool strincmpread ( register const char * pszString , register unsigned int len 

Reads and compares in the same time characters from the stream with the string pszString, ignoring case.

Returns true if the same chars as those of the passed string could be read from the stream. False otherwise.

Used by the generated parse code.

See also:

bool strncmpread ( register const char * pszString , register unsigned int len 

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

bool strncmpread ( register const char * pszString , register unsigned int len 

Reads and compares in the same time characters from the stream with the string pszString, regarding case.

Returns true if the same chars as those of the passed string could be read from the stream. False otherwise.

Used by the generated parse code.

See also:

bool strincmpread ( register const char * pszString , register unsigned int len 

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

void init Parser & parser Symbol * pParent , long lInitialRuleRank 

Initializes a symbol for parsing it.

Called by parseSymbol() before the parsing actually takes place.

Symbol status is set as follows:

- Symbol is not parsed (parsed() returns false)

- Start line no is set to the parent's current line number (pParent->getCurrentLineNo()

- Line count for the symbol is 0 (getLineCount() returns 0)

- Start offset of symbol in stream is set to the current offset in stream (tellg())

- Current parse offset will be set at the first eatWhitespaces call

- Parent symbol is set to pParent (hasParent() returns true then)

- Local pointers to input stream and parser are set to parser.stream() and parser.

- The known rank of the reduced rule is set to lInitialRuleRank (-1, invalid by default). This parameter is foreseen for a future implementation of backtracking where an already parsed symbol could be re-parsed starting from the rule following the rule that was parsed before.

- No current rule being parsed (getCurrentRuleRank() returns invalid rank -1)

- Current symbol being parsed will be set in the parse code.

See also:

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

void accept ( ) 

This is called when a rule could be reduced (successful parsing and reduction code execution).

Following this call, parsed() returns true and getReduceRuleRank() returns the rank of the rule reduced.

See also:

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

bool parsed ( ) const 

void reject ( ) 

void reject ( ) 

Notifies a syntax error to the parse at offset getStartOffset() and line no getStartLineNo(). Following this call, line count is 0 and parsed() returns false.

Initially, it was foreseen this to be called whenever a symbol couldn't be parsed at all. But the generated code only invokes this function for managing token parse failures (see token parse code). Other parse failures (non-terminal, constant) are managed with another code more accurately.

In advanced uses, this function might be called in non-terminal reduction code to force a syntax error despite of the successful non-terminal parsing.

See also:

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

bool parsed ( ) const 

void accept ( ) 

unsigned int getStartLineNo ( ) const 

std::streampos getStartOffset ( ) const 

void inhibitStoreWs ( bool bYes 

Tells whether to inhibit or not the storing of the eaten whitespaces

See also:

void eatWhiteSpaces ( ) 

void eatTrailingWhiteSpaces ( ) 

void disableWhitespaceEating ( bool bYes 

Tells whether whitespace eating shall be disabled. Whitespace chars become normal chars that must be parsed by a user rule.

bYes is optional and is true by default.

See also:

void eatWhiteSpaces ( ) 

void setParsed ( ) 

Sets the symbol as successfully parsed.

See also:

bool parsed ( ) const 

virtual bool parseSymbol ( Parser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) 

void incrementLineCount ( unsigned long uAmount 

Increments the line count.

This function is not used by the generated code. It is foreseen to be called by custom whitespace eater.

uAmount is optional and its default value is 1.

See also:

unsigned int getLineCount ( ) const 

unsigned int getCurrentLineNo ( ) const 

Returns the number of the last line or the line currently being parsed for this symbol.

It is equivalent to

   getStartLineNo() + getLineCount()

See also:

unsigned int getStartLineNo ( ) const 

unsigned int getLineCount ( ) const 

unsigned int getStartLineNo ( ) const 

Returns the start line number of this symbol.

See also:

void setStartLineNo ( unsigned int uLineNo 

void setStartLineNo ( unsigned int uLineNo 

Sets the start line number of this symbol to uLineNo.

See also:

unsigned int getStartLineNo ( ) const 

unsigned int getLineCount ( ) const 

Returns the number of new lines encountered while parsing this symbol.

See also:

void setLineCount ( unsigned int uLineCount 

void setLineCount ( unsigned int uLineCount 

Sets the number of new lines encountered while parsing this symbol to uLineCount.

See also:

unsigned int getLineCount ( ) const 

std::streampos getStartOffset ( ) const 

Returns the start offset in parse stream of this symbol before eating leading whitespaces.

See also:

std::streampos getCurrentOffset ( ) const 

std::streampos getCurrentOffset ( ) const 

Returns the start offset in parse stream of this symbol after eating leading whitespaces.

See also:

std::streampos getStartOffset ( ) const 

SymbolInfo getPreviousSymbolInfo ( const Parser & parser ) const 

Returns the information about the symbol preceding this one in the rule of the parent symbol currently being parsed.

If there is no preceding symbol, an invalid SymbolInfo object is returned.

See also:

SymbolInfo getNextSymbolInfo ( const Parser & parser ) const 

SymbolInfo getNextSymbolInfo ( const Parser & parser ) const 

Returns the information about the symbol following this one in the rule of the parent symbol currently being parsed.

If there is no preceding symbol, an invalid SymbolInfo object is returned.

See also:

SymbolInfo getPreviousSymbolInfo ( const Parser & parser ) const 

unsigned long getNbRules ( ) const 

Returns the number of rules defined for this symbol. of the parent symbol currently being parsed.

See also:

virtual unsigned long getNbSymbolsInRule ( long lRuleRank ) const 

virtual unsigned long getSymbolDescription ( long lRuleRank , long lSymbolRank ) const 

unsigned long getNbSymbolsInRule ( long lRuleRank ) const 

Returns the number of symbols defined in rule of rank lRuleRank.

See also:

virtual unsigned long getNbRules ( ) const 

virtual unsigned long getSymbolDescription ( long lRuleRank , long lSymbolRank ) const 

unsigned long getSymbolDescription ( long lRuleRank , long lSymbolRank ) const 

Returns the code identifying and describing the symbol of rank lSymbolRank in rule of rank lRuleRank. This code is computed and determined at generation time.

See also:

virtual unsigned long getNbRules ( ) const 

virtual unsigned long getSymbolDescription ( long lRuleRank , long lSymbolRank ) const 

SymbolInfoCursor getFirstsInfoCursor Parser & parser 

Returns a symbol information cursor on the firsts symbol for this symbol.

See also:

virtual const unsigned long * getFirstsDescriptions ( ) const 

const unsigned long * getFirstsDescriptions ( ) const 

Returns a pointer to the array of codes identifying and describing the firsts for this symbol.

These codes are computed and determined at generation time.

void error ( std::streampos errorOffset , unsigned int uErrorLineNo 

This function can be used from reduction codes to notify a non-breaking error, but without providing any error message. The error occurred at line uErrorLineNo and at offset errorOffset in parse stream.

Non-breaking errors are semantic errors and are user-defined and managed, in constrast with syntax errors which are detected and managed automatically by the generated parse code.

Notifying semantic errors does not break the parsing, even they will eventually result in a parse failure (fail() called on parser object will return false). Therefore, all semantical errors can be got in one step.

If the user estimates that it does not make sense to continue parsing when a certain semantic error was encountered, it shall invoke the fatal() method instead of this one.

See also:

void error ( std::ostream & os , unsigned long uErrorLineNo , const char * pszFormattedMsg , ... ) 

void fatal ( const char * pszFormattedMsg , ... ) 

bool Parser::failure ( std::streampos failureOffset , unsigned int uFailureLineNo , unsigned int uMismatchConstantIndex 

bool Parser::fail ( ) const 

void error ( std::ostream & os , unsigned long uErrorLineNo , const char * pszFormattedMsg , ... ) 

This function can be used from reduction codes to notify a non-breaking error and outputting an error message given by pszFormattedMsg and ... to the output stream os. The error occurred at line uErrorLineNo and at offset errorOffset in parse stream.

If showLineNo() of the parser object returns true, the line number will prefix the output error message.

Non-breaking errors are semantic errors and are user-defined and managed, in constrast with syntax errors which are detected and managed automatically by the generated parse code.

Notifying semantic errors does not break the parsing, even they will eventually result in a parse failure (fail() called on parser object will return false). Therefore, all semantical errors can be got in one step.

If the user estimates that it does not make sense to continue parsing when a certain semantic error was encountered, it shall invoke the fatal() method instead of this one.

Non-safe behavior:

Ultimately the universal::String shall implement a safe sprintf which will also ensure the string can hold messages of random length. Here the expanded message shouldn't exceed 1024 bytes !

See also:

void error ( std::streampos errorOffset , unsigned int uErrorLineNo 

void fatal ( const char * pszFormattedMsg , ... ) 

bool Parser::failure ( std::streampos failureOffset , unsigned int uFailureLineNo , unsigned int uMismatchConstantIndex 

bool Parser::fail ( ) const 

bool Parser::showLineNo ( ) const 

void fatal ( const char * pszFormattedMsg , ... ) 

This function can be used from reduction codes to notify a fatal breaking error for which an error message is provided in pszFormattedMsg and .... The error is assumed to have occur at offset getCurrentOffset() and line getStartLineNo() in parse stream.

The error message can be got back using either the getFailureMessage() or the outputFailureMessage() method of the parser object.

If showLineNo() of the parser object returns true, the line number will prefix the output error message.

Fatal breaking errors are semantic errors and are user-defined and managed, in constrast with syntax errors which are detected and managed automatically by the generated parse code.

If the user estimates that it might be meaningful and more efficient not to stop the syntactical analysis of the stream and get a chance to get more information about other errors, it shall use the error() method instead of this one.

Non-safe behavior:

Ultimately the universal::String shall implement a safe sprintf which will also ensure the string can hold messages of random length. Here the expanded message shouldn't exceed 1024 bytes !

See also:

std::streampos getCurrentOffset ( ) const 

unsigned int getStartLineNo ( ) const 

void error ( std::ostream & os , unsigned long uErrorLineNo , const char * pszFormattedMsg , ... ) 

const universal::String & Parser::getFailureMessage ( ) const 

void Parser::outputFailureMessage ( std::ostream & os ) const 

bool Parser::showLineNo ( ) const 

Symbol & getRightByIndex ( unsigned long uIndex 

Returns a reference to the non-terminal or token symbol of index uIndex in the reduced rule. The term 'rank' instead of 'index' is also often used in this documentation.

If there were no known reduced rule or if the index is beyond the valid range, a reference to this symbol is returned.

Let assume that the following rule of the sample.gsp grammar:

  SampleNonTerminal ::= word ',' int ',' SampleNonTerminal

and suppose we have parsed an instance of the SampleNonTerminal symbol class. Here is how the different instances of the right symbols of the rule could be got:

  SampleNonTerminal& sampleNonTerminal ... // We got somehow a reference to the SampleNonTerminal 
  WordToken& word = dynamic_cast<WordToken&>(sampleNonTerminal.getRightIndex(0));
  SignedIntegerToken& sint = dynamic_cast<SignedIntegerToken&>(sampleNonTerminal.getRightIndex(1));
  SampleNonTerminal& sub_sampleNonTerminal = dynamic_cast<SampleNonTerminal&>(sampleNonTerminal.getRightIndex(2));

Parser & parser ( ) 

Returns a reference to the parser object passed to the init() method.

See also:

void init ( Parser & parser , Symbol * pParent , long lInitialRuleRank = -1 ) 

void reset ( ) 

Resets the status of this symbol.

This is for example called by the matching factory object to reset a symbol that was already used but not 'consumed' due to an unsuccessful parsing.

The new symbol status is same as when the symbol was constructed.

See also:

Symbol ( ) 

void restore_ws ( unsigned long uSymbolRank , std::ostream & os ) const 

Called by expand() to restore the whitespaces that preceeded the symbols in the original parsed input stream.

uSymbolRank gives the rank of the symbol for which to restore the leading whitespaces. os is the output stream to which to restore them.

See also:

virtual bool expand ( std::ostream & os ) const 

void eatWhiteSpaces ( ) 

bool parseRoot Parser & parser Symbol * & pReturnSymbol Symbol * pParent , bool bBreakOff 

This function attempts to carry out a raw parsing of this symbol using parser, in that sense that the input stream is not pre-processed before. The symbol is parsed as the root symbol would be, i.e. the line numbering in stream starts counting from zero and it is checked that the end of the stream is encountered if a stream break-off was expected.

pReturnSymbol is the symbol pointer returned by the parsing.

pParent gives the initial parent for the symbol parent to parse. By default it is null, i.e. the symbol is the root and has no parent.

bBreakOff tells whether a stream break-off is expected at the end of the parsing (true by default).

See also:

bool parse ( std::istream & is , std::ostream & os , Symbol * pParent = NULL ) 

bool parse ( std::istream & is , Symbol * pParent = NULL ) 

bool parse ( std::istream & is , std::ostream & os Symbol * pParent 

This function attempts to parse this symbol in a input stream is. Input stream is pre-processed before parsing. Error messages are re-directed to output stream os.

pParent gives the initial parent for the symbol parent to parse. By default it is null, i.e. the symbol is the root and has no parent.

The symbol is parsed as the root symbol would be, i.e. the line numbering in stream starts counting from zero and it is checked that the end of the stream is encountered if a stream break-off was expected.

This function is useful when the users don't want to parse a stream starting from the root symbol of the grammar, but from whatever else symbol, without having to create by himself the parser object and output possible error messages.

With this function there is no way to get the symbol returned by the reduction code.

Basically, calling this method is equivalent to:

  ...Parser parser(is);
  ...parser.preprocess();
  ...parseRoot();

See also:

bool parseRoot ( Parser & parser , Symbol * & pReturnSymbol , Symbol * pParent = NULL , bool bBreakOff = true ) 

bool parse ( std::istream & is Symbol * pParent 

This overloaded function is provided for convenience and behaves like the above function, except that error messages are forced to be output on the standard error output cerr.

See also:

bool parse ( std::istream & is , std::ostream & os , Symbol * pParent = NULL ) 


This file is part of the LLOOP LL Object-Oriented Parser Generator and Object Expander Generator. Copyright (c) 2005 Michel MEHL, France. All rights reserved. LLOOP is distributed by the company ERSA SaRL.


Copyright (c) 2005 Michel MEHL, Haguenau, France
LLOOP version 1.0