| LLOOP Index | GSP Language | GSP Library | Framework Classes | Component Classes |
#include "gsp__Symbol.h"
This is the base class from which derive all generated non-terminal and token symbol classes. More...
Public Functions | |||
| Symbol | ( ) | ||
| ~Symbol | ( ) | ||
| backtrace | ( std::ostream & os ) = 0 | |
| cancelWhitespaceEating | ( ) | |
| disableWhitespaceEating | ( bool bYes = true ) | |
| error | ( std::ostream & os , unsigned long uErrorLineNo , const char * pszFormattedMsg , ... ) | |
| error | ( std::streampos errorOffset , unsigned int uErrorLineNo ) | |
| expand | ( std::ostream & os , bool bRestoreWS = true ) | |
| expand | ( bool bTrimFirstWhitespaces = true , bool bRestoreWS = true ) | |
| fail | ( ) const | |
| fatal | ( const char * pszFormattedMsg , ... ) | |
| getCurrentLineNo | ( ) const | |
| getCurrentOffset | ( ) const | |
| getEndOffset | ( ) const | |
| getFirstsInfoCursor | ( LLParser & parser ) | ||
| getLineCount | ( ) const | |
| getLineNo | ( ) const | |
| getNbRules | ( ) const | |
| getNbSymbolsInRule | ( long lRuleRank ) const | |
| getOffset | ( ) const | |
| getParent | ( ) | |
| getReducedRuleRank | ( ) const | |
| getRightByIndex | ( unsigned long uIndex ) | |
| getStartLineNo | ( ) const | |
| getStartOffset | ( ) const | |
| getSymbolName | ( ) const = 0 | |
| hasParent | ( ) const | |
| inhibitStoreWs | ( bool bYes ) | |
| isNonTerminal | ( ) const = 0 | |
| isToken | ( ) const = 0 | |
| isWhitespaceEatingDisabled | ( ) const | |
| outputReducedRule | ( std::ostream & os ) | |
| parent | ( ) | |
| parent | ( ) const | |
| parse | ( std::istream & is , std::ostream & os , Symbol * pParent = NULL , bool bExpectEOF = false , bool bDeactivatePreprocessing = false ) = 0 | |
| parse | ( std::istream & is , Symbol * pParent = NULL , bool bExpectEOF = false , bool bDeactivatePreprocessing = false ) = 0 | |
| parsed | ( ) const | |
| parser | ( ) | |
| reset | ( ) | |
| root | ( ) | |
| ruleid | ( ) const | |
| setLineCount | ( unsigned int uLineCount ) | |
| setParent | ( Symbol * pParent ) | |
| setStartLineNo | ( unsigned int uLineNo ) | |
| test | ( std::ostream & os , bool bVerbose = false ) = 0 | |
| visit | ( gsp::Visitor & visitor ) | |
| visit | ( gsp::Visitor & visitor , Symbol * & pReturnSymbol ) | |
| warning | ( std::ostream & os , unsigned long uErrorLineNo , const char * pszFormattedMsg , ... ) | |
Protected Functions | |||
| accept | ( ) | |
| eatTrailingWhiteSpaces | ( ) | |
| eatWhiteSpaces | ( ) | |
| fail | ( bool bFail ) | |
| getCurrentRuleRank | ( ) const | |
| getCurrentSymbolRank | ( ) const | |
| getFirstsDescriptions | ( ) const | |
| getNextSymbolInfo | ( const LLParser & parser ) const | ||
| getPreviousSymbolInfo | ( const LLParser & parser ) const | ||
| getSymbolDescription | ( long lRuleRank , long lSymbolRank ) const | |
| incrementLineCount | ( unsigned long uAmount = 1 ) | |
| init | ( LLParser & parser , Symbol * pParent , long lInitialRuleRank = -1 ) | |
| message | ( std::ostream & os , unsigned long uErrorLineNo , const universal::String & c_sPrefix , const char * pszFormattedMsg , va_list arg ) | |
| parseSymbol | ( LLParser & parser , Symbol * & rpReturnSymbol , Symbol * pParent = NULL ) = 0 | |
| peekchar | ( ) const | |
| readchar | ( ) | |
| reject | ( ) | |
| restore_ws | ( unsigned long uSymbolRank , std::ostream & os ) const | |
| setParsed | ( ) | |
| setReducedRuleRank | ( long lRuleRank ) | |
| strincmpread | ( register const char * pszString , register unsigned int len ) | |
| strncmpread | ( register const char * pszString , register unsigned int len ) | |
| test | ( std::ostream & os , unsigned long uNbTestCases , universal::String * pasTestCaseNames , universal::String * pasTestCaseInputs , universal::String * pasTestCaseOutputs , bool * pabResults , bool bDeactivatePreprocessing , bool bVerbose ) | |
Private Functions | |||
| store_ws | ( register unsigned char uWSType ) | |
Protected Variables | |||
| m_bLeadingWhitespacesEaten | ||
| m_currentOffset | ||
| m_endOffset | ||
| m_iterCounter | |||
| m_lCurrentRuleRank | ||
| m_lCurrentSymbolRank | ||
| m_lReducedRuleRank | ||
| m_offset | ||
| m_pParentSymbol | ||
| m_pParser | ||
| (* m_pcWsFlag) [ 20 ] | ||
| m_pis | ||
| m_puNbWs | ||
| m_startOffset | ||
| m_uCurrentLineCount | ||
| m_uLineCount | ||
| m_uLineNo | ||
| m_uStartLineNo | ||
Private Variables | |||
| m_bDisableWsEating | ||
| m_bFailure | ||
| m_bIsParsed | ||
| m_bStoreWs | ||
| m_peekc | ||
Private Static Variables | |||
| FIRSTS [ 1 ] | ||
| NB_FIRSTS | ||
Friends | |||
| SymbolInfoCursor | ||
| LLParser | ||
| ExpanderVisitor | ||
| BacktracerVisitor | ||
This is the base class from which derive all generated non-terminal and token symbol classes.
Base symbol class constructor.
Default symbol status is:
- Symbol is not parsed (parsed() returns false)
- Start line no is 1 (getLineNo() and getStartLineNo() return 1)
- Line count for the symbol is 0 (getLineCount() returns 0)
- Start offset of symbol in stream is 0 (getStartOffset() and getOffset() return 0)
- Current parse offset of symbol in stream is 0 (getCurrentOffset() returns 0)
- No known parent symbol (hasParent() returns false)
- Local pointers to input stream and parser are null.
- No rule reduced (getReducedRuleRank() returns invalid rank -1)
- No current rule being parsed (getCurrentRuleRank() returns invalid rank -1)
- No current symbol being parsed (getCurrentSymbolRank() returns invalid ranl -1)
See also:
Destructor.
See also:
Common parse function of all symbols. This function is overriden and defined by the actual symbol classes and uses the parse function generated for the actual symbol types.
This function allows to parse any symbol from their base interface, but is not called by the generated parse code itself. It is only called by parsers on root symbols for parsing a whole grammar and internally by end-user parse interfaces.
pParent is a pointer to the parent symbol that attempts to parse this one.
rpReturnSymbol is a pointer to the symbol returned by the reduction code. If the user does not specify another return symbol, it'll be a pointer to the symbol object itself. According to the description of this function, the return symbol pointer will always be castable to a pointer to the actual expected symbol type.
See also:
Expands the symbol in the passed output stream os. Expanding a symbol is the parsing reverse operation. Therefore, calling this function assumes the symbol has been successfully parsed before.
bRestoreWS is optional and tells whether to restore whitespaces. This is the case by default.
The default expand operation performed by the base symbol class is given below. During generation, this method is overriden, either by the default generated code or by user-defined expand code. In all cases, the overriding methods shall ensure the following:
- An end-of-stream char ('\0') is appended to the result string when the symbol is a non-terminal or is the root symbol.
- If expanding was ok and it is the root, trailing whitespaces are additionally restored.
- Return true on success only.
Example:
ExpanderVisitor expander(os, bRestoreWS); return visit(expander);
See also:
This function is provided for convenience and behaves like the other expand() function, except that :
on failure an empty string is returned;
the first leading whitespaces are removed by default. This can be switched off by setting bTrimFirstWhitespaces to false.
bRestoreWS is also optional and tells whether to restore all whitespaces. This is the case by default. If not so, the value of bTrimFirstWhitespaces is actually overriden and ignored.
This function executes the following code:
Example:
std::strstream sExp;
if (expand(sExp, bRestoreWS))
{
sExp << ends;
universal::String s = sExp.str();
if (bTrimFirstWhitespaces)
s.trimLeft();
return s;
}
else
{
return "";
}See also:
Backtraces the symbol in the passed output stream os.
Of course, the symbol must have been previously successfully parsed.
Returns true on success.
See also:
Outputs to the passed stream os the reduced rule in verbatim text.
See also:
Visits this symbol with visitor.
This calls the function referred to below and ignores the returned replacing object pointer.
Returns true if the visit was accepted and successful.
See also:
Visits this symbol with visitor and returns a pointer to a replacing object (if any).
Returns true if the visit was accepted and successful.
The use of the visitor design patter is useful to scan through the parsed symbols tree in the same order as they were parsed.
The field of use of the visitor pattern is wide. It can be used for implementing any post-processing function on parsed data, e.g. to perform some checks on structured language which can only be done once the whole context of some data has been parsed, as checking that a used variable or function was declared. It could also be used for automatic code correction or perform any other advanced substitution of a subset data in a complex structured stream.
For a symbol to be visited, its visit() method must be called with the visitor object as argument. The process() function of the visitor is called there to which the symbol passes a pointer to itself as argument.
The right-hand symbols of the parsed rule are in turn visited.
See also:
Tells whether the symbol was successfully parsed. If the symbol couldn't be parsed, this function returns false.
See also:
Tells whether a failure was encountered during the parsing of the symbol while using any of the function referred to below.
There is a failure when:
- Pre-processing failed. Normally, the sole possible way to make pre- processing fail is to pass an invalid input source to the constructor from which no chars can be got (e.g. invalid file name).
- A syntax error was encountered and detected by the generated parse code
- A fatal (breaking) failure was raised in a non-terminal reduction code
- At least one (non-breaking) error was notified in a non-terminal reduction code
See also:
Tells whether the symbol is token. Overriden by the generated symbols.
See also:
Tells whether the symbol is a non-terminal. Overriden by the generated symbols.
See also:
Tells whether the symbol has a parent symbol.
In the normal case, a symbol has always a parent symbol with the exception of the root symbol.
See also:
Returns a reference to the parent symbol.
If the symbol is the root, or more generally if the symbol has no parent, it returns a reference on itself.
See also:
Returns a const reference to the parent symbol.
If the symbol is the root, or more generally if the symbol has no parent, it returns a reference on itself.
See also:
Returns a reference to the actual root symbol for the current parsing.
It it is not necessarily an instance of the root symbol defined for the overall grammar as returned by the root() method of the parser.
Returns the root object as returned by the root() method of the parser object if the current parsing was initiated from the grammar root symbol using the run() method of a manually created parser object.
Otherwise, it is the random non-terminal from which the parsing was initiated using either the parseRoot() or any other parseSymbol that was explicitely called on the root non-terminal.
See also:
Returns a pointer to the parent symbol.
If the symbol is the root, or more generally if the symbol has no parent, it returns a pointer on itself.
See also:
Sets the parent for this symbol to pParent.
Not used in the generated code but was foreseen for custom and advanced uses.
See also:
Returns the rank of the rule that has been reduced for this symbol (successful parsing and reduction code execution).
The rank of the rule is the same as in the grammar. The rank always starts counting from 0 (0 = first rule, 1 = second rule...).
-1 is the value reserved for an invalid rank. This value is for example returned if no rule has been reduced for the symbol.
See also:
This function is provided for convenience. It returns exactly the same information as the function referred to below.
See also:
Returns the rank of the rule that the parse code is currently attempting to parse.
The rank of the rule is the same as in the grammar. The rank always starts counting from 0 (0 = first rule, 1 = second rule...). -1 is the value reserved for an invalid rank (no rule being parsed).
See also:
Sets the rank of the rule reduced for this symbol
The rank of the rule must match the rank of the rule in the grammar. The rank always starts counting from 0 (0 = first rule, 1 = second rule...). -1 is the value reserved for an invalid rank.
Not used in the generated code but was foreseen for custom and advanced uses.
See also:
Returns the rank of the symbol in rule that the parse code is currently attempting to parse.
The rank always starts counting from 0 (0 = first symbol, 1 = second symbol ...). -1 is the value reserved for an invalid rank (no symbol being parsed).
See also:
This function is called by the parse code to eat trailing whitespaces in the stream once the root symbol was successfully parsed (overall parsing was successful)
As for any other whitespace eaten in the input stream, trailing whitespaces are stored so that they can be restored back when the root symbol is expanded.
Calls and relies on eatWhiteSpaces().
See also:
This function is called by the parse code to eat whitespaces in the stream till to find a start character for a constant, token or a non-terminal.
Whitespaces are stored so that they can be restored back when the root symbol is expanded.
Stored whitespaces are the following: Space char (' '), tabulation char (\t), new line char (\n) or carriage return char (\r).
This function also counts the number of new lines encountered while parsing the symbol.
It is for all these reasons that the << stream operator is not used.
See also:
Cancels the whitespace eating performed for the current symbol in the current parsed rule.
The stream pointer is re-positioned at the start offset of this symbol before whitespaces eating, and the line count is reset to 0.
Once this function called, the first chars that will be read again are those whitespaces.
See also:
Returns the peek char of the stream, i.e. the first non-white space char found in the stream. The peek char is determined in eatWhitespaces().
See also:
Reads a char from the stream and returns it. The read char becomes the peek char.
Obsolete. Should not be used except for specific purposes.
See also:
Reads and compares in the same time characters from the stream with the string pszString, ignoring case.
Returns true if the same chars as those of the passed string could be read from the stream. False otherwise.
Used by the generated parse code.
See also:
Reads and compares in the same time characters from the stream with the string pszString, regarding case.
Returns true if the same chars as those of the passed string could be read from the stream. False otherwise.
Used by the generated parse code.
See also:
Initializes a symbol for parsing it.
Called by parseSymbol() before the parsing actually takes place.
Symbol status is set as follows:
- Symbol is not parsed (parsed() returns false)
- Start line no is set to the parent's current line number (pParent->getCurrentLineNo()). If the symbol has no parent, i.e. it is the root symbol, the initial value (normally 1 as set in constructor) remains unchanged. The initial start line number of a root symbol can be forced to a given value using the setStartLineNo() before initiating its parsing.
- Line count for the symbol is 0 (getLineCount() returns 0)
- Start offset of symbol in stream is set to the current offset in stream (tellg())
- Current parse offset will be set at the first eatWhitespaces call
- Parent symbol is set to pParent (hasParent() returns true then)
- Local pointers to input stream and parser are set to parser.stream() and parser.
- The known rank of the reduced rule is set to lInitialRuleRank (-1, invalid by default). This parameter is foreseen for a future implementation of backtracking where an already parsed symbol could be re-parsed starting from the rule following the rule that was parsed before.
- No current rule being parsed (getCurrentRuleRank() returns invalid rank -1)
- Current symbol being parsed will be set in the parse code.
See also:
This is called when a rule could be reduced (successful parsing and reduction code execution).
Following this call, parsed() returns true and getReduceRuleRank() returns the rank of the rule reduced.
The end offset is also set here.
See also:
Notifies a syntax error to the parse at offset getOffset() and line no getLineNo(). Following this call, line count is 0 and parsed() returns false.
Initially, it was foreseen this to be called whenever a symbol couldn't be parsed at all. But the generated code only invokes this function for managing token parse failures (see token parse code). Other parse failures (non-terminal, constant) are managed with another code more accurately.
In advanced uses, this function might be called in non-terminal reduction code to force a syntax error despite of the successful non-terminal parsing.
See also:
Tells whether to inhibit or not the storing of the eaten whitespaces
See also:
Tells whether whitespace eating is disabled.
See also:
Tells whether whitespace eating shall be disabled. Whitespace chars become normal chars that must be parsed by a user rule.
bYes is optional and is true by default.
See also:
Sets the symbol as successfully parsed.
See also:
Increments the line count.
This function is not used by the generated code. It is foreseen to be called by custom whitespace eater.
uAmount is optional and its default value is 1.
See also:
Returns the number of the last line or the line currently being parsed for this symbol.
It is equivalent to
Example:
getStartLineNo() + getLineCount()
See also:
Returns the start line number of this symbol after whitespace eating.
See also:
Returns the start line number of this symbol before whitespace eating.
See also:
Sets the start line number of this symbol to uLineNo.
See also:
Returns the number of new lines encountered while parsing this symbol.
See also:
Sets the number of new lines encountered while parsing this symbol to uLineCount.
See also:
Returns the start offset of this symbol before whitespace eating.
See also:
Returns the end offset of this symbol before any trailing whitespace eating.
See also:
Returns the start offset of this symbol after whitespace eating.
See also:
Returns the current parse offset for this symbol before whitespace eating.
See also:
Returns the information about the symbol preceding this one in the rule of the parent symbol currently being parsed.
If there is no preceding symbol, an invalid SymbolInfo object is returned.
See also:
Returns the information about the symbol following this one in the rule of the parent symbol currently being parsed.
If there is no preceding symbol, an invalid SymbolInfo object is returned.
See also:
Returns the number of rules defined for this symbol. of the parent symbol currently being parsed.
See also:
Returns the number of symbols defined in rule of rank lRuleRank.
See also:
Returns the code identifying and describing the symbol of rank lSymbolRank in rule of rank lRuleRank. This code is computed and determined at generation time.
If no valid code could be foun, eg. because either or both of the parameters are invalid, the special value OxFFFFFFFF is returned.
See also:
Returns a symbol information cursor on the firsts symbol for this symbol.
See also:
Returns a pointer to the array of codes identifying and describing the firsts for this symbol.
These codes are computed and determined at generation time.
See also:
This function can be used from reduction codes to notify an error at line uErrorLineNo and offset errorOffset without emitting any message.
See also the other error() function documentation.
See also:
This function can be used from reduction codes to emit an error message at line uErrorLineNo in the parse stream.
User-defined errors are semantic errors, in contrast with syntax errors which are detected and managed automatically by the generated parse code.
User-defined errors do not break the parsing, even though they will eventually result in a parse failure (fail() called on parser object will return false).
If the user estimates it does not make sense to continue the parsing when a certain semantic error was encountered, it shall invoke the fatal() method instead.
The message is formatted using the message() function referred to below using uErrorLineNo and pszFormattedMsg.
See also:
This function can be used to emit a warning message at line uErrorLineNo in the parse stream.
Warnings are not errors and never interrupt the parsing. They are used to catch the user's attention on information that may be important for him/her.
The message is formatted using the message() function referred to below using uErrorLineNo and pszFormattedMsg.
See also:
This function allows to emit a formatted message given by pszFormattedMsg and ... to the passed output stream os.
The line number uErrorLineNo will prefix the output message, but only if the line number is not 0 and if showLineNo() of the parser object returns true.
Non-safe behavior:
Ultimately the universal::String shall implement a safe sprintf which will also ensure the string can hold messages of random length. Here the expanded message shouldn't exceed 1024 bytes !
See also:
This function can be used from reduction codes to notify a fatal breaking error for which an error message is provided in pszFormattedMsg and .... The error is assumed to have occur at offset getCurrentOffset() and line getCurrentLineNo() in parse stream.
The error message can be got back using either the getFailureMessage() or the outputFailureMessage() method of the parser object.
If showLineNo() of the parser object returns true, the line number will prefix the output error message.
Fatal breaking errors are semantic errors and are user-defined and managed, in constrast with syntax errors which are detected and managed automatically by the generated parse code.
If the user estimates that it might be meaningful and more efficient not to stop the syntactical analysis of the stream and get a chance to get more information about other errors, it shall use the error() method instead of this one.
Non-safe behavior:
Ultimately the universal::String shall implement a safe sprintf which will also ensure the string can hold messages of random length. Here the expanded message shouldn't exceed 1024 bytes !
See also:
Returns a reference to the non-terminal or token symbol of index uIndex in the reduced rule. The term 'rank' instead of 'index' is also often used in this documentation.
If there were no known reduced rule or if the index is beyond the valid range, a reference to this symbol is returned.
Let assume that the following rule of the sample.gsp grammar:
Example:
SampleNonTerminal ::= word ',' int ',' SampleNonTerminal
and suppose we have parsed an instance of the SampleNonTerminal symbol class. Here is how the different instances of the right symbols of the rule could be got:
Example:
SampleNonTerminal& sampleNonTerminal ... // We got somehow a reference to the SampleNonTerminal WordToken& word = dynamic_cast<WordToken&>(sampleNonTerminal.getRightIndex(0)); SignedIntegerToken& sint = dynamic_cast<SignedIntegerToken&>(sampleNonTerminal.getRightIndex(1)); SampleNonTerminal& sub_sampleNonTerminal = dynamic_cast<SampleNonTerminal&>(sampleNonTerminal.getRightIndex(2));
See also:
Returns a reference to the parser object passed to the init() method.
See also:
Resets the status of this symbol.
This is for example called by the matching factory object to reset a symbol that was already used but not 'consumed' due to an unsuccessful parsing.
The new symbol status is same as when the symbol was constructed.
See also:
Called by eatWhiteSpaces() to keep track of the white spaces eaten during parsing.
Stores the whitespace uWSType appearing before the next symbol to parse. There can be at most 80 whitespaces that can be stored for a symbol.
See also:
Called by expand() to restore the whitespaces that preceeded the symbols in the original parsed input stream.
uSymbolRank gives the rank of the symbol for which to restore the leading whitespaces. os is the output stream to which to restore them.
See also:
Tests the symbol according to the test cases specified along the symbol definition. This function is used internally by the generated code to actually run the test
All the necessary input data required to run the test cases are passed into the following arguments:
uNbTestCases gives the number of test cases.
pasTestCaseNames is an array of strings giving the test case names.
pasTestCaseInputs gives the strings used for the parse tests.
pasTestCaseOutputs gives the expected strings resulting from the expanding of the parsed test strings.
pabResults gives the results expected (pass/fail) for the respective parse tests.
Prints out into the specified stream os the result of the tests in readable form. "[ PASS ]" means the test was successful whereas "[ FAIL ]" highlights a failure.
If bVerbose is false (default value), only the test name (with a limited text length) and the result is shown. If bVerbose is explicitly set to true, additional information are printed out such as the tested string and the expected result (fail or pass).
In all cases, if any test fails, the tested input is printed out with all the syntax error information given by the getFailureContext() of the parser object.
Returns true if all the test cases where successful. False if any failure was encountered.
See also:
Sets the failure flag.
Private.
See also:
This file is part of the LLOOP Reversible Object-Oriented Parser Generator. Copyright (c) 2005-2006 Michel MEHL, France. All rights reserved. LLOOP is distributed by the company ERSA SaRL.
| Copyright (c) 2005-2006 Michel MEHL, Haguenau, France |
| LLOOP version 1.1 |