fri.patterns.interpreter.parsergenerator.lexer
Class LexerImpl

java.lang.Object
  |
  +--fri.patterns.interpreter.parsergenerator.lexer.LexerImpl
All Implemented Interfaces:
Lexer, java.io.Serializable, StrategyFactoryMethod

public class LexerImpl
extends java.lang.Object
implements Lexer, StrategyFactoryMethod, java.io.Serializable

This Lexer must be created using LexerBuilder. It knows token and ignored terminals. To get this Lexer working the setTerminals() call must be called at least once. When using the Lexer standalone, the client must do this, else the Parser will call that method.

This lexer can be reused, but it can not be loaded with other syntaxes after it has been built for one.

Author:
(c) 2002, Fritz Ritzberger
See Also:
Serialized Form

Nested Class Summary
 
Nested classes inherited from class fri.patterns.interpreter.parsergenerator.Lexer
Lexer.TokenListener
 
Field Summary
protected  Strategy strategy
           
 
Constructor Summary
protected LexerImpl()
          Do-nothing constructor for subclasses (currently unused).
  LexerImpl(java.util.List ignoredSymbols, java.util.Map charConsumers)
          Creates a Lexer from token- and ignored symbols, and a map of character consumers (built by LexerBuilder).
 
Method Summary
 void addTokenListener(Lexer.TokenListener tokenListener)
          Implements Lexer.
 void clear()
          Implements Lexer: Does nothing as no states are stored.
protected  Token createToken(java.lang.String tokenIdentifier, ResultTree result, LexerSemantic lexerSemantic)
          Token factory method.
protected  Token createToken(java.lang.String tokenIdentifier, java.lang.String text, Token.Range range)
          Token factory method.
 void dump(java.io.PrintStream out)
          Outputs current and previous line, with line numbers.
 int getColumn()
          Returns the position within the current line, 0-n.
 int getLine()
          Returns the number of the current line, 1-n.
 java.lang.String getLineText()
          Returns the current line, as far as read.
 Token getNextToken(LexerSemantic lexerSemantic)
          This is an optional functionality of Lexer.
 Token getNextToken(java.util.Map expectedTokenSymbols)
          Implements Lexer: returns the next token from input, or EPSILON when no more input.
 int getOffset()
          Returns the offset read so far from input.
 boolean lex(LexerSemantic lexerSemantic)
          This is an optional functionality of Lexer.
protected  void loopResultTree(ResultTree result, LexerSemantic lexerSemantic)
          After top-down lexing this method is called to dispatch all results.
 Strategy newStrategy()
          Implements StrategyFactoryMethod.
 void removeTokenListener(Lexer.TokenListener tokenListener)
          Implements Lexer.
 void setCompeteForLongestInput(boolean competeForLongestInput)
          When false, the sort order (significance) of scan items without fixed start character decide what token is returned.
 void setDebug(boolean debug)
          Implements Lexer: Set debug on to output information about scanned tokens.
 void setInput(java.lang.Object text)
          Implements Lexer: set the input to be scanned.
 void setTerminals(java.util.List terminals)
          Implements Lexer: Parser call to pass all tokens symbols (all enclosed in `backquote`) and literals ("xyz").
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

strategy

protected Strategy strategy
Constructor Detail

LexerImpl

public LexerImpl(java.util.List ignoredSymbols,
                 java.util.Map charConsumers)
Creates a Lexer from token- and ignored symbols, and a map of character consumers (built by LexerBuilder).

Parameters:
ignoredSymbols - list of Strings containing ignored symbols to scan. These are NOT enclosed in `backquotes` like tokens.
charConsumers - map with key = nonterminal and value = Consumer.

LexerImpl

protected LexerImpl()
Do-nothing constructor for subclasses (currently unused).

Method Detail

addTokenListener

public void addTokenListener(Lexer.TokenListener tokenListener)
Implements Lexer. Adds the passed token listener to listener list.

Specified by:
addTokenListener in interface Lexer
Parameters:
tokenListener - the Lexer.Listener implementation to install.

removeTokenListener

public void removeTokenListener(Lexer.TokenListener tokenListener)
Implements Lexer. Removes the passed token listener from listener list.

Specified by:
removeTokenListener in interface Lexer
Parameters:
tokenListener - the Lexer.Listener implementation to remove.

newStrategy

public Strategy newStrategy()
Implements StrategyFactoryMethod. To be overridden to create a derived Strategy implementation.

Specified by:
newStrategy in interface StrategyFactoryMethod

setCompeteForLongestInput

public void setCompeteForLongestInput(boolean competeForLongestInput)
When false, the sort order (significance) of scan items without fixed start character decide what token is returned. When true (default), the scan item (without fixed start character) that scnas longest wins.


setInput

public void setInput(java.lang.Object text)
              throws java.io.IOException
Implements Lexer: set the input to be scanned. If text is InputStream, no Reader will be used (characters will not be converted).

Specified by:
setInput in interface Lexer
Parameters:
text - text to scan, as String, StringBuffer, File, InputStream, Reader.
java.io.IOException

setTerminals

public void setTerminals(java.util.List terminals)
Implements Lexer: Parser call to pass all tokens symbols (all enclosed in `backquote`) and literals ("xyz").

Specified by:
setTerminals in interface Lexer
Parameters:
terminals - List of String containing "literals" and `lexertokens`.

clear

public void clear()
Implements Lexer: Does nothing as no states are stored. This Lexer can not be loaded with new syntaxes.

Specified by:
clear in interface Lexer

getNextToken

public Token getNextToken(LexerSemantic lexerSemantic)
                   throws java.io.IOException
This is an optional functionality of Lexer. It is NOT called by the Parser. It can be used for heuristic reading from an input (not knowing if there is more input after the token was read).

The passed LexerSemantic will receive every matched rule (top-down) together with its ResultTree. See lex() for details.

Parameters:
lexerSemantic - the LexerSemantic to be called with every evaluated Rule and its lexing ResultTree.
Returns:
a Token with a terminal symbol and its instance text, or a Token with null symbol for error.
java.io.IOException

getNextToken

public Token getNextToken(java.util.Map expectedTokenSymbols)
                   throws java.io.IOException
Implements Lexer: returns the next token from input, or EPSILON when no more input. This is called by the Parser to get the next syntax token from input. When returned token.symbol is null, no input could be recognized (ERROR).

Specified by:
getNextToken in interface Lexer
Parameters:
expectedTokenSymbols - contains the expected String token symbols (in keys), can be null when no Parser drives this Lexer.
Returns:
a Token with a terminal symbol and its instance text, or a Token with null symbol for error.
java.io.IOException

createToken

protected Token createToken(java.lang.String tokenIdentifier,
                            ResultTree result,
                            LexerSemantic lexerSemantic)
Token factory method. Can be overridden to access the lexing ResultTree. Delegates to createToken(tokenIdentifier, text, range).


createToken

protected Token createToken(java.lang.String tokenIdentifier,
                            java.lang.String text,
                            Token.Range range)
Token factory method. Can be overridden to convert token.text to some Java object.


lex

public boolean lex(LexerSemantic lexerSemantic)
            throws java.io.IOException
This is an optional functionality of Lexer. It is NOT called by the Parser. It can be used to run a standalone Lexer with a LexerSemantic, processing a ready-scanned syntax tree. Other than with Parser Semantic no value stack is available for LexerSemantic, and all input will have been read when LexerSemantic is called with the built syntax tree.

The passed LexerSemantic will receive every matched rule (top-down) together with its results ResultTree, containing the range within input. ResultTree can be converted to text by calling resultTree.toString().

This method evaluates the input using end-of-input like a parser, that means it returns false if the input was either syntactically incorrect or EOF was not received when all rules have been evaluated.

MIND: This method does not call any TokenListener, as the LexerSemantic is expected to dispatch results!

Parameters:
lexerSemantic - the LexerSemantic to be called with every evaluated Rule and its lexing ResultTree.
Returns:
true when lexer succeeded (input was syntactically ok), else false.
java.io.IOException

loopResultTree

protected void loopResultTree(ResultTree result,
                              LexerSemantic lexerSemantic)
After top-down lexing this method is called to dispatch all results. Can be overridden to change dispatch logic. This method calls itself recursively with all result tree children. Nonterminals starting with "_" are ignored by default, as this marks artificial rules.

Parameters:
result - lexer result, returns text on getText().
Returns:
a Token with the range and return of the Semantic call for this Rule/ResultTree.

setDebug

public void setDebug(boolean debug)
Implements Lexer: Set debug on to output information about scanned tokens.

Specified by:
setDebug in interface Lexer

getLineText

public java.lang.String getLineText()
Returns the current line, as far as read.


getLine

public int getLine()
Returns the number of the current line, 1-n.


getColumn

public int getColumn()
Returns the position within the current line, 0-n.


getOffset

public int getOffset()
Returns the offset read so far from input. This is an absolute offset, including newlines.


dump

public void dump(java.io.PrintStream out)
Outputs current and previous line, with line numbers. Call this on ERROR.

Specified by:
dump in interface Lexer