fri.patterns.interpreter.parsergenerator.lexer
Class LexerBuilder

java.lang.Object
  |
  +--fri.patterns.interpreter.parsergenerator.lexer.LexerBuilder

public class LexerBuilder
extends java.lang.Object

Generates a Lexer from a Syntax. The Syntax can contain also parser rules. These will be retrievable (without the removed lexer rules) after build by calling lexer.getParserSyntax().

The syntax rules may not contain '(', ')', '*', '+' or '?', but they may contain character set symbols like ".." (set definitions) and "-" (intersections).

The syntax may contain identifiers enclosed within `backquotes`. This marks predefined lexer rules defined in StandardLexerRules. That class contains default rules for numbers, identifiers, stringdefinitions, characterdefinitions and many other (e.g. for XML), which can be used to build lexers.
CAUTION: Lexer and parser rules have the same namespace, you can not define

                identifier ::= `identifier`;	// wrong!
        
Nevertheless you need not to care about the names silently imported from StandardLexerRules, they will not reduce the parser syntax namespace, only the toplevel rules will.

The syntax may contain (case-sensitive) these nonterminals:

These are lexer-reserved identifiers and can be used to mark top level lexer rules (tokens). When token is used, the builder does not try to recognize any rule as lexer rule, so this must be good modeled. Be careful: you can read away comments only by using ignored. But you can define ignored without token, then nevertheless the builder tries to recognize lexer rules.
When the token marker is not used, the builder tries to separate lexer from parser rules.

Example:

                token ::= `identifier` ;	// using StandardLexerRules
                ignored ::= `spaces` ;
                ignored ::= `newline` ;
                ignored ::= comment ;
                comment ::= "//" char_minus_newline_list_opt ;
                char_minus_newline ::= chars - newline;
                char_minus_newline_list ::= char_minus_newline_list char_minus_newline;
                char_minus_newline_list ::= char_minus_newline ;
                char_minus_newline_list_opt ::= char_minus_newline_list;
                char_minus_newline_list_opt ::= ;  // nothing
        
Mind that the builder input can not be a text file, it must be wrapped into Syntax. Use syntax builder to convert a text into a Syntax object.

Java code fragment:

                SyntaxSeparation separation = new SyntaxSeparation(new Syntax(myRules));
                LexerBuilder builder = new LexerBuilder(separation.getLexerSyntax(), separation.getIgnoredSymbols());
                Lexer lexer = builder.getLexer();
                // when using the lexer standalone (without Parser), you must put the token terminal symbols into it now:
                lexer.setTerminals(separation.getTokenSymbols());
        

Author:
(c) 2002, Fritz Ritzberger
See Also:
SyntaxSeparation, StandardLexerRules

Field Summary
protected  java.util.Map charConsumers
           
static boolean DEBUG
           
protected  java.util.List ignoredSymbols
           
 
Constructor Summary
LexerBuilder(Syntax lexerSyntax, java.util.List ignoredSymbols)
          Creates a LexerBuilder (from lexer rules) that provides a Lexer.
 
Method Summary
 Lexer getLexer()
          Returns the built Lexer.
 Lexer getLexer(java.lang.Object input)
          Returns the built Lexer, loaded with passed input (file, stream, string, ...).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

charConsumers

protected java.util.Map charConsumers

ignoredSymbols

protected java.util.List ignoredSymbols

DEBUG

public static boolean DEBUG
Constructor Detail

LexerBuilder

public LexerBuilder(Syntax lexerSyntax,
                    java.util.List ignoredSymbols)
             throws LexerException,
                    SyntaxException
Creates a LexerBuilder (from lexer rules) that provides a Lexer.

Parameters:
lexerSyntax - lexer rule (without token and ignored, use SyntaxSeparation for that)
ignoredSymbols - list of ignored symbols, NOT enclosed in backquotes!
Method Detail

getLexer

public Lexer getLexer()
Returns the built Lexer.


getLexer

public Lexer getLexer(java.lang.Object input)
               throws java.io.IOException
Returns the built Lexer, loaded with passed input (file, stream, string, ...).

java.io.IOException