- Lexical analyzer generator scans the input stream and converts sequences of characters into tokens.
- The main job of lexical analyzer (scanner) is used to break up an input stream into more usable element(tokens)
- Lex is not a complete language, but rather a generator representing a new language feature which can be added to different programming languages, called host languages.
What is Token in Lex ?
Token is a classification of groups of characters.
- Lex is a tool for writing lexical analyzers.
- Lex reads a specification file containing regular expressions and generates a C routine that performs lexical analysis. Matches sequences that identify tokens.
A Lex program consist of 3 sections:
- The first section of declaration includes declaration of variable, constants and regular definition.
- The second section is for translation rules which consists of regular expression and action with respect to it.
- The third section holds whatever program subroutine are needed by the action.
Ambiguous Source Rules:
- Lex can handle ambiguous specifications. When more than one expression can match the current input, Lex chooses as follows:
- The longest match is preferred.
- Among rules which matched the same number of characters, the rule given first is preferred.