Chapter 2: Differences between flex and flexc++

Although flexc++ tries to be as much compatible with flex as possible, there are some differences. This chapter provides a quick overview for the users already familiar with flex.

2.1: Format of the input file

In flex it is possible to provide initializing code in the definition section (see section 3.1) and as the first lines in the rules section.

Flexc++ does not support code blocks. Since flexc++ generates a class with appropriate header files, there are other means to include code in your scanner. See also generated files 2.3 below.

Neither does flexc++ support a last `user code' section, where additional code can be placed to be copied verbatim to the source file.

There are two reasons for dropping these code blocks. First, the format of the lexer file becomes simpler. Second, the alternatives to the code blocks are actually an improvement over the traditional code blocks. With flex one would use code blocks before the rules to declare local variables that are used in some of the actions. With flexc++ it is possible to use data members of the scanner class for this. With flex the third section of the lexer file could be used to define helper functions. With flexc++ helper members may be defined in the scanner class. Below we list the differences between flex and flexc++. We provide suggestions for flexc++ solutions to problems that were addressed by flex features that we no longer support.

Sections 2.1.1, 2.1.2 and 2.1.3 provide lists of items which are no longer supported in flexc++ and offer alternatives.

2.1.1: Definition section

2.1.2: Rules section

2.1.3: User code section

2.2: Patterns

Not all patterns that are supported by flex are supported by flexc++. Notably, flexc++ does not yet support certain flags in regular expressions, like the flag that makes the regular expression case insensitive, or the flag that allows whitespace in regular expressions.

Another small difference in the patterns is that in a named pattern, defined in the definion section, the lookahead operator (`/') cannot be used. This is the result of name expansions being handled as a parenthesized regular expression (a group). Since groups may occur any number of times in a regular expression but a lookahead operator only once, the look-ahead operator is not accepted in a named pattern.

2.3: Generated files

Flexc++ generates more files than flex does. While flex only generates a lex.yy.cc, flexc++ generates several header files and a source file: by default Scanner.h, Scanner.ih, Scannerbase.h, and lex.cc. Both Scannerbase.h and lex.cc are overwritten when flexc++ is invoked.

Scanner.h and Scanner.ih are only generated the first time flexc++ is called. These files can thereafter be modified by the user (e.g., to add members to the Scanner class).

2.4: Comment

Flexc++ supports traditional C and C++ style end-of-line comment. Flexc++ handles comment more flexible than flex. Cf. section 3.3 for further details.

2.5: Members and macros

Since C++ supports namespaces, the yy-prefix for every member and macro is no longer needed. Most functions can now be used without the prefix. Also, because flexc++ generates a scanner class, instead of macros often member functions of the scanner class may be used. See the conversion table below.

flex flexc++ flexc++ alternative
yylex() lex()
YYText() matched()
YYLeng() length()
ECHO echo()
yymore() more()
yyless() accept() redo()
BEGIN startcondition begin(startcondition)
YY_AT_BOL n.a.
yy_set_bol(at_bol) n.a.

The member functions in the flexc++ column above are either members of Scanner or of its base class. Also note that flexc++ no longer uses macros. All member functions can be used from within actions or by other member functions.

Flexc++ does not use or define macros. Macros defined by flex are not available in flexc++'s input files.

2.6: Multiple input streams

Multiple input files can easily be handled by flexc++. See section 4.1 for details.