X-Git-Url: https://pd.if.org/git/?p=pccts;a=blobdiff_plain;f=antlr%2Fantlr1.txt;fp=antlr%2Fantlr1.txt;h=4545275e484a59baf44888abf7166da44b0f5c00;hp=0000000000000000000000000000000000000000;hb=780a935d52ff31d98a3f1083ab0f363a7aafb30d;hpb=ca5cea5c2f4e781582ae8f220c83018d17cb418d diff --git a/antlr/antlr1.txt b/antlr/antlr1.txt new file mode 100755 index 0000000..4545275 --- /dev/null +++ b/antlr/antlr1.txt @@ -0,0 +1,264 @@ + + + +ANTLR(1) PCCTS Manual Pages ANTLR(1) + + + +NAME + antlr - ANother Tool for Language Recognition + +SYNTAX + antlr [_o_p_t_i_o_n_s] _g_r_a_m_m_a_r__f_i_l_e_s + +DESCRIPTION + _A_n_t_l_r converts an extended form of context-free grammar into + a set of C functions which directly implement an efficient + form of deterministic recursive-descent LL(k) parser. + Context-free grammars may be augmented with predicates to + allow semantics to influence parsing; this allows a form of + context-sensitive parsing. Selective backtracking is also + available to handle non-LL(k) and even non-LALR(k) con- + structs. _A_n_t_l_r also produces a definition of a lexer which + can be automatically converted into C code for a DFA-based + lexer by _d_l_g. Hence, _a_n_t_l_r serves a function much like that + of _y_a_c_c, however, it is notably more flexible and is more + integrated with a lexer generator (_a_n_t_l_r directly generates + _d_l_g code, whereas _y_a_c_c and _l_e_x are given independent + descriptions). Unlike _y_a_c_c which accepts LALR(1) grammars, + _a_n_t_l_r accepts LL(k) grammars in an extended BNF notation - + which eliminates the need for precedence rules. + + Like _y_a_c_c grammars, _a_n_t_l_r grammars can use automatically- + maintained symbol attribute values referenced as dollar + variables. Further, because _a_n_t_l_r generates top-down + parsers, arbitrary values may be inherited from parent rules + (passed like function parameters). _A_n_t_l_r also has a mechan- + ism for creating and manipulating abstract-syntax-trees. + + There are various other niceties in _a_n_t_l_r, including the + ability to spread one grammar over multiple files or even + multiple grammars in a single file, the ability to generate + a version of the grammar with actions stripped out (for + documentation purposes), and lots more. + +OPTIONS + -ck _n + Use up to _n symbols of lookahead when using compressed + (linear approximation) lookahead. This type of looka- + head is very cheap to compute and is attempted before + full LL(k) lookahead, which is of exponential complex- + ity in the worst case. In general, the compressed loo- + kahead can be much deeper (e.g, -ck 10) _t_h_a_n _t_h_e _f_u_l_l + _l_o_o_k_a_h_e_a_d (_w_h_i_c_h _u_s_u_a_l_l_y _m_u_s_t _b_e _l_e_s_s _t_h_a_n _4). + + -CC Generate C++ output from both ANTLR and DLG. + + -cr Generate a cross-reference for all rules. For each + rule, print a list of all other rules that reference + it. + + -e1 Ambiguities/errors shown in low detail (default). + + -e2 Ambiguities/errors shown in more detail. + + -e3 Ambiguities/errors shown in excruciating detail. + + -fe file + Rename err.c to file. + + -fh file + Rename stdpccts.h header (turns on -gh) to file. + + -fl file + Rename lexical output, parser.dlg, to file. + + -fm file + Rename file with lexical mode definitions, mode.h, to + file. + + -fr file + Rename file which remaps globally visible symbols, + remap.h, to file. + + -ft file + Rename tokens.h to file. + + -ga Generate ANSI-compatible code (default case). This has + not been rigorously tested to be ANSI XJ11 C compliant, + but it is close. The normal output of _a_n_t_l_r is + currently compilable under both K&R, ANSI C, and C++- + this option does nothing because _a_n_t_l_r generates a + bunch of #ifdef's to do the right thing depending on + the language. + + -gc Indicates that _a_n_t_l_r should generate no C code, i.e., + only perform analysis on the grammar. + + -gd C code is inserted in each of the _a_n_t_l_r generated pars- + ing functions to provide for user-defined handling of a + detailed parse trace. The inserted code consists of + calls to the user-supplied macros or functions called + zzTRACEIN and zzTRACEOUT. The only argument is a _c_h_a_r + * pointing to a C-style string which is the grammar + rule recognized by the current parsing function. If no + definition is given for the trace functions, upon rule + entry and exit, a message will be printed indicating + that a particular rule as been entered or exited. + + -ge Generate an error class for each non-terminal. + + -gh Generate stdpccts.h for non-ANTLR-generated files to + include. This file contains all defines needed to + describe the type of parser generated by _a_n_t_l_r (e.g. + how much lookahead is used and whether or not trees are + constructed) and contains the header action specified + by the user. + + -gk Generate parsers that delay lookahead fetches until + needed. Without this option, _a_n_t_l_r generates parsers + which always have _k tokens of lookahead available. + + -gl Generate line info about grammar actions in C parser of + the form # _l_i_n_e "_f_i_l_e" which makes error messages from + the C/C++ compiler make more sense as they will point + into the grammar file not the resulting C file. + Debugging is easier as well, because you will step + through the grammar not C file. + + -gs Do not generate sets for token expression lists; + instead generate a ||-separated sequence of + LA(1)==_t_o_k_e_n__n_u_m_b_e_r. The default is to generate sets. + + -gt Generate code for Abstract-Syntax Trees. + + -gx Do not create the lexical analyzer files (dlg-related). + This option should be given when the user wishes to + provide a customized lexical analyzer. It may also be + used in _m_a_k_e scripts to cause only the parser to be + rebuilt when a change not affecting the lexical struc- + ture is made to the input grammars. + + -k _n Set k of LL(k) to _n; i.e. set tokens of look-ahead + (default==1). + + -o dir + Directory where output files should go (default="."). + This is very nice for keeping the source directory + clear of ANTLR and DLG spawn. + + -p The complete grammar, collected from all input grammar + files and stripped of all comments and embedded + actions, is listed to stdout. This is intended to aid + in viewing the entire grammar as a whole and to elim- + inate the need to keep actions concisely stated so that + the grammar is easier to read. Hence, it is preferable + to embed even complex actions directly in the grammar, + rather than to call them as subroutines, since the sub- + routine call overhead will be saved. + + -pa This option is the same as -p except that the output is + annotated with the first sets determined from grammar + analysis. + + -prc on + Turn on the computation and hoisting of predicate con- + text. + + -prc off + Turn off the computation and hoisting of predicate con- + text. This option makes 1.10 behave like the 1.06 + release with option -pr on. Context computation is off + by default. + + -rl _n + Limit the maximum number of tree nodes used by grammar + analysis to _n. Occasionally, _a_n_t_l_r is unable to + analyze a grammar submitted by the user. This rare + situation can only occur when the grammar is large and + the amount of lookahead is greater than one. A non- + linear analysis algorithm is used by PCCTS to handle + the general case of LL(k) parsing. The average com- + plexity of analysis, however, is near linear due to + some fancy footwork in the implementation which reduces + the number of calls to the full LL(k) algorithm. An + error message will be displayed, if this limit is + reached, which indicates the grammar construct being + analyzed when _a_n_t_l_r hit a non-linearity. Use this + option if _a_n_t_l_r seems to go out to lunch and your disk + start thrashing; try _n=10000 to start. Once the + offending construct has been identified, try to remove + the ambiguity that _a_n_t_l_r was trying to overcome with + large lookahead analysis. The introduction of (...)? + backtracking blocks eliminates some of these problems - + _a_n_t_l_r does not analyze alternatives that begin with + (...)? (it simply backtracks, if necessary, at run + time). + + -w1 Set low warning level. Do not warn if semantic + predicates and/or (...)? blocks are assumed to cover + ambiguous alternatives. + + -w2 Ambiguous parsing decisions yield warnings even if + semantic predicates or (...)? blocks are used. Warn if + predicate context computed and semantic predicates + incompletely disambiguate alternative productions. + + - Read grammar from standard input and generate stdin.c + as the parser file. + +SPECIAL CONSIDERATIONS + _A_n_t_l_r works... we think. There is no implicit guarantee of + anything. We reserve no legal rights to the software known + as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS + is in the public domain. An individual or company may do + whatever they wish with source code distributed with PCCTS + or the code generated by PCCTS, including the incorporation + of PCCTS, or its output, into commercial software. We + encourage users to develop software with PCCTS. However, we + do ask that credit is given to us for developing PCCTS. By + "credit", we mean that if you incorporate our source code + into one of your programs (commercial product, research pro- + ject, or otherwise) that you acknowledge this fact somewhere + in the documentation, research report, etc... If you like + PCCTS and have developed a nice tool with the output, please + mention that you developed it using PCCTS. As long as these + guidelines are followed, we expect to continue enhancing + this system and expect to make other tools available as they + are completed. + +FILES + *.c output C parser. + + *.cpp + output C++ parser when C++ mode is used. + + parser.dlg + output _d_l_g lexical analyzer. + + err.c + token string array, error sets and error support rou- + tines. Not used in C++ mode. + + remap.h + file that redefines all globally visible parser sym- + bols. The use of the #parser directive creates this + file. Not used in C++ mode. + + stdpccts.h + list of definitions needed by C files, not generated by + PCCTS, that reference PCCTS objects. This is not gen- + erated by default. Not used in C++ mode. + + tokens.h + output #_d_e_f_i_n_e_s for tokens used and function prototypes + for functions generated for rules. + + +SEE ALSO + dlg(1), pccts(1) + + + + +