From b4b840c95c66d3d7023861a8a54a23653801ddef Mon Sep 17 00:00:00 2001 From: hselasky Date: Thu, 5 Mar 2015 09:55:35 +0000 Subject: [PATCH] MFC r227243, r233456, r248258, r248849 and r279297: Update to upstream version 2.10 The most notable new feature is support for definition files. The most notable new feature is support for processing multiple files in one invocation. There is also support for more make-friendly exit statuses. The most notable bug fix is #line directives now include the input file name. Obtained from: http://dotat.at/prog/unifdef git-svn-id: svn://svn.freebsd.org/base/stable/9@279643 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f --- usr.bin/find/find.1 | 1 - usr.bin/indent/indent.1 | 3 - usr.bin/jot/jot.1 | 1 - usr.bin/setchannel/setchannel.1 | 1 - usr.bin/tr/tr.1 | 2 - usr.bin/unifdef/unifdef.1 | 326 +++++++----- usr.bin/unifdef/unifdef.c | 902 ++++++++++++++++++++++---------- usr.bin/unifdef/unifdef.h | 52 ++ usr.bin/unifdef/unifdefall.sh | 39 +- usr.bin/vgrind/vgrindefs.5 | 1 - 10 files changed, 881 insertions(+), 447 deletions(-) create mode 100644 usr.bin/unifdef/unifdef.h diff --git a/usr.bin/find/find.1 b/usr.bin/find/find.1 index 908abb19f..006dfbed3 100644 --- a/usr.bin/find/find.1 +++ b/usr.bin/find/find.1 @@ -156,7 +156,6 @@ This option is equivalent to the deprecated primary. .El .Sh PRIMARIES -.Pp All primaries which take a numeric argument allow the number to be preceded by a plus sign .Pq Dq Li + diff --git a/usr.bin/indent/indent.1 b/usr.bin/indent/indent.1 index f83900209..fe8f1aeda 100644 --- a/usr.bin/indent/indent.1 +++ b/usr.bin/indent/indent.1 @@ -488,7 +488,6 @@ The utility fits as many words (separated by blanks, tabs, or newlines) on a line as possible. Blank lines break paragraphs. -.Pp .Ss Comment indentation If a comment is on a line with code it is started in the `comment column', which is set by the @@ -504,7 +503,6 @@ command line parameter. If the code on a line extends past the comment column, the comment starts further to the right, and the right margin may be automatically extended in extreme cases. -.Pp .Ss Preprocessor lines In general, .Nm @@ -519,7 +517,6 @@ is recognized and .Nm attempts to correctly compensate for the syntactic peculiarities introduced. -.Pp .Ss C syntax The .Nm diff --git a/usr.bin/jot/jot.1 b/usr.bin/jot/jot.1 index 7a826e3fc..b6d1d14df 100644 --- a/usr.bin/jot/jot.1 +++ b/usr.bin/jot/jot.1 @@ -242,7 +242,6 @@ specifying an integer format: .Bd -literal -offset indent $ jot -w %d 6 1 10 0.5 .Ed -.Pp .Sh EXIT STATUS .Ex -std .Sh EXAMPLES diff --git a/usr.bin/setchannel/setchannel.1 b/usr.bin/setchannel/setchannel.1 index 0bb76a188..5b2d4ab82 100644 --- a/usr.bin/setchannel/setchannel.1 +++ b/usr.bin/setchannel/setchannel.1 @@ -33,7 +33,6 @@ .Nd Hauppage PVR250/350 channel selector .Sh SYNOPSIS .Cd pvr250-setchannel [-a {on | off}] [-c | -r | -s | -t] [-g geom] [-m channel_set] [channel | freq] -.Pp .Sh DESCRIPTION .Nm provides support for selecting channels on Hauppauge WinTV cards, diff --git a/usr.bin/tr/tr.1 b/usr.bin/tr/tr.1 index 5c8709d10..4cfd88ce9 100644 --- a/usr.bin/tr/tr.1 +++ b/usr.bin/tr/tr.1 @@ -145,7 +145,6 @@ the octal sequence to the full 3 octal digits. .It \echaracter A backslash followed by certain special characters maps to special values. -.Pp .Bl -column "\ea" .It "\ea .It "\eb @@ -177,7 +176,6 @@ previous implementations. .It [:class:] Represents all characters belonging to the defined character class. Class names are: -.Pp .Bl -column "phonogram" .It "alnum .It "alpha diff --git a/usr.bin/unifdef/unifdef.1 b/usr.bin/unifdef/unifdef.1 index e68a5f6b2..7b6369446 100644 --- a/usr.bin/unifdef/unifdef.1 +++ b/usr.bin/unifdef/unifdef.1 @@ -1,6 +1,6 @@ .\" Copyright (c) 1985, 1991, 1993 .\" The Regents of the University of California. All rights reserved. -.\" Copyright (c) 2002 - 2010 Tony Finch . All rights reserved. +.\" Copyright (c) 2002 - 2013 Tony Finch . All rights reserved. .\" .\" This code is derived from software contributed to Berkeley by .\" Dave Yost. It was rewritten to support ANSI C by Tony Finch. @@ -31,23 +31,24 @@ .\" .\" $FreeBSD$ .\" -.Dd March 11, 2010 -.Dt UNIFDEF 1 -.Os +.Dd January 7, 2014 +.Dt UNIFDEF 1 PRM +.Os " " .Sh NAME .Nm unifdef , unifdefall .Nd remove preprocessor conditionals from code .Sh SYNOPSIS .Nm -.Op Fl bBcdeKknsStV +.Op Fl bBcdehKkmnsStV .Op Fl I Ns Ar path -.Op Fl D Ns Ar sym Ns Op = Ns Ar val -.Op Fl U Ns Ar sym -.Op Fl iD Ns Ar sym Ns Op = Ns Ar val -.Op Fl iU Ns Ar sym +.Op Fl [i]D Ns Ar sym Ns Op = Ns Ar val +.Op Fl [i]U Ns Ar sym .Ar ... +.Op Fl f Ar defile +.Op Fl x Bro Ar 012 Brc +.Op Fl M Ar backext .Op Fl o Ar outfile -.Op Ar infile +.Op Ar infile ... .Nm unifdefall .Op Fl I Ns Ar path .Ar ... @@ -66,14 +67,21 @@ while otherwise leaving the file alone. The .Nm utility acts on -.Ic #if , #ifdef , #ifndef , #elif , #else , +.Ic #if , #ifdef , #ifndef , +.Ic #elif , #else , and .Ic #endif -lines. -A directive is only processed -if the symbols specified on the command line are sufficient to allow -.Nm -to get a definite value for its control expression. +lines, +using macros specified in +.Fl D +and +.Fl U +command line options or in +.Fl f +definitions files. +A directive is processed +if the macro specifications are sufficient to provide +a definite value for its control expression. If the result is false, the directive and the following lines under its control are removed. If the result is true, @@ -83,7 +91,7 @@ An or .Ic #ifndef directive is passed through unchanged -if its controlling symbol is not specified on the command line. +if its controlling macro is not specified. Any .Ic #if or @@ -109,12 +117,14 @@ and .Ic #elif lines: integer constants, -integer values of symbols defined on the command line, +integer values of macros defined on the command line, the .Fn defined operator, the operators -.Ic \&! , < , > , <= , >= , == , != , && , || , +.Ic \&! , < , > , +.Ic <= , >= , == , != , +.Ic && , || , and parenthesized expressions. A kind of .Dq "short circuit" @@ -128,16 +138,42 @@ if either operand of .Ic || is definitely true then the result is true. .Pp -In most cases, the +When evaluating an expression, +.Nm +does not expand macros first. +The value of a macro must be a simple number, +not an expression. +A limited form of indirection is allowed, +where one macro's value is the name of another. +.Pp +In most cases, .Nm -utility does not distinguish between object-like macros -(without arguments) and function-like arguments (with arguments). -If a macro is not explicitly defined, or is defined with the +does not distinguish between object-like macros +(without arguments) and function-like macros (with arguments). +A function-like macro invocation can appear in +.Ic #if +and +.Ic #elif +control expressions. +If the macro is not explicitly defined, +or is defined with the .Fl D -flag on the command-line, its arguments are ignored. +flag on the command-line, +or with +.Ic #define +in a +.Fl f +definitions file, +its arguments are ignored. If a macro is explicitly undefined on the command line with the .Fl U -flag, it may not have any arguments since this leads to a syntax error. +flag, +or with +.Ic #undef +in a +.Fl f +definitions file, +it may not have any arguments since this leads to a syntax error. .Pp The .Nm @@ -158,30 +194,65 @@ It uses .Nm Fl s and .Nm cpp Fl dM -to get lists of all the controlling symbols +to get lists of all the controlling macros and their definitions (or lack thereof), then invokes .Nm with appropriate arguments to process the file. .Sh OPTIONS -.Pp .Bl -tag -width indent -compact .It Fl D Ns Ar sym Ns = Ns Ar val -Specify that a symbol is defined to a given value -which is used when evaluating -.Ic #if -and -.Ic #elif -control expressions. +Specify that a macro is defined to a given value. .Pp .It Fl D Ns Ar sym -Specify that a symbol is defined to the value 1. +Specify that a macro is defined to the value 1. .Pp .It Fl U Ns Ar sym -Specify that a symbol is undefined. -If the same symbol appears in more than one argument, +Specify that a macro is undefined. +.Pp +If the same macro appears in more than one argument, the last occurrence dominates. .Pp +.It Fl iD Ns Ar sym Ns Op = Ns Ar val +.It Fl iU Ns Ar sym +C strings, comments, +and line continuations +are ignored within +.Ic #ifdef +and +.Ic #ifndef +blocks +controlled by macros +specified with these options. +.Pp +.It Fl f Ar defile +The file +.Ar defile +contains +.Ic #define +and +.Ic #undef +preprocessor directives, +which have the same effect as the corresponding +.Fl D +and +.Fl U +command-line arguments. +You can have multiple +.Fl f +arguments and mix them with +.Fl D +and +.Fl U +arguments; +later options override earlier ones. +.Pp +Each directive must be on a single line. +Object-like macro definitions (without arguments) +are set to the given value. +Function-like macro definitions (with arguments) +are treated as if they are set to 1. +.Pp .It Fl b Replace removed lines with blank lines instead of deleting them. @@ -196,35 +267,39 @@ Mutually exclusive with the option. .Pp .It Fl c -If the -.Fl c -flag is specified, -then the operation of -.Nm -is complemented, -i.e., the lines that would have been removed or blanked +Complement, +i.e., lines that would have been removed or blanked are retained and vice versa. .Pp .It Fl d Turn on printing of debugging messages. .Pp .It Fl e -Because -.Nm -processes its input one line at a time, -it cannot remove preprocessor directives that span more than one line. -The most common example of this is a directive with a multi-line -comment hanging off its right hand end. By default, -if .Nm -has to process such a directive, -it will complain that the line is too obfuscated. +will report an error if it needs to remove +a preprocessor directive that spans more than one line, +for example, if it has a multi-line +comment hanging off its right hand end. The .Fl e -option changes the behaviour so that, -where possible, -such lines are left unprocessed instead of reporting an error. +flag makes it ignore the line instead. +.Pp +.It Fl h +Print help. +.Pp +.It Fl I Ns Ar path +Specifies to +.Nm unifdefall +an additional place to look for +.Ic #include +files. +This option is ignored by +.Nm +for compatibility with +.Xr cpp 1 +and to simplify the implementation of +.Nm unifdefall . .Pp .It Fl K Always treat the result of @@ -248,6 +323,15 @@ because they typically start and are used as a kind of comment to sketch out future or past development. It would be rude to strip them out, just as it would be for normal comments. .Pp +.It Fl m +Modify one or more input files in place. +.Pp +.It Fl M Ar backext +Modify input files in place, and keep backups of the original files by +appending the +.Ar backext +to the input filenames. +.Pp .It Fl n Add .Li #line @@ -258,96 +342,57 @@ line numbers in the input file. .It Fl o Ar outfile Write output to the file .Ar outfile -instead of the standard output. -If -.Ar outfile -is the same as the input file, -the output is written to a temporary file -which is renamed into place when -.Nm -completes successfully. +instead of the standard output when processing a single file. .Pp .It Fl s -Instead of processing the input file as usual, +Instead of processing an input file as usual, this option causes .Nm -to produce a list of symbols that appear in expressions -that -.Nm -understands. -It is useful in conjunction with the -.Fl dM -option of -.Xr cpp 1 -for creating -.Nm -command lines. +to produce a list of macros that are used in +preprocessor directive controlling expressions. .Pp .It Fl S Like the .Fl s -option, but the nesting depth of each symbol is also printed. +option, but the nesting depth of each macro is also printed. This is useful for working out the number of possible combinations -of interdependent defined/undefined symbols. +of interdependent defined/undefined macros. .Pp .It Fl t -Disables parsing for C comments +Disables parsing for C strings, comments, and line continuations, which is useful for plain text. -.Pp -.It Fl iD Ns Ar sym Ns Op = Ns Ar val -.It Fl iU Ns Ar sym -Ignore -.Ic #ifdef Ns s . -If your C code uses -.Ic #ifdef Ns s -to delimit non-C lines, -such as comments -or code which is under construction, -then you must tell -.Nm -which symbols are used for that purpose so that it will not try to parse -comments -and line continuations -inside those -.Ic #ifdef Ns s . -You can specify ignored symbols with -.Fl iD Ns Ar sym Ns Oo = Ns Ar val Oc -and -.Fl iU Ns Ar sym -similar to -.Fl D Ns Ar sym Ns Op = Ns Ar val +This is a blanket version of the +.Fl iD and -.Fl U Ns Ar sym -above. -.Pp -.It Fl I Ns Ar path -Specifies to -.Nm unifdefall -an additional place to look for -.Ic #include -files. -This option is ignored by -.Nm -for compatibility with -.Xr cpp 1 -and to simplify the implementation of -.Nm unifdefall . +.Fl iU +flags. .Pp .It Fl V Print version details. +.Pp +.It Fl x Bro Ar 012 Brc +Set exit status mode to zero, one, or two. +See the +.Sx EXIT STATUS +section below for details. .El .Pp The .Nm -utility copies its output to -.Em stdout -and will take its input from +utility takes its input from .Em stdin -if no +if there are no .Ar file -argument is given. +arguments. +You must use the +.Fl m +or +.Fl M +options if there are multiple input files. +You can specify inut from stdin or output to stdout with +.Ql - . .Pp The .Nm @@ -356,10 +401,35 @@ utility works nicely with the option of .Xr diff 1 . .Sh EXIT STATUS -The +In normal usage the .Nm -utility exits 0 if the output is an exact copy of the input, -1 if not, and 2 if in trouble. +utility's exit status depends on the mode set using the +.Fl x +option. +.Pp +If the exit mode is zero (the default) then +.Nm +exits with status 0 if the output is an exact copy of the input, +or with status 1 if the output differs. +.Pp +If the exit mode is one, +.Nm +exits with status 1 if the output is unmodified +or 0 if it differs. +.Pp +If the exit mode is two, +.Nm +exits with status zero in both cases. +.Pp +In all exit modes, +.Nm +exits with status 2 if there is an error. +.Pp +The exit status is 0 if the +.Fl h +or +.Fl V +command line options are given. .Sh DIAGNOSTICS .Bl -item .It @@ -384,6 +454,9 @@ in comment. .Sh SEE ALSO .Xr cpp 1 , .Xr diff 1 +.Pp +The unifdef home page is +.Pa http://dotat.at/prog/unifdef .Sh HISTORY The .Nm @@ -401,13 +474,14 @@ rewrote it to support .Sh BUGS Expression evaluation is very limited. .Pp -Preprocessor control lines split across more than one physical line +Handling one line at a time means +preprocessor directives split across more than one physical line (because of comments or backslash-newline) cannot be handled in every situation. .Pp Trigraphs are not recognized. .Pp -There is no support for symbols with different definitions at +There is no support for macros with different definitions at different points in the source file. .Pp The text-mode and ignore functionality does not correspond to modern diff --git a/usr.bin/unifdef/unifdef.c b/usr.bin/unifdef/unifdef.c index 521b69842..fd378a14c 100644 --- a/usr.bin/unifdef/unifdef.c +++ b/usr.bin/unifdef/unifdef.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2002 - 2011 Tony Finch + * Copyright (c) 2002 - 2014 Tony Finch * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -43,21 +43,10 @@ * it possible to handle all "dodgy" directives correctly. */ -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -const char copyright[] = - "@(#) $Version: unifdef-2.5.6.21f1388 $\n" +#include "unifdef.h" + +static const char copyright[] = + "@(#) $Version: unifdef-2.10 $\n" "@(#) $FreeBSD$\n" "@(#) $Author: Tony Finch (dot@dotat.at) $\n" "@(#) $URL: http://dotat.at/prog/unifdef $\n" @@ -93,6 +82,9 @@ static char const * const linetype_name[] = { "PLAIN", "EOF", "ERROR" }; +#define linetype_if2elif(lt) ((Linetype)(lt - LT_IF + LT_ELIF)) +#define linetype_2dodgy(lt) ((Linetype)(lt + LT_DODGY)) + /* state of #if processing */ typedef enum { IS_OUTSIDE, @@ -146,7 +138,7 @@ static char const * const linestate_name[] = { */ #define MAXDEPTH 64 /* maximum #if nesting */ #define MAXLINE 4096 /* maximum length of line */ -#define MAXSYMS 4096 /* maximum number of symbols */ +#define MAXSYMS 16384 /* maximum number of symbols */ /* * Sometimes when editing a keyword the replacement text is longer, so @@ -154,11 +146,6 @@ static char const * const linestate_name[] = { */ #define EDITSLOP 10 -/* - * For temporary filenames - */ -#define TEMPLATE "unifdef.XXXXXX" - /* * Globals. */ @@ -167,6 +154,7 @@ static bool compblank; /* -B: compress blank lines */ static bool lnblank; /* -b: blank deleted lines */ static bool complement; /* -c: do the complement */ static bool debugging; /* -d: debugging reports */ +static bool inplace; /* -m: modify in place */ static bool iocccok; /* -e: fewer IOCCC errors */ static bool strictlogic; /* -K: keep ambiguous #ifs */ static bool killconsts; /* -k: eval constant #ifs */ @@ -183,14 +171,24 @@ static int nsyms; /* number of symbols */ static FILE *input; /* input file pointer */ static const char *filename; /* input file name */ static int linenum; /* current line number */ +static const char *linefile; /* file name for #line */ static FILE *output; /* output file pointer */ static const char *ofilename; /* output file name */ -static bool overwriting; /* output overwrites input */ -static char tempname[FILENAME_MAX]; /* used when overwriting */ +static const char *backext; /* backup extension */ +static char *tempname; /* avoid splatting input */ static char tline[MAXLINE+EDITSLOP];/* input buffer plus space */ static char *keyword; /* used for editing #elif's */ +/* + * When processing a file, the output's newline style will match the + * input's, and unifdef correctly handles CRLF or LF endings whatever + * the platform's native style. The stdio streams are opened in binary + * mode to accommodate platforms whose native newline style is CRLF. + * When the output isn't a processed input file (when it is error / + * debug / diagnostic messages) then unifdef uses native line endings. + */ + static const char *newline; /* input file format */ static const char newline_unix[] = "\n"; static const char newline_crlf[] = "\r\n"; @@ -205,33 +203,47 @@ static int delcount; /* count of deleted lines */ static unsigned blankcount; /* count of blank lines */ static unsigned blankmax; /* maximum recent blankcount */ static bool constexpr; /* constant #if expression */ -static bool zerosyms = true; /* to format symdepth output */ +static bool zerosyms; /* to format symdepth output */ static bool firstsym; /* ditto */ +static int exitmode; /* exit status mode */ static int exitstat; /* program exit status */ -static void addsym(bool, bool, char *); -static void closeout(void); +static void addsym1(bool, bool, char *); +static void addsym2(bool, const char *, const char *); +static char *astrcat(const char *, const char *); +static void cleantemp(void); +static void closeio(void); static void debug(const char *, ...); +static void debugsym(const char *, int); +static bool defundef(void); +static void defundefile(const char *); static void done(void); static void error(const char *); -static int findsym(const char *); +static int findsym(const char **); static void flushline(bool); -static Linetype parseline(void); +static void hashline(void); +static void help(void); static Linetype ifeval(const char **); static void ignoreoff(void); static void ignoreon(void); +static void indirectsym(void); static void keywordedit(const char *); +static const char *matchsym(const char *, const char *); static void nest(void); +static Linetype parseline(void); static void process(void); +static void processinout(const char *, const char *); static const char *skipargs(const char *); static const char *skipcomment(const char *); +static const char *skiphash(void); +static const char *skipline(const char *); static const char *skipsym(const char *); static void state(Ifstate); -static int strlcmp(const char *, const char *, size_t); static void unnest(void); static void usage(void); static void version(void); +static const char *xstrdup(const char *, const char *); #define endsym(c) (!isalnum((unsigned char)c) && c != '_') @@ -243,7 +255,7 @@ main(int argc, char *argv[]) { int opt; - while ((opt = getopt(argc, argv, "i:D:U:I:o:bBcdeKklnsStV")) != -1) + while ((opt = getopt(argc, argv, "i:D:U:f:I:M:o:x:bBcdehKklmnsStV")) != -1) switch (opt) { case 'i': /* treat stuff controlled by these symbols as text */ /* @@ -253,17 +265,17 @@ main(int argc, char *argv[]) */ opt = *optarg++; if (opt == 'D') - addsym(true, true, optarg); + addsym1(true, true, optarg); else if (opt == 'U') - addsym(true, false, optarg); + addsym1(true, false, optarg); else usage(); break; case 'D': /* define a symbol */ - addsym(false, true, optarg); + addsym1(false, true, optarg); break; case 'U': /* undef a symbol */ - addsym(false, false, optarg); + addsym1(false, false, optarg); break; case 'I': /* no-op for compatibility with cpp */ break; @@ -283,12 +295,25 @@ main(int argc, char *argv[]) case 'e': /* fewer errors from dodgy lines */ iocccok = true; break; + case 'f': /* definitions file */ + defundefile(optarg); + break; + case 'h': + help(); + break; case 'K': /* keep ambiguous #ifs */ strictlogic = true; break; case 'k': /* process constant #ifs */ killconsts = true; break; + case 'm': /* modify in place */ + inplace = true; + break; + case 'M': /* modify in place and keep backup */ + inplace = true; + backext = optarg; + break; case 'n': /* add #line directive after deleted lines */ lnnum = true; break; @@ -304,8 +329,14 @@ main(int argc, char *argv[]) case 't': /* don't parse C comments */ text = true; break; - case 'V': /* print version */ + case 'V': version(); + break; + case 'x': + exitmode = atoi(optarg); + if(exitmode < 0 || exitmode > 2) + usage(); + break; default: usage(); } @@ -313,54 +344,98 @@ main(int argc, char *argv[]) argv += optind; if (compblank && lnblank) errx(2, "-B and -b are mutually exclusive"); - if (argc > 1) { - errx(2, "can only do one file"); - } else if (argc == 1 && strcmp(*argv, "-") != 0) { - filename = *argv; - input = fopen(filename, "rb"); - if (input == NULL) - err(2, "can't open %s", filename); - } else { - filename = "[stdin]"; - input = stdin; + if (symlist && (ofilename != NULL || inplace || argc > 1)) + errx(2, "-s only works with one input file"); + if (argc > 1 && ofilename != NULL) + errx(2, "-o cannot be used with multiple input files"); + if (argc > 1 && !inplace) + errx(2, "multiple input files require -m or -M"); + if (argc == 0) + argc = 1; + if (argc == 1 && !inplace && ofilename == NULL) + ofilename = "-"; + indirectsym(); + + atexit(cleantemp); + if (ofilename != NULL) + processinout(*argv, ofilename); + else while (argc-- > 0) { + processinout(*argv, *argv); + argv++; } - if (ofilename == NULL) { - ofilename = "[stdout]"; - output = stdout; + switch(exitmode) { + case(0): exit(exitstat); + case(1): exit(!exitstat); + case(2): exit(0); + default: abort(); /* bug */ + } +} + +/* + * File logistics. + */ +static void +processinout(const char *ifn, const char *ofn) +{ + struct stat st; + + if (ifn == NULL || strcmp(ifn, "-") == 0) { + filename = "[stdin]"; + linefile = NULL; + input = fbinmode(stdin); } else { - struct stat ist, ost; - if (stat(ofilename, &ost) == 0 && - fstat(fileno(input), &ist) == 0) - overwriting = (ist.st_dev == ost.st_dev - && ist.st_ino == ost.st_ino); - if (overwriting) { - const char *dirsep; - int ofd; - - dirsep = strrchr(ofilename, '/'); - if (dirsep != NULL) - snprintf(tempname, sizeof(tempname), - "%.*s/" TEMPLATE, - (int)(dirsep - ofilename), ofilename); - else - snprintf(tempname, sizeof(tempname), - TEMPLATE); - ofd = mkstemp(tempname); - if (ofd != -1) - output = fdopen(ofd, "wb+"); - if (output == NULL) - err(2, "can't create temporary file"); - fchmod(ofd, ist.st_mode & (S_IRWXU|S_IRWXG|S_IRWXO)); - } else { - output = fopen(ofilename, "wb"); - if (output == NULL) - err(2, "can't open %s", ofilename); - } + filename = ifn; + linefile = ifn; + input = fopen(ifn, "rb"); + if (input == NULL) + err(2, "can't open %s", ifn); + } + if (strcmp(ofn, "-") == 0) { + output = fbinmode(stdout); + process(); + return; } + if (stat(ofn, &st) < 0) { + output = fopen(ofn, "wb"); + if (output == NULL) + err(2, "can't create %s", ofn); + process(); + return; + } + + tempname = astrcat(ofn, ".XXXXXX"); + output = mktempmode(tempname, st.st_mode); + if (output == NULL) + err(2, "can't create %s", tempname); + process(); - abort(); /* bug */ + + if (backext != NULL) { + char *backname = astrcat(ofn, backext); + if (rename(ofn, backname) < 0) + err(2, "can't rename \"%s\" to \"%s\"", ofn, backname); + free(backname); + } + if (replace(tempname, ofn) < 0) + err(2, "can't rename \"%s\" to \"%s\"", tempname, ofn); + free(tempname); + tempname = NULL; +} + +/* + * For cleaning up if there is an error. + */ +static void +cleantemp(void) +{ + if (tempname != NULL) + remove(tempname); } +/* + * Self-identification functions. + */ + static void version(void) { @@ -375,14 +450,55 @@ version(void) } } +static void +synopsis(FILE *fp) +{ + fprintf(fp, + "usage: unifdef [-bBcdehKkmnsStV] [-x{012}] [-Mext] [-opath] \\\n" + " [-[i]Dsym[=val]] [-[i]Usym] [-fpath] ... [file] ...\n"); +} + static void usage(void) { - fprintf(stderr, "usage: unifdef [-bBcdeKknsStV] [-Ipath]" - " [-Dsym[=val]] [-Usym] [-iDsym[=val]] [-iUsym] ... [file]\n"); + synopsis(stderr); exit(2); } +static void +help(void) +{ + synopsis(stdout); + printf( + " -Dsym=val define preprocessor symbol with given value\n" + " -Dsym define preprocessor symbol with value 1\n" + " -Usym preprocessor symbol is undefined\n" + " -iDsym=val \\ ignore C strings and comments\n" + " -iDsym ) in sections controlled by these\n" + " -iUsym / preprocessor symbols\n" + " -fpath file containing #define and #undef directives\n" + " -b blank lines instead of deleting them\n" + " -B compress blank lines around deleted section\n" + " -c complement (invert) keep vs. delete\n" + " -d debugging mode\n" + " -e ignore multiline preprocessor directives\n" + " -h print help\n" + " -Ipath extra include file path (ignored)\n" + " -K disable && and || short-circuiting\n" + " -k process constant #if expressions\n" + " -Mext modify in place and keep backups\n" + " -m modify input files in place\n" + " -n add #line directives to output\n" + " -opath output file name\n" + " -S list #if control symbols with nesting\n" + " -s list #if control symbols\n" + " -t ignore C strings and comments\n" + " -V print version\n" + " -x{012} exit status mode\n" + ); + exit(0); +} + /* * A state transition function alters the global #if processing state * in a particular way. The table below is indexed by the current @@ -396,7 +512,7 @@ usage(void) * When we have processed a group that starts off with a known-false * #if/#elif sequence (which has therefore been deleted) followed by a * #elif that we don't understand and therefore must keep, we edit the - * latter into a #if to keep the nesting correct. We use strncpy() to + * latter into a #if to keep the nesting correct. We use memcpy() to * overwrite the 4 byte token "elif" with "if " without a '\0' byte. * * When we find a true #elif in a group, the following block will @@ -451,7 +567,7 @@ static void Idrop (void) { Fdrop(); ignoreon(); } static void Itrue (void) { Ftrue(); ignoreon(); } static void Ifalse(void) { Ffalse(); ignoreon(); } /* modify this line */ -static void Mpass (void) { strncpy(keyword, "if ", 4); Pelif(); } +static void Mpass (void) { memcpy(keyword, "if ", 4); Pelif(); } static void Mtrue (void) { keywordedit("else"); state(IS_TRUE_MIDDLE); } static void Melif (void) { keywordedit("endif"); state(IS_FALSE_TRAILER); } static void Melse (void) { keywordedit("endif"); state(IS_FALSE_ELSE); } @@ -547,9 +663,21 @@ state(Ifstate is) ifstate[depth] = is; } +/* + * The last state transition function. When this is called, + * lineval == LT_EOF, so the process() loop will terminate. + */ +static void +done(void) +{ + if (incomment) + error("EOF in comment"); + closeio(); +} + /* * Write a line to the output or not, according to command line options. - * If writing fails, closeout() will print the error and exit. + * If writing fails, closeio() will print the error and exit. */ static void flushline(bool keep) @@ -562,77 +690,75 @@ flushline(bool keep) delcount += 1; blankcount += 1; } else { - if (lnnum && delcount > 0 && - fprintf(output, "#line %d%s", linenum, newline) < 0) - closeout(); + if (lnnum && delcount > 0) + hashline(); if (fputs(tline, output) == EOF) - closeout(); + closeio(); delcount = 0; blankmax = blankcount = blankline ? blankcount + 1 : 0; } } else { if (lnblank && fputs(newline, output) == EOF) - closeout(); + closeio(); exitstat = 1; delcount += 1; blankcount = 0; } if (debugging && fflush(output) == EOF) - closeout(); + closeio(); } /* - * The driver for the state machine. + * Format of #line directives depends on whether we know the input filename. */ static void -process(void) +hashline(void) { - /* When compressing blank lines, act as if the file - is preceded by a large number of blank lines. */ - blankmax = blankcount = 1000; - for (;;) { - Linetype lineval = parseline(); - trans_table[ifstate[depth]][lineval](); - debug("process line %d %s -> %s depth %d", - linenum, linetype_name[lineval], - ifstate_name[ifstate[depth]], depth); - } + int e; + + if (linefile == NULL) + e = fprintf(output, "#line %d%s", linenum, newline); + else + e = fprintf(output, "#line %d \"%s\"%s", + linenum, linefile, newline); + if (e < 0) + closeio(); } /* * Flush the output and handle errors. */ static void -closeout(void) +closeio(void) { + /* Tidy up after findsym(). */ if (symdepth && !zerosyms) printf("\n"); - if (ferror(output) || fclose(output) == EOF) { - if (overwriting) { - warn("couldn't write to temporary file"); - unlink(tempname); - errx(2, "%s unchanged", ofilename); - } else { - err(2, "couldn't write to %s", ofilename); - } - } + if (output != NULL && (ferror(output) || fclose(output) == EOF)) + err(2, "%s: can't write to output", filename); + fclose(input); } /* - * Clean up and exit. + * The driver for the state machine. */ static void -done(void) +process(void) { - if (incomment) - error("EOF in comment"); - closeout(); - if (overwriting && rename(tempname, ofilename) == -1) { - warn("couldn't rename temporary file"); - unlink(tempname); - errx(2, "%s unchanged", ofilename); + Linetype lineval = LT_PLAIN; + /* When compressing blank lines, act as if the file + is preceded by a large number of blank lines. */ + blankmax = blankcount = 1000; + zerosyms = true; + newline = NULL; + linenum = 0; + while (lineval != LT_EOF) { + lineval = parseline(); + trans_table[ifstate[depth]][lineval](); + debug("process line %d %s -> %s depth %d", + linenum, linetype_name[lineval], + ifstate_name[ifstate[depth]], depth); } - exit(exitstat); } /* @@ -645,105 +771,86 @@ parseline(void) { const char *cp; int cursym; - int kwlen; Linetype retval; Comment_state wascomment; - linenum++; - if (fgets(tline, MAXLINE, input) == NULL) { - if (ferror(input)) - error(strerror(errno)); - else - return (LT_EOF); - } + wascomment = incomment; + cp = skiphash(); + if (cp == NULL) + return (LT_EOF); if (newline == NULL) { if (strrchr(tline, '\n') == strrchr(tline, '\r') + 1) newline = newline_crlf; else newline = newline_unix; } - retval = LT_PLAIN; - wascomment = incomment; - cp = skipcomment(tline); - if (linestate == LS_START) { - if (*cp == '#') { - linestate = LS_HASH; - firstsym = true; - cp = skipcomment(cp + 1); - } else if (*cp != '\0') - linestate = LS_DIRTY; + if (*cp == '\0') { + retval = LT_PLAIN; + goto done; } - if (!incomment && linestate == LS_HASH) { - keyword = tline + (cp - tline); - cp = skipsym(cp); - kwlen = cp - keyword; + keyword = tline + (cp - tline); + if ((cp = matchsym("ifdef", keyword)) != NULL || + (cp = matchsym("ifndef", keyword)) != NULL) { + cp = skipcomment(cp); + if ((cursym = findsym(&cp)) < 0) + retval = LT_IF; + else { + retval = (keyword[2] == 'n') + ? LT_FALSE : LT_TRUE; + if (value[cursym] == NULL) + retval = (retval == LT_TRUE) + ? LT_FALSE : LT_TRUE; + if (ignore[cursym]) + retval = (retval == LT_TRUE) + ? LT_TRUEI : LT_FALSEI; + } + } else if ((cp = matchsym("if", keyword)) != NULL) + retval = ifeval(&cp); + else if ((cp = matchsym("elif", keyword)) != NULL) + retval = linetype_if2elif(ifeval(&cp)); + else if ((cp = matchsym("else", keyword)) != NULL) + retval = LT_ELSE; + else if ((cp = matchsym("endif", keyword)) != NULL) + retval = LT_ENDIF; + else { + cp = skipsym(keyword); /* no way can we deal with a continuation inside a keyword */ if (strncmp(cp, "\\\r\n", 3) == 0 || strncmp(cp, "\\\n", 2) == 0) Eioccc(); - if (strlcmp("ifdef", keyword, kwlen) == 0 || - strlcmp("ifndef", keyword, kwlen) == 0) { - cp = skipcomment(cp); - if ((cursym = findsym(cp)) < 0) - retval = LT_IF; - else { - retval = (keyword[2] == 'n') - ? LT_FALSE : LT_TRUE; - if (value[cursym] == NULL) - retval = (retval == LT_TRUE) - ? LT_FALSE : LT_TRUE; - if (ignore[cursym]) - retval = (retval == LT_TRUE) - ? LT_TRUEI : LT_FALSEI; - } - cp = skipsym(cp); - } else if (strlcmp("if", keyword, kwlen) == 0) - retval = ifeval(&cp); - else if (strlcmp("elif", keyword, kwlen) == 0) - retval = ifeval(&cp) - LT_IF + LT_ELIF; - else if (strlcmp("else", keyword, kwlen) == 0) - retval = LT_ELSE; - else if (strlcmp("endif", keyword, kwlen) == 0) - retval = LT_ENDIF; - else { - linestate = LS_DIRTY; - retval = LT_PLAIN; - } - cp = skipcomment(cp); - if (*cp != '\0') { + cp = skipline(cp); + retval = LT_PLAIN; + goto done; + } + cp = skipcomment(cp); + if (*cp != '\0') { + cp = skipline(cp); + if (retval == LT_TRUE || retval == LT_FALSE || + retval == LT_TRUEI || retval == LT_FALSEI) + retval = LT_IF; + if (retval == LT_ELTRUE || retval == LT_ELFALSE) + retval = LT_ELIF; + } + /* the following can happen if the last line of the file lacks a + newline or if there is too much whitespace in a directive */ + if (linestate == LS_HASH) { + long len = cp - tline; + if (fgets(tline + len, MAXLINE - len, input) == NULL) { + if (ferror(input)) + err(2, "can't read %s", filename); + /* append the missing newline at eof */ + strcpy(tline + len, newline); + cp += strlen(newline); + linestate = LS_START; + } else { linestate = LS_DIRTY; - if (retval == LT_TRUE || retval == LT_FALSE || - retval == LT_TRUEI || retval == LT_FALSEI) - retval = LT_IF; - if (retval == LT_ELTRUE || retval == LT_ELFALSE) - retval = LT_ELIF; - } - if (retval != LT_PLAIN && (wascomment || incomment)) { - retval += LT_DODGY; - if (incomment) - linestate = LS_DIRTY; - } - /* skipcomment normally changes the state, except - if the last line of the file lacks a newline, or - if there is too much whitespace in a directive */ - if (linestate == LS_HASH) { - size_t len = cp - tline; - if (fgets(tline + len, MAXLINE - len, input) == NULL) { - if (ferror(input)) - error(strerror(errno)); - /* append the missing newline at eof */ - strcpy(tline + len, newline); - cp += strlen(newline); - linestate = LS_START; - } else { - linestate = LS_DIRTY; - } } } - if (linestate == LS_DIRTY) { - while (*cp != '\0') - cp = skipcomment(cp + 1); + if (retval != LT_PLAIN && (wascomment || linestate != LS_START)) { + retval = linetype_2dodgy(retval); + linestate = LS_DIRTY; } +done: debug("parser line %d state %s comment %s line", linenum, comment_name[incomment], linestate_name[linestate]); return (retval); @@ -753,34 +860,34 @@ parseline(void) * These are the binary operators that are supported by the expression * evaluator. */ -static Linetype op_strict(int *p, int v, Linetype at, Linetype bt) { +static Linetype op_strict(long *p, long v, Linetype at, Linetype bt) { if(at == LT_IF || bt == LT_IF) return (LT_IF); return (*p = v, v ? LT_TRUE : LT_FALSE); } -static Linetype op_lt(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_lt(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a < b, at, bt); } -static Linetype op_gt(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_gt(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a > b, at, bt); } -static Linetype op_le(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_le(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a <= b, at, bt); } -static Linetype op_ge(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_ge(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a >= b, at, bt); } -static Linetype op_eq(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_eq(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a == b, at, bt); } -static Linetype op_ne(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_ne(long *p, Linetype at, long a, Linetype bt, long b) { return op_strict(p, a != b, at, bt); } -static Linetype op_or(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_or(long *p, Linetype at, long a, Linetype bt, long b) { if (!strictlogic && (at == LT_TRUE || bt == LT_TRUE)) return (*p = 1, LT_TRUE); return op_strict(p, a || b, at, bt); } -static Linetype op_and(int *p, Linetype at, int a, Linetype bt, int b) { +static Linetype op_and(long *p, Linetype at, long a, Linetype bt, long b) { if (!strictlogic && (at == LT_FALSE || bt == LT_FALSE)) return (*p = 0, LT_FALSE); return op_strict(p, a && b, at, bt); @@ -798,7 +905,7 @@ static Linetype op_and(int *p, Linetype at, int a, Linetype bt, int b) { */ struct ops; -typedef Linetype eval_fn(const struct ops *, int *, const char **); +typedef Linetype eval_fn(const struct ops *, long *, const char **); static eval_fn eval_table, eval_unary; @@ -809,13 +916,15 @@ static eval_fn eval_table, eval_unary; * element of the table. Innermost expressions have special non-table-driven * handling. */ -static const struct ops { +struct op { + const char *str; + Linetype (*fn)(long *, Linetype, long, Linetype, long); +}; +struct ops { eval_fn *inner; - struct op { - const char *str; - Linetype (*fn)(int *, Linetype, int, Linetype, int); - } op[5]; -} eval_ops[] = { + struct op op[5]; +}; +static const struct ops eval_ops[] = { { eval_table, { { "||", op_or } } }, { eval_table, { { "&&", op_and } } }, { eval_table, { { "==", op_eq }, @@ -826,13 +935,19 @@ static const struct ops { { ">", op_gt } } } }; +/* Current operator precedence level */ +static long prec(const struct ops *ops) +{ + return (ops - eval_ops); +} + /* * Function for evaluating the innermost parts of expressions, * viz. !expr (expr) number defined(symbol) symbol * We reset the constexpr flag in the last two cases. */ static Linetype -eval_unary(const struct ops *ops, int *valp, const char **cpp) +eval_unary(const struct ops *ops, long *valp, const char **cpp) { const char *cp; char *ep; @@ -842,7 +957,7 @@ eval_unary(const struct ops *ops, int *valp, const char **cpp) cp = skipcomment(*cpp); if (*cp == '!') { - debug("eval%d !", ops - eval_ops); + debug("eval%d !", prec(ops)); cp++; lt = eval_unary(ops, valp, &cp); if (lt == LT_ERROR) @@ -853,7 +968,7 @@ eval_unary(const struct ops *ops, int *valp, const char **cpp) } } else if (*cp == '(') { cp++; - debug("eval%d (", ops - eval_ops); + debug("eval%d (", prec(ops)); lt = eval_table(eval_ops, valp, &cp); if (lt == LT_ERROR) return (LT_ERROR); @@ -861,37 +976,38 @@ eval_unary(const struct ops *ops, int *valp, const char **cpp) if (*cp++ != ')') return (LT_ERROR); } else if (isdigit((unsigned char)*cp)) { - debug("eval%d number", ops - eval_ops); + debug("eval%d number", prec(ops)); *valp = strtol(cp, &ep, 0); if (ep == cp) return (LT_ERROR); lt = *valp ? LT_TRUE : LT_FALSE; - cp = skipsym(cp); - } else if (strncmp(cp, "defined", 7) == 0 && endsym(cp[7])) { + cp = ep; + } else if (matchsym("defined", cp) != NULL) { cp = skipcomment(cp+7); - debug("eval%d defined", ops - eval_ops); if (*cp == '(') { cp = skipcomment(cp+1); defparen = true; } else { defparen = false; } - sym = findsym(cp); + sym = findsym(&cp); + cp = skipcomment(cp); + if (defparen && *cp++ != ')') { + debug("eval%d defined missing ')'", prec(ops)); + return (LT_ERROR); + } if (sym < 0) { + debug("eval%d defined unknown", prec(ops)); lt = LT_IF; } else { + debug("eval%d defined %s", prec(ops), symname[sym]); *valp = (value[sym] != NULL); lt = *valp ? LT_TRUE : LT_FALSE; } - cp = skipsym(cp); - cp = skipcomment(cp); - if (defparen && *cp++ != ')') - return (LT_ERROR); constexpr = false; } else if (!endsym(*cp)) { - debug("eval%d symbol", ops - eval_ops); - sym = findsym(cp); - cp = skipsym(cp); + debug("eval%d symbol", prec(ops)); + sym = findsym(&cp); if (sym < 0) { lt = LT_IF; cp = skipargs(cp); @@ -907,12 +1023,12 @@ eval_unary(const struct ops *ops, int *valp, const char **cpp) } constexpr = false; } else { - debug("eval%d bad expr", ops - eval_ops); + debug("eval%d bad expr", prec(ops)); return (LT_ERROR); } *cpp = cp; - debug("eval%d = %d", ops - eval_ops, *valp); + debug("eval%d = %d", prec(ops), *valp); return (lt); } @@ -920,14 +1036,14 @@ eval_unary(const struct ops *ops, int *valp, const char **cpp) * Table-driven evaluation of binary operators. */ static Linetype -eval_table(const struct ops *ops, int *valp, const char **cpp) +eval_table(const struct ops *ops, long *valp, const char **cpp) { const struct op *op; const char *cp; - int val; + long val; Linetype lt, rt; - debug("eval%d", ops - eval_ops); + debug("eval%d", prec(ops)); cp = *cpp; lt = ops->inner(ops+1, valp, &cp); if (lt == LT_ERROR) @@ -940,7 +1056,7 @@ eval_table(const struct ops *ops, int *valp, const char **cpp) if (op->str == NULL) break; cp += strlen(op->str); - debug("eval%d %s", ops - eval_ops, op->str); + debug("eval%d %s", prec(ops), op->str); rt = ops->inner(ops+1, &val, &cp); if (rt == LT_ERROR) return (LT_ERROR); @@ -948,8 +1064,8 @@ eval_table(const struct ops *ops, int *valp, const char **cpp) } *cpp = cp; - debug("eval%d = %d", ops - eval_ops, *valp); - debug("eval%d lt = %s", ops - eval_ops, linetype_name[lt]); + debug("eval%d = %d", prec(ops), *valp); + debug("eval%d lt = %s", prec(ops), linetype_name[lt]); return (lt); } @@ -961,8 +1077,8 @@ eval_table(const struct ops *ops, int *valp, const char **cpp) static Linetype ifeval(const char **cpp) { - int ret; - int val = 0; + Linetype ret; + long val = 0; debug("eval %s", *cpp); constexpr = killconsts ? false : true; @@ -971,6 +1087,49 @@ ifeval(const char **cpp) return (constexpr ? LT_IF : ret == LT_ERROR ? LT_IF : ret); } +/* + * Read a line and examine its initial part to determine if it is a + * preprocessor directive. Returns NULL on EOF, or a pointer to a + * preprocessor directive name, or a pointer to the zero byte at the + * end of the line. + */ +static const char * +skiphash(void) +{ + const char *cp; + + linenum++; + if (fgets(tline, MAXLINE, input) == NULL) { + if (ferror(input)) + err(2, "can't read %s", filename); + else + return (NULL); + } + cp = skipcomment(tline); + if (linestate == LS_START && *cp == '#') { + linestate = LS_HASH; + return (skipcomment(cp + 1)); + } else if (*cp == '\0') { + return (cp); + } else { + return (skipline(cp)); + } +} + +/* + * Mark a line dirty and consume the rest of it, keeping track of the + * lexical state. + */ +static const char * +skipline(const char *cp) +{ + if (*cp != '\0') + linestate = LS_DIRTY; + while (*cp != '\0') + cp = skipcomment(cp + 1); + return (cp); +} + /* * Skip over comments, strings, and character literals and stop at the * next character position that is not whitespace. Between calls we keep @@ -1123,88 +1282,265 @@ skipsym(const char *cp) return (cp); } +/* + * Skip whitespace and take a copy of any following identifier. + */ +static const char * +getsym(const char **cpp) +{ + const char *cp = *cpp, *sym; + + cp = skipcomment(cp); + cp = skipsym(sym = cp); + if (cp == sym) + return NULL; + *cpp = cp; + return (xstrdup(sym, cp)); +} + +/* + * Check that s (a symbol) matches the start of t, and that the + * following character in t is not a symbol character. Returns a + * pointer to the following character in t if there is a match, + * otherwise NULL. + */ +static const char * +matchsym(const char *s, const char *t) +{ + while (*s != '\0' && *t != '\0') + if (*s != *t) + return (NULL); + else + ++s, ++t; + if (*s == '\0' && endsym(*t)) + return(t); + else + return(NULL); +} + /* * Look for the symbol in the symbol table. If it is found, we return * the symbol table index, else we return -1. */ static int -findsym(const char *str) +findsym(const char **strp) { - const char *cp; + const char *str; int symind; - cp = skipsym(str); - if (cp == str) - return (-1); + str = *strp; + *strp = skipsym(str); if (symlist) { + if (*strp == str) + return (-1); if (symdepth && firstsym) printf("%s%3d", zerosyms ? "" : "\n", depth); firstsym = zerosyms = false; printf("%s%.*s%s", - symdepth ? " " : "", - (int)(cp-str), str, - symdepth ? "" : "\n"); + symdepth ? " " : "", + (int)(*strp-str), str, + symdepth ? "" : "\n"); /* we don't care about the value of the symbol */ return (0); } for (symind = 0; symind < nsyms; ++symind) { - if (strlcmp(symname[symind], str, cp-str) == 0) { - debug("findsym %s %s", symname[symind], - value[symind] ? value[symind] : ""); + if (matchsym(symname[symind], str) != NULL) { + debugsym("findsym", symind); return (symind); } } return (-1); } +/* + * Resolve indirect symbol values to their final definitions. + */ +static void +indirectsym(void) +{ + const char *cp; + int changed, sym, ind; + + do { + changed = 0; + for (sym = 0; sym < nsyms; ++sym) { + if (value[sym] == NULL) + continue; + cp = value[sym]; + ind = findsym(&cp); + if (ind == -1 || ind == sym || + *cp != '\0' || + value[ind] == NULL || + value[ind] == value[sym]) + continue; + debugsym("indir...", sym); + value[sym] = value[ind]; + debugsym("...ectsym", sym); + changed++; + } + } while (changed); +} + +/* + * Add a symbol to the symbol table, specified with the format sym=val + */ +static void +addsym1(bool ignorethis, bool definethis, char *symval) +{ + const char *sym, *val; + + sym = symval; + val = skipsym(sym); + if (definethis && *val == '=') { + symval[val - sym] = '\0'; + val = val + 1; + } else if (*val == '\0') { + val = definethis ? "1" : NULL; + } else { + usage(); + } + addsym2(ignorethis, sym, val); +} + /* * Add a symbol to the symbol table. */ static void -addsym(bool ignorethis, bool definethis, char *sym) +addsym2(bool ignorethis, const char *sym, const char *val) { + const char *cp = sym; int symind; - char *val; - symind = findsym(sym); + symind = findsym(&cp); if (symind < 0) { if (nsyms >= MAXSYMS) errx(2, "too many symbols"); symind = nsyms++; } - symname[symind] = sym; ignore[symind] = ignorethis; - val = sym + (skipsym(sym) - sym); - if (definethis) { - if (*val == '=') { - value[symind] = val+1; - *val = '\0'; - } else if (*val == '\0') - value[symind] = "1"; - else - usage(); + symname[symind] = sym; + value[symind] = val; + debugsym("addsym", symind); +} + +static void +debugsym(const char *why, int symind) +{ + debug("%s %s%c%s", why, symname[symind], + value[symind] ? '=' : ' ', + value[symind] ? value[symind] : "undef"); +} + +/* + * Add symbols to the symbol table from a file containing + * #define and #undef preprocessor directives. + */ +static void +defundefile(const char *fn) +{ + filename = fn; + input = fopen(fn, "rb"); + if (input == NULL) + err(2, "can't open %s", fn); + linenum = 0; + while (defundef()) + ; + if (ferror(input)) + err(2, "can't read %s", filename); + else + fclose(input); + if (incomment) + error("EOF in comment"); +} + +/* + * Read and process one #define or #undef directive + */ +static bool +defundef(void) +{ + const char *cp, *kw, *sym, *val, *end; + + cp = skiphash(); + if (cp == NULL) + return (false); + if (*cp == '\0') + goto done; + /* strip trailing whitespace, and do a fairly rough check to + avoid unsupported multi-line preprocessor directives */ + end = cp + strlen(cp); + while (end > tline && strchr(" \t\n\r", end[-1]) != NULL) + --end; + if (end > tline && end[-1] == '\\') + Eioccc(); + + kw = cp; + if ((cp = matchsym("define", kw)) != NULL) { + sym = getsym(&cp); + if (sym == NULL) + error("missing macro name in #define"); + if (*cp == '(') { + val = "1"; + } else { + cp = skipcomment(cp); + val = (cp < end) ? xstrdup(cp, end) : ""; + } + debug("#define"); + addsym2(false, sym, val); + } else if ((cp = matchsym("undef", kw)) != NULL) { + sym = getsym(&cp); + if (sym == NULL) + error("missing macro name in #undef"); + cp = skipcomment(cp); + debug("#undef"); + addsym2(false, sym, NULL); } else { - if (*val != '\0') - usage(); - value[symind] = NULL; + error("unrecognized preprocessor directive"); } - debug("addsym %s=%s", symname[symind], - value[symind] ? value[symind] : "undef"); + skipline(cp); +done: + debug("parser line %d state %s comment %s line", linenum, + comment_name[incomment], linestate_name[linestate]); + return (true); } /* - * Compare s with n characters of t. - * The same as strncmp() except that it checks that s[n] == '\0'. + * Concatenate two strings into new memory, checking for failure. */ -static int -strlcmp(const char *s, const char *t, size_t n) +static char * +astrcat(const char *s1, const char *s2) { - while (n-- && *t != '\0') - if (*s != *t) - return ((unsigned char)*s - (unsigned char)*t); - else - ++s, ++t; - return ((unsigned char)*s); + char *s; + int len; + size_t size; + + len = snprintf(NULL, 0, "%s%s", s1, s2); + if (len < 0) + err(2, "snprintf"); + size = (size_t)len + 1; + s = (char *)malloc(size); + if (s == NULL) + err(2, "malloc"); + snprintf(s, size, "%s%s", s1, s2); + return (s); +} + +/* + * Duplicate a segment of a string, checking for failure. + */ +static const char * +xstrdup(const char *start, const char *end) +{ + size_t n; + char *s; + + if (end < start) abort(); /* bug */ + n = (size_t)(end - start) + 1; + s = malloc(n); + if (s == NULL) + err(2, "malloc"); + snprintf(s, n, "%s", start); + return (s); } /* @@ -1230,6 +1566,6 @@ error(const char *msg) else warnx("%s: %d: %s (#if line %d depth %d)", filename, linenum, msg, stifline[depth], depth); - closeout(); + closeio(); errx(2, "output may be truncated"); } diff --git a/usr.bin/unifdef/unifdef.h b/usr.bin/unifdef/unifdef.h new file mode 100644 index 000000000..e2e0bd8f3 --- /dev/null +++ b/usr.bin/unifdef/unifdef.h @@ -0,0 +1,52 @@ +/* + * Copyright (c) 2012 - 2013 Tony Finch + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +/* portabiity stubs */ + +#define fbinmode(fp) (fp) + +#define replace(old,new) rename(old,new) + +static FILE * +mktempmode(char *tmp, int mode) +{ + int fd = mkstemp(tmp); + if (fd < 0) return (NULL); + fchmod(fd, mode & (S_IRWXU|S_IRWXG|S_IRWXO)); + return (fdopen(fd, "wb")); +} diff --git a/usr.bin/unifdef/unifdefall.sh b/usr.bin/unifdef/unifdefall.sh index c9a04cca7..6abf60b6e 100644 --- a/usr.bin/unifdef/unifdefall.sh +++ b/usr.bin/unifdef/unifdefall.sh @@ -2,7 +2,7 @@ # # unifdefall: remove all the #if's from a source file # -# Copyright (c) 2002 - 2010 Tony Finch +# Copyright (c) 2002 - 2013 Tony Finch # Copyright (c) 2009 - 2010 Jonathan Nieder # # Redistribution and use in source and binary forms, with or without @@ -42,38 +42,19 @@ case "$@" in shift esac -basename=$(basename "$0") -tmp=$(mktemp -d "${TMPDIR:-/tmp}/$basename.XXXXXXXXXX") || exit 2 +tmp=$(mktemp -d "${TMPDIR:-/tmp}/${0##*/}.XXXXXXXXXX") || exit 2 trap 'rm -r "$tmp" || exit 2' EXIT export LC_ALL=C -# list of all controlling macros -"$unifdef" $debug -s "$@" | sort | uniq >"$tmp/ctrl" +# list of all controlling macros; assume these are undefined +"$unifdef" $debug -s "$@" | sort -u | sed 's/^/#undef /' >"$tmp/undefs" # list of all macro definitions -cpp -dM "$@" | sort | sed 's/^#define //' >"$tmp/hashdefs" -# list of defined macro names -sed 's/[^A-Za-z0-9_].*$//' <"$tmp/hashdefs" >"$tmp/alldef" -# list of undefined and defined controlling macros -comm -23 "$tmp/ctrl" "$tmp/alldef" >"$tmp/undef" -comm -12 "$tmp/ctrl" "$tmp/alldef" >"$tmp/def" -# create a sed script that extracts the controlling macro definitions -# and converts them to unifdef command-line arguments -sed 's|.*|s/^&\\(([^)]*)\\)\\{0,1\\} /-D&=/p|' <"$tmp/def" >"$tmp/script" -# create the final unifdef command -{ echo "$unifdef" $debug -k '\' - # convert the controlling undefined macros to -U arguments - sed 's/.*/-U& \\/' <"$tmp/undef" - # convert the controlling defined macros to quoted -D arguments - sed -nf "$tmp/script" <"$tmp/hashdefs" | - sed "s/'/'\\\\''/g;s/.*/'&' \\\\/" - echo '"$@"' -} >"$tmp/cmd" +cc -E -dM "$@" | sort >"$tmp/defs" + case $debug in --d) for i in ctrl hashdefs alldef undef def script cmd - do echo ==== $i - cat "$tmp/$i" - done 1>&2 +-d) cat "$tmp/undefs" "$tmp/defs" 1>&2 esac -# run the command we just created -sh "$tmp/cmd" "$@" + +# order of -f arguments means definitions override undefs +"$unifdef" $debug -k -f "$tmp/undefs" -f "$tmp/defs" "$@" diff --git a/usr.bin/vgrind/vgrindefs.5 b/usr.bin/vgrind/vgrindefs.5 index 3986dca84..ede022a42 100644 --- a/usr.bin/vgrind/vgrindefs.5 +++ b/usr.bin/vgrind/vgrindefs.5 @@ -48,7 +48,6 @@ very similar to .Xr termcap 5 . .Sh FIELDS The following table names and describes each field. -.Pp .Bl -column Namexxx Tpexxx .It Sy "Name Type Description .It "ab str regular expression for the start of an alternate comment" -- 2.45.0