- Action
- A series of awkstatements attached to a rule.  If the rule's
pattern matches an input record,awkexecutes the
rule's action.  Actions are always enclosed in curly braces. 
(See Actions.)
 
- Amazing awkAssembler
- Henry Spencer at the University of Toronto wrote a retargetable assembler
completely as sedandawkscripts.  It is thousands
of lines long, including machine descriptions for several eight-bit
microcomputers.  It is a good example of a program that would have been
better written in another language. 
You can get it from ftp://ftp.freefriends.org/arnold/Awkstuff/aaa.tgz.
 
- Amazingly Workable Formatter (awf)
- Henry Spencer at the University of Toronto wrote a formatter that accepts
a large subset of the nroff -msandnroff -manformatting
commands, usingawkandsh. 
It is available over the Internet
from ftp://ftp.freefriends.org/arnold/Awkstuff/awf.tgz.
 
- Anchor
- The regexp metacharacters ^and$, which force the match
to the beginning or end of the string, respectively.
 
- ANSI
- The American National Standards Institute.  This organization produces
many standards, among them the standards for the C and C++ programming
languages. 
These standards often become international standards as well. See also
"ISO."
 
- Array
- A grouping of multiple values under the same name. 
Most languages just provide sequential arrays. 
awkprovides associative arrays.
 
- Assertion
- A statement in a program that a condition is true at this point in the program. 
Useful for reasoning about how a program is supposed to behave.
 
- Assignment
- An awkexpression that changes the value of someawkvariable or data object.  An object that you can assign to is called an
lvalue.  The assigned values are called rvalues. 
See Assignment Expressions.
 
- Associative Array
- Arrays in which the indices may be numbers or strings, not just
sequential integers in a fixed range.
 
- awkLanguage
- The language in which awkprograms are written.
 
- awkProgram
- An awkprogram consists of a series of patterns and
actions, collectively known as rules.  For each input record
given to the program, the program's rules are all processed in turn.awkprograms may also contain function definitions.
 
- awkScript
- Another name for an awkprogram.
 
- Bash
- The GNU version of the standard shell
(the Bourne-Again SHell). 
See also "Bourne Shell."
 
- BBS
- See "Bulletin Board System."
 
- Bit
- Short for "Binary Digit." 
All values in computer memory ultimately reduce to binary digits: values
that are either zero or one. 
Groups of bits may be interpreted differently--as integers,
floating-point numbers, character data, addresses of other
memory objects, or other data. 
awklets you work with floating-point numbers and strings.gawklets you manipulate bit values with the built-in
functions described in
Usinggawk's Bit Manipulation Functions.Computers are often defined by how many bits they use to represent integer
values.  Typical systems are 32-bit systems, but 64-bit systems are
becoming increasingly popular, and 16-bit systems are waning in
popularity.
 
 
- Boolean Expression
- Named after the English mathematician Boole. See also "Logical Expression."
 
- Bourne Shell
- The standard shell (/bin/sh) on Unix and Unix-like systems,
originally written by Steven R. Bourne. 
Many shells (bash,ksh,pdksh,zsh) are
generally upwardly compatible with the Bourne shell.
 
- Built-in Function
- The awklanguage provides built-in functions that perform various
numerical, I/O-related, and string computations.  Examples aresqrt(for the square root of a number) andsubstr(for a
substring of a string).gawkprovides functions for timestamp management, bit manipulation,
and runtime string translation. 
(See Built-in Functions.)
 
- Built-in Variable
- ARGC,- ARGV,- CONVFMT,- ENVIRON,- FILENAME,- FNR,- FS,- NF,- NR,- OFMT,- OFS,- ORS,- RLENGTH,- RSTART,- RS,
and- SUBSEPare the variables that have special meaning to- awk. 
In addition,- ARGIND,- BINMODE,- ERRNO,- FIELDWIDTHS,- IGNORECASE,- LINT,- PROCINFO,- RT,
and- TEXTDOMAINare the variables that have special meaning to- gawk. 
Changing some of them affects- awk's running environment. 
(See Built-in Variables.)
 
- Braces
- See "Curly Braces."
 
- Bulletin Board System
- A computer system allowing users to log in and read and/or leave messages
for other users of the system, much like leaving paper notes on a bulletin
board.
 
- C
- The system programming language that most GNU software is written in.  The
awkprogramming language has C-like syntax, and this Web page
points out similarities betweenawkand C when appropriate.In general, gawkattempts to be as similar to the 1990 version
of ISO C as makes sense.  Future versions ofgawkmay adopt features
from the newer 1999 standard, as appropriate.
 
 
- C++
- A popular object-oriented programming language derived from C.
 
- Character Set
- The set of numeric codes used by a computer system to represent the
characters (letters, numbers, punctuation, etc.) of a particular country
or place. The most common character set in use today is ASCII (American
Standard Code for Information Interchange).  Many European
countries use an extension of ASCII known as ISO-8859-1 (ISO Latin-1).
 
- CHEM
- A preprocessor for picthat reads descriptions of molecules
and producespicinput for drawing them. 
It was written inawkby Brian Kernighan and Jon Bentley, and is available from
http://cm.bell-labs.com/netlib/typesetting/chem.gz.
 
- Coprocess
- A subordinate program with which two-way communications is possible.
 
- Compiler
- A program that translates human-readable source code into
machine-executable object code.  The object code is then executed
directly by the computer. 
See also "Interpreter."
 
- Compound Statement
- A series of awkstatements, enclosed in curly braces.  Compound
statements may be nested. 
(See Control Statements in Actions.)
 
- Concatenation
- Concatenating two strings means sticking them together, one after another,
producing a new string.  For example, the string fooconcatenated with
the stringbargives the stringfoobar. 
(See String Concatenation.)
 
- Conditional Expression
- An expression using the ?:ternary operator, such asexpr1 ? expr2 : expr3.  The expression
expr1 is evaluated; if the result is true, the value of the whole
expression is the value of expr2; otherwise the value is
expr3.  In either case, only one of expr2 and expr3
is evaluated. (See Conditional Expressions.)
 
- Comparison Expression
- A relation that is either true or false, such as (a < b). 
Comparison expressions are used inif,while,do,
andforstatements, and in patterns to select which input records to process. 
(See Variable Typing and Comparison Expressions.)
 
- Curly Braces
- The characters {and}.  Curly braces are used inawkfor delimiting actions, compound statements, and function
bodies.
 
- Dark Corner
- An area in the language where specifications often were (or still
are) not clear, leading to unexpected or undesirable behavior. 
Such areas are marked in this Web page with
"(d.c.)" in the text
and are indexed under the heading "dark corner."
 
- Data Driven
- A description of awkprograms, where you specify the data you
are interested in processing, and what to do when that data is seen.
 
- Data Objects
- These are numbers and strings of characters.  Numbers are converted into
strings and vice versa, as needed. 
(See Conversion of Strings and Numbers.)
 
- Deadlock
- The situation in which two communicating processes are each waiting
for the other to perform an action.
 
- Double-Precision
- An internal representation of numbers that can have fractional parts. 
Double-precision numbers keep track of more digits than do single-precision
numbers, but operations on them are sometimes more expensive.  This is the way
awkstores numeric values.  It is the C typedouble.
 
- Dynamic Regular Expression
- A dynamic regular expression is a regular expression written as an
ordinary expression.  It could be a string constant, such as
"foo", but it may also be an expression whose value can vary. 
(See Using Dynamic Regexps.)
 
- Environment
- A collection of strings, of the form name=val, that each
program has available to it. Users generally place values into the
environment in order to provide information to various programs. Typical
examples are the environment variablesHOMEandPATH.
 
- Empty String
- See "Null String."
 
- Epoch
- The date used as the "beginning of time" for timestamps. 
Time values in Unix systems are represented as seconds since the epoch,
with library functions available for converting these values into
standard date and time formats.
The epoch on Unix and POSIX systems is 1970-01-01 00:00:00 UTC. 
See also "GMT" and "UTC."
 
 
- Escape Sequences
- A special sequence of characters used for describing nonprinting
characters, such as \nfor newline or\033for the ASCII
ESC (Escape) character. (See Escape Sequences.)
 
- FDL
- See "Free Documentation License."
 
- Field
- When awkreads an input record, it splits the record into pieces
separated by whitespace (or by a separator regexp that you can
change by setting the built-in variableFS).  Such pieces are
called fields.  If the pieces are of fixed length, you can use the built-in
variableFIELDWIDTHSto describe their lengths. 
(See Specifying How Fields Are Separated,
and
Reading Fixed-Width Data.)
 
- Flag
- A variable whose truth value indicates the existence or nonexistence
of some condition.
 
- Floating-Point Number
- Often referred to in mathematical terms as a "rational" or real number,
this is just a number that can have a fractional part. 
See also "Double-Precision" and "Single-Precision."
 
- Format
- Format strings are used to control the appearance of output in the
strftimeandsprintffunctions, and are used in theprintfstatement as well.  Also, data conversions from numbers to strings
are controlled by the format string contained in the built-in variableCONVFMT. (See Format-Control Letters.)
 
- Free Documentation License
- This document describes the terms under which this Web page
is published and may be copied. (See GNU Free Documentation License.)
 
- Function
- A specialized group of statements used to encapsulate general
or program-specific tasks.  awkhas a number of built-in
functions, and also allows you to define your own. 
(See Functions.)
 
- FSF
- See "Free Software Foundation."
 
- Free Software Foundation
- A nonprofit organization dedicated
to the production and distribution of freely distributable software. 
It was founded by Richard M. Stallman, the author of the original
Emacs editor.  GNU Emacs is the most widely used version of Emacs today.
 
- gawk
- The GNU implementation of awk.
 
- General Public License
- This document describes the terms under which gawkand its source
code may be distributed. (See GNU General Public License.)
 
- GMT
- "Greenwich Mean Time." 
This is the old term for UTC. 
It is the time of day used as the epoch for Unix and POSIX systems. 
See also "Epoch" and "UTC."
 
- GNU
- "GNU's not Unix".  An on-going project of the Free Software Foundation
to create a complete, freely distributable, POSIX-compliant computing
environment.
 
- GNU/Linux
- A variant of the GNU system using the Linux kernel, instead of the
Free Software Foundation's Hurd kernel. 
Linux is a stable, efficient, full-featured clone of Unix that has
been ported to a variety of architectures. 
It is most popular on PC-class systems, but runs well on a variety of
other systems too. 
The Linux kernel source code is available under the terms of the GNU General
Public License, which is perhaps its most important aspect.
 
- GPL
- See "General Public License."
 
- Hexadecimal
- Base 16 notation, where the digits are 0-9andA-F, withArepresenting 10,Brepresenting 11, and so on, up toFfor 15. 
Hexadecimal numbers are written in C using a leading0x,
to indicate their base.  Thus,0x12is 18 (1 times 16 plus 2).
 
- I/O
- Abbreviation for "Input/Output," the act of moving data into and/or
out of a running program.
 
- Input Record
- A single chunk of data that is read in by awk.  Usually, anawkinput
record consists of one line of text. 
(See How Input Is Split into Records.)
 
- Integer
- A whole number, i.e., a number that does not have a fractional part.
 
- Internationalization
- The process of writing or modifying a program so
that it can use multiple languages without requiring
further source code changes.
 
- Interpreter
- A program that reads human-readable source code directly, and uses
the instructions in it to process data and produce results. 
awkis typically (but not always) implemented as an interpreter. 
See also "Compiler."
 
- Interval Expression
- A component of a regular expression that lets you specify repeated matches of
some part of the regexp.  Interval expressions were not traditionally available
in awkprograms.
 
- ISO
- The International Standards Organization. 
This organization produces international standards for many things, including
programming languages, such as C and C++. 
In the computer arena, important standards like those for C, C++, and POSIX
become both American national and ISO international standards simultaneously. 
This Web page refers to Standard C as "ISO C" throughout.
 
- Keyword
- In the awklanguage, a keyword is a word that has special
meaning.  Keywords are reserved and may not be used as variable names.gawk's keywords are:BEGIN,END,if,else,while,do...while,for,for...in,break,continue,delete,next,nextfile,function,func,
andexit.
 
 
- Lesser General Public License
- This document describes the terms under which binary library archives
or shared objects,
and their source code may be distributed.
 
- Linux
- See "GNU/Linux."
 
- LGPL
- See "Lesser General Public License."
 
- Localization
- The process of providing the data necessary for an
internationalized program to work in a particular language.
 
- Logical Expression
- An expression using the operators for logic, AND, OR, and NOT, written
&&,||, and!inawk. Often called Boolean
expressions, after the mathematician who pioneered this kind of
mathematical logic.
 
- Lvalue
- An expression that can appear on the left side of an assignment
operator.  In most languages, lvalues can be variables or array
elements.  In awk, a field designator can also be used as an
lvalue.
 
- Matching
- The act of testing a string against a regular expression.  If the
regexp describes the contents of the string, it is said to match it.
 
- Metacharacters
- Characters used within a regexp that do not stand for themselves. 
Instead, they denote regular expression operations, such as repetition,
grouping, or alternation.
 
- Null String
- A string with no characters in it.  It is represented explicitly in
awkprograms by placing two double quote characters next to
each other ("").  It can appear in input data by having two successive
occurrences of the field separator appear next to each other.
 
- Number
- A numeric-valued data object.  Modern awkimplementations use
double-precision floating-point to represent numbers. 
Very oldawkimplementations use single-precision floating-point.
 
- Octal
- Base-eight notation, where the digits are 0-7. 
Octal numbers are written in C using a leading0,
to indicate their base.  Thus,013is 11 (one times 8 plus 3).
 
- P1003.2
- See "POSIX."
 
- Pattern
- Patterns tell awkwhich input records are interesting to which
rules.A pattern is an arbitrary conditional expression against which input is
tested.  If the condition is satisfied, the pattern is said to match
the input record.  A typical pattern might compare the input record against
a regular expression. (See Pattern Elements.)
 
 
- POSIX
- The name for a series of standards
that specify a Portable Operating System interface.  The "IX" denotes
the Unix heritage of these standards.  The main standard of interest for
awkusers is
IEEE Standard for Information Technology, Standard 1003.2-1992,
Portable Operating System Interface (POSIX) Part 2: Shell and Utilities. 
Informally, this standard is often referred to as simply "P1003.2."
 
- Precedence
- The order in which operations are performed when operators are used
without explicit parentheses.
 
- Private
- Variables and/or functions that are meant for use exclusively by library
functions and not for the main awkprogram. Special care must be
taken when naming such variables and functions. 
(See Naming Library Function Global Variables.)
 
- Range (of input lines)
- A sequence of consecutive lines from the input file(s).  A pattern
can specify ranges of input lines for awkto process or it can
specify single lines. (See Pattern Elements.)
 
- Recursion
- When a function calls itself, either directly or indirectly. 
If this isn't clear, refer to the entry for "recursion."
 
- Redirection
- Redirection means performing input from something other than the standard input
stream, or performing output to something other than the standard output stream.
You can redirect the output of the printandprintfstatements
to a file or a system command, using the>,>>,|, and|&operators.  You can redirect input to thegetlinestatement using
the<,|, and|&operators. 
(See Redirecting Output ofprintandprintf,
and Explicit Input withgetline.)
 
 
- Regexp
- Short for regular expression.  A regexp is a pattern that denotes a
set of strings, possibly an infinite set.  For example, the regexp
R.*xpmatches any string starting with the letterRand ending with the lettersxp.  Inawk, regexps are
used in patterns and in conditional expressions.  Regexps may contain
escape sequences. (See Regular Expressions.)
 
- Regular Expression
- See "regexp."
 
- Regular Expression Constant
- A regular expression constant is a regular expression written within
slashes, such as /foo/.  This regular expression is chosen
when you write theawkprogram and cannot be changed during
its execution. (See How to Use Regular Expressions.)
 
- Rule
- A segment of an awkprogram that specifies how to process single
input records.  A rule consists of a pattern and an action.awkreads an input record; then, for each rule, if the input record
satisfies the rule's pattern,awkexecutes the rule's action. 
Otherwise, the rule does nothing for that input record.
 
- Rvalue
- A value that can appear on the right side of an assignment operator. 
In awk, essentially every expression has a value. These values
are rvalues.
 
- Scalar
- A single value, be it a number or a string. 
Regular variables are scalars; arrays and functions are not.
 
- Search Path
- In gawk, a list of directories to search forawkprogram source files. 
In the shell, a list of directories to search for executable programs.
 
- Seed
- The initial value, or starting point, for a sequence of random numbers.
 
- sed
- See "Stream Editor."
 
- Shell
- The command interpreter for Unix and POSIX-compliant systems. 
The shell works both interactively, and as a programming language
for batch files, or shell scripts.
 
- Short-Circuit
- The nature of the awklogical operators&&and||. 
If the value of the entire expression is determinable from evaluating just
the lefthand side of these operators, the righthand side is not
evaluated. 
(See Boolean Expressions.)
 
- Side Effect
- A side effect occurs when an expression has an effect aside from merely
producing a value.  Assignment expressions, increment and decrement
expressions, and function calls have side effects. 
(See Assignment Expressions.)
 
- Single-Precision
- An internal representation of numbers that can have fractional parts. 
Single-precision numbers keep track of fewer digits than do double-precision
numbers, but operations on them are sometimes less expensive in terms of CPU time. 
This is the type used by some very old versions of awkto store
numeric values.  It is the C typefloat.
 
- Space
- The character generated by hitting the space bar on the keyboard.
 
- Special File
- A file name interpreted internally by gawk, instead of being handed
directly to the underlying operating system--for example,/dev/stderr. 
(See Special File Names ingawk.)
 
- Stream Editor
- A program that reads records from an input stream and processes them one
or more at a time.  This is in contrast with batch programs, which may
expect to read their input files in entirety before starting to do
anything, as well as with interactive programs which require input from the
user.
 
- String
- A datum consisting of a sequence of characters, such as I am a
string.  Constant strings are written with double quotes in theawklanguage and may contain escape sequences. 
(See Escape Sequences.)
 
- Tab
- The character generated by hitting the TAB key on the keyboard. 
It usually expands to up to eight spaces upon output.
 
- Text Domain
- A unique name that identifies an application. 
Used for grouping messages that are translated at runtime
into the local language.
 
- Timestamp
- A value in the "seconds since the epoch" format used by Unix
and POSIX systems.  Used for the gawkfunctionsmktime,strftime, andsystime. 
See also "Epoch" and "UTC."
 
- Unix
- A computer operating system originally developed in the early 1970's at
AT&T Bell Laboratories.  It initially became popular in universities around
the world and later moved into commercial environments as a software
development system and network server system. There are many commercial
versions of Unix, as well as several work-alike systems whose source code
is freely available (such as GNU/Linux, NetBSD, FreeBSD, and OpenBSD).
 
- UTC
- The accepted abbreviation for "Universal Coordinated Time." 
This is standard time in Greenwich, England, which is used as a
reference time for day and date calculations. 
See also "Epoch" and "GMT."
 
- Whitespace
- A sequence of space, TAB, or newline characters occurring inside an input
record or a string.