Read LPI Linux Certification in a Nutshell Online
Authors: Adam Haeder; Stephen Addison Schneiter; Bruno Gomes Pessanha; James Stanger
Tags: #Reference:Computers
grep
grep [options
]regex
[files
]
Searchfiles
or standard input for
lines containing a match to regular expressionregex
. By default, matching lines will be
displayed and nonmatching lines will not be displayed. When multiple
files are specified,
grep
displays the filename
as a prefix to the output lines (use the
-h
option to
suppress filename prefixes).
Display only a count of matched lines, but not the lines
themselves.
Display matched lines, but do not include filenames for
multiple file input.
Ignore uppercase and lowercase distinctions, allowingabc
to match bothabc
andABC
.
Display matched lines prefixed with their line numbers.
When used with multiple files,
both
the
filename and line number are prefixed.
Print all lines that
do not
matchregex
. This is an important and
useful option. You’ll want to use regular expressions not only
to
select
information but also to
eliminate
information. Using
-v
inverts the output this way.
Interpretregex
as an
extended regular expression. This makes
grep
behave as if it were
egrep
.
Since regular expressions can contain both metacharacters and
literals,
grep
can be used with an entirely
literalregex
. For example, to find all
lines in
file1
that contain either
Linux
or
linux
, you could
use
grep
like this:
$grep -i linux file1
In this example, theregex
is
simplylinux
. The uppercaseL
inLinux
will match since the command-line
option
-i
was specified. This is fine for
literal expressions that are common. However, in situations in whichregex
includes regular expression
metacharacters that are also shell special characters (such as$
or*
), theregex
must be quoted to prevent shell expansion and pass the
metacharacters on to
grep
.
As a simplistic example of this, suppose you have files in
your local directory named
abc
,
abc1
, and
abc2
. When
combined with
bash
’s
echo
built-in command, theabc*
wildcard expression lists all files that begin withabc
, as follows:
$echo abc*
abc abc1 abc2
Now, suppose that these files contain lines with the stringsabc
,abcc
,abccc
, and so on, and you wish to use
grep
to find them. You can use the shell
wildcard expressionabc*
to
expand to all the files that start withabc
as displayed with
echo
in the previous example, and you’d use an
identical regular expressionabc*
to find all occurrences of lines containingabc
,abcc
,abccc
, etc. Without using quotes to
prevent shell expansion, the command would be:
$grep abc* abc*
After shell expansion, this yields:
grep abc abc1 abc2 abc abc1 abc2
no!
This is
not
what you intended!
grep
would search for the literal expressionabc
, because it appears as the
first command argument. Instead, quote the regular expression with
single or double quotes to protect it (the difference between single
quotes and double quotes on the command line is subtle and is
explained later in this section):
$grep 'abc*' abc*
or:
$grep "abc*" abc*
After expansion, both examples yield the same results:
grep abc* abc abc1 abc2
Now this is what you’re after. The three files
abc
,
abc1
, and
abc2
will be searched for the regular
expressionabc*
. It is good to
stay in the habit of quoting regular expressions on the command line
to avoid these problems; they won’t be at all obvious, because the
shell expansion is invisible to you unless you use the
echo
command.
On the Exam
The use of
grep
and its options is
common. You should be familiar with what each option does, as well
as the concept of piping the results of other commands into
grep
for matching.
sed
sed [options
] 'command1
' [files
]
sed [options
] -e 'command1
' [-e 'command2
'...] [files
]
sed [options
] -fscript
[files
]
The first form invokes
sed
with a
one-linecommand1
. The second form
invokes
sed
with two (or more) commands. Note
that in this case the
-e
parameter is required
for each command specified. The commands are specified in quotes to
prevent the shell from interpreting and expanding them. The last
form instructs
sed
to take editing commands
from filescript
(which does not need to
be executable). In all cases, iffiles
are not specified, input is taken from standard input. If multiplefiles
are specified, the edited output of
each successive file is concatenated.
cmd
The
-e
option specifies that the
next argument (cmd
) is a
sed
command (or a series of commands).
When specifying only one string of commands, the
-e
is optional.
file
file
is a
sed
script.
Treat all substitutions as global.
The
sed
utility operates on text through
the use of
addresses
and
editing commands
. The address is
used to locate lines of text to be operated on, and editing commands
modify text. During operation, each line (that is, text separated by
newline characters) of input to
sed
is
processed individually and without regard to adjacent lines. If
multiple editing commands are to be used (through the use of a
script file or multiple
-e
options), they are
all applied in order to each line before moving on to the next
line.
Addresses in
sed
locate lines of text to
which commands will be applied. The addresses can be:
A line number (note that
sed
counts
lines continuously across multiple input files). The symbol$
can be used to indicate the
last line of input. A range of line numbers can be given by
separating the starting and ending lines with a comma
(start
,
end
). So,
for example, the address for all input would be1,$
.
A regular expression delimited by forward slashes
(/
regex
/
).
A line number with an interval. The form isn
~
s
, wheren
is the starting line number ands
is the step, or interval, to apply.
For example, to match every odd line in the input, the address
specification would be1~2
(start at line 1 and match every two lines thereafter). This
feature is a GNU extension to
sed
.
If no address is given, commands are applied to all input
lines by default. Any address may be followed by the!
character,
applying commands to lines that
do not match
the address.
Thesed
command immediately
follows the address specification if present. Commands generally
consist of a single letter or symbol, unless they have arguments.
Following are some basicsed
editing commands to get you started.
Delete lines.
Make substitutions. This is a very popular
sed
command. The syntax is as
follows:
s/pattern
/replacement
/[flags
]
The followingflags
can be
specified for thes
command:
Replace all instances ofpattern
, not just the
first.
Replacen
th instance ofpattern
; the default is
1.
Print the line if a successful substitution is
done. Generally used with the
-n
command-line
option.
file
Print the line tofile
if a successful substitution is done.
Translate characters. This command works in a
fashion similar to the
tr
command,
described earlier.
Delete lines 3 through 5 of
file1
:
$sed '3,5d' file1
Delete lines of
file1
that contain a#
at the beginning of the
line:
$sed '/^#/d' file1
Translate characters:
y/abc/xyz/
Every instance ofa
is
translated tox
,b
toy
,
andc
toz
.
Write the@
symbol for all
empty lines in
file1
(that is, lines with only
a newline character but nothing more):
$sed 's/^$/@/' file1
Remove all double quotation marks from all lines in
file1
:
$sed 's/"//g' file1
Using
sed
commands from external file
sedcmds
, replace the third and fourth double
quotation marks with(
and)
on lines 1 through 10 in
file1
. Make no changes from line 11 to the end
of the file. Script file
sedcmds
contains:
1,10{
s/"/(/3
s/"/)/4
}
The command is executed using the
-f
option:
$sed -f sedcmds file1
This example employs the positional flag for thes
(substitute) command. The first of the
two commands substitutes(
for
the third double-quote character. The next command substitutes)
for the fourth double-quote
character. Note, however, that the position count is interpreted
independently
for each subsequent command in
the script. This is important because each command operates on the
results of the commands preceding it. In this example, since the
third double quote has been replaced with(
, it is no longer counted as a double
quote by the second command. Thus, the second command will operate
on the
fifth
double quote character in the
original
file1
. If the input line starts out
with the following:
""""""
after the first command, which operates on the third double
quote, the result is this:
""("""
At this point, the numbering of the double-quote characters
has changed, and the fourth double quote in the line is now the
fifth character. Thus, after the second command executes, the output
is as follows:
""(")"
As you can see, creating scripts withsed
requires that the sequential nature of
the command execution be kept in mind.
If you find yourself making repetitive changes to many files
on a regular basis, ased
script
is probably warranted. Many more commands are available insed
than are listed here.