|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Archive-Name: editor-faq/sed
|
|
 |
1bfe7ce |
Posting-Frequency: irregular
|
|
 |
1bfe7ce |
Last-modified: 10 March 2003
|
|
 |
1bfe7ce |
Version: 015
|
|
 |
1bfe7ce |
URL: http://sed.sourceforge.net/sedfaq.html
|
|
 |
1bfe7ce |
Maintainer: Eric Pement (pemente@northpark.edu)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
THE SED FAQ
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Frequently Asked Questions about
|
|
 |
1bfe7ce |
sed, the stream editor
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
CONTENTS
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1. GENERAL INFORMATION
|
|
 |
1bfe7ce |
1.1. Introduction - How this FAQ is organized
|
|
 |
1bfe7ce |
1.2. Latest version of the sed FAQ
|
|
 |
1bfe7ce |
1.3. FAQ revision information
|
|
 |
1bfe7ce |
1.4. How do I add a question/answer to the sed FAQ?
|
|
 |
1bfe7ce |
1.5. FAQ abbreviations
|
|
 |
1bfe7ce |
1.6. Credits and acknowledgements
|
|
 |
1bfe7ce |
1.7. Standard disclaimers
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2. BASIC SED
|
|
 |
1bfe7ce |
2.1. What is sed?
|
|
 |
1bfe7ce |
2.2. What versions of sed are there, and where can I get them?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1. Free versions
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1.1. Unix platforms
|
|
 |
1bfe7ce |
2.2.1.2. OS/2
|
|
 |
1bfe7ce |
2.2.1.3. Microsoft Windows (Win3x, Win9x, WinNT, Win2K)
|
|
 |
1bfe7ce |
2.2.1.4. MS-DOS
|
|
 |
1bfe7ce |
2.2.1.5. CP/M
|
|
 |
1bfe7ce |
2.2.1.6. Macintosh v8 or v9
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.2. Shareware and Commercial versions
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.2.1. Unix platforms
|
|
 |
1bfe7ce |
2.2.2.2. OS/2
|
|
 |
1bfe7ce |
2.2.2.3. Windows 95/98, Windows NT, Windows 2000
|
|
 |
1bfe7ce |
2.2.2.4. MS-DOS
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.3. Where can I learn to use sed?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.3.1. Books
|
|
 |
1bfe7ce |
2.3.2. Mailing list
|
|
 |
1bfe7ce |
2.3.3. Tutorials, electronic text
|
|
 |
1bfe7ce |
2.3.4. General web and ftp sites
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3. TECHNICAL
|
|
 |
1bfe7ce |
3.1. More detailed explanation of basic sed
|
|
 |
1bfe7ce |
3.1.1. Regular expressions on the left side of "s///"
|
|
 |
1bfe7ce |
3.1.2. Escape characters on the right side of "s///"
|
|
 |
1bfe7ce |
3.1.3. Substitution switches
|
|
 |
1bfe7ce |
3.2. Common one-line sed scripts. How do I . . . ?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
- double/triple-space a file?
|
|
 |
1bfe7ce |
- convert DOS/Unix newlines?
|
|
 |
1bfe7ce |
- delete leading/trailing spaces?
|
|
 |
1bfe7ce |
- do substitutions on all/certain lines?
|
|
 |
1bfe7ce |
- delete consecutive blank lines?
|
|
 |
1bfe7ce |
- delete blank lines at the top/end of the file?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.3. Addressing and address ranges
|
|
 |
1bfe7ce |
3.4. Address ranges in GNU sed and HHsed
|
|
 |
1bfe7ce |
3.5. Debugging sed scripts
|
|
 |
1bfe7ce |
3.6. Notes about s2p, the sed-to-perl translator
|
|
 |
1bfe7ce |
3.7. GNU/POSIX extensions to regular expressions
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4. EXAMPLES
|
|
 |
1bfe7ce |
ONE-CHARACTER QUESTIONS
|
|
 |
1bfe7ce |
4.1. How do I insert a newline into the RHS of a substitution?
|
|
 |
1bfe7ce |
4.2. How do I represent control-codes or non-printable characters?
|
|
 |
1bfe7ce |
4.3. How do I convert files with toggle characters, like +this+,
|
|
 |
1bfe7ce |
to look like [i]this[/i]?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
CHANGING STRINGS
|
|
 |
1bfe7ce |
4.10. How do I perform a case-insensitive search?
|
|
 |
1bfe7ce |
4.11. How do I match only the first occurrence of a pattern?
|
|
 |
1bfe7ce |
4.12. How do I parse a comma-delimited (CSV) data file?
|
|
 |
1bfe7ce |
4.13. How do I handle fixed-length, columnar data?
|
|
 |
1bfe7ce |
4.14. How do I commify a string of numbers?
|
|
 |
1bfe7ce |
4.15. How do I prevent regex expansion on substitutions?
|
|
 |
1bfe7ce |
4.16. How do I convert a string to all lowercase or capital letters?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
CHANGING BLOCKS (consecutive lines)
|
|
 |
1bfe7ce |
4.20. How do I change only one section of a file?
|
|
 |
1bfe7ce |
4.21. How do I delete or change a block of text if the block contains
|
|
 |
1bfe7ce |
a certain regular expression?
|
|
 |
1bfe7ce |
4.22. How do I locate a paragraph of text if the paragraph contains a
|
|
 |
1bfe7ce |
certain regular expression?
|
|
 |
1bfe7ce |
4.23. How do I match a block of specific consecutive lines?
|
|
 |
1bfe7ce |
4.23.1. Try to use a "/range/, /expression/"
|
|
 |
1bfe7ce |
4.23.2. Try to use a "multi-line\nexpression"
|
|
 |
1bfe7ce |
4.23.3. Try to use a block of "literal strings"
|
|
 |
1bfe7ce |
4.24. How do I address all the lines between RE1 and RE2, excluding the lines themselves?
|
|
 |
1bfe7ce |
4.25. How do I join two lines if line #1 ends in a [certain string]?
|
|
 |
1bfe7ce |
4.26. How do I join two lines if line #2 begins in a [certain string]?
|
|
 |
1bfe7ce |
4.27. How do I change all paragraphs to long lines?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
SHELL AND ENVIRONMENT
|
|
 |
1bfe7ce |
4.30. How do I read environment variables with sed ...
|
|
 |
1bfe7ce |
4.31.1. ... on Unix platforms?
|
|
 |
1bfe7ce |
4.31.2. ... on MS-DOS or 4DOS platforms?
|
|
 |
1bfe7ce |
4.32. How do I export or pass variables back into the environment ...
|
|
 |
1bfe7ce |
4.32.1. ... on Unix platforms?
|
|
 |
1bfe7ce |
4.32.2. ... on MS-DOS or 4DOS platforms?
|
|
 |
1bfe7ce |
4.33. How do I handle shell quoting in sed?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
FILES, DIRECTORIES, AND PATHS
|
|
 |
1bfe7ce |
4.40. How do I read (insert/add) a file at the top of a textfile?
|
|
 |
1bfe7ce |
4.41. How do I make substitutions in every file in a directory, or
|
|
 |
1bfe7ce |
in a complete directory tree?
|
|
 |
1bfe7ce |
4.41.1. ... ssed solution
|
|
 |
1bfe7ce |
4.41.2. ... Unix solution
|
|
 |
1bfe7ce |
4.41.3. ... DOS solution
|
|
 |
1bfe7ce |
4.42. How do I replace "/some/UNIX/path" in a substitution?
|
|
 |
1bfe7ce |
4.43. How do I replace "C:\SOME\DOS\PATH" in a substitution?
|
|
 |
1bfe7ce |
4.44. How do I emulate file-includes, using sed?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
5. WHY ISN'T THIS WORKING?
|
|
 |
1bfe7ce |
5.1. Why don't my variables like $var get expanded in my sed script?
|
|
 |
1bfe7ce |
5.2. I'm using 'p' to print, but I have duplicate lines sometimes.
|
|
 |
1bfe7ce |
5.3. Why does my DOS version of sed process a file part-way through
|
|
 |
1bfe7ce |
and then quit?
|
|
 |
1bfe7ce |
5.4. My RE isn't matching/deleting what I want it to. (Or, "Greedy vs.
|
|
 |
1bfe7ce |
stingy pattern matching")
|
|
 |
1bfe7ce |
5.5. What is CSDPMI*B.ZIP and why do I need it?
|
|
 |
1bfe7ce |
5.6. Where are the man pages for GNU sed?
|
|
 |
1bfe7ce |
5.7. How do I tell what version of sed I am using?
|
|
 |
1bfe7ce |
5.8. Does sed issue an exit code?
|
|
 |
1bfe7ce |
5.9. The 'r' command isn't inserting the file into the text.
|
|
 |
1bfe7ce |
5.10. Why can't I match or delete a newline using the \n escape
|
|
 |
1bfe7ce |
sequence? Why can't I match 2 or more lines using \n?
|
|
 |
1bfe7ce |
5.11. My script aborts with an error message, "event not found".
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
6. OTHER ISSUES
|
|
 |
1bfe7ce |
6.1. I have a problem that stumps me. Where can I get help?
|
|
 |
1bfe7ce |
6.2. How does sed compare with awk, perl, and other utilities?
|
|
 |
1bfe7ce |
6.3. When should I use sed?
|
|
 |
1bfe7ce |
6.4. When should I NOT use sed?
|
|
 |
1bfe7ce |
6.5. When should I ignore sed and use Awk or Perl instead?
|
|
 |
1bfe7ce |
6.6. Known limitations among sed versions
|
|
 |
1bfe7ce |
6.7. Known incompatibilities between sed versions
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
6.7.1. Issuing commands from the command line
|
|
 |
1bfe7ce |
6.7.2. Using comments (prefixed by the '#' sign)
|
|
 |
1bfe7ce |
6.7.3. Special syntax in REs
|
|
 |
1bfe7ce |
6.7.4. Word boundaries
|
|
 |
1bfe7ce |
6.7.5. Commands which operate differently
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
7. KNOWN BUGS AMONG SED VERSIONS
|
|
 |
1bfe7ce |
7.1. ssed v3.59
|
|
 |
1bfe7ce |
7.2. GNU sed v4.0 - v4.0.5
|
|
 |
1bfe7ce |
7.3. GNU sed v3.02.80
|
|
 |
1bfe7ce |
7.4. GNU sed v3.02
|
|
 |
1bfe7ce |
7.5. GNU sed v2.05
|
|
 |
1bfe7ce |
7.6. GNU sed v1.18
|
|
 |
1bfe7ce |
7.7. GNU sed v1.03
|
|
 |
1bfe7ce |
7.8. sed v1.6 (Briscoe)
|
|
 |
1bfe7ce |
7.9. sed v1.5 (Helman)
|
|
 |
1bfe7ce |
7.10. sedmod v1.0 (Chen)
|
|
 |
1bfe7ce |
7.11. HP-UX sed
|
|
 |
1bfe7ce |
7.12. SunOS sed v4.1
|
|
 |
1bfe7ce |
7.13. SunOS sed v5.6
|
|
 |
1bfe7ce |
7.14. Ultrix sed v4.3
|
|
 |
1bfe7ce |
7.15. Digital Unix sed
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
------------------------------
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1. GENERAL INFORMATION
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1.1. Introduction - How this FAQ is organized
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
This FAQ is organized to answer common (and some uncommon)
|
|
 |
1bfe7ce |
questions about sed, quickly. If you see a term or abbreviation in
|
|
 |
1bfe7ce |
the examples that seems unclear, see if the term is defined in
|
|
 |
1bfe7ce |
section 1.5. If not, send your comment to pemente[at]northpark.edu.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1.2. Latest version of the sed FAQ
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The newest version of the sed FAQ is usually here:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/sedfaq.html (HTML version)
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/sedfaq.txt (plain text)
|
|
 |
1bfe7ce |
http://www.student.northpark.edu/pemente/sed/sedfaq.html
|
|
 |
1bfe7ce |
http://www.student.northpark.edu/pemente/sed/sedfaq.txt
|
|
 |
1bfe7ce |
http://www.faqs.org/faqs/editor-faq/sed
|
|
 |
1bfe7ce |
ftp://rtfm.mit.edu/pub/faqs/editor-faq/sed
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Another FAQ file on sed by a different author can be found here:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://www.dreamwvr.com/sed-info/sed-faq.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1.3. FAQ revision information
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
In the plaintext version, changes are shown by a vertical bar (|)
|
|
 |
1bfe7ce |
placed in column 78 of the affected lines. To remove the vertical
|
|
 |
1bfe7ce |
bars (use double quotes for MS-DOS):
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed 's/ *|$//' sedfaq.txt > sedfaq2.txt
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
In the HTML version, vertical bars do not appear. New or altered
|
|
 |
1bfe7ce |
portions of the FAQ are indicated by printing in dark blue type.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
In the text version, words needing emphasis may be surrounded by
|
|
 |
1bfe7ce |
the underscore '_' or the asterisk '*'. In the HTML version, these
|
|
 |
1bfe7ce |
are changed to italics and boldface, respectively.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1.4. How do I add a question/answer to the sed FAQ?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Word your question briefly and send it to pemente[at]northpark.edu,
|
|
 |
1bfe7ce |
indicating your proposed change. We'll post it on the sed-users
|
|
 |
1bfe7ce |
mailing list (see section 2.3.2) and discuss it there. If it's
|
|
 |
1bfe7ce |
good, your contribution will be added to the next edition.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1.5. FAQ abbreviations
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
files = one or more filenames, separated by whitespace
|
|
 |
1bfe7ce |
gsed = GNU sed
|
|
 |
1bfe7ce |
ssed = super-sed
|
|
 |
1bfe7ce |
RE = Regular Expressions supported by sed
|
|
 |
1bfe7ce |
LHS = the left-hand side ("find" part) of "s/find/repl/" command
|
|
 |
1bfe7ce |
RHS = the right-hand side ("replace" part) of "s/find/repl/" cmd
|
|
 |
1bfe7ce |
nn+ = version _nn_ or higher (e.g., "15+" = version 1.5 and above)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
files: "files" stands for one or more filenames entered on the
|
|
 |
1bfe7ce |
command line. The names may include any wildcards your shell
|
|
 |
1bfe7ce |
understands (such as ``zork*'' or ``Aug[4-9].let''). Sed will
|
|
 |
1bfe7ce |
process each filename passed to it by the shell.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
RE: For details on regular expressions, see section 3.1.1., below.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1.6. Credits and acknowledgements
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Many of the ideas for this FAQ were taken from the Awk FAQ:
|
|
 |
1bfe7ce |
http://www.faqs.org/faqs/computer-lang/awk/faq/
|
|
 |
1bfe7ce |
ftp://rtfm.mit.edu/pub/usenet/comp.lang.awk/faq
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
and from the old Perl FAQ:
|
|
 |
1bfe7ce |
http://www.perl.com/doc/FAQs/FAQ/oldfaq-html/index.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The following individuals have contributed significantly to this
|
|
 |
1bfe7ce |
document, and have provided input and wording suggestions for
|
|
 |
1bfe7ce |
questions, answers, and script examples. Credit goes to these
|
|
 |
1bfe7ce |
contributors (in alphabetical order by last name):
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Al Aab, Yiorgos Adamopoulos, Paolo Bonzini, Walter Briscoe, Jim
|
|
 |
1bfe7ce |
Dennis, Carlos Duarte, Otavio Exel, Sven Guckes, Aurelio Jargas,
|
|
 |
1bfe7ce |
Mark Katz, Toby Kelsey, Eric Pement, Greg Pfeiffer, Ken Pizzini,
|
|
 |
1bfe7ce |
Niall Smart, Simon Taylor, Peter Tillier, Greg Ubben, Laurent
|
|
 |
1bfe7ce |
Vogel.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1.7. Standard disclaimers
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
While a serious attempt has been made to ensure the accuracy of the
|
|
 |
1bfe7ce |
information presented herein, the contributors and maintainers of
|
|
 |
1bfe7ce |
this document do not claim the absence of errors and make no
|
|
 |
1bfe7ce |
warranties on the information provided. If you notice any mistakes,
|
|
 |
1bfe7ce |
please let us know so we can fix it.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
------------------------------
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2. BASIC SED
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.1. What is sed?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"sed" stands for Stream EDitor. Sed is a non-interactive editor,
|
|
 |
1bfe7ce |
written by the late Lee E. McMahon in 1973 or 1974. A brief history
|
|
 |
1bfe7ce |
of sed's origins may be found in an early history of the Unix
|
|
 |
1bfe7ce |
tools, at <http://www.columbia.edu/~rh120/ch106.x09>.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Instead of altering a file interactively by moving the cursor on
|
|
 |
1bfe7ce |
the screen (as with a word processor), the user sends a script of
|
|
 |
1bfe7ce |
editing instructions to sed, plus the name of the file to edit (or
|
|
 |
1bfe7ce |
the text to be edited may come as output from a pipe). In this
|
|
 |
1bfe7ce |
sense, sed works like a filter -- deleting, inserting and changing
|
|
 |
1bfe7ce |
characters, words, and lines of text. Its range of activity goes
|
|
 |
1bfe7ce |
from small, simple changes to very complex ones.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed reads its input from stdin (Unix shorthand for "standard
|
|
 |
1bfe7ce |
input," i.e., the console) or from files (or both), and sends the
|
|
 |
1bfe7ce |
results to stdout ("standard output," normally the console or
|
|
 |
1bfe7ce |
screen). Most people use sed first for its substitution features.
|
|
 |
1bfe7ce |
Sed is often used as a find-and-replace tool.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed 's/Glenn/Harold/g' oldfile >newfile
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
will replace every occurrence of "Glenn" with the word "Harold",
|
|
 |
1bfe7ce |
wherever it occurs in the file. The "find" portion is a regular
|
|
 |
1bfe7ce |
expression ("RE"), which can be a simple word or may contain
|
|
 |
1bfe7ce |
special characters to allow greater flexibility (for example, to
|
|
 |
1bfe7ce |
prevent "Glenn" from also matching "Glennon").
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
My very first use of sed was to add 8 spaces to the left side of a
|
|
 |
1bfe7ce |
file, so when I printed it, the printing wouldn't begin at the
|
|
 |
1bfe7ce |
absolute left edge of a piece of paper.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed 's/^/ /' myfile >newfile # my first sed script
|
|
 |
1bfe7ce |
sed 's/^/ /' myfile | lp # my next sed script
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Then I learned that sed could display only one paragraph of a file,
|
|
 |
1bfe7ce |
beginning at the phrase "and where it came" and ending at the
|
|
 |
1bfe7ce |
phrase "for all people". My script looked like this:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed -n '/and where it came/,/for all people/p' myfile
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed's normal behavior is to print (i.e., display or show on screen)
|
|
 |
1bfe7ce |
the entire file, including the parts that haven't been altered,
|
|
 |
1bfe7ce |
unless you use the -n switch. The "-n" stands for "no output". This
|
|
 |
1bfe7ce |
switch is almost always used in conjunction with a 'p' command
|
|
 |
1bfe7ce |
somewhere, which says to print only the sections of the file that
|
|
 |
1bfe7ce |
have been specified. The -n switch with the 'p' command allow for
|
|
 |
1bfe7ce |
parts of a file to be printed (i.e., sent to the console).
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Next, I found that sed could show me only (say) lines 12-18 of a
|
|
 |
1bfe7ce |
file and not show me the rest. This was very handy when I needed to
|
|
 |
1bfe7ce |
review only part of a long file and I didn't want to alter it.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# the 'p' stands for print
|
|
 |
1bfe7ce |
sed -n 12,18p myfile
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Likewise, sed could show me everything else BUT those particular
|
|
 |
1bfe7ce |
lines, without physically changing the file on the disk:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# the 'd' stands for delete
|
|
 |
1bfe7ce |
sed 12,18d myfile
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed could also double-space my single-spaced file when it came time
|
|
 |
1bfe7ce |
to print it:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed G myfile >newfile
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If you have many editing commands (for deleting, adding,
|
|
 |
1bfe7ce |
substituting, etc.) which might take up several lines, those
|
|
 |
1bfe7ce |
commands can be put into a separate file and all of the commands in
|
|
 |
1bfe7ce |
the file applied to file being edited:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# 'script.sed' is the file of commands
|
|
 |
1bfe7ce |
# 'myfile' is the file being changed
|
|
 |
1bfe7ce |
sed -f script.sed myfile # 'script.sed' is the file of commands
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
It is not our intention to convert this FAQ file into a full-blown
|
|
 |
1bfe7ce |
sed tutorial (for good tutorials, see section 2.3). Rather, we hope
|
|
 |
1bfe7ce |
this gives the complete novice a few ideas of how sed can be used.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2. What versions of sed are there, and where can I get them?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1. Free versions
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note: "Free" does not mean "public domain" nor does it necessarily
|
|
 |
1bfe7ce |
mean you will never be charged for it. All versions of sed in this
|
|
 |
1bfe7ce |
section except the CP/M versions are based on the GNU general
|
|
 |
1bfe7ce |
public license and are "free software" by that standard (for
|
|
 |
1bfe7ce |
details, see http://www.gnu.org/philosophy/free-sw.html). This
|
|
 |
1bfe7ce |
means you can get the source code and develop it further.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
At the URLs listed in this category, sed binaries or source code
|
|
 |
1bfe7ce |
can be downloaded and used without fees or license payments.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1.1. Unix platforms
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ssed v3.60
|
|
 |
1bfe7ce |
ssed is the version recommended by the FAQ maintainers, since it
|
|
 |
1bfe7ce |
shares the same codebase with GNU sed, has the most options, and is
|
|
 |
1bfe7ce |
free software (you can get the source). Though there were earlier
|
|
 |
1bfe7ce |
version of ssed distributed, sites for these are not being listed.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/grabbag/ssed
|
|
 |
1bfe7ce |
http://freshmeat.net/project/sed/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v4.0.5
|
|
 |
1bfe7ce |
This is the latest official version of GNU sed. It offers in-place
|
|
 |
1bfe7ce |
text replacement as an option switch.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ftp://ftp.gnu.org/pub/gnu/sed/sed-4.0.5.tar.gz
|
|
 |
1bfe7ce |
http://freshmeat.net/project/sed
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
BSD multi-byte sed (Japanese)
|
|
 |
1bfe7ce |
Based on the latest version of GNU sed, which supports multi-byte
|
|
 |
1bfe7ce |
characters.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ftp://ftp1.freebsd.org/pub/FreeBSD/FreeBSD-stable/packages/Latest/ja-sed.tgz
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v3.02.80
|
|
 |
1bfe7ce |
An alpha test release which was the base for the development of
|
|
 |
1bfe7ce |
ssed and GNU sed v4.0.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ftp://alpha.gnu.org/pub/gnu/sed/sed-3.02.80.tar.gz
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v3.02a
|
|
 |
1bfe7ce |
Interim version with most features of GNU sed v3.02.80.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v3.02
|
|
 |
1bfe7ce |
ftp://ftp.gnu.org/pub/gnu/sed/sed-3.02.tar.gz
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Precompiled versions:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v3.02-8
|
|
 |
1bfe7ce |
source code and binaries for Debian GNU/Linux
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://www.debian.org/Packages/stable/base/sed.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
For some time, the GNU project <http://www.gnu.org> used Eric S.
|
|
 |
1bfe7ce |
Raymond's version of sed (ESR sed v1.1), but eventually dropped it
|
|
 |
1bfe7ce |
because it had too many built-in limits. In 1991 Howard Helman
|
|
 |
1bfe7ce |
modified the GNU/ESR sed and produced a flexible version of sed
|
|
 |
1bfe7ce |
v1.5 available at several sites (Helman's version permitted things
|
|
 |
1bfe7ce |
like \<...\> to delimit word boundaries, \xHH to enter hex code and
|
|
 |
1bfe7ce |
\n to indicate newlines in the replace string). This version did
|
|
 |
1bfe7ce |
not catch on with the GNU project and their version of sed has
|
|
 |
1bfe7ce |
moved in a similar but different direction.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed v1.3, by Eric Steven Raymond (released 4 June 1998)
|
|
 |
1bfe7ce |
http://catb.org/~esr/sed-1.3.tar.gz
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Eric Raymond <esr@snark.thyrsus.com> wrote one of the earliest
|
|
 |
1bfe7ce |
versions of sed. On his website <http://www.catb.org/~esr/> which
|
|
 |
1bfe7ce |
also distributes many freeware utilities he has written or worked
|
|
 |
1bfe7ce |
on, he describes sed v1.1 this way:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"This is the fast, small sed originally distributed in the GNU
|
|
 |
1bfe7ce |
toolkit and still distributed with Minix. The GNU people ditched it
|
|
 |
1bfe7ce |
when they built their own sed around an enhanced regex package --
|
|
 |
1bfe7ce |
but it's still better for some uses (in particular, faster and less
|
|
 |
1bfe7ce |
memory-intensive)." (Version 1.3 fixes an unidentified bug and adds
|
|
 |
1bfe7ce |
the L command to hexdump the current pattern space.)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1.2. OS/2
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v3.02.80
|
|
 |
1bfe7ce |
http://www2s.biglobe.ne.jp/~vtgf3mpr/gnu/sed.htm
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v3.02
|
|
 |
1bfe7ce |
http://hobbes.nmsu.edu/pub/os2/util/file/sed-3_02-r2-bin.zip # binaries
|
|
 |
1bfe7ce |
http://hobbes.nmsu.edu/pub/os2/util/file/sed-3_02-r2.zip # source
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1.3. Microsoft Windows (Win3x, Win9x, WinNT, Win2K)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v4.0.5
|
|
 |
1bfe7ce |
32-bit binaries and docs. Precompiled versions not available (yet).
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v3.02.80
|
|
 |
1bfe7ce |
32-bit binaries and docs, using DJGPP compiler. For details on new
|
|
 |
1bfe7ce |
features, see Unix section, above.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://www.student.northpark.edu/pemente/sed/sed3028a.zip # DOS binaries
|
|
 |
1bfe7ce |
ftp://alpha.gnu.org/pub/gnu/sed/sed-3.02.80.tar.gz # source
|
|
 |
1bfe7ce |
ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed3028b.zip # binaries
|
|
 |
1bfe7ce |
ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed3028d.zip # docs
|
|
 |
1bfe7ce |
ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed3028s.zip # source
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v2.05
|
|
 |
1bfe7ce |
32-bit binaries, no docs. Requires 80386 DX (SX will not run) and
|
|
 |
1bfe7ce |
must be run in a DOS window or in a full screen DOS session under
|
|
 |
1bfe7ce |
Microsoft Windows. Will not run in MS-DOS mode (outside Win/Win95).
|
|
 |
1bfe7ce |
We recommend using the latest version of GNU sed.
|
|
 |
1bfe7ce |
http://www.simtel.net/pub/win95/prog/gsed205b.zip
|
|
 |
1bfe7ce |
ftp://ftp.cdrom.com/pub/simtelnet/win95/prog/gsed205b.zip
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v1.03
|
|
 |
1bfe7ce |
modified by Frank Whaley.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
This version was part of the "Virtually UN*X" toolset, hosted by
|
|
 |
1bfe7ce |
itribe.net; that website is now closed. Gsed v1.03 supported Win9x
|
|
 |
1bfe7ce |
long filenames, as well as hex, decimal, binary, and octal
|
|
 |
1bfe7ce |
character representations.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The Cygwin toolkit:
|
|
 |
1bfe7ce |
http://www.cygwin.com
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Formerly know as "GNU-Win32 tools." According to their home page,
|
|
 |
1bfe7ce |
"The Cygwin tools are Win32 ports of the popular GNU development
|
|
 |
1bfe7ce |
tools for Windows NT, 95 and 98. They function through the use of
|
|
 |
1bfe7ce |
the Cygwin library which provides a UNIX-like API on top of the
|
|
 |
1bfe7ce |
Win32 API." The version of sed used is GNU sed v3.02.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Minimalist GNU for Windows (MinGW):
|
|
 |
1bfe7ce |
http://www.mingw.org
|
|
 |
1bfe7ce |
http://mingw.sourceforge.net
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
According to their home page, "MinGW ('Minimalist GNU for Windows')
|
|
 |
1bfe7ce |
refers to a set of runtime headers, used in building a compiler
|
|
 |
1bfe7ce |
system based on the GNU GCC and binutils projects. It compiles and
|
|
 |
1bfe7ce |
links code to be run on Win32 platforms ... MinGW uses Microsoft
|
|
 |
1bfe7ce |
runtime libraries, distributed with the Windows operating system."
|
|
 |
1bfe7ce |
The version of sed used is GNU sed v3.02.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed v1.5 (a/k/a HHsed), by Howard Helman
|
|
 |
1bfe7ce |
Compiled with Mingw32 for 32-bit environments described above. This
|
|
 |
1bfe7ce |
version should support Win95 long filenames.
|
|
 |
1bfe7ce |
http://www.dbnet.ece.ntua.gr/~george/sed/OLD/sed15.exe
|
|
 |
1bfe7ce |
http://www.student.northpark.edu/pemente/sed/sed15exe.zip
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1.4. MS-DOS
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed v1.6 (from HHsed), by Walter Briscoe
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
This is a forthcoming version, now in beta testing, but with many
|
|
 |
1bfe7ce |
new features. It corrects all the bugs in sed v1.5, and adds the
|
|
 |
1bfe7ce |
best features of sedmod v1.0 (below). It is available in 16-bit and
|
|
 |
1bfe7ce |
32-bit compiled versions for MS-DOS. Sorry, no URLs available yet.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed v1.5 (a/k/a HHsed), by Howard Helman
|
|
 |
1bfe7ce |
uncompiled source code (Turbo C)
|
|
 |
1bfe7ce |
ftp://ftp.simtel.net/pub/simtelnet/msdos/txtutl/sed15.zip
|
|
 |
1bfe7ce |
ftp://ftp.cdrom.com/pub/simtelnet/msdos/txtutl/sed15.zip
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
DOS executable and documentation
|
|
 |
1bfe7ce |
ftp://ftp.simtel.net/pub/simtelnet/msdos/txtutl/sed15x.zip
|
|
 |
1bfe7ce |
ftp://ftp.cdrom.com/pub/simtelnet/msdos/txtutl/sed15x.zip
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sedmod v1.0, by Hern Chen
|
|
 |
1bfe7ce |
http://www.ptug.org/sed/SEDMOD10.ZIP
|
|
 |
1bfe7ce |
http://www.student.northpark.edu/pemente/sed/sedmod10.zip
|
|
 |
1bfe7ce |
ftp://garbo.uwasa.fi/pc/unix/sedmod10.zip
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v3.02.80
|
|
 |
1bfe7ce |
See section 2.2.1.3 ("Microsoft Windows"), above.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v2.05
|
|
 |
1bfe7ce |
Does not run under MS-DOS.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v1.18
|
|
 |
1bfe7ce |
32-bit binaries and source, using DJGPP compiler. Requires 80386 SX
|
|
 |
1bfe7ce |
or better. Also requires 3 CWS*.EXE extenders on the path. See
|
|
 |
1bfe7ce |
section 5.5 ("What is CSDPMI*B.ZIP and why do I need it?"), below.
|
|
 |
1bfe7ce |
We recommend using a newer version of GNU sed.
|
|
 |
1bfe7ce |
http://www.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed118b.zip
|
|
 |
1bfe7ce |
ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2gnu/sed118b.zip
|
|
 |
1bfe7ce |
http://www.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed118s.zip
|
|
 |
1bfe7ce |
ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2gnu/sed118s.zip
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed v1.06
|
|
 |
1bfe7ce |
16-bit binaries and source. Should run under any MS-DOS system.
|
|
 |
1bfe7ce |
http://www.simtel.net/pub/gnu/gnuish/sed106.zip
|
|
 |
1bfe7ce |
ftp://ftp.cdrom.com/pub/simtelnet/gnu/gnuish/sed106.zip
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1.5. CP/M
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ssed v2.2, by Chuck A. Forsberg
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Written for CP/M, ssed (for "small/stupid stream editor) supports
|
|
 |
1bfe7ce |
only the a(ppend), c(hange), d(elete) and i(nsert) options, and
|
|
 |
1bfe7ce |
apparently doesn't support regular expressions. A -u switch will
|
|
 |
1bfe7ce |
"unsqueeze" compressed files and was used mainly in conjunction
|
|
 |
1bfe7ce |
with DIF.COM for source code maintenance. (file: ssed22.lbr)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
change, by Michael M. Rubenstein
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Rubenstein released a version of sed called CHANGE.COM (the
|
|
 |
1bfe7ce |
TTOOLS.LBR archive member CHANGE.CZM is a "crunched" file).
|
|
 |
1bfe7ce |
CHANGE.COM supports full RE's except grouping and backreferences,
|
|
 |
1bfe7ce |
and its only function is global substitution. (file: ttools.lbr)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.1.6. Macintosh v8 or v9
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Since sed is a command-line utility, it is not customary to think
|
|
 |
1bfe7ce |
of sed being used on a Mac. Nonetheless, the following instructions
|
|
 |
1bfe7ce |
from Aurelio Jargas describe the process for running sed on MacOS
|
|
 |
1bfe7ce |
version version 8 or 9.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(1) Download and install the Apple DiskCopy application
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ftp://ftp.apple.com/developer/Development_Kits
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(2) Download and install Apple MPW
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ftp://ftp.apple.com/developer/Tool_Chest/Core_Mac_OS_Tools/MPW_etc./
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(3) Download and expand Matthias Neeracher's GNU sed for MPW. (They
|
|
 |
1bfe7ce |
seem to have misnumbered the sed filename.)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ftp://sunsite.cnlab-switch.ch/software/platform/macos/src/mpw_c/sed-2.03.sit.bin
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(4) Enter the sed-3.02 directory and doubleclick the 'sed' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(5) MPW Shell will open up. It will be a command window instead of
|
|
 |
1bfe7ce |
a command line, but sed should work as expected. For example:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
echo aa | sed 's/a/Z/g'<ENTER>
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note that ENTER is different from RETURN on an iMac. Apple *also*
|
|
 |
1bfe7ce |
has its own version of sed on MPW, called "StreamEdit", with a
|
|
 |
1bfe7ce |
syntax fairly similar to that of normal sed.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.2. Shareware and Commercial versions
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.2.1. Unix platforms
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
[ Additional information needed. ]
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.2.2. OS/2
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Hamilton Labs:
|
|
 |
1bfe7ce |
http://www.hamiltonlabs.com/cshell.htm
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
A sizable set of Unix/C shell utilities designed for OS/2. Price is
|
|
 |
1bfe7ce |
$350 in the US, $395 elsewhere, with FedEx shipping, unconditional
|
|
 |
1bfe7ce |
guarantee, unlimited support and free updates. A demo version of
|
|
 |
1bfe7ce |
the suite can be downloaded from this site, but a stand-alone copy
|
|
 |
1bfe7ce |
of sed is not available.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.2.3. Windows 95/98, Windows NT, Windows 2000
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Hamilton Labs:
|
|
 |
1bfe7ce |
http://www.hamiltonlabs.com/cshell.htm
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
A sizable set of Unix/C shell utilities designed for Win9x, WinNT,
|
|
 |
1bfe7ce |
and Win2K. Price is $350 in the US, $395 elsewhere, with FedEx
|
|
 |
1bfe7ce |
shipping, unconditional guarantee, unlimited support and free
|
|
 |
1bfe7ce |
updates. A demo version of the suite can be downloaded from this
|
|
 |
1bfe7ce |
site, but a stand-alone copy of sed is not available.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Interix:
|
|
 |
1bfe7ce |
http://www.interix.com
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Interix (formerly known as OpenNT) is advertised as "a complete
|
|
 |
1bfe7ce |
UNIX system environment running natively on Microsoft Windows NT",
|
|
 |
1bfe7ce |
and is licensed and supported by Softway Systems. It offers over
|
|
 |
1bfe7ce |
200 Unix utilities, and supports Unix shells, sockets, networking,
|
|
 |
1bfe7ce |
and more. A single-user edition runs about $200. A free demo or
|
|
 |
1bfe7ce |
evaluation copy will run for 31 days and then quit; to continue
|
|
 |
1bfe7ce |
using it, you must purchase the commercial version.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
MKS NuTCRACKER Professional
|
|
 |
1bfe7ce |
http://www.datafocus.com/products/nutc/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
A different, yet related product line offered by MKS (Mortice Kern
|
|
 |
1bfe7ce |
Systems, below); the awkward spelling "NuTCRACKER" is intentional.
|
|
 |
1bfe7ce |
Various packages offer hundreds of Unix utilities for Win32
|
|
 |
1bfe7ce |
environments. Sed is not available as a separate product.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
UnixDos:
|
|
 |
1bfe7ce |
http://www.unixdos.com
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
UnixDos is a suite of 82 Unix utilities ported over to the Windows
|
|
 |
1bfe7ce |
environments. There are 16-bit versions for Win3.x and 32-bit
|
|
 |
1bfe7ce |
versions for WinNT/Win95. It is distributed as uncrippled shareware
|
|
 |
1bfe7ce |
for the first 30 days. After the test period, the utilities will
|
|
 |
1bfe7ce |
not run and you must pay the registration fee of $50.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Their version of sed supports "\n" in the RHS of expressions, and
|
|
 |
1bfe7ce |
increases the length of input lines to 10,000 characters. By
|
|
 |
1bfe7ce |
special arrangement with the owners, persons who want a licensed
|
|
 |
1bfe7ce |
version of sed *only* (without the other utilities) may pay a
|
|
 |
1bfe7ce |
license fee of $10.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
U/WIN:
|
|
 |
1bfe7ce |
http://www.research.att.com/sw/tools/uwin/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
U/WIN is a suite of Unix utilities created for WinNT and Win95
|
|
 |
1bfe7ce |
systems. It is owned by AT&T, created by David Korn (author of the
|
|
 |
1bfe7ce |
Unix korn shell), and is freely distributed only to educational
|
|
 |
1bfe7ce |
institutions, AT&T employees, or certain researchers; all others
|
|
 |
1bfe7ce |
must pay a fee after a 90-day evaluation period expires. U/WIN
|
|
 |
1bfe7ce |
operates best with the NTFS (WinNT file system) but will run in
|
|
 |
1bfe7ce |
degraded mode with the FAT file system and in further degraded mode
|
|
 |
1bfe7ce |
under Win95. A minimal installation takes about 25 to 30 megs of
|
|
 |
1bfe7ce |
disk space. Sed is not available as a separate file for download,
|
|
 |
1bfe7ce |
but comes with the suite.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.2.2.4. MS-DOS
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Mix C/Utilities Toolchest
|
|
 |
1bfe7ce |
http://www.mixsoftware.com/product/utility.htm
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
According to their web page, "The C/Utilities Toolchest adds over
|
|
 |
1bfe7ce |
40 powerful UNIX utilities to your MS-DOS operating system. The
|
|
 |
1bfe7ce |
result is an environment very similar to UNIX operating systems,
|
|
 |
1bfe7ce |
yet 100% compatible with MS-DOS programs and commands." The
|
|
 |
1bfe7ce |
toolchest costs $19.95, with source code available for an
|
|
 |
1bfe7ce |
additional fee. Mix C's version of sed is not available separately.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
MKS (Mortice Kern Systems) Toolkit
|
|
 |
1bfe7ce |
http://www.mks.com
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed comes bundled with the MKS Toolkit, which is distributed only
|
|
 |
1bfe7ce |
as commercial software; it is not available separately.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Thompson Automation Software
|
|
 |
1bfe7ce |
http://www.tasoft.com
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The Thompson Toolkit contains over 100 familiar Unix utilities,
|
|
 |
1bfe7ce |
including a version of the Unix Korn shell. It runs under MS-DOS,
|
|
 |
1bfe7ce |
OS/2, Win3.x, Win9x, and WinNT. Sed is one of the utilities, though
|
|
 |
1bfe7ce |
Thompson is better known for its version of awk for DOS, TAWK. The
|
|
 |
1bfe7ce |
toolkit runs about $150; sed is not available separately.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.3. Where can I learn to use sed?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.3.1. Books
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
_Sed & Awk, 2d edition_, by Dale Dougherty & Arnold Robbins
|
|
 |
1bfe7ce |
(Sebastopol, Calif: O'Reilly and Associates, 1997)
|
|
 |
1bfe7ce |
ISBN 1-56592-225-5
|
|
 |
1bfe7ce |
http://www.oreilly.com/catalog/sed2/noframes.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
About 40 percent of this book is devoted to sed, and maybe 50
|
|
 |
1bfe7ce |
percent is devoted to awk. The other 10 percent covers regexes and
|
|
 |
1bfe7ce |
concepts common to both tools. If you prefer hard copy, this is
|
|
 |
1bfe7ce |
definitely the best single place to learn to use sed, including its
|
|
 |
1bfe7ce |
advanced features.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The first edition is also very useful. Several typos crept into the
|
|
 |
1bfe7ce |
first printing of the first edition (though if you follow the
|
|
 |
1bfe7ce |
tutorials closely, you'll recognize them right away). A list of
|
|
 |
1bfe7ce |
errors from the first printing of _sed & awk_ is available at
|
|
 |
1bfe7ce |
<http://www.cs.colostate.edu/~dzubera/sedawk.txt>, and errors in
|
|
 |
1bfe7ce |
the 2nd are at <http://www.cs.colostate.edu/~dzubera/sedawk2.txt>,
|
|
 |
1bfe7ce |
though most of these were corrected in later printings. The second
|
|
 |
1bfe7ce |
edition tells how POSIX standards have affected these tools and
|
|
 |
1bfe7ce |
covers the popular GNU versions of sed and awk. Price is about (US)
|
|
 |
1bfe7ce |
$30.00
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
-----
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
_Mastering Regular Expressions, 2d ed.,_ by Jeffrey E. F. Friedl
|
|
 |
1bfe7ce |
(Sebastopol, Calif: O'Reilly and Associates, 2002)
|
|
 |
1bfe7ce |
ISBN 0-596-00289-0
|
|
 |
1bfe7ce |
http://regex.info
|
|
 |
1bfe7ce |
http://www.oreilly.com/catalog/regex2/
|
|
 |
1bfe7ce |
http://public.yahoo.com/~jfriedl/regex/ (for the first edition)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Knowing how to use "regular expressions" is essential to effective
|
|
 |
1bfe7ce |
use of most Unix tools. This book focuses on how regular
|
|
 |
1bfe7ce |
expressions can be best implemented in utilities such as perl, vi,
|
|
 |
1bfe7ce |
emacs, and awk, but also touches on sed as well. Friedl's home page
|
|
 |
1bfe7ce |
(above) gives links to other sites which help students learn to
|
|
 |
1bfe7ce |
master regular expressions. His site also gives a Perl script for
|
|
 |
1bfe7ce |
determining a syntactically valid e-mail address, using regexes:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://public.yahoo.com/~jfriedl/regex/code.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
-----
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
_Awk und Sed_, by Helmut Herold.
|
|
 |
1bfe7ce |
(Bonn: Addison-Wesley, 1994; 288 pages)
|
|
 |
1bfe7ce |
2nd edition to be released in March 2003
|
|
 |
1bfe7ce |
ISBN 3-8273-2094-1
|
|
 |
1bfe7ce |
http://www.addison-wesley.de/main/main.asp?page=home/bookdetails&ProductID=37214
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.3.2. Mailing list
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If you are interested in learning more about sed (its syntax, using
|
|
 |
1bfe7ce |
regular expressions, etc.) you are welcome to subscribe to a
|
|
 |
1bfe7ce |
sed-oriented mailing list. In fact, there are two mailing lists
|
|
 |
1bfe7ce |
about sed: one in English named "sed-users", moderated by Sven
|
|
 |
1bfe7ce |
Guckes; and one in Portuguese named "sed-BR" (for sed-Brazil),
|
|
 |
1bfe7ce |
moderated by Aurelio Marinho Jargas. The average volume of mail for
|
|
 |
1bfe7ce |
"sed-users" is about 35 messages a week; the average volume of mail
|
|
 |
1bfe7ce |
for "sed-BR" is about 15 messages a week.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed-BR mailing list: http://br.groups.yahoo.com/group/sed-br/
|
|
 |
1bfe7ce |
sed-users mailing list: http://groups.yahoo.com/group/sed-users/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
To subscribe to sed-users, send a blank message to:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed-users-subscribe@yahoogroups.com
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
To unsubscribe from sed-users, send a blank message to:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed-users-unsubscribe@yahoogroups.com
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.3.3. Tutorials, electronic text
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The original users manual for sed, by Lee E. McMahon, from the
|
|
 |
1bfe7ce |
7th edition UNIX Manual (1978), with the classic "Kubla Khan"
|
|
 |
1bfe7ce |
example and tutorial, in formatted text format:
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/grabbag/tutorials/sed_mcmahon.txt
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The source code to the preceding manual. Use "troff -ms sed" to
|
|
 |
1bfe7ce |
print this file properly:
|
|
 |
1bfe7ce |
http://plan9.bell-labs.com/7thEdMan/vol2/sed
|
|
 |
1bfe7ce |
http://cm.bell-labs.com/7thEdMan/vol2/sed
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"Do It With Sed", by Carlos Duarte
|
|
 |
1bfe7ce |
http://www.dbnet.ece.ntua.gr/~george/sed/OLD/sedtut_1.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"Sed: How to use sed, a special editor for modifying files
|
|
 |
1bfe7ce |
automatically", by Bruce Barnett and General Electric Company
|
|
 |
1bfe7ce |
http://www.grymoire.com/Unix/Sed.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
U-SEDIT2.ZIP, by Mike Arst (16 June 1990)
|
|
 |
1bfe7ce |
ftp://ftp.cs.umu.se/pub/pc/u-sedit2.zip
|
|
 |
1bfe7ce |
ftp://ftp.uni-stuttgart.de/pub/systems/msdos/util/unixlike/u-sedit2.zip
|
|
 |
1bfe7ce |
ftp://sunsite.icm.edu.pl/vol/wojsyl/garbo/pc/editor/u-sedit2.zip
|
|
 |
1bfe7ce |
ftp://ftp.sogang.ac.kr/pub/msdos/garbo_pc/editor/u-sedit2.zip
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
U-SEDIT3.ZIP, by Mike Arst (24 Jan. 1992)
|
|
 |
1bfe7ce |
http://www.student.northpark.edu/pemente/sed/u-sedit3.zip
|
|
 |
1bfe7ce |
CompuServe DTPFORUM, "PC DTP Utilities" library, file SEDDOC.ZIP
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Another sed FAQ
|
|
 |
1bfe7ce |
http://www.dreamwvr.com/sed-info/sed-faq.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed-tutorial, by Felix von Leitner
|
|
 |
1bfe7ce |
http://www.math.fu-berlin.de/~leitner/sed/tutorial.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"Manipulating text with sed," chapter 14 of the SCO OpenServer
|
|
 |
1bfe7ce |
"Operating System Users Guide"
|
|
 |
1bfe7ce |
http://ou800doc.caldera.com/SHL_automate/CTOC-Manipulating_text_with_sed.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"Combining the Bourne-shell, sed and awk in the UNIX environment
|
|
 |
1bfe7ce |
for language analysis," by Lothar Schmitt and Kiel Christianson.
|
|
 |
1bfe7ce |
This basic tutorial on the Bourne shell, sed and awk downloads as a
|
|
 |
1bfe7ce |
71-page PostScript file (compressed to 290K with gzip). You may
|
|
 |
1bfe7ce |
need to navigate down from the root to get the file.
|
|
 |
1bfe7ce |
ftp://ftp.u-aizu.ac.jp/u-aizu/doc/Tech-Report/1997/97-2-007.tar.gz
|
|
 |
1bfe7ce |
available upon request from Lothar Schmitt <lothar@u-aizu.ac.jp>
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
2.3.4. General web and ftp sites
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/grabbag # Collected scripts
|
|
 |
1bfe7ce |
http://main.rtfiber.com.tw/~changyj/sed/ # Yao-Jen Chang
|
|
 |
1bfe7ce |
http://www.math.fu-berlin.de/~guckes/sed/ # Sven Guckes
|
|
 |
1bfe7ce |
http://www.math.fu-berlin.de/~leitner/sed/ # Felix von Leitner
|
|
 |
1bfe7ce |
http://www.dbnet.ece.ntua.gr/~george/sed/ # Yiorgos Adamopoulos
|
|
 |
1bfe7ce |
http://www.student.northpark.edu/pemente/sed/ # Eric Pement
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://spacsun.rice.edu/FAQ/sed.html
|
|
 |
1bfe7ce |
ftp://algos.inesc.pt/pub/users/cdua/scripts.tar.gz (sed and shell scripts)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"Handy One-Liners For Sed", compiled by Eric Pement. A large list
|
|
 |
1bfe7ce |
of 1-line sed commands which can be executed from the command line.
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/sed1line.txt
|
|
 |
1bfe7ce |
http://www.student.northpark.edu/pemente/sed/sed1line.txt
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"Handy One-Liners For Sed", translated to Portuguese
|
|
 |
1bfe7ce |
http://wmaker.lrv.ufsc.br/sed_ptBR.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The Single UNIX Specification, Version 3 (technical man page)
|
|
 |
1bfe7ce |
http://www.opengroup.org/onlinepubs/007904975/utilities/sed.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Getting started with sed
|
|
 |
1bfe7ce |
http://www.cs.hmc.edu/tech_docs/qref/sed.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
masm to gas converter
|
|
 |
1bfe7ce |
http://www.delorie.com/djgpp/faq/converting/asm2s-sed.html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
mail2html.zip
|
|
 |
1bfe7ce |
http://www.crispen.org/src/#mail2html
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sample uses of sed in batch files and scripts (Benny Pederson)
|
|
 |
1bfe7ce |
http://users.cybercity.dk/~bse26236/batutil/help/SED.HTM
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
dc.sed - the most complex and impressive sed script ever written.
|
|
 |
1bfe7ce |
This sed script by Greg Ubben emulates the Unix dc (desk
|
|
 |
1bfe7ce |
calculator), including base conversion, exponentiation, square
|
|
 |
1bfe7ce |
roots, and much more.
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/grabbag/scripts/dc_overview.htm
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If you should find other tutorials or scripts that should be added
|
|
 |
1bfe7ce |
to this document, please forward the URLs to the FAQ maintainer.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
------------------------------
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3. TECHNICAL
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.1. More detailed explanation of basic sed
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed takes a script of editing commands and applies each command, in
|
|
 |
1bfe7ce |
order, to each line of input. After all the commands have been
|
|
 |
1bfe7ce |
applied to the first line of input, that line is output. A second
|
|
 |
1bfe7ce |
input line is taken for processing, and the cycle repeats. Sed
|
|
 |
1bfe7ce |
scripts can address a single line by line number or by matching a
|
|
 |
1bfe7ce |
/RE pattern/ on the line. An exclamation mark '!' after a regex
|
|
 |
1bfe7ce |
('/RE/!') or line number will select all lines that do NOT match
|
|
 |
1bfe7ce |
that address. Sed can also address a range of lines in the same
|
|
 |
1bfe7ce |
manner, using a comma to separate the 2 addresses.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
$d # delete the last line of the file
|
|
 |
1bfe7ce |
/[0-9]\{3\}/p # print lines with 3 consecutive digits
|
|
 |
1bfe7ce |
5!s/ham/cheese/ # except on line 5, replace 'ham' with 'cheese'
|
|
 |
1bfe7ce |
/awk/!s/aaa/bb/ # unless 'awk' is found, replace 'aaa' with 'bb'
|
|
 |
1bfe7ce |
17,/foo/d # delete all lines from line 17 up to 'foo'
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Following an address or address range, sed accepts curly braces
|
|
 |
1bfe7ce |
'{...}' so several commands may be applied to that line or to the
|
|
 |
1bfe7ce |
lines matched by the address range. On the command line, semicolons
|
|
 |
1bfe7ce |
';' separate each instruction and must precede the closing brace.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed '/Owner:/{s/yours/mine/g;s/your/my/g;s/you/me/g;}' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Range addresses operate differently depending on which version of
|
|
 |
1bfe7ce |
sed is used (see section 3.4, below). For further information on
|
|
 |
1bfe7ce |
using sed, consult the references in section 2.3, above.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.1.1. Regular expressions on the left side of "s///"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
All versions of sed support Basic Regular Expressions (BREs). For
|
|
 |
1bfe7ce |
the syntax of BREs, enter "man ed" at a Unix shell prompt. A
|
|
 |
1bfe7ce |
technical description of BREs from IEEE POSIX 1003.1-2001 and the
|
|
 |
1bfe7ce |
Single UNIX Specification Version 3 is available online at:
|
|
 |
1bfe7ce |
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap09.html#tag_09_03
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed normally supports BREs plus '\n' to match a newline in the
|
|
 |
1bfe7ce |
pattern space, plus '\xREx' as equivalent to '/RE/', where 'x' is any
|
|
 |
1bfe7ce |
character other than a newline or another backslash.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Some versions of sed support supersets of BREs, or "extended
|
|
 |
1bfe7ce |
regular expressions", which offer additional metacharacters for
|
|
 |
1bfe7ce |
increased flexibility. For additional information on extended REs
|
|
 |
1bfe7ce |
in GNU sed, see sections 3.7 ("GNU/POSIX extensions to regular
|
|
 |
1bfe7ce |
expressions") and 6.7.3 ("Special syntax in REs"), below.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Though not required by BREs, some versions of sed support \t to
|
|
 |
1bfe7ce |
represent a TAB, \r for carriage return, \xHH for direct entry of
|
|
 |
1bfe7ce |
hex codes, and so forth. Other versions of sed do not.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ssed (super-sed) introduced many new features for LHS pattern
|
|
 |
1bfe7ce |
matching, too many to give here. The complete list is found in
|
|
 |
1bfe7ce |
section 6.7.3.H ("ssed"), below.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.1.2. Escape characters on the right side of "s///"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The right-hand side (the replacement part) in "s/find/replace/" is
|
|
 |
1bfe7ce |
almost always a string literal, with no interpolation of these
|
|
 |
1bfe7ce |
metacharacters:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
. ^ $ [ ] { } ( ) ? + * |
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Three things *are* interpolated: ampersand (&), backreferences, and
|
|
 |
1bfe7ce |
options for special seds. An ampersand on the RHS is replaced by
|
|
 |
1bfe7ce |
the entire expression matched on the LHS. There is _never_ any
|
|
 |
1bfe7ce |
reason to use grouping like this:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s/\(some-complex-regex\)/one two \1 three/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
since you can do this instead:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s/some-complex-regex/one two & three/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
To enter a literal ampersand on the RHS, type '\&'.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Grouping and backreferences: All versions of sed support grouping
|
|
 |
1bfe7ce |
and backreferences on the LHS and backreferences only on the RHS.
|
|
 |
1bfe7ce |
Grouping allows a series of characters to be collected in a set,
|
|
 |
1bfe7ce |
indicating the boundaries of the set with \( and \). Then the set
|
|
 |
1bfe7ce |
can be designated to be repeated a certain number of times
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
\(like this\)* or \(like this\)\{5,7\}.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Groups can also be nested "\(like \(this\) is here\)" and may
|
|
 |
1bfe7ce |
contain any valid RE. Backreferences repeat the contents of a
|
|
 |
1bfe7ce |
particular group, using a backslash and a digit (1-9) for each
|
|
 |
1bfe7ce |
corresponding group. In other words, "/\(pom\)\1/" is another way
|
|
 |
1bfe7ce |
of writing "/pompom/". If groups are nested, backreference numbers
|
|
 |
1bfe7ce |
are counted by matching \( in strict left to right order. Thus,
|
|
 |
1bfe7ce |
/..\(the \(word\)\) \("foo"\)../ is matched by the backreference
|
|
 |
1bfe7ce |
\3. Backreferences can be used in the LHS, the RHS, and in normal
|
|
 |
1bfe7ce |
RE addressing (see section 3.3). Thus,
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/\(.\)\1\(.\)\2\(.\)\3/; # matches "bookkeeper"
|
|
 |
1bfe7ce |
/^\(.\)\(.\)\(.\)\3\2\1$/; # finds 6-letter palindromes
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Seds differ in how they treat invalid backreferences where no
|
|
 |
1bfe7ce |
corresponding group occurs. To insert a literal ampersand or
|
|
 |
1bfe7ce |
backslash into the RHS, prefix it with a backslash: \& or \\.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ssed, sed16, and sedmod permit additional options on the RHS. They
|
|
 |
1bfe7ce |
all support changing part of the replacement string to upper case
|
|
 |
1bfe7ce |
(\u or \U), lower case (\l or \L), or to end case conversion (\E).
|
|
 |
1bfe7ce |
Both sed16 and sedmod support awk-style word references ($1, $2,
|
|
 |
1bfe7ce |
$3, ...) and $0 to insert the entire line before conversion.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
echo ab ghi | sed16 "s/.*/$0 - \U$2/" # prints "ab ghi - GHI"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
*Note:* This feature of sed16 and sedmod will break sed scripts which
|
|
 |
1bfe7ce |
put a dollar sign and digit into the RHS. Though this is an unlikely
|
|
 |
1bfe7ce |
combination, it's worth remembering if you use other people's scripts.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.1.3. Substitution switches
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Standard versions of sed support 4 main flags or switches which may
|
|
 |
1bfe7ce |
be added to the end of an "s///" command. They are:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
N - Replace the Nth match of the pattern on the LHS, where
|
|
 |
1bfe7ce |
N is an integer between 1 and 512. If N is omitted,
|
|
 |
1bfe7ce |
the default is to replace the first match only.
|
|
 |
1bfe7ce |
g - Global replace of all matches to the pattern.
|
|
 |
1bfe7ce |
p - Print the results to stdout, even if -n switch is used.
|
|
 |
1bfe7ce |
w file - Write the pattern space to 'file' if a replacement was
|
|
 |
1bfe7ce |
done. If the file already exists when the script is
|
|
 |
1bfe7ce |
executed, it is overwritten. During script execution,
|
|
 |
1bfe7ce |
w appends to the file for each match.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed 3.02 and ssed also offer the /I switch for doing a
|
|
 |
1bfe7ce |
case-insensitive match. For example,
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
echo ONE TWO | gsed "s/one/unos/I" # prints "unos TWO"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed 4.x and ssed add the /M switch, to simplify working with
|
|
 |
1bfe7ce |
multi-line patterns: when it is used, ^ or $ will match BOL or EOL.
|
|
 |
1bfe7ce |
\` and \' remain available to match the start and end of pattern
|
|
 |
1bfe7ce |
space, respectively.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ssed supports two more switches, /S and /X, when its Perl mode is
|
|
 |
1bfe7ce |
used. They are described in detail in section 6.7.3.H, below.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.1.4. Command-line switches
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
All versions of sed support two switches, -e and -n. Though sed
|
|
 |
1bfe7ce |
usually separates multiple commands with semicolons (e.g., "H;d;"),
|
|
 |
1bfe7ce |
certain commands could not accept a semicolon command separator.
|
|
 |
1bfe7ce |
These include :labels, 't', and 'b'. These commands had to occur
|
|
 |
1bfe7ce |
last in a script, separated by -e option switches. For example:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# The 'ta' means jump to label :a if last s/// returns true
|
|
 |
1bfe7ce |
sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The -n switch turns off sed's default behavior of printing every
|
|
 |
1bfe7ce |
line. With -n, lines are printed only if explicitly told to. In
|
|
 |
1bfe7ce |
addition, for certain versions of sed, if an external script begins
|
|
 |
1bfe7ce |
with "#n" as its first two characters, the output is suppressed
|
|
 |
1bfe7ce |
(exactly as if -n had been entered on the command line). A list of
|
|
 |
1bfe7ce |
which versions appears in section 6.7.2., below.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed 4.x and ssed support additional switches. -l (lowercase L),
|
|
 |
1bfe7ce |
followed by a number, lets you adjust the default length of the 'l'
|
|
 |
1bfe7ce |
and 'L' commands (note that these implementations of sed also
|
|
 |
1bfe7ce |
support an argument to these commands, to tailor the length
|
|
 |
1bfe7ce |
separately of each occurrence of the command).
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
-i activates in-place editing (see section 4.41.1, below). -s
|
|
 |
1bfe7ce |
treats each file as a separate stream: sed by default joins all the
|
|
 |
1bfe7ce |
files, so $ represents the last line of the last file; 15 means the
|
|
 |
1bfe7ce |
15th line in the joined stream; and /abc/,/def/ might match across
|
|
 |
1bfe7ce |
files.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
When -s is used, however all addresses refer to single files. For
|
|
 |
1bfe7ce |
example, $ represents the last line of each input file; 15 means
|
|
 |
1bfe7ce |
the 15th line of each input file; and /abc/,/def/ will be "reset"
|
|
 |
1bfe7ce |
(in other words, sed will not execute the commands and start
|
|
 |
1bfe7ce |
looking for /abc/ again) if a file ends before /def/ has been
|
|
 |
1bfe7ce |
matched. Note that -i automatically activates this interpretation
|
|
 |
1bfe7ce |
of addresses.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.2. Common one-line sed scripts
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
A separate document of over 70 handy "one-line" sed commands is
|
|
 |
1bfe7ce |
available at
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/sed1line.txt
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Here are several common sed commands for one-line use. MS-DOS users
|
|
 |
1bfe7ce |
should replace single quotes ('...') with double quotes ("...") in
|
|
 |
1bfe7ce |
these examples. A specific filename usually follows the script,
|
|
 |
1bfe7ce |
though the input may also come via piping or redirection.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Double space a file
|
|
 |
1bfe7ce |
sed G file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Triple space a file
|
|
 |
1bfe7ce |
sed 'G;G' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Under UNIX: convert DOS newlines (CR/LF) to Unix format
|
|
 |
1bfe7ce |
sed 's/.$//' file # assumes that all lines end with CR/LF
|
|
 |
1bfe7ce |
sed 's/^M$// file # in bash/tcsh, press Ctrl-V then Ctrl-M
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Under DOS: convert Unix newlines (LF) to DOS format
|
|
 |
1bfe7ce |
sed 's/$//' file # method 1
|
|
 |
1bfe7ce |
sed -n p file # method 2
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Delete leading whitespace (spaces/tabs) from front of each line
|
|
 |
1bfe7ce |
# (this aligns all text flush left). '^t' represents a true tab
|
|
 |
1bfe7ce |
# character. Under bash or tcsh, press Ctrl-V then Ctrl-I.
|
|
 |
1bfe7ce |
sed 's/^[ ^t]*//' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Delete trailing whitespace (spaces/tabs) from end of each line
|
|
 |
1bfe7ce |
sed 's/[ ^t]*$//' file # see note on '^t', above
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Delete BOTH leading and trailing whitespace from each line
|
|
 |
1bfe7ce |
sed 's/^[ ^t]*//;s/[ ^]*$//' file # see note on '^t', above
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Substitute "foo" with "bar" on each line
|
|
 |
1bfe7ce |
sed 's/foo/bar/' file # replaces only 1st instance in a line
|
|
 |
1bfe7ce |
sed 's/foo/bar/4' file # replaces only 4th instance in a line
|
|
 |
1bfe7ce |
sed 's/foo/bar/g' file # replaces ALL instances within a line
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Substitute "foo" with "bar" ONLY for lines which contain "baz"
|
|
 |
1bfe7ce |
sed '/baz/s/foo/bar/g' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Delete all CONSECUTIVE blank lines from file except the first.
|
|
 |
1bfe7ce |
# This method also deletes all blank lines from top and end of file.
|
|
 |
1bfe7ce |
# (emulates "cat -s")
|
|
 |
1bfe7ce |
sed '/./,/^$/!d' file # this allows 0 blanks at top, 1 at EOF
|
|
 |
1bfe7ce |
sed '/^$/N;/\n$/D' file # this allows 1 blank at top, 0 at EOF
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Delete all leading blank lines at top of file (only).
|
|
 |
1bfe7ce |
sed '/./,$!d' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Delete all trailing blank lines at end of file (only).
|
|
 |
1bfe7ce |
sed -e :a -e '/^\n*$/{$d;N;};/\n$/ba' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# If a line ends with a backslash, join the next line to it.
|
|
 |
1bfe7ce |
sed -e :a -e '/\\$/N; s/\\\n//; ta' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# If a line begins with an equal sign, append it to the previous
|
|
 |
1bfe7ce |
# line (and replace the "=" with a single space).
|
|
 |
1bfe7ce |
sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.3. Addressing and address ranges
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed commands may have an optional "address" or "address range"
|
|
 |
1bfe7ce |
prefix. If there is no address or address range given, then the
|
|
 |
1bfe7ce |
command is applied to all the lines of the input file or text
|
|
 |
1bfe7ce |
stream. Three commands cannot take an address prefix:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
- labels, used to branch or jump within the script
|
|
 |
1bfe7ce |
- the close brace, '}', which ends the '{' "command"
|
|
 |
1bfe7ce |
- the '#' comment character, also technically a "command"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
An address can be a line number (such as 1, 5, 37, etc.), a regular
|
|
 |
1bfe7ce |
expression (written in the form /RE/ or \xREx where 'x' is any
|
|
 |
1bfe7ce |
character other than '\' and RE is the regular expression), or the
|
|
 |
1bfe7ce |
dollar sign ($), representing the last line of the file. An
|
|
 |
1bfe7ce |
exclamation mark (!) after an address or address range will apply
|
|
 |
1bfe7ce |
the command to every line EXCEPT the ones named by the address. A
|
|
 |
1bfe7ce |
null regex ("//") will be replaced by the last regex which was
|
|
 |
1bfe7ce |
used. Also, some seds do not support \xREx as regex delimiters.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
5d # delete line 5 only
|
|
 |
1bfe7ce |
5!d # delete every line except line 5
|
|
 |
1bfe7ce |
/RE/s/LHS/RHS/g # substitute only if RE occurs on the line
|
|
 |
1bfe7ce |
/^$/b label # if the line is blank, branch to ':label'
|
|
 |
1bfe7ce |
/./!b label # ... another way to write the same command
|
|
 |
1bfe7ce |
\%.%!b label # ... yet another way to write this command
|
|
 |
1bfe7ce |
$!N # on all lines but the last, get the Next line
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note that an embedded newline can be represented in an address by
|
|
 |
1bfe7ce |
the symbol \n, but this syntax is needed only if the script puts 2
|
|
 |
1bfe7ce |
or more lines into the pattern space via the N, G, or other
|
|
 |
1bfe7ce |
commands. The \n symbol does *not* match the newline at an
|
|
 |
1bfe7ce |
end-of-line because when sed reads each line into the pattern space
|
|
 |
1bfe7ce |
for processing, it strips off the trailing newline, processes the
|
|
 |
1bfe7ce |
line, and adds a newline back when printing the line to standard
|
|
 |
1bfe7ce |
output. To match the end-of-line, use the '$' metacharacter, as
|
|
 |
1bfe7ce |
follows:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/tape$/ # matches the word 'tape' at the end of a line
|
|
 |
1bfe7ce |
/tape$deck/ # matches the word 'tape$deck' with a literal '$'
|
|
 |
1bfe7ce |
/tape\ndeck/ # matches 'tape' and 'deck' with a newline between
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The following sed commands usually accept *only* a single address.
|
|
 |
1bfe7ce |
All other commands (except labels, '}', and '#') accept both single
|
|
 |
1bfe7ce |
addresses and address ranges.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
= print to stdout the line number of the current line
|
|
 |
1bfe7ce |
a after printing the current line, append "text" to stdout
|
|
 |
1bfe7ce |
i before printing the current line, insert "text" to stdout
|
|
 |
1bfe7ce |
q quit after the current line is matched
|
|
 |
1bfe7ce |
r file prints contents of "file" to stdout after line is matched
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note that we said "usually." If you need to apply the '=', 'a',
|
|
 |
1bfe7ce |
'i', or 'r' commands to each and every line within an address
|
|
 |
1bfe7ce |
range, this behavior can be coerced by the use of braces. Thus,
|
|
 |
1bfe7ce |
"1,9=" is an invalid command, but "1,9{=;}" will print each line
|
|
 |
1bfe7ce |
number followed by its line for the first 9 lines (and then print
|
|
 |
1bfe7ce |
the rest of the rest of the file normally).
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Address ranges occur in the form
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
<address1>,<address2> or <address1>,<address2>!
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
where the address can be a line number or a standard /regex/.
|
|
 |
1bfe7ce |
<address2> can also be a dollar sign, indicating the end of file.
|
|
 |
1bfe7ce |
Under GNU sed 3.02+, ssed, and sed15+, <address2> may also be a
|
|
 |
1bfe7ce |
notation of the form +num, indicating the next _num_ lines after
|
|
 |
1bfe7ce |
<address1> is matched.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Address ranges are:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(1) Inclusive. The range "/From here/,/eternity/" matches all the
|
|
 |
1bfe7ce |
lines containing "From here" up to and including the line
|
|
 |
1bfe7ce |
containing "eternity". It will not stop on the line just prior to
|
|
 |
1bfe7ce |
"eternity". (If you don't like this, see section 4.24.)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(2) Plenary. They always match full lines, not just parts of lines.
|
|
 |
1bfe7ce |
In other words, a command to change or delete an address range will
|
|
 |
1bfe7ce |
change or delete whole lines; it won't stop in the middle of a
|
|
 |
1bfe7ce |
line.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(3) Multi-linear. Address ranges normally match 2 lines or more.
|
|
 |
1bfe7ce |
The second address will never match the same line the first address
|
|
 |
1bfe7ce |
did; therefore a valid address range always spans at least two
|
|
 |
1bfe7ce |
lines, with these exceptions which match only one line:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
- if the first address matches the last line of the file
|
|
 |
1bfe7ce |
- if using the syntax "/RE/,3" and /RE/ occurs only once in the
|
|
 |
1bfe7ce |
file at line 3 or below
|
|
 |
1bfe7ce |
- if using HHsed v1.5. See section 3.4.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(4) Minimalist. In address ranges with /regex/ as <address2>, the
|
|
 |
1bfe7ce |
range "/foo/,/bar/" will stop at the first "bar" it finds, provided
|
|
 |
1bfe7ce |
that "bar" occurs on a line below "foo". If the word "bar" occurs
|
|
 |
1bfe7ce |
on several lines below the word "foo", the range will match all the
|
|
 |
1bfe7ce |
lines from the first "foo" up to the first "bar". It will not
|
|
 |
1bfe7ce |
continue hopping ahead to find more "bar"s. In other words, address
|
|
 |
1bfe7ce |
ranges are not "greedy," like regular expressions.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(5) Repeating. An address range will try to match more than one
|
|
 |
1bfe7ce |
block of lines in a file. However, the blocks cannot nest. In
|
|
 |
1bfe7ce |
addition, a second match will not "take" the last line of the
|
|
 |
1bfe7ce |
previous block. For example, given the following text,
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
start
|
|
 |
1bfe7ce |
stop start
|
|
 |
1bfe7ce |
stop
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
the sed command '/start/,/stop/d' will only delete the first two
|
|
 |
1bfe7ce |
lines. It will not delete all 3 lines.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(6) Relentless. If the address range finds a "start" match but
|
|
 |
1bfe7ce |
doesn't find a "stop", it will match every line from "start" to the
|
|
 |
1bfe7ce |
end of the file. Thus, beware of the following behaviors:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/RE1/,/RE2/ # If /RE2/ is not found, matches from /RE1/ to the
|
|
 |
1bfe7ce |
# end-of-file.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
20,/RE/ # If /RE/ is not found, matches from line 20 to the
|
|
 |
1bfe7ce |
# end-of-file.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/RE/,30 # If /RE/ occurs any time after line 30, each
|
|
 |
1bfe7ce |
# occurrence will be matched in sed15+, sedmod, and
|
|
 |
1bfe7ce |
# GNU sed v3.02+. GNU sed v2.05 and 1.18 will match
|
|
 |
1bfe7ce |
# from the 2nd occurrence of /RE/ to the end-of-file.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If these behaviors seem strange, remember that they occur because
|
|
 |
1bfe7ce |
sed does not look "ahead" in the file. Doing so would stop sed from
|
|
 |
1bfe7ce |
being a stream editor and have adverse effects on its efficiency.
|
|
 |
1bfe7ce |
If these behaviors are undesirable, they can be circumvented or
|
|
 |
1bfe7ce |
corrected by the use of nested testing within braces. The following
|
|
 |
1bfe7ce |
scripts work under GNU sed 3.02:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Execute your_commands on range "/RE1/,/RE2/", but if /RE2/ is
|
|
 |
1bfe7ce |
# not found, do nothing.
|
|
 |
1bfe7ce |
/RE1/{:a;N;/RE2/!ba;your_commands;}
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Execute your_commands on range "20,/RE/", but if /RE/ is not
|
|
 |
1bfe7ce |
# found, do nothing.
|
|
 |
1bfe7ce |
20{:a;N;/RE/!ba;your_commands;}
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
As a side note, once we've used N to "slurp" lines together to test
|
|
 |
1bfe7ce |
for the ending expression, the pattern space will have gathered
|
|
 |
1bfe7ce |
many lines (possibly thousands) together and concatenated them as a
|
|
 |
1bfe7ce |
single expression, with the \n sequence marking line breaks. The
|
|
 |
1bfe7ce |
REs *within* the pattern space may have to be modified (e.g., you
|
|
 |
1bfe7ce |
must write '/\nStart/' instead of '/^Start/' and '/[^\n]*/' instead
|
|
 |
1bfe7ce |
of '/.*/') and other standard sed commands will be unavailable or
|
|
 |
1bfe7ce |
difficult to use.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Execute your_commands on range "/RE/,30", but if /RE/ occurs
|
|
 |
1bfe7ce |
# on line 31 or later, do not match it.
|
|
 |
1bfe7ce |
1,30{/RE/,$ your_commands;}
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
For related suggestions on using address ranges, see sections 4.2,
|
|
 |
1bfe7ce |
4.15, and 4.19 of this FAQ. Also, note the following section.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.4. Address ranges in GNU sed and HHsed
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(1) GNU sed 3.02+, ssed, and sed15+ all support address ranges like:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/regex/,+5
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
which match /regex/ plus the next 5 lines (or EOF, whichever comes
|
|
 |
1bfe7ce |
first).
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(2) GNU sed v3.02.80 (and above) and ssed support address ranges of:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
0,/regex/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
as a special case to permit matching /regex/ if it occurs on the
|
|
 |
1bfe7ce |
first line. This syntax permits a range expression that matches
|
|
 |
1bfe7ce |
every line from the top of the file to the first instance of
|
|
 |
1bfe7ce |
/regex/, even if /regex/ is on the first line.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(3) HHsed (sed15) has an exceptional way of implementing
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/regex1/,/regex2/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If /RE1/ and /RE2/ both occur on the *same* line, HHsed will match
|
|
 |
1bfe7ce |
that single line. In other words, an address range block can
|
|
 |
1bfe7ce |
consist of just one line. HHsed will then look for the next
|
|
 |
1bfe7ce |
occurrence of /regex1/ to begin the block again.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Every other version of sed (including sed16) requires 2 lines to
|
|
 |
1bfe7ce |
match an address range, and thus /regex1/ and /regex2/ cannot
|
|
 |
1bfe7ce |
successfully match just one line. See also the comments at
|
|
 |
1bfe7ce |
section 7.9.4, below.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(4) BEGIN~STEP selection: ssed and GNU sed (v2.05 and above) offer
|
|
 |
1bfe7ce |
a form of addressing called "BEGIN~STEP selection". This is *not* a
|
|
 |
1bfe7ce |
range address, which selects an inclusive block of consecutive
|
|
 |
1bfe7ce |
lines from /start/ to /finish/. But I think it seems to belong here.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Given an expression of the form "M~N", where M and N are integers,
|
|
 |
1bfe7ce |
GNU sed and ssed will select every Nth line, beginning at line M.
|
|
 |
1bfe7ce |
(With gsed v2.05, M had to be less than N, but this restriction is
|
|
 |
1bfe7ce |
no longer necessary). Both M and N may equal 0 ("0~0" selects every
|
|
 |
1bfe7ce |
line). These examples illustrate the syntax:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed '1~3d' file # delete every 3d line, starting with line 1
|
|
 |
1bfe7ce |
# deletes lines 1, 4, 7, 10, 13, 16, ...
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed '0~3d' file # deletes lines 3, 6, 9, 12, 15, 18, ...
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed -n '2~5p' file # print every 5th line, starting with line 2
|
|
 |
1bfe7ce |
# prints lines 2, 7, 12, 17, 22, 27, ...
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(5) Finally, GNU sed v2.05 has a bug in range addressing (see
|
|
 |
1bfe7ce |
section 7.5), which was fixed in the higher versions.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.5. Debugging sed scripts
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The following two debuggers should make it easier to understand how
|
|
 |
1bfe7ce |
sed scripts operate. They can save hours of grief when trying to
|
|
 |
1bfe7ce |
determine the problems with a sed script.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(1) sd (sed debugger), by Brian Hiles
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
This debugger runs under a Unix shell, is powerful, and is easy to
|
|
 |
1bfe7ce |
use. sd has conditional breakpoints and spypoints of the pattern
|
|
 |
1bfe7ce |
space and hold space, on any scope defined by regex match and/or
|
|
 |
1bfe7ce |
script line number. It can be semi-automated, can save diagnostic
|
|
 |
1bfe7ce |
reports, and shows potential problems with a sed script before it
|
|
 |
1bfe7ce |
tries to execute it. The script is robust and requires the Unix
|
|
 |
1bfe7ce |
shell utilities plus the Bourne shell or Korn shell to execute.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/grabbag/scripts/sd.ksh.txt (2003)
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/grabbag/scripts/sd.sh.txt (1998)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(2) sedsed, by Aurelio Jargas
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
This debugger requires Python to run it, and it uses your own
|
|
 |
1bfe7ce |
version of sed, whatever that may be. It displays the current input
|
|
 |
1bfe7ce |
line, the pattern space, and the hold space, before and after each
|
|
 |
1bfe7ce |
sed command is executed.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://sedsed.sourceforge.net
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.6. Notes about s2p, the sed-to-perl translator
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s2p (sed to perl) is a Perl program to convert sed scripts into the
|
|
 |
1bfe7ce |
Perl programming language; it is included with many versions of
|
|
 |
1bfe7ce |
Perl. These problems have been found when using s2p:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(1) Doesn't recognize the semicolon properly after s/// commands.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s/foo/bar/g;
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(2) Doesn't trim trailing whitespace after s/// commands. Even lone
|
|
 |
1bfe7ce |
trailing spaces, without comments, produce an error.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(3) Doesn't handle multiple commands within braces. E.g.,
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1,4{=;G;}
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
will produce perl code with missing braces, and miss the second "G"
|
|
 |
1bfe7ce |
command as well. In fact, any commands after the first one are
|
|
 |
1bfe7ce |
missed in the perl output script, and the output perl script will
|
|
 |
1bfe7ce |
also contain mismatched braces.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
3.7. GNU/POSIX extensions to regular expressions
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
GNU sed supports "character classes" in addition to regular
|
|
 |
1bfe7ce |
character sets, such as [0-9A-F]. Like regular character sets,
|
|
 |
1bfe7ce |
character classes represent any single character within a set.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"Character classes are a new feature introduced in the POSIX
|
|
 |
1bfe7ce |
standard. A character class is a special notation for describing
|
|
 |
1bfe7ce |
lists of characters that have a specific attribute, but where the
|
|
 |
1bfe7ce |
actual characters themselves can vary from country to country
|
|
 |
1bfe7ce |
and/or from character set to character set. For example, the notion
|
|
 |
1bfe7ce |
of what is an alphabetic character differs in the USA and in
|
|
 |
1bfe7ce |
France." [quoted from the docs for GNU awk v3.1.0.]
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Though character classes don't generally conserve space on the
|
|
 |
1bfe7ce |
line, they help make scripts portable for international use. The
|
|
 |
1bfe7ce |
equivalent character sets _for U.S. users_ follows:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
[[:alnum:]] - [A-Za-z0-9] Alphanumeric characters
|
|
 |
1bfe7ce |
[[:alpha:]] - [A-Za-z] Alphabetic characters
|
|
 |
1bfe7ce |
[[:blank:]] - [ \x09] Space or tab characters only
|
|
 |
1bfe7ce |
[[:cntrl:]] - [\x00-\x19\x7F] Control characters
|
|
 |
1bfe7ce |
[[:digit:]] - [0-9] Numeric characters
|
|
 |
1bfe7ce |
[[:graph:]] - [!-~] Printable and visible characters
|
|
 |
1bfe7ce |
[[:lower:]] - [a-z] Lower-case alphabetic characters
|
|
 |
1bfe7ce |
[[:print:]] - [ -~] Printable (non-Control) characters
|
|
 |
1bfe7ce |
[[:punct:]] - [!-/:-@[-`{-~] Punctuation characters
|
|
 |
1bfe7ce |
[[:space:]] - [ \t\v\f] All whitespace chars
|
|
 |
1bfe7ce |
[[:upper:]] - [A-Z] Upper-case alphabetic characters
|
|
 |
1bfe7ce |
[[:xdigit:]] - [0-9a-fA-F] Hexadecimal digit characters
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note that [[:graph:]] does not match the space " ", but [[:print:]]
|
|
 |
1bfe7ce |
does. Some character classes may (or may not) match characters in
|
|
 |
1bfe7ce |
the high ASCII range (ASCII 128-255 or 0x80-0xFF), depending on
|
|
 |
1bfe7ce |
which C library was used to compile sed. For non-English languages,
|
|
 |
1bfe7ce |
[[:alpha:]] and other classes may also match high ASCII characters.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
------------------------------
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4. EXAMPLES
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
ONE-CHARACTER QUESTIONS
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.1. How do I insert a newline into the RHS of a substitution?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Several versions of sed permit '\n' to be typed directly into the
|
|
 |
1bfe7ce |
RHS, which is then converted to a newline on output: ssed,
|
|
 |
1bfe7ce |
gsed302a+, gsed103 (with the -x switch), sed15+, sedmod, and
|
|
 |
1bfe7ce |
UnixDOS sed. The _easiest_ solution is to use one of these
|
|
 |
1bfe7ce |
versions.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
For other versions of sed, try one of the following:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(a) If typing the sed script from a Bourne shell, use one backslash
|
|
 |
1bfe7ce |
"\" if the script uses 'single quotes' or two backslashes "\\" if
|
|
 |
1bfe7ce |
the script requires "double quotes". In the example below, note
|
|
 |
1bfe7ce |
that the leading '>' on the 2nd line is generated by the shell to
|
|
 |
1bfe7ce |
prompt the user for more input. The user types in slash,
|
|
 |
1bfe7ce |
single-quote, and then ENTER to terminate the command:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
[sh-prompt]$ echo twolines | sed 's/two/& new\
|
|
 |
1bfe7ce |
>/'
|
|
 |
1bfe7ce |
two new
|
|
 |
1bfe7ce |
lines
|
|
 |
1bfe7ce |
[bash-prompt]$
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(b) Use a script file with one backslash '\' in the script,
|
|
 |
1bfe7ce |
immediately followed by a newline. This will embed a newline into
|
|
 |
1bfe7ce |
the "replace" portion. Example:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed -f newline.sed files
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# newline.sed
|
|
 |
1bfe7ce |
s/twolines/two new\
|
|
 |
1bfe7ce |
lines/g
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Some versions of sed may not need the trailing backslash. If so,
|
|
 |
1bfe7ce |
remove it.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(c) Insert an unused character and pipe the output through tr:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
echo twolines | sed 's/two/& new=/' | tr "=" "\n" # produces
|
|
 |
1bfe7ce |
two new
|
|
 |
1bfe7ce |
lines
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(d) Use the "G" command:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
G appends a newline, plus the contents of the hold space to the end
|
|
 |
1bfe7ce |
of the pattern space. If the hold space is empty, a newline is
|
|
 |
1bfe7ce |
appended anyway. The newline is stored in the pattern space as "\n"
|
|
 |
1bfe7ce |
where it can be addressed by grouping "\(...\)" and moved in the
|
|
 |
1bfe7ce |
RHS. Thus, to change the "twolines" example used earlier, the
|
|
 |
1bfe7ce |
following script will work:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed '/twolines/{G;s/\(two\)\(lines\)\(\n\)/\1\3\2/;}'
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(e) Inserting full lines, not breaking lines up:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If one is not *changing* lines but only inserting complete lines
|
|
 |
1bfe7ce |
before or after a pattern, the procedure is much easier. Use the
|
|
 |
1bfe7ce |
"i" (insert) or "a" (append) command, making the alterations by an
|
|
 |
1bfe7ce |
external script. To insert "This line is new" BEFORE each line
|
|
 |
1bfe7ce |
matching a regex:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/RE/i This line is new # HHsed, sedmod, gsed 3.02a
|
|
 |
1bfe7ce |
/RE/{x;s/$/This line is new/;G;} # other seds
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The two examples above are intended as "one-line" commands entered
|
|
 |
1bfe7ce |
from the console. If using a sed script, "i\" immediately followed
|
|
 |
1bfe7ce |
by a literal newline will work on all versions of sed. Furthermore,
|
|
 |
1bfe7ce |
the command "s/$/This line is new/" will only work if the hold
|
|
 |
1bfe7ce |
space is already empty (which it is by default).
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
To append "This line is new" AFTER each line matching a regex:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/RE/a This line is new # HHsed, sedmod, gsed 3.02a
|
|
 |
1bfe7ce |
/RE/{G;s/$/This line is new/;} # other seds
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
To append 2 blank lines after each line matching a regex:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/RE/{G;G;} # assumes the hold space is empty
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
To replace each line matching a regex with 5 blank lines:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/RE/{s/.*//;G;G;G;G;} # assumes the hold space is empty
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(f) Use the "y///" command if possible:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
On some Unix versions of sed (not GNU sed!), though the s///
|
|
 |
1bfe7ce |
command won't accept '\n' in the RHS, the y/// command does. If
|
|
 |
1bfe7ce |
your Unix sed supports it, a newline after "aaa" can be inserted
|
|
 |
1bfe7ce |
this way (which is not portable to GNU sed or other seds):
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s/aaa/&~;; y/~/\n/; # assuming no other '~' is on the line!
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.2. How do I represent control-codes or nonprintable characters?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Several versions of sed support the notation \xHH, where "HH" are
|
|
 |
1bfe7ce |
two hex digits, 00-FF: ssed, GNU sed v3.02.80 and above, GNU sed
|
|
 |
1bfe7ce |
v1.03, sed16 and sed15 (HHsed). Try to use one of those versions.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed is not intended to process binary or object code, and files
|
|
 |
1bfe7ce |
which contain nulls (0x00) will usually generate errors in most
|
|
 |
1bfe7ce |
versions of sed. The latest versions of GNU sed and ssed are an
|
|
 |
1bfe7ce |
exception; they permit nulls in the input files and also in
|
|
 |
1bfe7ce |
regexes.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
On Unix platforms, the 'echo' command may allow insertion of octal
|
|
 |
1bfe7ce |
or hex values, e.g., `echo "\0nnn"` or `echo -n "\0nnn"`. The echo
|
|
 |
1bfe7ce |
command may also support syntax like '\\b' or '\\t' for backspace
|
|
 |
1bfe7ce |
or tab characters. Check the man pages to see what syntax your
|
|
 |
1bfe7ce |
version of echo supports. Some versions support the following:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# replace 0x1A (32 octal) with ASCII letters
|
|
 |
1bfe7ce |
sed 's/'`echo "\032"`'/Ctrl-Z/g'
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# note the 3 backslashes in the command below
|
|
 |
1bfe7ce |
sed "s/.`echo \\\b`//g"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.3. How do I convert files with toggle characters, like +this+, to
|
|
 |
1bfe7ce |
look like [i]this[/i]?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Input files, especially message-oriented text files, often contain
|
|
 |
1bfe7ce |
toggle characters for emphasis, like ~this~, *this*, or =this=. Sed
|
|
 |
1bfe7ce |
can make the same input pattern produce alternating output each
|
|
 |
1bfe7ce |
time it is encountered. Typical needs might be to generate HMTL
|
|
 |
1bfe7ce |
codes or print codes for boldface, italic, or underscore. This
|
|
 |
1bfe7ce |
script accomodates multiple occurrences of the toggle pattern on
|
|
 |
1bfe7ce |
the same line, as well as cases where the pattern starts on one
|
|
 |
1bfe7ce |
line and finishes several lines later, even at the end of the file:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# sed script to convert +this+ to [i]this[/i]
|
|
 |
1bfe7ce |
:a
|
|
 |
1bfe7ce |
/+/{ x; # If "+" is found, switch hold and pattern space
|
|
 |
1bfe7ce |
/^ON/{ # If "ON" is in the (former) hold space, then ..
|
|
 |
1bfe7ce |
s///; # .. delete it
|
|
 |
1bfe7ce |
x; # .. switch hold space and pattern space back
|
|
 |
1bfe7ce |
s|+|[/i]|; # .. turn the next "+" into "[/i]"
|
|
 |
1bfe7ce |
ba; # .. jump back to label :a and start over
|
|
 |
1bfe7ce |
}
|
|
 |
1bfe7ce |
s/^/ON/; # Else, "ON" was not in the hold space; create it
|
|
 |
1bfe7ce |
x; # Switch hold space and pattern space
|
|
 |
1bfe7ce |
s|+|[i]|; # Turn the first "+" into "[i]"
|
|
 |
1bfe7ce |
ba; # Branch to label :a to find another pattern
|
|
 |
1bfe7ce |
}
|
|
 |
1bfe7ce |
#---end of script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
This script uses the hold space to create a "flag" to indicate
|
|
 |
1bfe7ce |
whether the toggle is ON or not. We have added remarks to
|
|
 |
1bfe7ce |
illustrate the script logic, but in most versions of sed remarks
|
|
 |
1bfe7ce |
are not permitted after 'b'ranch commands or labels.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If you are sure that the +toggle+ characters never cross line
|
|
 |
1bfe7ce |
boundaries (i.e., never begin on one line and end on another), this
|
|
 |
1bfe7ce |
script can be reduced to one line:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s|+\([^+][^+]*\)+|[i]\1[/i]|g
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If your toggle pattern contains regex metacharacters (such as '*'
|
|
 |
1bfe7ce |
or perhaps '+' or '?'), remember to quote them with backslashes.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
CHANGING STRINGS
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.10. How do I perform a case-insensitive search?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Several versions of sed support case-insensitive matching: ssed and
|
|
 |
1bfe7ce |
GNU sed v3.02+ (with I flag after s/// or /regex/); sedmod with the
|
|
 |
1bfe7ce |
-i switch; and sed16 (which supports both types of switches).
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
With other versions of sed, case-insensitive searching is awkward,
|
|
 |
1bfe7ce |
so people may use awk or perl instead, since these programs have
|
|
 |
1bfe7ce |
options for case-insensitive searches. In gawk/mawk, use "BEGIN
|
|
 |
1bfe7ce |
{IGNORECASE=1}" and in perl, "/regex/i". For other seds, here are
|
|
 |
1bfe7ce |
three solutions:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Solution 1: convert everything to upper case and search normally
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# sed script, solution 1
|
|
 |
1bfe7ce |
h; # copy the original line to the hold space
|
|
 |
1bfe7ce |
# convert the pattern space to solid caps
|
|
 |
1bfe7ce |
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
|
|
 |
1bfe7ce |
# now we can search for the word "CARLOS"
|
|
 |
1bfe7ce |
/CARLOS/ {
|
|
 |
1bfe7ce |
# add or insert lines. Note: "s/.../.../" will not work
|
|
 |
1bfe7ce |
# here because we are searching a modified pattern
|
|
 |
1bfe7ce |
# space and are not printing the pattern space.
|
|
 |
1bfe7ce |
}
|
|
 |
1bfe7ce |
x; # get back the original pattern space
|
|
 |
1bfe7ce |
# the original pattern space will be printed
|
|
 |
1bfe7ce |
#---end of sed script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Solution 2: search for both cases
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Often, proper names will either start with all lower-case ("unix"),
|
|
 |
1bfe7ce |
with an initial capital letter ("Unix") or occur in solid caps
|
|
 |
1bfe7ce |
("UNIX"). There may be no need to search for every possibility.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/UNIX/b match
|
|
 |
1bfe7ce |
/[Uu]nix/b match
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Solution 3: search for all possible cases
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# If you must, search for any possible combination
|
|
 |
1bfe7ce |
/[Ca][Aa][Rr][Ll][Oo][Ss]/ { ... }
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Bear in mind that as the pattern length increases, this solution
|
|
 |
1bfe7ce |
becomes an order of magnitude slower than the one of Solution 1, at
|
|
 |
1bfe7ce |
least with some implementations of sed.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.11. How do I match only the first occurrence of a pattern?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(1) The general solution is to use GNU sed or ssed, with one of
|
|
 |
1bfe7ce |
these range expressions. The first script ("print only the first
|
|
 |
1bfe7ce |
match") works with any version of sed:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed -n '/RE/{p;q;}' file # print only the first match
|
|
 |
1bfe7ce |
sed '0,/RE/{//d;}' file # delete only the first match
|
|
 |
1bfe7ce |
sed '0,/RE/s//to_that/' file # change only the first match
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(2) If you cannot use GNU sed and if you *know* the pattern will
|
|
 |
1bfe7ce |
not occur on the first line, this will work:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed '1,/RE/{//d;}' file # delete only the first match
|
|
 |
1bfe7ce |
sed '1,/RE/s//to_that/' file # change only the first match
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(3) If you cannot use GNU sed and the pattern *might* occur on the
|
|
 |
1bfe7ce |
first line, use one of the following commands (credit for short GNU
|
|
 |
1bfe7ce |
script goes to Donald Bruce Stewart):
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed '/RE/{x;/Y/!{s/^/Y/;h;d;};x;}' file # delete (one way)
|
|
 |
1bfe7ce |
sed -e '/RE/{d;:a' -e '$!N;$ba' -e '}' file # delete (another way)
|
|
 |
1bfe7ce |
sed '/RE/{d;:a;N;$ba;}' file # same script, GNU sed
|
|
 |
1bfe7ce |
sed -e '/RE/{s//to_that/;:a' -e '$!N;$!ba' -e '}' file # change
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Still another solution, using a flag in the hold space. This is
|
|
 |
1bfe7ce |
portable to all seds and works if the pattern is on the first line:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# sed script to change "foo" to "bar" only on the first occurrence
|
|
 |
1bfe7ce |
1{x;s/^/first/;x;}
|
|
 |
1bfe7ce |
1,/foo/{x;/first/s///;x;s/foo/bar/;}
|
|
 |
1bfe7ce |
#---end of script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.12. How do I parse a comma-delimited (CSV) data file?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Comma-delimited data files can come in several forms, requiring
|
|
 |
1bfe7ce |
increasing levels of complexity in parsing and handling. They are
|
|
 |
1bfe7ce |
often referred to as CSV files (for "comma separated values") and
|
|
 |
1bfe7ce |
occasionally as SDF files (for "standard data format"). Note that
|
|
 |
1bfe7ce |
some vendors use "SDF" to refer to variable-length records with
|
|
 |
1bfe7ce |
comma-separated fields which are "double-quoted" if they contain
|
|
 |
1bfe7ce |
character values, while other vendors use "SDF" to designate
|
|
 |
1bfe7ce |
fixed-length records with fixed-length, nonquoted fields! (For help
|
|
 |
1bfe7ce |
with fixed-length fields, see question 4.23)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The term "CSV" became a de-facto standard when Microsoft Excel used
|
|
 |
1bfe7ce |
it as an optional output file format.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Here are 4 different forms you may encounter in comma-delimited data:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(a) No quotes, no internal commas
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
1001,John Smith,PO Box 123,Chicago,IL,60699
|
|
 |
1bfe7ce |
1002,Mary Jones,320 Main,Denver,CO,84100,
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(b) Like (a), with quotes around each field
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"1003","John Smith","PO Box 123","Chicago","IL","60699"
|
|
 |
1bfe7ce |
"1004","Mary Jones","320 Main","Denver","CO","84100"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(c) Like (b), with embedded commas
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"1005","Tom Hall, Jr.","61 Ash Ct.","Niles","OH","44446"
|
|
 |
1bfe7ce |
"1006","Bob Davis","429 Pine, Apt. 5","Boston","MA","02128"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
(d) Like (c), with embedded commas and quotes
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
"1007","Sue "Red" Smith","19 Main","Troy","MI","48055"
|
|
 |
1bfe7ce |
"1008","Joe "Hey, guy!" Hall","POB 44","Reno","NV","89504"
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
In each example above, we have 7 fields and 6 commas which function
|
|
 |
1bfe7ce |
as field separators. Case (c) is a very typical form of these data
|
|
 |
1bfe7ce |
files, with double quotes used to enclose each field and to protect
|
|
 |
1bfe7ce |
internal commas (such as "Tom Hall, Jr.") from interpretation as
|
|
 |
1bfe7ce |
field separators. However, many times the data may include both
|
|
 |
1bfe7ce |
embedded quotation marks as well as embedded commas, as seen by
|
|
 |
1bfe7ce |
case (d), above.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Case (d) is the closest to Microsoft CSV format. *However*, the
|
|
 |
1bfe7ce |
Microsoft CSV format allows embedded newlines within a
|
|
 |
1bfe7ce |
double-quoted field. If embedded newlines within fields are a
|
|
 |
1bfe7ce |
possibility for your data, you should consider using something
|
|
 |
1bfe7ce |
other than sed to work with the data file.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Before handling a comma-delimited data file, make sure that you
|
|
 |
1bfe7ce |
fully understand its format and check the integrity of the data.
|
|
 |
1bfe7ce |
Does each line contain the same number of fields? Should certain
|
|
 |
1bfe7ce |
fields be composed only of numbers or of two-letter state
|
|
 |
1bfe7ce |
abbreviations in all caps? Sed (or awk or perl) should be used to
|
|
 |
1bfe7ce |
validate the integrity of the data file before you attempt to alter
|
|
 |
1bfe7ce |
it or extract particular fields from the file.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
After ensuring that each line has a valid number of fields, use sed
|
|
 |
1bfe7ce |
to locate and modify individual fields, using the \(...\) grouping
|
|
 |
1bfe7ce |
command where needed.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
In case (a):
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed 's/^[^,]*,[^,]*,[^,]*,[^,]*,/.../'
|
|
 |
1bfe7ce |
^ ^ ^
|
|
 |
1bfe7ce |
| | |_ 3rd field
|
|
 |
1bfe7ce |
| |_______ 2nd field
|
|
 |
1bfe7ce |
|_____________ 1st field
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Unix script to delete the second field for case (a)
|
|
 |
1bfe7ce |
sed 's/^\([^,]*\),[^,]*,/\1,,/' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Unix script to change field 1 to 9999 for case (a)
|
|
 |
1bfe7ce |
sed 's/^[^,]*,/9999,/' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
In cases (b) and (c):
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed 's/^"[^"]*","[^"]*","[^"]*","[^"]*",/.../'
|
|
 |
1bfe7ce |
1st-- 2nd-- 3rd-- 4th--
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Unix script to delete the second field for case (c)
|
|
 |
1bfe7ce |
sed 's/^\("[^"]*"\),"[^"]*",/\1,"",/' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# Unix script to change field 1 to 9999 for case (c)
|
|
 |
1bfe7ce |
sed 's/^"[^"]*",/"9999",/' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
In case (d):
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
One way to parse such files is to replace the 3-character field
|
|
 |
1bfe7ce |
separator "," with an unused character like the tab or vertical
|
|
 |
1bfe7ce |
bar. (Technically, the field separator is only the comma while the
|
|
 |
1bfe7ce |
fields are surrounded by "double quotes", but the net _effect_ is
|
|
 |
1bfe7ce |
that fields are separated by quote-comma-quote, with quote
|
|
 |
1bfe7ce |
characters added to the beginning and end of each record.) Search
|
|
 |
1bfe7ce |
your datafile _first_ to make sure that your character appears
|
|
 |
1bfe7ce |
nowhere in it!
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed -n '/|/p' file # search for any instance of '|'
|
|
 |
1bfe7ce |
# if it's not found, we can use the '|' to separate fields
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Then replace the 3-character field separator and parse as before:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# sed script to delete the second field for case (d)
|
|
 |
1bfe7ce |
s/","/|/g; # global change of "," to bar
|
|
 |
1bfe7ce |
s/^\([^|]*\)|[^|]|/\1||/; # delete 2nd field
|
|
 |
1bfe7ce |
s/|/","/g; # global change of bar back to ","
|
|
 |
1bfe7ce |
#---end of script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# sed script to change field 1 to 9999 for case (d)
|
|
 |
1bfe7ce |
# Remember to accommodate leading and trailing quote marks
|
|
 |
1bfe7ce |
s/","/|/g;
|
|
 |
1bfe7ce |
s/^[^|]*|/"9999|/;
|
|
 |
1bfe7ce |
s/|/","/g;
|
|
 |
1bfe7ce |
#---end of script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note that this technique works only if _each_ and _every_ field is
|
|
 |
1bfe7ce |
surrounded with double quotes, including empty fields.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The following solution is for more complex examples of (d), such
|
|
 |
1bfe7ce |
as: not all fields contain "double-quote" marks, or the presence of
|
|
 |
1bfe7ce |
embedded "double-quote" marks within fields, or extraneous
|
|
 |
1bfe7ce |
whitespace around field delimiters. (Thanks to Greg Ubben for this
|
|
 |
1bfe7ce |
script!)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# sed script to convert case (d) to bar-delimited records
|
|
 |
1bfe7ce |
s/^ *\(.*[^ ]\) *$/|\1|/;
|
|
 |
1bfe7ce |
s/" *, */"|/g;
|
|
 |
1bfe7ce |
: loop
|
|
 |
1bfe7ce |
s/| *\([^",|][^,|]*\) *, */|\1|/g;
|
|
 |
1bfe7ce |
s/| *, */|\1|/g;
|
|
 |
1bfe7ce |
t loop
|
|
 |
1bfe7ce |
s/ *|/|/g;
|
|
 |
1bfe7ce |
s/| */|/g;
|
|
 |
1bfe7ce |
s/^|\(.*\)|$/\1/;
|
|
 |
1bfe7ce |
#---end of script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
For example, it turns this (which is badly-formed but legal):
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
first,"",unquoted ,""this" is, quoted " ,, sub "quote" inside, f", lone " empty:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
into this:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
first|""|unquoted|""this" is, quoted "||sub "quote" inside|f"|lone " empty:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note that the script preserves the "double-quote" marks, but
|
|
 |
1bfe7ce |
changes only the commas where they are used as field separators. I
|
|
 |
1bfe7ce |
have used the vertical bar "|" because it's easier to read, but you
|
|
 |
1bfe7ce |
may change this to another field separator if you wish.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
If your CSV datafile is more complex, it would probably not be
|
|
 |
1bfe7ce |
worth the effort to write it in sed. For such a case, you should
|
|
 |
1bfe7ce |
use Perl with a dedicated CSV module (there are at least two recent
|
|
 |
1bfe7ce |
CSV parsers available from CPAN).
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.13. How do I handle fixed-length, columnar data?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sed handles fixed-length fields via \(grouping\) and backreferences
|
|
 |
1bfe7ce |
(\1, \2, \3 ...). If we have 3 fields of 10, 25, and 9 characters
|
|
 |
1bfe7ce |
per field, our sed script might look like so:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s/^\(.\{10\}\)\(.\{25\}\)\(.\{9\}\)/\3\2\1/; # Change the fields
|
|
 |
1bfe7ce |
^^^^^^^^^^^~~~~~~~~~~~========== # from 1,2,3 to 3,2,1
|
|
 |
1bfe7ce |
field #1 field #2 field #3
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
This is a bit hard to read. By using GNU sed or ssed with the -r
|
|
 |
1bfe7ce |
switch active, it can look like this:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s/^(.{10})(.{25})(.{9})/\3\2\1/; # Using the -r switch
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
To delete a field in sed, use grouping and omit the backreference
|
|
 |
1bfe7ce |
from the field to be deleted. If the data is long or difficult to
|
|
 |
1bfe7ce |
work with, use ssed with the -R switch and the /x flag after an s///
|
|
 |
1bfe7ce |
command, to insert comments and remarks about the fields.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
For records with many fields, use GNU awk with the FIELDWIDTHS
|
|
 |
1bfe7ce |
variable set in the top of the script. For example:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
awk 'BEGIN{FIELDWIDTHS = "10 25 9"}; {print $3 $2 $1}' file
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
This is much easier to read than a similar sed script, especially
|
|
 |
1bfe7ce |
if there are more than 5 or 6 fields to manipulate.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.14. How do I commify a string of numbers?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Use the simplest script necessary to accomplish your task. As
|
|
 |
1bfe7ce |
variations of the line increase, the sed script must become more
|
|
 |
1bfe7ce |
complex to handle additional conditions. Whole numbers are
|
|
 |
1bfe7ce |
simplest, followed by decimal formats, followed by embedded words.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Case 1: simple strings of whole numbers separated by spaces or
|
|
 |
1bfe7ce |
commas, with an optional negative sign. To convert this:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4381, -1222333, and 70000: - 44555666 1234567890 words
|
|
 |
1bfe7ce |
56890 -234567, and 89222 -999777 345888777666 chars
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
to this:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4,381, -1,222,333, and 70,000: - 44,555,666 1,234,567,890 words
|
|
 |
1bfe7ce |
56,890 -234,567, and 89,222 -999,777 345,888,777,666 chars
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
use one of these one-liners:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed ':a;s/\B[0-9]\{3\}\>/,&/;ta' # GNU sed
|
|
 |
1bfe7ce |
sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta' # other seds
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Case 2: strings of numbers which may have an embedded decimal
|
|
 |
1bfe7ce |
point, separated by spaces or commas, with an optional negative
|
|
 |
1bfe7ce |
sign. To change this:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4381, -6555.1212 and 70000, 7.18281828 44906982.071902
|
|
 |
1bfe7ce |
56890 -2345.7778 and 8.0000: -49000000 -1234567.89012
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
to this:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4,381, -6,555.1212 and 70,000, 7.18281828 44,906,982.071902
|
|
 |
1bfe7ce |
56,890 -2,345.7778 and 8.0000: -49,000,000 -1,234,567.89012
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
use the following command for GNU sed:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed ':a;s/\(^\|[^0-9.]\)\([0-9]\+\)\([0-9]\{3\}\)/\1\2,\3/g;ta'
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
and for other versions of sed:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed -f case2.sed files
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# case2.sed
|
|
 |
1bfe7ce |
s/^/ /; # add space to start of line
|
|
 |
1bfe7ce |
:a
|
|
 |
1bfe7ce |
s/\( [-0-9]\{1,\}\)\([0-9]\{3\}\)/\1,\2/g
|
|
 |
1bfe7ce |
ta
|
|
 |
1bfe7ce |
s/ //; # remove space from start of line
|
|
 |
1bfe7ce |
#---end of script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.15. How do I prevent regex expansion on substitutions?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Sometimes you want to *match* regular expression metacharacters as
|
|
 |
1bfe7ce |
literals (e.g., you want to match "[0-9]" or "\n"), to be replaced
|
|
 |
1bfe7ce |
with something else. The ordinary way to prevent expanding
|
|
 |
1bfe7ce |
metacharacters is to prefix them with a backslash. Thus, if "\n"
|
|
 |
1bfe7ce |
matches a newline, "\\n" will match the two-character string of
|
|
 |
1bfe7ce |
'backslash' followed by 'n'.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
But doing this repeatedly can become tedious if there are many
|
|
 |
1bfe7ce |
regexes. The following script will replace alternating strings of
|
|
 |
1bfe7ce |
literals, where no character is interpreted as a regex
|
|
 |
1bfe7ce |
metacharacter:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# filename: sub_quote.sed
|
|
 |
1bfe7ce |
# author: Paolo Bonzini
|
|
 |
1bfe7ce |
# sed script to add backslash to find/replace metacharacters
|
|
 |
1bfe7ce |
N; # add even numbered line to pattern space
|
|
 |
1bfe7ce |
s,[]/\\$*[],\\&,;; # quote all of [, ], /, \, $, or *
|
|
 |
1bfe7ce |
s,^,s/,; # prepend "s/" to front of pattern space
|
|
 |
1bfe7ce |
s,$,/,; # append "/" to end of pattern space
|
|
 |
1bfe7ce |
s,\n,/,; # change "\n" to "/", making s/from/to/
|
|
 |
1bfe7ce |
#---end of script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Here's a sample of how sub_quote.sed might be used. This example
|
|
 |
1bfe7ce |
converts typical sed regexes to perl-style regexes. The input file
|
|
 |
1bfe7ce |
consists of 10 lines:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
[0-9]
|
|
 |
1bfe7ce |
\d
|
|
 |
1bfe7ce |
[^0-9]
|
|
 |
1bfe7ce |
\D
|
|
 |
1bfe7ce |
\+
|
|
 |
1bfe7ce |
+
|
|
 |
1bfe7ce |
\?
|
|
 |
1bfe7ce |
?
|
|
 |
1bfe7ce |
\|
|
|
 |
1bfe7ce |
|
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Run the command "sed -f sub_quote.sed input", to transform the
|
|
 |
1bfe7ce |
input file (above) to 5 lines of output:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
s/\[0-9\]/\\d/
|
|
 |
1bfe7ce |
s/\[^0-9\]/\\D/
|
|
 |
1bfe7ce |
s/\\+/+/
|
|
 |
1bfe7ce |
s/\\?/?/
|
|
 |
1bfe7ce |
s/\\|/|/
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The above file is itself a sed script, which can then be used to
|
|
 |
1bfe7ce |
modify other files.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.16. How do I convert a string to all lowercase or capital letters?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The easiest method is to use a new version of GNU sed, ssed, sedmod
|
|
 |
1bfe7ce |
or sed16 and employ the \U, \L, or other switches on the right side
|
|
 |
1bfe7ce |
of an s/// command. For example, to convert any word which begins
|
|
 |
1bfe7ce |
with "reg" or "exp" into solid capital letters:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
sed -r "s/\<(reg|exp)[a-z]+/\U&/g" # gsed4.+ or ssed
|
|
 |
1bfe7ce |
sed "s/\
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
As you can see, sedmod and sed16 do not support alternation (|),
|
|
 |
1bfe7ce |
but they do support case conversion. If none of these versions of
|
|
 |
1bfe7ce |
sed are available to you, some sample scripts for this task are
|
|
 |
1bfe7ce |
available from the Seder's Grab Bag:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
http://sed.sourceforge.net/grabbag/scripts
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note that some case conversion scripts are listed under "Filename
|
|
 |
1bfe7ce |
manipulation" and others are under "Text formatting."
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
CHANGING BLOCKS (consecutive lines)
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.20. How do I change only one section of a file?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
You can match a range of lines by line number, by regexes (say, all
|
|
 |
1bfe7ce |
lines between the words "from" and "until"), or by a combination of
|
|
 |
1bfe7ce |
the two. For multiple substitutions on the same range, put the
|
|
 |
1bfe7ce |
command(s) between braces {...}. For example:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# replace only between lines 1 and 20
|
|
 |
1bfe7ce |
1,20 s/Johnson/White/g
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# replace everywhere EXCEPT between lines 1 and 20
|
|
 |
1bfe7ce |
1,20 !s/Johnson/White/g
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# replace only between words "from" and "until". Note the
|
|
 |
1bfe7ce |
# use of \<....\> as word boundary markers in GNU sed.
|
|
 |
1bfe7ce |
/from/,/until/ { s/\<red\>/magenta/g; s/\<blue\>/cyan/g; }
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# replace only from the words "ENDNOTES:" to the end of file
|
|
 |
1bfe7ce |
/ENDNOTES:/,$ { s/Schaff/Herzog/g; s/Kraft/Ebbing/g; }
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
For technical details on using address ranges, see section 3.3
|
|
 |
1bfe7ce |
("Addressing and Address ranges").
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.21. How do I delete or change a block of text if the block contains
|
|
 |
1bfe7ce |
a certain regular expression?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
The following deletes the block between 'start' and 'end'
|
|
 |
1bfe7ce |
inclusively, if and only if the block contains the string
|
|
 |
1bfe7ce |
'regex'. Written by Russell Davies, with additional comments:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
# sed script to delete a block if /regex/ matches inside it
|
|
 |
1bfe7ce |
:t
|
|
 |
1bfe7ce |
/start/,/end/ { # For each line between these block markers..
|
|
 |
1bfe7ce |
/end/!{ # If we are not at the /end/ marker
|
|
 |
1bfe7ce |
$!{ # nor the last line of the file,
|
|
 |
1bfe7ce |
N; # add the Next line to the pattern space
|
|
 |
1bfe7ce |
bt
|
|
 |
1bfe7ce |
} # and branch (loop back) to the :t label.
|
|
 |
1bfe7ce |
} # This line matches the /end/ marker.
|
|
 |
1bfe7ce |
/regex/d; # If /regex/ matches, delete the block.
|
|
 |
1bfe7ce |
} # Otherwise, the block will be printed.
|
|
 |
1bfe7ce |
#---end of script---
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Note: When the script above reaches /regex/, the entire multi-line
|
|
 |
1bfe7ce |
block is in the pattern space. To replace items inside the block,
|
|
 |
1bfe7ce |
use "s///". To change the entire block, use the 'c' (change)
|
|
 |
1bfe7ce |
command:
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
/regex/c\
|
|
 |
1bfe7ce |
1: This will replace the entire block\
|
|
 |
1bfe7ce |
2: with these two lines of text.
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
4.22. How do I locate a paragraph of text if the paragraph contains a
|
|
 |
1bfe7ce |
certain regular expression?
|
|
 |
1bfe7ce |
|
|
 |
1bfe7ce |
Assume that paragraphs are separated by blank lines. For regexes
|
|
 |
1bfe7ce |
that are single terms, use one of the following scripts:
|
|
 |
1bfe7ce |
|
|
 |