walden systems, walden, system, developer, geek, geeks corner, programming, awk, scripting, variable, scope, global, local, c shell, bash, grouping, regular expressions, regexp, sed

Regular expressions in SED

SED stands for Stream EDitor. It is a simple but powerful utility that parses the text and easily manipulates it. SED was developed in 1973 by Lee McMahon of Bell Labs. Today, it runs on all major operating systems. McMahon wrote a general purpose, line oriented editor, which eventually became SED. SED borrowed syntax and many useful features from the ed editor. From the beginning, it supportsr regular expressions. SED accepts inputs from files as well as pipes. It can also accept inputs from standard input streams. To a beginner, the syntax of SED may look complicated and cryptic but once you get used to its syntax, you can solve many complex tasks with a few lines of SED script.

Workflow

SED follows a simple flow, read execute and display. SED reads a line from the input stream whether it is a file, pipe, or stdin, and stores it in its internal, pattern buffer. All SED commands are run sequentially on the pattern buffer. SED commands are applied on all lines unless line addressing is specified. SED then displays the modified contents to the output stream. After sending the data, the pattern buffer will be empty.

Syntax

SED can be executed two ways, in line or through a script file. To a single SED command, we use the in line method. To execute multiple SED commands, we use a script file. We can combine both forms together and use them multiple times. SED has three standard options we need to be familiar with. They are for the default display of the pattern buffer, whether the next command is an argument and whether the next argument is a file.

  -n - Default printing of pattern buffer
  -e - Next argument is a edit command,  used when we want to issue
       multiple commands without using a script file
  -f - Next argument is a script file that contains editing commands

Regular expressions

There are several ways that SED allows us to search a string. We can match a single character using the .. We can search for word at beginning of sentence using ^. We can search for a word t the end of a line using $. We can search for a set of characters using by enclosing characters in []. We can also search for an exclusive set were we include all character but what is enclosed in [^}. This is all similar to AWK which also uses regex syntax.

Character match:

$  echo -e "cat car fun den fan foo" | sed -n '/f.n/'

Will output fun, fan and foo

Match begining of line:
$ echo -e "Who What There Their these" | sed -n '/^The/'

Will output There, Their
Match end of line:

$ echo -e "foo where Den fan boon boot" | sed -n '/n$/'

Will output Den, fan, boon

Match set:

$ echo -e "Coo Tall Ball Mall" | sed -n '/[BM]all/'

Will output Mall, Mall

Exclusive set:

 $ echo -e "Coo Tall Ball Mall" | sed -n '/[^BM]all/'

Will output Coo, Tall