Saturday 25 May 2013

Text Processing Tools in Linux

The following command are often used when writing scripts to automate tasks.

The diff command :- to compare the content of two files for differences.

diff file1.conf file2.conf > output.txt
cat output.txt
Options are -c, -u and -r.

The patch command :- ued to apply simple patch file to a single file.

patch -b file1.conf < output.txt

The grep command :- displays the lines in a files that match pattern.

grep root /etc/passwd
ps -aux | grep sshd
Options are -i, -n, -r, -c, -v and -l.

The cut command :- ued to cut fields or colomns of text from a file and display standard output.

cut -f3 -d: /etc/passwd
/sbin/ip addr | grep 'inet' | cut -d ' ' -f6 | cut -d / -f1
Options are -d, -f and -c.

The head command :- displays first few lines of a file.

head /etc/passwd
head -n 3 /etc/passwd
Options is -n.

The tail command :- displays last few lines of a file.

tail -n 3 /etc/passwd
tail -f /etc/passwd
tail -f will continue to show updates until Ctrl+c is pressed.

The wc command :- counts the number of lines(l), words(w), bytes(c) and characters(m) in a file.

wc -l file1.conf
ls /tmp | wc -l
Options are -l, -w, -c and -m.

The sort command :- used to sort text data.

grep bash /etc/passwd | cut -d: -f1 | sort
options are -n, -k and -t.

The uniq command :- removes duplicate adjacent lines from a file.

cut -d: -f7 /etc/passwd | sort | uniq -c
Options are -u and -d

The echo command :-  output strings.

echo This is a test.
echo This is a test. > output.txt
cat output.txt

The cat command :- output or concatenate files.

cat file1.conf > output.txt
cat output.txt
cat file1.conf file2.conf | less
Options are -b, -n, -s, -v, -t, -e and -A.

The paste command :- join multiple files horizontally.

paste file1.conf file2.conf
Options are -d and -s.

The split command :- split files based on context lines.

split -l 500 myfile segment
split -b 40k myfile segment
Options are -l (line no) and -b(bytes).

The comm command :- to compare two files for common and distinct lines.

comm file1.conf file2.conf

The dirname command :-  it will delete any suffix beginning with the last slash ('/') character and return the result.

dirname /etc/httpd/conf/httpd.conf

The fold command :- used for for making a file with long lines more readable on a limited width terminal.
fold -w 30 file.txt

The sed command :- reads text input, line by line, and allows midfication.

sed 'word;wordtoreplace' < file1.conf > output.txt
sed '/word/ d' filename > output.txt
sed 's/word/ /g' filename > output.txt
sed 's/firstword//g; s/secondword//g' yourfile > output.txt

The awk command :- used as a data extraction and reporting tool.

awk "/word/" filename > output.txt

The less command :- used to view the contents of a text file one screen at a time.

less /etc/passwd


No comments:

Post a Comment