EXAMPLE 1

Lets generate some random text:

SIDENOTE: actually at the time, this text was not random for me. I was doing something with it – in case you were wondering why udev rules output.

SIDENOTE: grep -nir . *  is very useful bash command, it shows the output of every text file recusively down (empty lines not included)

Sample output:

I want to select all of the SYMLINK=+"stuff"  entries
So I could do this:

 

NOTE: grep -o , is similar to using grep to extract data. grep -o  doesnt show you the full line of what it found. it shows you just what it found. So its a good data extraction tool, thats if you know the regular expression to extract the data with it.

However the output is this, notice it selects more than just SYMLINK and beyond the closing quote mark in SYMLINK , it didnt stop at the correct double quote – instead it goes to the last occurring double quote mark. (so the question is, how do we make it go to the first/next occurring quote mark)

So we want it to find  SYMLINK=+" then everything in between until the next double and then the closing double quote " * .  means any character
* .*  means any characters (thats not a new line)

The correct notation would be this:

 

There we go now we have the correct output (we found SYMLINK+=" and everything until the next appearing double quotation mark" :

So that says find SYMLINK=+" then everything thats not a double quote, followed by a double quote " * [^"]  means any character thats not a double quote
* [^"]*  means any characters that are not double quotes

EXAMPLE 2

Read the text1.txt as it states the goal in a weird way.

NOTE: we dont need to specify start of the line ^, answer will be the same, for obvious reasons

So that says find everything from the start of the line, that has any characters up to a semicolon.
The problem is that it will find everything up to the last semicolon in the text

Output will be:

To get the correct output do these:

Then we have the correct output (up to the first semicolon):
We have some text;
Looks fun;
Lets say we wanted the first semicolon;

EXTRA:

What if we wanted everything between the first and second semicolon… Then we wouldnt use grep, we would use sed (and use variables, save the 2nd occurance into a variable and make that variable be the output – or select text to the left and right of what we want and make it blank). Or grep in combo with  cut -d';' -f2  or awk -F';' '{print $2}' – the cut will select everything in 2nd field, if you split your text into columns by semicolon delimiters – the awk will split the text by semicolon field/delimiter and print the output of the 2nd column (which is the same thing as doing it with cut). Many ways to do the same thing.

EXAMPLE 3 (added 2015-04-29)

We have some code here (generated by my favorite command grep -nir .*  – which simply lists all files recursively in the folder showing their string content and line numbers)

Lets use the terminals coloring ability, and greps –color to selectively color the first part “filename:linenumber:” so that the part after which is the content of the file is seperated and its easier on the eyes.

It looks like this:

grepcolors

The above will color all of the beginning parts of each line, that look like this: “widgets.py:1082:”.

The problem is that it will also color “mlab.py:2334: try:”, we only want it to color this part “mlab.py:2334:”

Building the solution: If I do this”

The above says find a line that start with anything except a colon, and find any number of those (so basically select all non-colon characters), until you find a colon, and show all that. (so basically show everything up to the first colon, and show the first colon as well). So it will find & color “NONE-COLON:”

The above will color up to the first colon, so it will color “widgets.py:” part of the line “widgets.py:1094: from pylab import *” but not the full “widgets.py:1094:”, it misses the “1094:” part.
We also want the number (it can be anything though) afterwards up to the next :.

The solution:

The above says find a line that start with anything except a colon, and find any number of those, until you find a colon, and show all that, also keep looking for none colon characters and show them up until the next colon. Meaning show all of the beginning text that starts with a none-colon and goes to a : and then more none-colon text and then one more colon. So it will find & color “NONE-COLON:NONE-COLON:”

Since –color colors what grep finds, and still returns the full line (as grep usually does – sidenote: to stop that behaviour and have grep only show what it has found, and nothing outside of that, so not to display the whole line, use “-o” argument)

What to learn from example3:

(1) you can use the [^C]*C where C is any character as many times as you want. Imagine C is a colon and here is an example of using it 2 times  grep --color "[^:]*:[^:]*:"

(2) Well you can use grep -nir . * technique to look thru all files recursively through all folders and you can append the grep trick we just learned to color out the part mentioning the “filename:linenumber:” so that the content stands out (like in the screenshot above)

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *