Home > Uncategorized > Javascript Regular Expression greedy vs lazy

Javascript Regular Expression greedy vs lazy

February 4th, 2012

We will understand the concept of greedy Vs. Lazy regular expression with the help of
an example.

Consider part of an html page consisting of some words in bold. Here is an example

<p> This is an example page </p>
<b> First Bold </b>
Something here
<b> Second Bold </b>
Something else here
<p> Finish </p>
Now, we want to write a regular expression which will match <b> First Bold
</b> in the above example. To do this we use the following regular expression
/<b>.*</b>/
Hoping it to work. This regular expression, however, matches the following part of the
subject
<b> First Bold </b>
Something here
<b> Second Bold </b>
Why ? To understand it we would like to understand the mechanism of match. In the
regular expression /<b>.*</b>/ . The regular expression first looks for <b> in the
subject. Once <b> is found, .* all way to the end of the subject. In the process, it eats
up the whole subject, right all way to the finish </p>. When its stomach is full, it looks
in the regular expression, for what next to match. It is </b>. So what it does is
backtracking.

 

It tries to match < of </b> in the regular expression with > of </p> in the subject. The
match fails. Now it tries matching < of </b> with p of </p>. This also fails. It keeps
doing this till it matches < of </b> in the regular expression with < of </p> in the
subject

<html>
<body>
<script type="text/javascript">
<!--
/*
********************************************************
Javascript Regular Expression Example ch4 Ex 01
Understanding Greedy Vs. Lazy Match
********************************************************
*/
var pattern1=/<b>(.*)<\/b>/;
var string1 = "<p> This is an example page </p><b> First Bold </b>Something here
<b> Second Bold </b> Something else here<p> Finish </p>" ;
var string2 = string1.match(pattern1);
document.write("string2[0] is : ", string2[0] , "<br />");
//-->
</script>
</body>
</html>

 

If we run this code we get the following output

string2[0] is : First Bold Something here Second Bold

Notice that, since we are displaying the output on an html, we do not see <b> and </b>.
We instead see the actual bold letters. But you should be able to feel the idea. The
regular expression /<b>(.*)<\/b>/; matches all the text between the first <b> and all
way to the last </b>.

If we change the statement

var pattern1=/<b>(.*)<\/b>/;
to
var pattern1=/<b>(.*?)<\/b>/;

we get the following output

string2[0] is : First Bold

You may also like to take a small quiz about Regex ( greedy vs lazy)

Also check

1. Javascript Tutorial for beginners

2. Regular Expression In Javascript on amazon

 

 

 

Related posts:

  1. Javascript Textarea Tutorial

Uncategorized

  1. No comments yet.
  1. No trackbacks yet.