Vyoms OneStopTesting.com - Testing EBooks, Tutorials, Articles, Jobs, Training Institutes etc.
OneStopGate.com - Gate EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopMBA.com - MBA EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopIAS.com - IAS EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopSAP.com - SAP EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopGRE.com - of GRE EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
Bookmark and Share Rss Feeds

regular expression enhancements in 11g | Articles | Recent Articles | News Article | Interesting Articles | Technology Articles | Articles On Education | Articles On Corporate | Company Articles | College Articles | Articles on Recession
Sponsored Ads
Hot Jobs
Fresher Jobs
Experienced Jobs
Government Jobs
Walkin Jobs
Placement Section
Company Profiles
Interview Questions
Placement Papers
Resources @ VYOMS
Companies In India
Consultants In India
Colleges In India
Exams In India
Latest Results
Notifications In India
Call Centers In India
Training Institutes In India
Job Communities In India
Courses In India
Jobs by Keyskills
Jobs by Functional Areas
Learn @ VYOMS
GATE Preparation
GRE Preparation
GMAT Preparation
IAS Preparation
SAP Preparation
Testing Preparation
MBA Preparation
News @ VYOMS
Freshers News
Job Articles
Latest News
India News Network
Interview Ebook
Get 30,000+ Interview Questions & Answers in an eBook.
Interview Success Kit - Get Success in Job Interviews
  • 30,000+ Interview Questions
  • Most Questions Answered
  • 5 FREE Bonuses
  • Free Upgrades

VYOMS TOP EMPLOYERS

Wipro Technologies
Tata Consultancy Services
Accenture
IBM
Satyam
Genpact
Cognizant Technologies

Home » Articles » regular expression enhancements in 11g

regular expression enhancements in 11g








Article Posted On Date : Wednesday, March 21, 2012


regular expression enhancements in 11g
Advertisements

Regular expressions for SQL and PL/SQL were introduced in 10g Release 1 (as described in this oracle-developer.net article) and enhanced in 10g Release 2 (this included support for Perl-style expressions). Until 11g, Oracle's regular expression support comprised four functions (REGEXP_LIKE, REGEXP_SUBSTR, REGEXP_INSTR and REGEXP_REPLACE), but with the new version, Oracle has added a fifth, REGEXP_COUNT. In addition, 11g includes some usability enhancements to two of the existing functions.

In this article, we will explore all of the 11g new features for regular expressions. Note that this not a primer in regular expressions and does not work through the metacharacters and their usage. It is assumed that readers are familiar with Oracle's original regular expression support in 10g. For readers who require some background reading, see the article referenced above.
setup

To keep the examples simple and concise, we will use a single string to test our regular expressions. We will select this string from a view, as follows.

SQL> CREATE VIEW v
  2  AS
  3     SELECT 'www.oracle-developer.net' AS string
  4     FROM   dual;

View created.


SQL> SELECT * FROM v;

STRING
------------------------
www.oracle-developer.net

1 row selected.

regexp_count

We will begin with a new function, REGEXP_COUNT. As its name suggests, this 11g function counts the number of occurrences of an expression within a string. The parameters to this function follow a similar pattern to some of the existing REGEXP_% functions (REGEXP_COUNT's parameters are string, expression, starting position and match parameter) and we will see examples of their usage below.

In the first example, we will search for the number of occurrences of "e" in this website's address, as follows.

SQL> SELECT REGEXP_COUNT(string, 'e') AS "e Count"
  2  FROM   v;

   e Count
----------
         5

1 row selected.

Using the third argument to REGEXP_COUNT, we can also specify at which point in the base string we wish to begin our search. In the following example, we will count the number of "e" occurrences starting at the 20th character.

SQL> SELECT REGEXP_COUNT(string, 'e', 20) AS "e Count From Position 20"
  2  FROM   v;

e Count From Position 20
------------------------
                       1

1 row selected.

Readers who are familiar with REGEXP_SUBSTR, REGEXP_INSTR and REGEXP_REPLACE will recognise the general pattern of arguments to REGEXP_COUNT. The fourth argument to REGEXP_COUNT is the "match parameter", which enables us to apply some additional control over our search. For example, we can tell Oracle to ignore case in our search, as the following example demonstrates.

SQL> SELECT REGEXP_COUNT(string, 'E')         AS "E Count"
  2  ,      REGEXP_COUNT(string, 'E', 1, 'i') AS "E Count Ignore Case"
  3  FROM   v;

   E Count E Count Ignore Case
---------- -------------------
         0                   5

1 row selected.

We can see that our first expression found zero capital "E" characters, but when we ignore case (match parameter = 'i'), Oracle finds five occurrences. Incidentally, as with all regular expressions, special characters must be "escaped". In the following example, we will search for the number of full-stops in our string, both with and without an escaping character.

SQL> SELECT REGEXP_COUNT(string, '.')  AS "Dot No Escape (Any Character)"
  2  ,      REGEXP_COUNT(string, '.') AS "Dot With Escape (Actual Dot)"
  3  FROM   v;

Dot No Escape (Any Character) Dot With Escape (Actual Dot)
----------------------------- ----------------------------
                           24                            2

1 row selected.

The "dot" character in a regular expression means "match any character", so every character in "www.oracle-developer.net" is a match for our first expression. When we escape this character in our second function call, we receive the correct answer of two occurrences.

In the examples so far, we have searched for single characters. REGEXP_COUNT is also useful for counting occurrences of multiple characters. In the following example, we will count the number of times "www" appears in this website's address. We will do this in three different ways.

SQL> SELECT REGEXP_COUNT(string, 'www')       AS "www Count Literal"
  2  ,      REGEXP_COUNT(string, 'w{3}')      AS "www Count Metacharacters"
  3  ,      REGEXP_COUNT(string, '(w)(w)(w)') AS "www Count Subexpressions"
  4  FROM   v;

www Count Literal www Count Metacharacters www Count Subexpressions
----------------- ------------------------ ------------------------
                1                        1                        1

1 row selected.

Of course, we knew the answer already, but it is useful to demonstrate how REGEXP_COUNT ignores subexpressions. In the third function call above, we treated each "w" in the search expression as an expression in its own right (we wrapped it in parentheses to create a "subexpression"). These subexpressions can be extremely useful. In 10g, they are primarily used in REGEXP_REPLACE calls, but in 11g they extend to the REGEXP_SUBSTR and REGEXP_INSTR functions, as we will see later in this article.
counting occurrences pre-11g

The REGEXP_COUNT function is a convenient new way of searching for the occurrences of strings or patterns, but it is possible to do this with pre-11g built-ins. If we are searching for a literal string, the following example will work in as many versions of Oracle as most readers have used.

SQL> SELECT LENGTH(string) - LENGTH(REPLACE(string, 'e')) AS "e Count Any Version"
  2  FROM   v;

e Count Any Version
-------------------
                  5

1 row selected.

We have correctly identified that the "e" character appears five times in our sample string. If we wish to count the occurrences of strings greater than one character in length, we need to amend the algorithm slightly to factor in the search string length. In the following example, we calculate the number of times "www" appears in our string using simple built-ins.

SQL> SELECT (LENGTH(string) - LENGTH(REPLACE(string, 'www'))) /
  2          LENGTH('www') AS "www Count Any Version"
  3  FROM   v;

www Count Any Version
---------------------
                    1
                   
1 row selected.

Prior to 11g, it is also possible to search for the number of times an expression (rather than a literal) occurs in a string. In all versions of 10g, we can use REGEXP_REPLACE instead of REPLACE in our algorithm. In the following example, we will search for the number of non-alpha characters in the website address.

SQL> SELECT LENGTH(string)
  2           - LENGTH(REGEXP_REPLACE(string, '[^a-z]')) AS "Non-Alpha Count 10g"
  3  FROM   v;

Non-Alpha Count 10g
-------------------
                  3

1 row selected.

Oracle has correctly counted the three non-alpha characters in our base string.
performance considerations for regexp_count
The new REGEXP_COUNT function is much more simple to use than the pre-11g alternatives. However, being a regular expression, it will be far more intensive in terms of the work it needs to do to calculate the occurrences of an expression or literal. In the following performance tests, we will compare the time it takes REGEXP_COUNT to find the "e" character in our string with the "any version" method across one million iterations.

Because of the simplicity of this test, we will start by setting the PL/SQL optimisation level to 0 (to prevent Oracle from "optimising out" our constant assignment in the loop).

SQL> ALTER SESSION SET PLSQL_OPTIMIZE_LEVEL = 0;

Session altered.

We will begin the comparison with the REGEXP_COUNT version of our code, as follows.

SQL> set timing on

SQL> DECLARE
  2     v_string    VARCHAR2(30) := 'www.oracle-developer.net';
  3     v_delimiter VARCHAR2(1)  := 'e';
  4     v_count     PLS_INTEGER;
  5  BEGIN
  6     FOR i IN 1 .. 1000000 LOOP
  7        v_count := REGEXP_COUNT(v_string, v_delimiter);
  8     END LOOP;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:07.89

One million executions of the REGEXP_COUNT function against our website string in PL/SQL took almost eight seconds (repeated executions of this example gives similar results). The "any version" alternative, using simple built-ins, is as follows.

SQL> DECLARE
  2     v_string    VARCHAR2(30) := 'www.oracle-developer.net';
  3     v_delimiter VARCHAR2(1)  := 'e';
  4     v_count     PLS_INTEGER;
  5  BEGIN
  6     FOR i IN 1 .. 1000000 LOOP
  7        v_count := (LENGTH(v_string) -
  8                    LENGTH(REPLACE(v_string, v_delimiter))) /
  9                    LENGTH(v_delimiter);
 10     END LOOP;
 11  END;
 12  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.85

This completes in less than one second. The flexibility and ease of the REGEXP_COUNT function comes at a price, therefore, and if our processing requirements are reasonably intensive, it might be better to use the pre-11g algorithm for literal searches. For most developers, however, the performance "hit" will probably be minor, depending on the type of application, queries and overall intensity of usage. If milliseconds are not hugely important (as in batch systems, for example), then REGEXP_COUNT's performance cost might be worth bearing for the simplicity it offers.
using regexp_count in "string-to-table" queries

To complete this section on REGEXP_COUNT, we will see a small example whereby counting string occurrences can be very useful. One of the FAQs on Oracle web forums is how to split a delimited string into its constituent elements. The following simple example highlights how we can use REGEXP_COUNT to help us to do this in SQL. We will imagine the website address as a delimited string and define the "." character as our delimiter.

SQL> SELECT string
  2  ,      LEVEL                                     AS element_no
  3  ,      REGEXP_SUBSTR(string, '[^.]+', 1, LEVEL) AS single_element
  4  FROM   v
  5  CONNECT BY LEVEL <= REGEXP_COUNT(string, '.') + 1;

STRING                   ELEMENT_NO SINGLE_ELEMENT
------------------------ ---------- ------------------------
www.oracle-developer.net          1 www
www.oracle-developer.net          2 oracle-developer
www.oracle-developer.net          3 net

3 rows selected.

There are many variations on this technique. For each string, we need to generate as many rows as there are constituent pieces. We do this by adding one to the count of delimiters and using a CONNECT BY "trick" to generate this number of rows. We can then use either REGEXP_SUBSTR (as in our example) or a combination of SUBSTR and INSTR to extract each element.
regexp_substr

As stated earlier, REGEXP_SUBSTR is one of the original regular expressions introduced in 10g. This is a useful function as it enables us to cut a literal or expression from within a string without needing to know its starting or ending position. In the following example, we will use REGEXP_SUBSTR to extract "oracle-developer" from the website's address (using an expression rather than literal for demonstration purposes).

SQL> SELECT REGEXP_SUBSTR(string, '.[a-z-]+.') AS url_middle_10g
  2  FROM   v;

URL_MIDDLE_10G
------------------
.oracle-developer.

1 row selected.

We can see that we almost have our result. We defined our expression as a string of lower-case alpha characters and a hyphen bounded by full-stops. Unfortunately, we also have our boundary characters (the full-stops). In 10g, we need to strip these using functions such as REPLACE, LTRIM, RTRIM or even SUBSTR. For example, we will remove the full-stops using REPLACE, as follows.

SQL> SELECT REPLACE(
  2            REGEXP_SUBSTR(string, '.[a-z-]+.'),
  3            '.') AS url_middle_cleaned_10g
  4  FROM   v;

URL_MIDDLE_CLEANED_10G
----------------------
oracle-developer

1 row selected.

We now have the string we actually wanted and this leads us nicely on to the 11g enhancement to REGEXP_SUBSTR. Oracle has added a new parameter to the function to enable us to extract the results of a subexpression, rather than an entire expression. In our example, the regular expression has three logical components or "subexpressions". First, we have a full-stop to the left of the string we want to extract. Second, we have the expression to find the string we actually want to extract. Third, we have the terminating full-stop.

We can use parentheses to define each of these subexpressions which turns our original expression .[a-z-]+. into (.)([a-z-]+)(.). Now that we have three subexpressions, we can ask REGEXP_SUBSTR to return any one of them, rather than the entire result. In the following example, therefore, we will request the results of the second subexpression only.

SQL> SELECT REGEXP_SUBSTR(
  2            string,
  3            '(.)([a-z-]+)(.)', --<-- expression with subexpressions
  4            1,                   --<-- starting position
  5            1,                   --<-- nth occurrence
  6            'i',                 --<-- match parameter (ignore case)
  7            2                    --<-- 11g: subexpression to return
  8            ) AS url_middle_11g
  9  FROM   v;

URL_MIDDLE_11G
----------------
oracle-developer

1 row selected.

Note that we can only specify one subexpression to return. However, this new feature is an extremely useful addition to REGEXP_SUBSTR and one which greatly simplifies its use.
regexp_instr

Oracle has added the same subexpression parameter to REGEXP_INSTR in 11g. Continuing with the example we used for REGEXP_SUBSTR above, we will use REGEXP_INSTR in its 10g form to determine the start and offset positions of the "oracle-developer" part of the sample website address. As before, we will use regular expressions rather than literals for demonstration purposes. We will begin by finding the start position of the search expression, as follows.

SQL> SELECT string
  2  ,      REGEXP_INSTR(string, '.[a-z-]+.') AS url_middle_start_10g
  3  FROM   v;

STRING                   URL_MIDDLE_START_10G
------------------------ --------------------
www.oracle-developer.net                    4

1 row selected.

The REGEXP_INSTR function returns the starting position of the entire expression, which in the case is the position of the full-stop. Using REGEXP_INSTR's offset parameter (described in the background reading), we will try to determine the position immediately following the end of the expression, as follows.

SQL> SELECT string
  2  ,      REGEXP_INSTR(
  3            string,
  4            '.[a-z-]+.',             --<-- expression
  5            1,                         --<-- starting position
  6            1,                         --<-- nth occurrence
  7            1                          --<-- offset indicator
  8            ) AS url_middle_offset_10g
  9  FROM   v;

STRING                   URL_MIDDLE_OFFSET_10G
------------------------ ---------------------
www.oracle-developer.net                    22

1 row selected.

We will use the SUBSTR function to summarise our results, as follows.

SQL> SELECT SUBSTR(string, 4, 1)  AS instr_start_10g
  2  ,      SUBSTR(string, 22, 1) AS instr_offset_10g
  3  FROM   v;

INSTR_START_10G INSTR_OFFSET_10G
--------------- ----------------
.               n

1 row selected.

These are clearly not what we wanted, although they are correct in the context of the 10g version of REGEXP_INSTR. Using the new 11g subexpression parameter, however, we can more accurately define our results, as follows.

SQL> SELECT string
  2  ,      REGEXP_INSTR(
  3            string,
  4            '(.)([a-z-]+)(.)',      --<-- expression with subexpressions
  5            1,                        --<-- starting position
  6            1,                        --<-- nth occurrence
  7            0,                        --<-- offset indicator
  8            'i',                      --<-- match parameter (ignore case)
  9            2                         --<-- 11g: subexpression to return
 10            ) AS url_middle_start_11g
 11  FROM   v;

STRING                   URL_MIDDLE_START_11G
------------------------ --------------------
www.oracle-developer.net                    5

1 row selected.

By specifying that we are only interested in the second subexpression, we have correctly identified the start of the "oracle-developer" string within the website address. We can do the same for the position immediately after this string, either by using the offset parameter with subexpression two or by finding the start of the third subexpression, as follows.

SQL> SELECT string
  2  ,      REGEXP_INSTR(
  3            string,
  4            '(.)([a-z-]+)(.)',      --<-- expression with subexpressions
  5            1,                        --<-- starting position
  6            1,                        --<-- nth occurrence
  7            1,                        --<-- offset indicator
  8            'i',                      --<-- match parameter (ignore case)
  9            2                         --<-- 11g: subexpression to return
 10            ) AS url_middle_offset_1_11g
 11  ,      REGEXP_INSTR(
 12            string,
 13            '(.)([a-z-]+)(.)',      --<-- expression with subexpressions
 14            1,                        --<-- starting position
 15            1,                        --<-- nth occurrence
 16            0,                        --<-- no offset indicator
 17            'i',                      --<-- match parameter (ignore case)
 18            3                         --<-- 11g: subexpression to return
 19            ) AS url_middle_offset_2_11g
 20  FROM   v;

STRING                   URL_MIDDLE_OFFSET_1_11G URL_MIDDLE_OFFSET_2_11G
------------------------ ----------------------- -----------------------
www.oracle-developer.net                      21                      21

1 row selected.

We can see that either method returns the same result. We can verify the results of the 11g version of REGEXP_INSTR as follows.

SQL> SELECT SUBSTR(string, 5, 1)  AS instr_start_11g
  2  ,      SUBSTR(string, 21, 1) AS instr_offset_11g
  3  FROM   v;

INSTR_START_11G INSTR_OFFSET_11G
--------------- ----------------
o               .

1 row selected.

As we can see, the subxpressions parameter enables us to define the results we want from the REGEXP_INSTR function with greater precision.






Sponsored Ads



Interview Questions
HR Interview Questions
Testing Interview Questions
SAP Interview Questions
Business Intelligence Interview Questions
Call Center Interview Questions

Databases

Clipper Interview Questions
DBA Interview Questions
Firebird Interview Questions
Hierarchical Interview Questions
Informix Interview Questions
Microsoft Access Interview Questions
MS SqlServer Interview Questions
MYSQL Interview Questions
Network Interview Questions
Object Relational Interview Questions
PL/SQL Interview Questions
PostgreSQL Interview Questions
Progress Interview Questions
Relational Interview Questions
SQL Interview Questions
SQL Server Interview Questions
Stored Procedures Interview Questions
Sybase Interview Questions
Teradata Interview Questions

Microsof Technologies

.Net Database Interview Questions
.Net Deployement Interview Questions
ADO.NET Interview Questions
ADO.NET 2.0 Interview Questions
Architecture Interview Questions
ASP Interview Questions
ASP.NET Interview Questions
ASP.NET 2.0 Interview Questions
C# Interview Questions
Csharp Interview Questions
DataGrid Interview Questions
DotNet Interview Questions
Microsoft Basics Interview Questions
Microsoft.NET Interview Questions
Microsoft.NET 2.0 Interview Questions
Share Point Interview Questions
Silverlight Interview Questions
VB.NET Interview Questions
VC++ Interview Questions
Visual Basic Interview Questions

Java / J2EE

Applet Interview Questions
Core Java Interview Questions
Eclipse Interview Questions
EJB Interview Questions
Hibernate Interview Questions
J2ME Interview Questions
J2SE Interview Questions
Java Interview Questions
Java Beans Interview Questions
Java Patterns Interview Questions
Java Security Interview Questions
Java Swing Interview Questions
JBOSS Interview Questions
JDBC Interview Questions
JMS Interview Questions
JSF Interview Questions
JSP Interview Questions
RMI Interview Questions
Servlet Interview Questions
Socket Programming Interview Questions
Springs Interview Questions
Struts Interview Questions
Web Sphere Interview Questions

Programming Languages

C Interview Questions
C++ Interview Questions
CGI Interview Questions
Delphi Interview Questions
Fortran Interview Questions
ILU Interview Questions
LISP Interview Questions
Pascal Interview Questions
Perl Interview Questions
PHP Interview Questions
Ruby Interview Questions
Signature Interview Questions
UML Interview Questions
VBA Interview Questions
Windows Interview Questions
Mainframe Interview Questions


Copyright © 2001-2024 Vyoms.com. All Rights Reserved. Home | About Us | Advertise With Vyoms.com | Jobs | Contact Us | Feedback | Link to Us | Privacy Policy | Terms & Conditions
Placement Papers | Get Your Free Website | IAS Preparation | C++ Interview Questions | C Interview Questions | Report a Bug | Romantic Shayari | CAT 2024

Fresher Jobs | Experienced Jobs | Government Jobs | Walkin Jobs | Company Profiles | Interview Questions | Placement Papers | Companies In India | Consultants In India | Colleges In India | Exams In India | Latest Results | Notifications In India | Call Centers In India | Training Institutes In India | Job Communities In India | Courses In India | Jobs by Keyskills | Jobs by Functional Areas

Testing Articles | Testing Books | Testing Certifications | Testing FAQs | Testing Downloads | Testing Interview Questions | Testing Jobs | Testing Training Institutes

Gate Articles | Gate Books | Gate Colleges | Gate Downloads | Gate Faqs | Gate Jobs | Gate News | Gate Sample Papers | Gate Training Institutes

MBA Articles | MBA Books | MBA Case Studies | MBA Business Schools | MBA Current Affairs | MBA Downloads | MBA Events | MBA Notifications | MBA FAQs | MBA Jobs
MBA Job Consultants | MBA News | MBA Results | MBA Courses | MBA Sample Papers | MBA Interview Questions | MBA Training Institutes

GRE Articles | GRE Books | GRE Colleges | GRE Downloads | GRE Events | GRE FAQs | GRE News | GRE Training Institutes | GRE Sample Papers

IAS Articles | IAS Books | IAS Current Affairs | IAS Downloads | IAS Events | IAS FAQs | IAS News | IAS Notifications | IAS UPSC Jobs | IAS Previous Question Papers
IAS Results | IAS Sample Papers | IAS Interview Questions | IAS Training Institutes | IAS Toppers Interview

SAP Articles | SAP Books | SAP Certifications | SAP Companies | SAP Study Materials | SAP Events | SAP FAQs | SAP Jobs | SAP Job Consultants
SAP Links | SAP News | SAP Sample Papers | SAP Interview Questions | SAP Training Institutes |


Copyright ©2001-2024 Vyoms.com, All Rights Reserved.
Disclaimer: VYOMS.com has taken all reasonable steps to ensure that information on this site is authentic. Applicants are advised to research bonafides of advertisers independently. VYOMS.com shall not have any responsibility in this regard.