Vyoms OneStopTesting.com - Testing EBooks, Tutorials, Articles, Jobs, Training Institutes etc.
OneStopGate.com - Gate EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopMBA.com - MBA EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopIAS.com - IAS EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopSAP.com - SAP EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopGRE.com - of GRE EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
Bookmark and Share Rss Feeds

Preprocess External Tables | Articles | Recent Articles | News Article | Interesting Articles | Technology Articles | Articles On Education | Articles On Corporate | Company Articles | College Articles | Articles on Recession
Sponsored Ads
Hot Jobs
Fresher Jobs
Experienced Jobs
Government Jobs
Walkin Jobs
Placement Section
Company Profiles
Interview Questions
Placement Papers
Resources @ VYOMS
Companies In India
Consultants In India
Colleges In India
Exams In India
Latest Results
Notifications In India
Call Centers In India
Training Institutes In India
Job Communities In India
Courses In India
Jobs by Keyskills
Jobs by Functional Areas
Learn @ VYOMS
GATE Preparation
GRE Preparation
GMAT Preparation
IAS Preparation
SAP Preparation
Testing Preparation
MBA Preparation
News @ VYOMS
Freshers News
Job Articles
Latest News
India News Network
Interview Ebook
Get 30,000+ Interview Questions & Answers in an eBook.
Interview Success Kit - Get Success in Job Interviews
  • 30,000+ Interview Questions
  • Most Questions Answered
  • 5 FREE Bonuses
  • Free Upgrades


Wipro Technologies
Tata Consultancy Services
Cognizant Technologies

Home » Articles » Preprocess External Tables

Preprocess External Tables

Article Posted On Date : Wednesday, March 21, 2012

Preprocess External Tables

Use inline preprocessing on external tables to simplify extract, transform, and load operations.

External tables, available since Oracle9I Database, enable you to view a text file as if it were data in a database table. Used in data warehouse environments, extract, transform, and load (ETL) processes can read an external table’s text file directly and subsequently load the data into summary tables. ETL eliminates the step of loading the text files into intermediate storage, saving significant space and time. But sometimes the text files being loaded into the database are compressed or not precisely in the format expected by the external tables. In such cases, the text files must be transformed into the appropriate format"for example, uncompressed"in a separate process before they can be read by external tables. Can this additional “out-of-line” task be eliminated"or at least be made inline with ETL processes"so that processing logic does not need to change?

In Oracle Database 11g Release 2, it is possible with inline preprocessing of input text files in external tables. This article shows you how to use this new feature to make your ETL processes handle any type of file without an additional out-of-line step.
External Tables

First let’s look at an external table setup. Consider a system that receives a file called indata1.txt to be loaded into the database by an ETL process. This file is stored in a directory (on a Windows-based system) named c:etl_dir.

To create the external table, you must first create a directory object; in this case, the table is named ETL_DIR. Logging in as SYS, you issue the following command:

create directory etl_dir as

The user who owns the ETL_DIR external table is ETLADMIN, and that user should have the privileges to read from and write on the directory. The following SQL grants those privileges to ETLADMIN:

grant read, write on directory etl_dir to etladmin;

Now you can create the external table on the indata1.txt text file:

create table indata1
    cust_id     number,
    cust_name    varchar2(20),
    credit_limit number(10)
organization external
    type oracle_loader
    default directory etl_dir
    access parameters
       records delimited by newline
       fields terminated by ","
location ('indata1.txt')

The indata1.txt text file contains the following rows of information:

1,John Smith,1000
2,Jack Smith,2000

With the external table created, you can now select from it as you would from a typical database table:

SQL> select * from indata1;

"""""""  """"""""""  """"""""""""
      1  John Smith          1000
      2  Jack Smith          2000

Although you can access the external table in many of the same ways you would a database table, you cannot update the external table.
Change in Process

Now suppose that with your external table and its text file in place, the input file for the external table is compressed to reduce the volume of data transmitted across the network. Although compression helps the network bandwidth utilization, it creates a challenge for the ETL process. The file must be uncompressed before its contents can be accessed by the external table.

Rather than uncompress the file in a separate process, you can use the preprocessor feature of Oracle Database 11g Release 2 with external tables to uncompress the file inline. And you will not need to change the ETL process.

To use the preprocessor feature, first you need to create a preprocessor program. The external table expects input in a text format but not necessarily in a file. The external table does not need to read a file; rather, it expects to get the file contents “fed” to it. So the preprocessor program must stream the input data directly to the external table"and not create another input file. The input to the preprocessor will be a compressed file, and the output will be the uncompressed contents.

The following is the code for your new preprocessor program, named preprocess.bat:

@echo off
BINunzip.exe -qc %1

The first line, @echo off, suppresses the output of the command in a Windows environment. The remaining code calls the unzip.exe utility located in the Oracle home. The utility needs an input file, which is the first (and only) parameter passed to it, shown as %1. The options q and c tell the utility to uncompress quietly (q) without producing extraneous output such as “Inflating” or “%age inflated” and match filenames case-insensitively (c), respectively.

Next you need to create the directory object where this preprocessor program is located. Logging in as SYS, issue

create directory execdir as 'c: ools';

And now grant EXECUTE permissions on the directory to the ETLADMIN user:

grant execute on directory execdir to etladmin;

Finally, create the new external table:

create table indata1
  cust_id        number,
  cust_name    varchar2(20),
  credit_limit number(10)
organization external
  type oracle_loader
  default directory etl_dir
  access parameters
    records delimited by newline
    preprocessor execdir:'preprocess.bat'
    fields terminated by ","
location ('indata1.txt')

Note that the only difference from the previous CREATE TABLE statement is the new line within the access parameters clause:

preprocessor execdir:'preprocess.bat'

It calls the preprocess.bat executable in the directory specified by EXECDIR before the external table accesses the indata1.txt file in the location specified by the ETL_DIR directory. Remember, indata1.txt is now a compressed file. So, in effect, the external table reads not the actual specified input file but rather the output of preprocess.bat, which is the uncompressed data from the indata1.txt file.

If you select from the external table now, the output will be similar to that of the earlier select * from indata1; query. The preprocessor passed the uncompressed contents of the indata1.txt (compressed) file on to the external table. There was no need to uncompress the file first"saving significant time and the intermediate space required and making it unnecessary to change the ETL process.

This inline preprocessing unzip example uses a script, but that is not always necessary. An executable can be used instead. For example, in Linux you can use /bin/gunzip. However, the utility can’t accept any parameters. So if you pass parameters (as in this article’s example), you must use a script.
Security Concerns

The EXECUTE privilege on a directory is a new feature introduced in Oracle Database 11g Release 2. It enables the DBA to grant EXECUTE permissions only for certain directories and only to certain users. Without WRITE privileges, users will not be able to update the executables inside a directory to insert malicious code, but users will be able to execute the “approved” code accessible in a single location. The DBA can put all the necessary preprocessor tools into a single directory and grant EXECUTE privileges there to the users who may need them. And, of course, the executables and data should be in different directories.

Preprocessing also requires some special precautions on the part of the DBA. Because the executables called by preprocessing programs will be executed under the privileges of the Oracle software owner and malicious executable code can cause a lot of damage, the DBA should be extremely careful in monitoring executables for potentially harmful code.

The directory containing the preprocessor executables needs to be accessible to the Oracle software owner for EXECUTE operations only, not for WRITE activity. Therefore, as an added precaution, the system administrator can remove WRITE access to that directory from all users, including the Oracle software owner. This significantly reduces the chance of damage by malicious code.
Other Uses

Compression is not the only use for inline preprocessing, although it certainly is the most widely used. You can, for example, use this preprocessing technique to show the output of a program as an external table. Consider, for instance, the dir command in Windows for listing the contents of a directory. How would you like to get the output as a table so that you can apply predicates?

Getting and using this output is quite simple with the preprocessor functionality. Remember, the preprocessor does not actually need a file but, rather, requires the output of the preprocessor program. You can write a preprocessor program to send the output of a dir command. The new pre-processor program, named preproc_dir.bat, has only the following two lines:

@echo off

You will also need a file for the external table. The contents of the file are irrelevant, so you can use any file that the Oracle software owner can read in a directory to which that owner has read access. For this example, the file is dirfile.txt, and although the contents of the file are immaterial, the file must exist, because the external table will access it. Listing 1 shows how to create the table.

Code Listing 1: Creating an external table for showing a directory listing

create table dir_tab
       mod_dt       date,
       mod_time     char(10),
       file_type    char(10),
       file_size    char(10),
       file_name    char(40)
organization external
    type oracle_loader
    default directory etl_dir
    access parameters
       records delimited by newline
       preprocessor execdir:'preproc_dir.bat'
       skip 5
       load when (mod_dt != blanks)
             mod_dt       position (01:10) DATE mask "mm/dd/yyyy",
             mod_time     position (11:20),
             file_type    position (21:29),
             file_size    position (30:38),
             file_name    position (39:80)
location ('dirfile.txt')
reject limit unlimited

-- select from this table

SQL> select * from dir_tab;

"""""""""    """"""""   """""""""  """""""""  """""""""
16-DEC-10   10:12 AM     <DIR>                .
16-DEC-10   10:12 AM     <DIR>                ..
22-MAY-10   09:57 PM     <DIR>                archive
22-MAY-10   10:27 PM                  2,048   hc_alap112.dat
05-DEC-10   07:07 PM                     36   indata1.txt
22-DEC-05   04:07 AM                 31,744   oradba.exe
16-DEC-10   09:58 AM                  1,123   oradim.log
28-SEP-10   12:41 PM                  1,536   PWDALAP112.ora
16-DEC-10   09:58 AM                  2,560   SPFILEALAP112.ORA

9 rows selected.

-- select a file not updated in last 1 year

SQL> select * from dir_tab where mod_dt < sysdate - 365;

"""""""""   """"""""   """""""""  """""""""  """""""""
22-DEC-05   04:07 AM               31,744  oradba.exe

Next Steps

READ more about processing commands
Oracle Database Utilities, chapter 14, “ The ORACLE_LOADER Access Driver”
“Using the Preprocessor Feature with External Tables in Oracle Database 11g Release 2”
Because the dir command displays output in a prespecified manner, the external table easily parses it by reading the fields located in specific positions. For example, positions 1 through 10 display the date, 11 through 20 display the time, and so on. The dir command produces some heading and preliminary information that the external table has to ignore, so there is a skip 5 clause in Listing 1 that skips the first five lines of the output. The last few lines of the output show how many files and directories are present and how much free space remains. This output must be skipped as well, so the external table displays records only when the date column has a value.

Listing 1 also shows the result of a query against the external table. Because the MOD_DT column is of the date datatype, you can also apply a WHERE condition to select a specified set of records.

External tables enable users to access data in text files, immediately eliminating the need to load input text files to intermediate tables for processing"saving both time and storage space. Now, with Oracle Database 11g Release 2, intermediate processing of any kind"such as decompression of compressed input files"is eliminated, further saving time and storage, not to mention a change to the existing code.

But the power of inline preprocessing is not limited to decompression alone. It can be applied to any type of prior processing required, as long as it produces an output that can be parsed by the external table. The directory listing is just one small demonstration of this rich capability of preprocessing in external tables. You can also use it to massage datafiles to fit a specific format, append or augment data from multiple sources dynamically"without creating any intermediate storage"or even send an e-mail when a specific text file is accessed by an external table. What you can do with inline preprocessing is limited only by your imagination. 

Sponsored Ads

Interview Questions
HR Interview Questions
Testing Interview Questions
SAP Interview Questions
Business Intelligence Interview Questions
Call Center Interview Questions


Clipper Interview Questions
DBA Interview Questions
Firebird Interview Questions
Hierarchical Interview Questions
Informix Interview Questions
Microsoft Access Interview Questions
MS SqlServer Interview Questions
MYSQL Interview Questions
Network Interview Questions
Object Relational Interview Questions
PL/SQL Interview Questions
PostgreSQL Interview Questions
Progress Interview Questions
Relational Interview Questions
SQL Interview Questions
SQL Server Interview Questions
Stored Procedures Interview Questions
Sybase Interview Questions
Teradata Interview Questions

Microsof Technologies

.Net Database Interview Questions
.Net Deployement Interview Questions
ADO.NET Interview Questions
ADO.NET 2.0 Interview Questions
Architecture Interview Questions
ASP Interview Questions
ASP.NET Interview Questions
ASP.NET 2.0 Interview Questions
C# Interview Questions
Csharp Interview Questions
DataGrid Interview Questions
DotNet Interview Questions
Microsoft Basics Interview Questions
Microsoft.NET Interview Questions
Microsoft.NET 2.0 Interview Questions
Share Point Interview Questions
Silverlight Interview Questions
VB.NET Interview Questions
VC++ Interview Questions
Visual Basic Interview Questions

Java / J2EE

Applet Interview Questions
Core Java Interview Questions
Eclipse Interview Questions
EJB Interview Questions
Hibernate Interview Questions
J2ME Interview Questions
J2SE Interview Questions
Java Interview Questions
Java Beans Interview Questions
Java Patterns Interview Questions
Java Security Interview Questions
Java Swing Interview Questions
JBOSS Interview Questions
JDBC Interview Questions
JMS Interview Questions
JSF Interview Questions
JSP Interview Questions
RMI Interview Questions
Servlet Interview Questions
Socket Programming Interview Questions
Springs Interview Questions
Struts Interview Questions
Web Sphere Interview Questions

Programming Languages

C Interview Questions
C++ Interview Questions
CGI Interview Questions
Delphi Interview Questions
Fortran Interview Questions
ILU Interview Questions
LISP Interview Questions
Pascal Interview Questions
Perl Interview Questions
PHP Interview Questions
Ruby Interview Questions
Signature Interview Questions
UML Interview Questions
VBA Interview Questions
Windows Interview Questions
Mainframe Interview Questions

Copyright © 2001-2021 Vyoms.com. All Rights Reserved. Home | About Us | Advertise With Vyoms.com | Jobs | Contact Us | Feedback | Link to Us | Privacy Policy | Terms & Conditions
Placement Papers | Get Your Free Website | IAS Preparation | C++ Interview Questions | C Interview Questions | Report a Bug | Romantic Shayari | CAT 2021

Fresher Jobs | Experienced Jobs | Government Jobs | Walkin Jobs | Company Profiles | Interview Questions | Placement Papers | Companies In India | Consultants In India | Colleges In India | Exams In India | Latest Results | Notifications In India | Call Centers In India | Training Institutes In India | Job Communities In India | Courses In India | Jobs by Keyskills | Jobs by Functional Areas

Testing Articles | Testing Books | Testing Certifications | Testing FAQs | Testing Downloads | Testing Interview Questions | Testing Jobs | Testing Training Institutes

Gate Articles | Gate Books | Gate Colleges | Gate Downloads | Gate Faqs | Gate Jobs | Gate News | Gate Sample Papers | Gate Training Institutes

MBA Articles | MBA Books | MBA Case Studies | MBA Business Schools | MBA Current Affairs | MBA Downloads | MBA Events | MBA Notifications | MBA FAQs | MBA Jobs
MBA Job Consultants | MBA News | MBA Results | MBA Courses | MBA Sample Papers | MBA Interview Questions | MBA Training Institutes

GRE Articles | GRE Books | GRE Colleges | GRE Downloads | GRE Events | GRE FAQs | GRE News | GRE Training Institutes | GRE Sample Papers

IAS Articles | IAS Books | IAS Current Affairs | IAS Downloads | IAS Events | IAS FAQs | IAS News | IAS Notifications | IAS UPSC Jobs | IAS Previous Question Papers
IAS Results | IAS Sample Papers | IAS Interview Questions | IAS Training Institutes | IAS Toppers Interview

SAP Articles | SAP Books | SAP Certifications | SAP Companies | SAP Study Materials | SAP Events | SAP FAQs | SAP Jobs | SAP Job Consultants
SAP Links | SAP News | SAP Sample Papers | SAP Interview Questions | SAP Training Institutes |

Copyright ©2001-2021 Vyoms.com, All Rights Reserved.
Disclaimer: VYOMS.com has taken all reasonable steps to ensure that information on this site is authentic. Applicants are advised to research bonafides of advertisers independently. VYOMS.com shall not have any responsibility in this regard.