Vyoms OneStopTesting.com - Testing EBooks, Tutorials, Articles, Jobs, Training Institutes etc.
OneStopGate.com - Gate EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopMBA.com - MBA EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopIAS.com - IAS EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopSAP.com - SAP EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopGRE.com - of GRE EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
Bookmark and Share Rss Feeds

Running the System For DBA | Articles | Recent Articles | News Article | Interesting Articles | Technology Articles | Articles On Education | Articles On Corporate | Company Articles | College Articles | Articles on Recession
Sponsored Ads
Hot Jobs
Fresher Jobs
Experienced Jobs
Government Jobs
Walkin Jobs
Placement Section
Company Profiles
Interview Questions
Placement Papers
Resources @ VYOMS
Companies In India
Consultants In India
Colleges In India
Exams In India
Latest Results
Notifications In India
Call Centers In India
Training Institutes In India
Job Communities In India
Courses In India
Jobs by Keyskills
Jobs by Functional Areas
Learn @ VYOMS
GATE Preparation
GRE Preparation
GMAT Preparation
IAS Preparation
SAP Preparation
Testing Preparation
MBA Preparation
News @ VYOMS
Freshers News
Job Articles
Latest News
India News Network
Interview Ebook
Get 30,000+ Interview Questions & Answers in an eBook.
Interview Success Kit - Get Success in Job Interviews
  • 30,000+ Interview Questions
  • Most Questions Answered
  • 5 FREE Bonuses
  • Free Upgrades

VYOMS TOP EMPLOYERS

Wipro Technologies
Tata Consultancy Services
Accenture
IBM
Satyam
Genpact
Cognizant Technologies

Home » Articles » Running the System For DBA

Running the System For DBA








Article Posted On Date : Friday, May 22, 2009


Running the System For DBA
Advertisements

HTML clipboard

Running the System For DBA

Each of the queries was run 21 times, and the median runtime was used as the representative value, as shown below.

EVENT WINNER_TIME          RNR_UP_TIME          LOSER_TIME  ----- -------------------- -------------------- --------------------  1.    DIM = 00:00:06.049   REL = 00:00:09.023   HYB = 00:00:09.644  2.    DIM = 00:00:04.186   HYB = 00:00:07.961   REL = 00:00:08.092  3.    DIM = 00:00:03.415   HYB = 00:00:04.938   REL = 00:00:05.428  4.    DIM = 00:00:00.140   HYB = 00:00:00.190   REL = 00:00:06.990  5.    HYB = 00:00:00.131   DIM = 00:00:00.651   REL = 00:00:05.418  6.    DIM = 00:00:00.530   HYB = 00:00:01.392   REL = 00:00:05.478  7.    DIM = 00:00:00.520   HYB = 00:00:01.572   REL = 00:00:07.9718.          DIM = 00:00:00.461   HYB = 00:00:00.731   REL = 00:00:01.882    Converting to a percentage scale, to make the values relative rather than absolute, and forcing the fastest schema to 100 percent by definition produces these percentages: 
EVENT WINNER_OFFSET        RNR_UP_OFFSET        LOSER_OFFSET  ----- -------------------- -------------------- --------------------  1.    DIM =    100%        REL =    149%        HYB =    159%  2.    DIM =    100%        HYB =    190%        REL =    193%  3.    DIM =    100%        HYB =    145%        REL =    159%  4.    DIM =    100%        HYB =    136%        REL =   4993%  5.    HYB =    100%        DIM =    497%        REL =   4136%  6.    DIM =    100%        HYB =    263%        REL =   1034%  7.    DIM =    100%        HYB =    302%        REL =   1533%  8.    DIM =    100%        HYB =    159%        REL =    408%    

Comparing Relational and Dimensional

Showing that the dimensional schema outperforms the relational schema when running dimensional queries functions as the control of the experiment and provides the baseline from which to consider the hybrid schema's performance. As you can see below, the dimensional schema consistently outperforms the relational schema, as expected.

   EVENT WINNER_OFFSET        RNR_UP_OFFSET        LOSER_OFFSET  ----- -------------------- -------------------- --------------------  1.    DIM =    100%        REL =    149%  2.    DIM =    100%                             REL =    193%  3.    DIM =    100%                             REL =    159%  4.    DIM =    100%                             REL =   4993%  5.                         DIM =    497%        REL =   4136%  6.    DIM =    100%                             REL =   1034%  7.    DIM =    100%                             REL =   1533%  8.    DIM =    100%                             REL =    408%    

In the case of Query #4, the difference is nearly 50-fold! Query #4 is the most extreme case in which the only (nontime) restriction is on attributes of the topmost table. In the relational schema, this means that all the tables down the hierarchy must be joined to get to the numerical information�an expensive operation. In the dimensional schema, the join is a direct connection from one dimension right into the fact table�an efficient operation.

Hybrid vs. Dimensional

Whether the hybrid schema performs as well as the dimensional is the core question in this analysis. As you can see below, the hybrid schema works reasonably well, but the hybrid approach is not as fast as a purely dimensional one.

EVENT WINNER_OFFSET        RNR_UP_OFFSET        LOSER_OFFSET  ----- -------------------- -------------------- --------------------  1.    DIM =    100%                             HYB =    159%  2.    DIM =    100%        HYB =    190%                       3.    DIM =    100%        HYB =    145%                       4.    DIM =    100%        HYB =    136%                       5.    HYB =    100%        DIM =    497%                       6.    DIM =    100%        HYB =    263%                       7.    DIM =    100%        HYB =    302%                       8.    DIM =    100%        HYB =    159%     

Query #5 is a deviant case, but for all the other queries, the hybrid takes between 136 percent and 302 percent of the time required for the dimensional schema. This immediately shows that there are some limitations to the performance of the hybrid schema, but to understand why requires analysis of the query plans. A review of the plans captured during a system run indicates that there are three categories of behavior:

  • Queries whose plans are identical between dimensional and hybrid (queries #1, #3, #4, #8).
  • Queries whose plans differ between dimensional and hybrid (queries #2, #6, #7).
  • Perfect alignment of the query with the hybrid schema (query #5).

Identical plans. Here are the three plans for query #1:

Query #1, relational schema plan:    SELECT STATEMENT (rows=195)      SORT GROUP BY (rows=195)        TABLE ACCESS FULL PREMIUM (rows=1377304)    Query #1, dimensional schema plan:    SELECT STATEMENT (rows=300)      SORT GROUP BY (rows=300)        HASH JOIN (rows=1372568)          TABLE ACCESS FULL TIME_DIM (rows=3600)          TABLE ACCESS FULL PREMIUM_FACT (rows=1372568)    Query #1, hybrid schema plan:    SELECT STATEMENT (rows=300)      SORT GROUP BY (rows=300)        HASH JOIN (rows=1360176)          TABLE ACCESS FULL TIME_DIM (rows=3600)          TABLE ACCESS FULL PREMIUM (rows=1360176)    

Note that the relational plan is different from the dimensional plan, as would be expected. It also that the dimensional and hybrid plans are identical. This shows the optimizer's ability to detect the dimensional nature of the query to the dimensional constructs of the hybrid schema, which is the desired behavior. The pattern of the dimensional and relational plans being identical also holds for queries #3, #4, and #8.

The slower performance despite the identical plans leads to the conclusion that the hybrid schema is slower simply due to its sheer size. As previously discussed, the hybrid schema tends to require about twice the space of either of the other two schemas. This means fewer rows per block, more total reads for any given operation, and more bytes in motion than in the dimensional schema. It may very well be that having all these extra bytes in motion simply slows things down.

Different plans. Now review the three plans for query #7:

Query #7, relational schema plan:    SELECT STATEMENT (rows=6)      SORT GROUP BY (rows=6)        HASH JOIN (rows=77)          TABLE ACCESS FULL COVERAGE (rows=800)          TABLE ACCESS FULL PREMIUM (rows=13773)    Query #7, dimensional schema plan:    SELECT STATEMENT (rows=1)      SORT GROUP BY (rows=1)        HASH JOIN (rows=1)          TABLE ACCESS BY INDEX ROWID COVERAGE_DIM (rows=6)            BITMAP CONVERSION TO ROWIDS (rows=)              BITMAP AND (rows=)                BITMAP INDEX SINGLE VALUE BX_COVERAGE_ACCD_LIMIT (rows=)                BITMAP INDEX SINGLE VALUE BX_COVERAGE_DEDUCTIBLE (rows=)                BITMAP INDEX SINGLE VALUE BX_COVERAGE_PERS_LIMIT (rows=)          TABLE ACCESS BY INDEX ROWID PREMIUM_FACT (rows=48)            BITMAP CONVERSION TO ROWIDS (rows=)              BITMAP AND (rows=)                BITMAP MERGE (rows=)                  BITMAP KEY ITERATION (rows=)                    TABLE ACCESS FULL TIME_DIM (rows=12)                    BITMAP INDEX RANGE SCAN BX_PREMIUM_TIME (rows=)                BITMAP MERGE (rows=)                  BITMAP KEY ITERATION (rows=)                    TABLE ACCESS BY INDEX ROWID COVERAGE_DIM (rows=6)                      BITMAP CONVERSION TO ROWIDS (rows=)                        BITMAP AND (rows=)                          BITMAP INDEX SINGLE VALUE BX_COVERAGE_ACCD_LIMIT (rows=)                          BITMAP INDEX SINGLE VALUE BX_COVERAGE_DEDUCTIBLE (rows=)                          BITMAP INDEX SINGLE VALUE BX_COVERAGE_PERS_LIMIT (rows=)                    BITMAP INDEX RANGE SCAN BX_PREMIUM_COVERAGE (rows=)    Query #7, hybrid schema plan:    SELECT STATEMENT (rows=1)      TEMP TABLE TRANSFORMATION (rows=)        LOAD AS SELECT  SYS_TEMP_0FD9D697C_1278CF0 (rows=)          TABLE ACCESS BY INDEX ROWID COVERAGE (rows=6)            INDEX FULL SCAN UX_COVERAGE_COVERAGE_KEY (rows=6)        SORT GROUP BY (rows=1)          HASH JOIN (rows=1)            TABLE ACCESS FULL SYS_TEMP_0FD9D697C_1278CF0 (rows=6)            TABLE ACCESS BY INDEX ROWID PREMIUM (rows=47)              BITMAP CONVERSION TO ROWIDS (rows=)                BITMAP AND (rows=)                  BITMAP MERGE (rows=)                    BITMAP KEY ITERATION (rows=)                      TABLE ACCESS FULL TIME_DIM (rows=12)                      BITMAP INDEX RANGE SCAN BX_PREMIUM_TIME (rows=)                  BITMAP MERGE (rows=)                    BITMAP KEY ITERATION (rows=)                      TABLE ACCESS FULL SYS_TEMP_0FD9D697C_1278CF0 (rows=1)                      BITMAP INDEX RANGE SCAN BX_PREMIUM_COVERAGE (rows=)    

Again the relational plan matches neither of the other two, but this time, the dimensional and hybrid plans are not identical. This shows the undesirable optimizer behavior of not using a dimensional plan even though one can exist because the schema has all the necessary constructs. The pattern of not generating a dimensional plan on the hybrid schema also holds for queries #2 and #6.

The reasonable conclusion is that the availability of the relational constructs when the optimizer is doing dimensional queries causes the optimizer to generate a plan that is not as effective as the purely dimensional query generated when only dimensional constructs are available in the schema.

It is important to note that all four cases that use a dimensional plan against a hybrid schema consistently outperform all three cases where something other than a purely dimensional plan is used against the hybrid schema. This reinforces the value of using a dimensional plan whenever possible.

Perfect alignment. Query #5 presents a unique case in which the hybrid wins because the nature of the query happens to lend itself artificially well to the nature of the hybrid schema. Specifically, query #5 uses the VEHICLE, COVERAGE, and TIME tables. COVERAGE is a weak entity, and as such, it has all of the VEHICLE identifier in its primary key in the relational and hybrid schemas. In the dimensional schema, the VEHICLE attributes were taken out, so that the COVERAGE dimension would be a "pure" dimension�pure in the sense that it stands alone and any association it has with VEHICLE is through the fact table. Although this does make for a proper dimensional schema, it also separates the VEHICLE and COVERAGE tables dramatically more in the dimensional schema than in the relational and hybrid schemas.

When it comes time for the optimizer to use VEHICLE and COVERAGE in query #5, it has to bring them together from scratch in the dimensional schema, but in the hybrid and relational schemas, it can already find them together in the key of the COVERAGE table. That the hybrid schema has such constructs available when the optimizer needs them is one of its stated advantages, but on the other hand, this is not a general-case advantage . It exists only when the query and schema happen to align properly, as is the case in query #5.

Conclusion of Query Analysis

Generalizing our conclusions: if the hybrid schema aligns perfectly with the nature of the query, the hybrid schema can significantly outperform the dimensional schema, but this is not the general case (one of the eight). In most cases (four of the eight), the hybrid plan will be identical to the dimensional plan, but the hybrid schema will run slower, likely because of the number of bytes in motion. In some cases (three of the eight), the plan for the hybrid schema will be different, less than optimal, and the result is runtimes that are quite a bit longer than those of a dimensional schema.

 

Materialized View Aggregates

Aggregates are commonly used in dimensional modeling to increase performance, and materialized views are commonly used to create the aggregates. To demonstrate the effect of materialized view aggregates (MVAs) on the hybrid schema, two MVAs were added. Table 1 shows that four of the queries can undergo a query rewrite to use the MVA named in the right-most column.

Query Account dim. Policy dim. Vehicle dim. Coverage dim. Time dim. Aggregate utilized
1 Agg Agg     Qry/Agg Agg_acct_pol_time
2 Qry     Qry Qry  
3 Qry/Agg Qry/Agg     Qry/Agg Agg_acct_pol_time
4 Qry/Agg Agg     Qry/Agg Agg_acct_pol_time
5     Qry Qry Qry  
6 Qry/Agg Qry/Agg Qry/Agg Qry/Agg   Agg_acct_pol_veh_cov

7

      Qry Qry  
8     Qry   Qry  

Table 1 . Materialized view aggregates added to optimize specific queries

"Qry" indicates that the query references the dimension in its WHERE clause, and "Agg" indicates that the MVA preserves the reference to the dimension. For rewrite to occur, no "Qry" may appear alone in a cell.

The performance change of the MVAs overall is very positive, as would be expected. Also of interest is that the MVAs cause the hybrid schema to do quite well in relative to the dimensional schema, as shown here:

VENT WINNER_TIME          RNR_UP_TIME          LOSER_TIME  ----- -------------------- -------------------- --------------------  1.    HYB = 00:00:00.941   DIM = 00:00:01.382   REL = 00:00:08.943  2.    DIM = 00:00:04.246   REL = 00:00:08.041   HYB = 00:00:08.121  3.    HYB = 00:00:00.942   DIM = 00:00:01.262   REL = 00:00:05.388  4.    HYB = 00:00:00.120   DIM = 00:00:00.180   REL = 00:00:07.381  5.    HYB = 00:00:00.290   DIM = 00:00:01.222   REL = 00:00:05.989  6.    HYB = 00:00:00.731   DIM = 00:00:00.912   REL = 00:00:05.437  7.    DIM = 00:00:00.691   HYB = 00:00:01.993   REL = 00:00:07.962  8.    DIM = 00:00:00.511   HYB = 00:00:00.801   REL = 00:00:02.063    EVENT WINNER_OFFSET        RNR_UP_OFFSET        LOSER_OFFSET  ----- -------------------- -------------------- --------------------  1.    HYB =    100%        DIM =    147%        REL =    950%  2.    DIM =    100%        REL =    189%        HYB =    191%  3.    HYB =    100%        DIM =    134%        REL =    572%  4.    HYB =    100%        DIM =    150%        REL =   6151%  5.    HYB =    100%        DIM =    421%        REL =   2065%  6.    HYB =    100%        DIM =    125%        REL =    744%  7.    DIM =    100%        HYB =    288%        REL =   1152%  8.    DIM =    100%        HYB =    157%        REL =    404%    

In fact, in 100 percent of the cases in which aggregates are used (four of the eight queries), the hybrid schema takes first place for performance. This shows not only that the use of materialized views allows both the dimensional and hybrid schemas to benefit but also that it consistently favors the hybrid schema. This is a very significant finding because it indicates that the use of MVAs on a hybrid schema achieves full dimensional performance while still keeping all relational relationships.

Future Research and Other Considerations

"Can" vs. "Should". As noted at the outset, the motivation for this research was my need to build a single system to meet both OLTP-like and DSS-like business requirements. However, if a project needs only one of the two types of behavior or has the money and time to fund and build two separate environments with the appropriate feeds between them, it may be better to avoid the hybrid design.

Human understanding. This analysis looked at implementation aspects of the hybrid design, but one of the main values of the dimensional design is that it is easy for those who are not database experts to understand. The hybrid design not only loses this ease of understanding but also produces the most complex design of the three discussed. This is a major drawback for systems that need to give power users direct access to the database.

Other physical optimization. No partitioning was used in the current analysis. Bitmap join indexes were added and analyzed, but the results showed little advantage (see code for details). These and other physical optimization techniques should be analyzed to determine if they benefit the hybrid design as much as they benefit the dimensional design.

Platform considerations. This system was run on the Windows XP operating system using Intel Pentium-class hardware. Three such platforms were tested. RAM varied from 256MB to 1GB. Other platforms may run faster or slower overall, but platform changes may also change the performance of the schemas relative to each other. This possibility is even more likely given the ability of the Oracle database software to detect the state of platform resources and adjust plans accordingly.

Conclusion

In general, the numbers show that the idea of combining relational and dimensional designs is feasible, if not without issues. But note that perfection is not an option. A purely relational design cannot achieve dimensional performance; a purely dimensional design cannot represent relations efficiently. Each of the three has its own limitations. Given that, the slight performance degradation that comes from the bigger row size and the increase in physical complexity are relatively small prices to pay when the benefit is the ability to have both full relational representation and much improved performance.






Sponsored Ads



Interview Questions
HR Interview Questions
Testing Interview Questions
SAP Interview Questions
Business Intelligence Interview Questions
Call Center Interview Questions

Databases

Clipper Interview Questions
DBA Interview Questions
Firebird Interview Questions
Hierarchical Interview Questions
Informix Interview Questions
Microsoft Access Interview Questions
MS SqlServer Interview Questions
MYSQL Interview Questions
Network Interview Questions
Object Relational Interview Questions
PL/SQL Interview Questions
PostgreSQL Interview Questions
Progress Interview Questions
Relational Interview Questions
SQL Interview Questions
SQL Server Interview Questions
Stored Procedures Interview Questions
Sybase Interview Questions
Teradata Interview Questions

Microsof Technologies

.Net Database Interview Questions
.Net Deployement Interview Questions
ADO.NET Interview Questions
ADO.NET 2.0 Interview Questions
Architecture Interview Questions
ASP Interview Questions
ASP.NET Interview Questions
ASP.NET 2.0 Interview Questions
C# Interview Questions
Csharp Interview Questions
DataGrid Interview Questions
DotNet Interview Questions
Microsoft Basics Interview Questions
Microsoft.NET Interview Questions
Microsoft.NET 2.0 Interview Questions
Share Point Interview Questions
Silverlight Interview Questions
VB.NET Interview Questions
VC++ Interview Questions
Visual Basic Interview Questions

Java / J2EE

Applet Interview Questions
Core Java Interview Questions
Eclipse Interview Questions
EJB Interview Questions
Hibernate Interview Questions
J2ME Interview Questions
J2SE Interview Questions
Java Interview Questions
Java Beans Interview Questions
Java Patterns Interview Questions
Java Security Interview Questions
Java Swing Interview Questions
JBOSS Interview Questions
JDBC Interview Questions
JMS Interview Questions
JSF Interview Questions
JSP Interview Questions
RMI Interview Questions
Servlet Interview Questions
Socket Programming Interview Questions
Springs Interview Questions
Struts Interview Questions
Web Sphere Interview Questions

Programming Languages

C Interview Questions
C++ Interview Questions
CGI Interview Questions
Delphi Interview Questions
Fortran Interview Questions
ILU Interview Questions
LISP Interview Questions
Pascal Interview Questions
Perl Interview Questions
PHP Interview Questions
Ruby Interview Questions
Signature Interview Questions
UML Interview Questions
VBA Interview Questions
Windows Interview Questions
Mainframe Interview Questions


Copyright © 2001-2024 Vyoms.com. All Rights Reserved. Home | About Us | Advertise With Vyoms.com | Jobs | Contact Us | Feedback | Link to Us | Privacy Policy | Terms & Conditions
Placement Papers | Get Your Free Website | IAS Preparation | C++ Interview Questions | C Interview Questions | Report a Bug | Romantic Shayari | CAT 2024

Fresher Jobs | Experienced Jobs | Government Jobs | Walkin Jobs | Company Profiles | Interview Questions | Placement Papers | Companies In India | Consultants In India | Colleges In India | Exams In India | Latest Results | Notifications In India | Call Centers In India | Training Institutes In India | Job Communities In India | Courses In India | Jobs by Keyskills | Jobs by Functional Areas

Testing Articles | Testing Books | Testing Certifications | Testing FAQs | Testing Downloads | Testing Interview Questions | Testing Jobs | Testing Training Institutes

Gate Articles | Gate Books | Gate Colleges | Gate Downloads | Gate Faqs | Gate Jobs | Gate News | Gate Sample Papers | Gate Training Institutes

MBA Articles | MBA Books | MBA Case Studies | MBA Business Schools | MBA Current Affairs | MBA Downloads | MBA Events | MBA Notifications | MBA FAQs | MBA Jobs
MBA Job Consultants | MBA News | MBA Results | MBA Courses | MBA Sample Papers | MBA Interview Questions | MBA Training Institutes

GRE Articles | GRE Books | GRE Colleges | GRE Downloads | GRE Events | GRE FAQs | GRE News | GRE Training Institutes | GRE Sample Papers

IAS Articles | IAS Books | IAS Current Affairs | IAS Downloads | IAS Events | IAS FAQs | IAS News | IAS Notifications | IAS UPSC Jobs | IAS Previous Question Papers
IAS Results | IAS Sample Papers | IAS Interview Questions | IAS Training Institutes | IAS Toppers Interview

SAP Articles | SAP Books | SAP Certifications | SAP Companies | SAP Study Materials | SAP Events | SAP FAQs | SAP Jobs | SAP Job Consultants
SAP Links | SAP News | SAP Sample Papers | SAP Interview Questions | SAP Training Institutes |


Copyright ©2001-2024 Vyoms.com, All Rights Reserved.
Disclaimer: VYOMS.com has taken all reasonable steps to ensure that information on this site is authentic. Applicants are advised to research bonafides of advertisers independently. VYOMS.com shall not have any responsibility in this regard.