A Exadata and the Oracle Communications Data Model

This appendix provides information you can use to bring the benefits of Exadata to your Oracle Communications Data Model warehouse. It contains the following topics:

Benefits of Running Oracle Communications Data Model on an Exadata Storage Server
Suggested Practices for Oracle Communications Data Model Warehouse in Exadata

Benefits of Running Oracle Communications Data Model on an Exadata Storage Server

An Exadata Storage Server offers the following capabilities that provide major benefits when you run Oracle Communications Data Model:

Exadata: A Preconfigured Balanced System
Exadata Smart Flash Cache
Exadata Storage Capacity, Performance, Bandwidth and IOPS
Exadata Smart Scan Processing and Storage Index
Other Exadata Optimization Techniques for Data Warehousing

Exadata: A Preconfigured Balanced System

The Database Machine is a pre-configured system ready to be turned on on day one, taking significant integration work, cost and time out of the database deployment process. The Exadata Storage Server (Exadata storage or Exadata cells) is used as the storage for the Oracle Database in the Database Machine and is used to grow existing Database Machine deployments. The benefit of a common infrastructure to deploy a database for any application, whether OLTP, data warehouse, a mix of the two, or as a platform for consolidation of several databases, creates tremendous opportunities for efficiencies in the datacenter. The Database Machine is truly a "cloud in box".

The new top-of-the-line Exadata Database Machine X2-8 combines the best of scale-up and scale-out architectures by delivering a grid architecture containing large SMP database servers. Historically, a 64-core SMP required a full rack of equipment by itself, and was difficult to scale out further.The hardware components of the Exadata Storage Server (also referred to as an Exadata cell) were carefully chosen to match the needs of high performance database processing. The Exadata software is optimized to take the best possible advantage of the hardware components and Oracle Database. Each Exadata cell delivers outstanding I/O performance and bandwidth to the database.

Demos:

For more information on Exadata, see the following demos.

Exadata and Database Machine Version 2 Series - 4 of 25: Exadata Process Introduction
Exadata and Database Machine Version 2 Series - 8 of 25: Exadata Cell Configuration
Exadata and Database Machine Version 2 Series - 22 of 25: Bulk Data Loading with Database Machine

To access the demos, sign into OTN and open the Oracle Learning Library in your browser by following the instructions in "Oracle Technology Network"; and, then, search for the demos by name.

Exadata Smart Flash Cache

Each Exadata cell comes with 384 GB of Exadata Smart Flash Cache which means in the Database Machine X2-8 and Full Rack X2-2 there is 5.3 TB of Flash - larger than most databases. This solid state storage delivers dramatic performance advantages with Exadata storage. Exadata Smart Flash Cache provides a ten-fold improvement in response time for reads over regular disk, a hundred-fold improvement in IOPS for reads over regular disk, and is a less expensive higher capacity alternative to memory. Overall it delivers a ten-fold increase performing a blended average of read and write operations.

The Exadata Smart Flash Cache manages active data from regular disks in the Exadata cell - but the cache is not managed in a simple Least Recently Used (LRU) fashion. The Exadata Storage Server Software in cooperation with the Oracle Database keeps track of data access patterns and knows what and how to cache data and avoid polluting the cache. This functionality is all managed automatically and does not require manual tuning. If there are specific tables or indexes that are known to be key to the performance of a database application they can optionally be identified and pinned in cache.

Exadata Storage Capacity, Performance, Bandwidth and IOPS

The Oracle Exadata Storage Servers comes with either twelve 600 GB 15,000 RPM High Performance SAS disks or twelve 2 TB 7,200 RPM High Capacity SAS disks.

The storage capacity of each model of the Database Machine is shown in the following table.

Table A-1 Database Machine Storage Capacity

	Database Machine X2-8 and X2-2 Full Rack	Database Machine X2-2 Half Rack	Database Machine X2-2 Quarter Rack
Exadata Smart Flash Cache	5.3 TB	2.6 TB	1.1 TB
Raw Disk Capacity: High Performance SAS	100 TB	50 TB	21 TB
Raw Disk Capacity: High Capacity SAS	336 TB	168 TB	72 TB
Useable Capacity: High Performance SAS	Up to 45 TB	Up to 22.5 TB	Up to 9.25 TB
Useable Capacity: High Capacity SAS	Up to 150 TB	Up to 75. TB	Up to 31.5 TB

Note:

When calculating raw disk capacity, 1 TB = 1 trillion bytes. Actual formatted capacity is less. Useable capacity available for databases is computed after mirroring (Oracle ASM normal redundancy) and leaving one empty disk to automatically handle disk failures.

The performance that each cell delivers is extremely high due to the Exadata Smart Flash Cache.

The Exadata software can simultaneously scan from Flash and disk to maximize bandwidth. The automated caching within Flash enables each Exadata cell to deliver up to 5.4 GB per second bandwidth and 125,000 IOPS when accessing uncompressed data. When data is stored in compressed format, the amount of user data capacity, the amount of data bandwidth and IOPS achievable, often increases up to ten times, or more which represents a significant improvement over traditional storage devices used with the Oracle Database.

The performance characteristics of each model of the Database Machine are depicted in the following table.

Table A-2 Database Machine I/O Performance

	Database Machine X2-8 and X2-2 Full Rack	Database Machine X2-2 Half Rack	Database Machine X2-2 Quarter Rack
Raw Disk Data Bandwidth: High Performance SAS (without data compression)	Up to 25 GB per second	Up to 12.5 GB per second	Up to 5.4 GB per second
Raw Disk Data Bandwidth: High Capacity SAS (without data compression)	Up to 14 GB per second	Up to 7.0 GB per second	Up to 3.0 GB per second
Raw Flash Data Bandwidth: High Performance SAS (without data compression)	Up to 75 GB per second	Up to 37.5 GB per second	Up to 16 GB per second
Raw Flash Data Bandwidth: High Capacity SAS (without data compression)	Up to 64 GB per second	Up to 32 GB per second	Up to 13.5 GB per second
Flash Cache IOPS	Up to 1,500,000	Up to 750,000	Up to 375,000
Disk IOPS: High Performance SAS	Up to 50,000	Up to 25,000	Up to 10,800
Disk IOPS: High Capacity SAS	Up to 25,000	Up to 12,500	Up to 5,400

Exadata Smart Scan Processing and Storage Index

As is often the case with the large queries, the predicate filters out most of the rows read. Yet all the blocks from the table must be read, transferred across the storage network and copied into memory. Many more rows are read into memory than required to complete the requested SQL operation which generates a large number of data transfers which consume bandwidth and impact application throughput and response time.

With Exadata storage, database operations are handled much more efficiently. Queries that perform table scans can be processed within Exadata storage with only the required subset of data returned to the database server. Row filtering, column filtering and some join processing (among other functions) are performed within the Exadata storage cells. When this takes place only the relevant and required data is returned to the database server.

The Oracle Database and Exadata server cooperatively execute various SQL statements. Moving SQL processing off the database server frees server CPU cycles and eliminates a massive amount of bandwidth consumption which is then available to better service other requests. SQL operations run faster, and more of them can run concurrently because of less contention for the I/O bandwidth.

The following topics provide detailed information on the various SQL operations that benefit from the use of Exadata:

Smart Scan Predicate Filtering
Smart Scan Column Filtering
Smart Scan Join Processing
Smart Scan Processing of Encrypted Tablespaces and Columns
Storage Indexing
Offload of Data Mining Model Scoring
Other Exadata Smart Scan Processing

Smart Scan Predicate Filtering

Exadata enables predicate filtering for table scans. Only the rows requested are returned to the database server rather than all rows in a table. For example, when the following SQL is issued only rows where the employees' hire date is after the specified date are sent from Exadata to the database instance.

SELECT * FROM employee_table WHERE hire_date > '1-Jan-2003'.

This ability to return only relevant rows to the server greatly improves database performance. This performance enhancement also applies as queries become more complicated, so the same benefits also apply to complex queries, including those with subqueries.

Smart Scan Column Filtering

Exadata provides column filtering, also called column projection, for table scans. Only the columns requested are returned to the database server rather than all columns in a table. For example, when the following SQL is issued, only the employee_name and employee_number columns are returned from Exadata to the database kernel.

SELECT employee_name, employee_number FROM employee_table.

For tables with many columns, or columns containing LOBs (Large Objects), the I/O bandwidth saved can be very large. Using both predicate and column filtering dramatically improves performance and reduces I/O bandwidth consumption. In addition, column filtering also applies to indexes, allowing for even faster query performance.

Smart Scan Join Processing

Exadata performs joins between large tables and small lookup tables, a very common scenario for data warehouses with star schemas. Joining large tables and small lookup tables is implemented using Bloom Filters, which are a very efficient probabilistic method to determine whether a row is a member of the desired result set.

Smart Scan Processing of Encrypted Tablespaces and Columns

Smart Scan offload processing of Encrypted Tablespaces (TSE) and Encrypted Columns (TDE) is supported in Exadata storage which increases performance when accessing the most confidential data in the enterprise.

Storage Indexing

Storage Indexes are a very powerful capability provided in Exadata storage that helps avoid I/O operations. The Exadata Storage Server Software creates and maintains a Storage Index (that is, metadata about the database objects) in the Exadata cell. The Storage Index keeps track of minimum and maximum values of columns for tables stored on that cell. When a query specifies a WHERE clause, but before any I/O is done, the Exadata software examines the Storage Index to determine if rows with the specified column value exist in the cell by comparing the column value to the minimum and maximum values maintained in the Storage Index. If the column value is outside the minimum and maximum range, scan I/O for that query is avoided. Many SQL Operations run dramatically faster because large numbers of I/O operations are automatically replaced by a few lookups. To minimize operational overhead, Storage Indexes are created and maintained transparently and automatically by the Exadata Storage Server Software.

Demos:

For an introduction to Exadata Storage Indexes, see the following demos.

Storage Index in Exadata
Exadata and Database Machine Version 2 Series - 5 of 25: Hierarchy of Exadata Storage Objects

To access the demos, sign into OTN and open the Oracle Learning Library in your browser by following the instructions in "Oracle Technology Network"; and, then, search for the demos by name.

Offload of Data Mining Model Scoring

Data Mining model scoring is offloaded to Exadata which makes the deployment of data warehouses on Database Machine an even better and more performant data analysis platform. All data mining scoring functions (for example, prediction_probability) are offloaded to Exadata for processing. Offloading data mining scoring functions not only speeds warehouse analysis, but reduce database server CPU consumption and the I/O load between the database server and Exadata storage.

Other Exadata Smart Scan Processing

Two other database operations that are offloaded to Exadata are incremental database backups and tablespace creation. The speed and efficiency of incremental database backups has been significantly enhanced with Exadata. The granularity of change tracking in the database is much finer when Exadata storage is used. Changes are tracked at the individual Oracle block level with Exadata rather than at the level of a large group of blocks which results in less I/O bandwidth being consumed for backups and faster running backups.

With Exadata the create file operation is also executed much more efficiently. For example, when issuing a CREATE TABLESPACE command, instead of operating synchronously with each block of the new tablespace being formatted in server memory and written to storage, an iDB command is sent to Exadata instructing it to create the tablespace and format the blocks. Host memory usage is reduced and I/O associated with the creation and formatting of the tablespace blocks is offloaded. The I/O bandwidth saved with these operations means more bandwidth is available for other business critical work.

Other Exadata Optimization Techniques for Data Warehousing

Beside the benefits of running an Oracle Communications Data Model warehouse on an Exadata Server discussed earlier in this appendix, Exadata offers the following optimization techniques for any data warehouse -- including an Oracle Communications Data Model warehouse:

Exadata Hybrid Columnar Compression

As discussed in "Types of Data Compression Available", Exadata provides a very advanced compression capability called Exadata Hybrid Columnar Compression (EHCC). EHCC enables the highest levels of data compression and provides enterprises with tremendous cost-savings and performance improvements due to reduced I/O. Average storage savings can range from 10x to 15x depending on how EHCC is used.

See also:
"Exadata Hybrid Columnar Compression"
I/O Resource Management With Exadata

In data warehousing, or mixed workload environments, you may want to ensure different users and tasks within a database are allocated the correct relative amount of I/O resources. For example you may want to allocate 70% of I/O resources to interactive users on the system and 30% of I/O resources to batch reporting jobs. This allocation is simple to enforce using the DBRM and I/O resource management capabilities of Exadata storage.

An Exadata administrator can create a resource plan that specifies how I/O requests are prioritized by putting the different types of work into service groupings called Consumer Groups. Consumer groups can be defined by many attributes including the username, client program name, function, or length of time the query has been running. After these consumer groups are defined, the user can set a hierarchy of which consumer group gets precedence in I/O resources and how much of the I/O resource is given to each consumer group. This hierarchy determining I/O resource prioritization can be applied simultaneously to both intra-database operations (that is, operations occurring within a database) and inter-database operations (that is, operations occurring among various databases).

In essence, Exadata I/O Resource Manager has solved one challenge traditional storage technology does not address: creating a shared grid storage environment with the ability to balance and prioritize the work of multiple databases and users sharing the storage subsystem. Exadata I/O resource management ensures user-defined Service Level Agreements (SLA) are met for multiple databases sharing Exadata storage so that each database or user gets the correct share of disk bandwidth to meet business objectives.
Quality of Service (QoS) Management with Exadata

Oracle Exadata QoS Management is an automated, policy-based product that monitors the workload requests for an entire system. It manages the resources that are shared across applications and adjusts the system configuration to keep the applications running at the performance levels needed by your business. It responds gracefully to changes in system configuration and demand, thus avoiding additional oscillations in the performance levels of your applications.

Suggested Practices for Oracle Communications Data Model Warehouse in Exadata

Exadata's features offer the opportunity to modify or tune the data model to increase performance and maximum usage of Exadata. Although, the implementation of the following practices depends on the specific customer application of the Oracle Communications Data Model, these practices should serve as a guideline for determining applicability of these features:

Index Rich Versus Index Scarce Design
Exadata Hybrid Columnar Compression

Index Rich Versus Index Scarce Design

Traditional design dictates a the use of indexes on data warehouse tables to aid in performance. In a traditional environment, typically the processing time within the database for performing an index lookup is significantly lower than a full table scan. Because of the lower processing time for index lookup, traditional design is an index rich design where tables have many indexes which are specifically setup based on known or predicted query patterns. In a traditional environment, the main downsides of index rich designs are: the space required for storing the index, the care and feeding of the indexes by the DBA staff, and the temptation to 'over-index' by adding indexes on nearly every column.

Exadata allows for the possibility of index scarce designs. In an index scarce design, indexes are removed or altered invisible to force full table scans on specific tables. Because of the features of Exadata, specifically smart scan and storage indexing, a table scan may return a result faster than walking an index. In addition to the possibility of faster return times, an index-scarce design provides storage savings if the index is removed.

A common scenario though is to first alter the index invisible, which still keeps the index structure and data intact, but accomplishes the equivalent of removing the indexes for processing time. In this scenario, a customer can understand the impact of index-scarce design without risking the need to rebuild indexes.

Also, if the data warehouse is expected to run on a non-Exadata environment, for example in a disaster recovery environment, removing the indexes has significant degradation in that non-Exadata environment.

Exadata Hybrid Columnar Compression

As discussed in "Types of Data Compression Available", Exadata provides a very advanced compression capability called Exadata Hybrid Columnar Compression (EHCC). EHCC enables the highest levels of data compression and provides enterprises with tremendous cost-savings and performance improvements due to reduced I/O. Average storage savings can range from 10x to 15x depending on how EHCC is used.

With the high compression rates resulting from Hybrid Columnar Compression (HCC), customers can achieve significant storage savings. Hybrid Columnar compression, though, is not applicable to all scenarios.

HCC can be applied at a table or partition level and is best applied at that level of granularity.

A typical scenario is to implement a time based information lifecycle management (ILM) using the various compression options including Advanced Compression, Hybrid Columnar Compression Query and Hybrid Columnar Compression Archive. The Oracle Communications Data Model implementation can consider date partitioning on large fact tables and can apply HCC-Q and HCC-A to data that is no longer being updated or no longer being read.

For example, a customer may have the following requirements for Customer Orders:

Orders can be updated and changed for the first 90 days.
After 90 days, orders cannot be changed, but are typically used in many reports.
After 3 years, the data is rarely, if ever, queried, but is required for retention reasons.

A design which used Advanced Compression for partitions containing data less than 90 days old, HCC-Q for partitions with data 90 days to 3 years, and HCC-A for partitions with data older than 3 years maximizes use of all three compression options, while minimizing the risk of performance degradation.

Demos:

For more information on Exadata Hybrid Columnar Compression, see the following demos.

Exadata and Database Machine Version 2 Series - 2 of 25: Introduction to Exadata Hybrid Columnar Compression
Exadata and Database Machine Version 2 Series - 18 of 25: Examining Exadata Hybrid Columnar Compression
Exadata and Database Machine Version 2 Series - 19 of 25: Index Elimination with Exadata
Hybrid Columnar Compression

To access the demos, sign into OTN and open the Oracle Learning Library in your browser by following the instructions in "Oracle Technology Network"; and, then, search for the demos by name.