What is the Datawarehouse
A datawarehouse is a collection of data in which the company's information is integrated and that is used as support for the managerial decision-making process. Although various organizations and individuals come to understand the approach of a Warehouse , experience has shown that there are many potential pitfalls.
Gathering the appropriate data elements from various
application sources in a centralized comprehensive environment, simplifies the
problem of access to information and, consequently, accelerates the process of
analysis, consultations and the shortest time of use of the information.
The applications for decision support based on one data
warehousing can make more practical and easy data mining for greater efficiency
of the business, which is not achieved when using only the data from
operational applications (which help in operation of the company in its daily
operations), in which the information is obtained by carrying out independent
and often complex processes.
A data warehouse is created by extracting data from one or
more operational application databases . The extracted data is transformed to
eliminate inconsistencies and summarize if necessary and then uploaded to the
data warehouse.
The process of transforming , creating the variant time
detail, summarizing and combining the data extracts, help to create the
environment for access to Institutional information. This new approach helps
individuals, at all levels of the company, to make their decision-making more
responsibly.
The innovation of Information Technology within a data
warehousing environment can enable any organization to make optimal use of data
as a key ingredient for making process more effective decisions. Organizations
have to take advantage of their information resources to create the information
of the business operation, but the technological strategies necessary for the
implementation of a complete datawarehouse architecture must be considered.
The most important characteristics of this data repository
are the following:
1. Theme
oriented . The data in the database is organized so that all data elements
relating to the same event or object in the real world are linked together.
2. Variant
in time . The changes produced in the data over time are recorded so that the
reports that can be generated reflect these variations.
3. Not
volatile . The information is not modified or eliminated, once a piece of data
is stored, it becomes read-only information, and is kept for future
consultations.
4. Integrated
. The database contains the data of all the operational systems of the
organization, and such data must be consistent.
A Datawarehouse is structured as follows:
1. OLAP
Datamart : They are based on the popular OLAP cubes, which are built by adding,
according to the requirements of each area or department, the dimensions and
necessary measures of each relational cube. The way of creating, operating and
maintaining OLAP cubes is very heterogeneous, depending on the final tool used.
2. Datamart
OLTP : They can be based on a simple extract from the datawarehouse, however,
the common thing is to introduce improvements in its performance (aggregations
and filtered are usually the most common operations) taking advantage of the
particular characteristics of each area of the company. The most common
structures in this sense are report tables, which come to be reduced feasible
tables (which add the appropriate dimensions), and materialized views, which
are built with the same structure as the previous ones, but with the aim of
exploiting the rewriting of queries (although it is only possible in some
advanced DBMS, such as Oracle).
A Datamart is a departmental database, specialized in
storing data for a specific business area. It is characterized by having the
optimal data structure to analyze the information in detail from all the
perspectives that affect the processes of said department. A cube can be fed
from the data of a data warehouse, or integrate by yourself a compendium of
different sources of information. Therefore, to create the datamart of a
functional area of the company, it is necessary to find the optimal structure
for the analysis of its information, a structure that can be mounted on an OLTP
database, such as the datawarehouse itself, or on a database OLAP. The
designation of one or the other will depend on the data, requirements and
specific characteristics of each department.
Datamarts that are equipped with these optimal analysis
structures have the following advantages:
1. Little
data volume
2. Faster
consultation
3. Simple
SQL and / or MDX queries
4. Direct
validation of information
5. Ease of
data historicization