A data warehouse is a relational database that is designed
for query and analysis rather than for transaction processing. It
usually contains historical data derived from transaction data, but it
can include data from other sources. It separates analysis workload from
transaction workload and enables an organization to consolidate data
from several sources.
In addition to a relational database, a data warehouse
environment includes an extraction, transportation, transformation, and
loading (ETL) solution, an online analytical processing (OLAP) engine,
client analysis tools, and other applications that manage the process of
gathering data and delivering it to business users.
Development of a data warehouse includes development of systems to extract data from operating systems plus installation of a warehouse database system that provides managers flexible access to the data.
The term data warehousing generally refers to the combination of many different databases across an entire enterprise. Contrast with data mart.
Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject.
Integrated: A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product.
Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept. For example, a transaction system may hold the most recent address of a customer, where a data warehouse can hold all addresses associated with a customer.
Non-volatile: Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered.