A data warehouse is a large, central store of data from sources across an organization or company. The information contained in this central data store is used for making strategic and tactical business decisions.
Building a Data Warehouse
When it comes to building a data warehouse for your organization, the most commonly utilized methodologies are those proposed by Bill Inmon and Ralph Kimball, the two most respected leaders in data warehousing:
- “A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision-making process.” (Inmon, 1995)
- “A copy of transaction data specifically structured for query and analysis” (Kimball, 2002)
Inmon sees the data warehouse as a single, giant repository for enterprise data stored in third normal form, where Kimball sees the data warehouse as a collection of subject area data marts stored as a dimensional model.
So, which methodology should you choose and, most importantly, why does your company need a data warehouse solution?
The answers are respectively “It depends.” and “To gain a competitive advantage over your competition.”
How do data warehousing projects come about?
Most organizations start with simple information systems (called applications) used to optimize a business need. Shortly after the implementation of these applications, the company wants to measure the performance of the application and, therefore, requires reports.
The reports get more complex as the system expands and new features and functions are added. Soon, the complexity of the reports begins to decrease the performance of the application.
The next step is to create a copy of the application’s database on a different server so as not to impact the performance of the application when running reports.
As the organization and database continue to grow, other parts of the business are requesting data from the system.
To support the needs of the organization, Comma Separated Values (.csv) files are exported from the application database and sent throughout the organization via File Transfer Protocol (FTP) or email.
At this point, the senior management of the organization is likely to begin receiving conflicting information from the various departments using the data. Eventually, it becomes evident that each department uses the source data in different ways and therefore have multiple versions of answers to essential business questions such as How many customers do I have? or How many widgets did I produce and sell last year? Now the organization’s senior leadership sees the need for a data warehouse, a centralized repository of enterprise data upon which both strategic and tactical decisions can be made to gain a competitive advantage over the competition.
Now that you have an understanding of what a data warehouse is and how projects come about, I’m going to explain the requirements for creating a data warehouse solution. To learn more about these requirements, read the blog post here.