A data warehouse is a repository that stores different forms of information from different sources.
Data warehouse Architecture (DWA) is the organization of the data and storage facility.
Data warehouse usually stores dispositive data. The architecture should be able to extract valuable information from raw data using the most effective approach.
It now transforms the information into a simple structure using a dimensional model that provides valuable insights into business decisions.
Types of Data Warehouse Architectures
There are different types of data warehouse architecture. We have a traditional and modern architecture that is still evolving.
The five main architectures are Independent, Centralized, Data Mart, Hub-and-Spoke, Federated, and Data Mart Bus.
You can as well combine two or more of these architectures for more effectiveness in data warehousing.
However, the three main types of architectures to consider when designing an enterprise data warehouse are Single-level, 2-tier, and 3-tier architecture.
- Single-Level Architecture
A single-level data warehouse architecture focuses on creating a dense data set while reducing the volume of the stored data.
This architecture may not be suitable for every business, especially companies that process huge data with numerous streams, and complex data requirements.
The advantage of this architecture is that it will write off duplicate data and eliminate redundancy.
- Two-Tier Architecture
With this architecture, the material data sources will be split within the warehouse itself.
The only disadvantage here is that 2-tier architecture is not scalable, but can be more efficient in terms of data organization and management.
- Three-Tier Architecture
Among the various data warehouse architectures, 3-tier architecture is the commonest type of data warehouse architecture people employ because it creates a well-organized data flow from raw information to valuable insights.
The lowest level usually consists of the database server, which creates an abstraction layer for data from different sources, such as transactional databases used for front-end applications.
The second level contains an online analytical processing (OLAP) server. The data at this level will be changed to an arrangement that makes it easy for analysis and multiple testing to be carried out.
The third and top level of the architecture is the client level, which contains the tools and Application Programming Interface (API) used to analyze, query, and report on high-level data.
Main Components of the Data Warehouse Architecture
Now that we’ve discussed the three main data warehouse architectures that businesses employ, let’s look into the main components of a data warehouse.
A data warehouse-design has five major components which are the following
- Data warehouse database
- Extraction, transformation and loading tools (ETL)
- Data warehouse access tools
- Data warehouse bus
1. Data warehouse database
Your database is the central component of a data warehousing architecture. The database is where you keep all the information about your business and refer to when you need to pull data for business decisions.
There are different forms of a database, and you’ll need to choose the best that suits your business model and operating environment to store data in your warehouse.
Let’s see the popular types of the database you can choose, and the basic differences between them.
- Typical relational databases: These forms of the database are row-oriented, and you can use them on a daily basis.
Examples of such databases are Systems, Applications, and Products (SAP), Oracle, Microsoft SQL Server, and IBM DB2.
- Analytics Databases: As the name implies, these are designed specifically to enable data storage facilities to maintain and manage analytics effectively. Examples of these are Teradata and Greenplum.
- Data Warehouse Applications: This is not really a storage database, but some providers offer their clients applications that provide both data management software and hardware for data storage. Examples are IBM Netezza, Oracle Exadata, and SAP Hana.
- Cloud-based databases: A cloud-based database can be hosted and accessed in the cloud. There will be no need to order hardware from the vendor before you set up your data warehouse.
Examples are Google BigQuery, Microsoft Azure SQL, and Amazon Redshift.
Metadata defines your data warehouse and provides a framework for data storage.
Metadata helps with the structuring of the warehouse, the storage, and even the use of the data warehouse.
Metadata can be categorized into Technical and Business Metadata.
Technical Metadata: This stores data that are needed by developers and managers when performing warehouse development and management tasks.
Business Metadata: Deals with the information that helps to understand and interpret the stored data in the warehouse.
Metadata, no doubt, is important for both companies and their technical teams to understand and convert the data stored in the warehouse into useful information.
3. Extraction, Transformation and Loading Tools (ETL)
In data warehousing, the ETL tools are very important because they are at the focal point of the data warehouse architecture.
ETL tools are used to extract data from different sources, organize and arrange it, and load it into a data warehouse.
The chosen ETL tool in your data warehouse architecture will influence the following
- The time it takes for the data to be extracted from sources
- The method of data extraction
- The simplicity and nature of the transformations applied.
- The rate of data validation and cleansing to improve end-product analysis
- Structuring the distribution of information from the fundamental repository to your BI applications
4. Data Warehouse Access Tools
A data warehouse uses a database or a group of databases as its basis. Generally, enterprise users can not work directly with databases so they’ll have to use support tools.
Some of the support tools are the following:
Query and Reporting Tools: These tools help users to create enterprise reports (usually in the form of spreadsheets, calculations, or interactive visual representations) for their analysis.
Application development tools: These tools help users to create customized reports and present them in interpretations intended for specific reporting purposes.
Data mining tools: These help to systematize the process of identifying arrays and links in a huge chunk of data, adopting advanced statistical modeling methods.
Online Analytical Processing tools: OLAP tools help users to create a multidimensional data warehouse, and support an analysis of enterprise data from different perspectives.
5. Data Warehouse Bus
This defines how the data flow within a data warehousing architecture. It usually contains a data mart.
A data mart is an access level with which your data is transferred to the users. It is used to group the data that is created for the respective user group.