Monday, April 20, 2009

Data Model Analysis Styles

Data Models
- Conceptual: The highest-level beginning place
- Logical: The most important model for developing a good physical design
- Physical: The most detailed model that’s the source of the DDL to generate database objects
- Reverse engineered: A Physical model that shows the As-Is of a deployed database

Top-Down Modeling
A top-down approach to the modeling of a system will begin with a Conceptual model and work down through a Logical model to a Physical model for implementation. This process documents details regarding the conceptual entities that you begin with, breaking out attributes and relationships as you progress toward a Physical model.

Bottom-up Modeling
Sometimes you have to work with a bottom-up approach, working backward from an existing system, in order to develop a Logical model (and sometimes further back to a Conceptual model). This type of approach is reverse-engineering analysis.

Levels of detail
At each of these stages of analysis, you can choose the level of detail you display in the model you develop.

- Entity level: This is a high level of detail used for scope and planning or a simplified review at the beginning of a project. It shows only entities and attributes.
- Key-based: This is an intermediate level of detail, which includes entities and primary key attributes (usually natural keys if possible).
- Fully attributed: This is the most detailed level of analysis. All attributes are listed, including all foreign keys migrating from the parent table.

It’s important to use the appropriate type of model and level of detail and to tailor the model to the needs of your particular situation. You’ll consider the scope of projects in the next chapter.

The Right Model
Data models go through similar maturing and iteration processes as a full project. At the height of a project you’ll be moving rapidly between different models, from Conceptual to Logical to Physical and back to Logical, as you find out that there are exceptions to the business rules that the clients didn’t mention earlier.

Project Type
The type of project will help you narrow down the analysis level of the models. Enterprise-level projects, for instance, almost never need Physical models, but transactional OLTP projects could use all three analysis levels.

Model Deliverables for Different Project Types
Enterprise
Conceptual

Transactional
Conceptual, Logical, Physical

Dimensional or data warehouse
Conceptual, Physical

Model Goal
Your task helps you decide analysis level as well. If you want a high-level overview of the data system, or to play with new ideas or ways of organizing things, then you’ll need to build a Conceptual model. On the other hand, if you want to analyze business rules and analyze data elements, then you’ll need to build a Logical model. If you already know everything there is to know about the rules of the data elements and need to design tables, then you’ll want to build a Physical model.

Model Deliverables Expected by Model Goal
Abstract

Conceptual

Analysis
Conceptual, Logical

Physical design
Physical

Customer Need
Now you need to determine who will be your customer in order to decide what level of definition you need. If your customer is someone who wants a general rundown of the data involved in the project, then you want an entity-level model. If you’re working with someone who is concentrating on integration opportunities or key design choices, then you need a key-based view of the model. If the customer is someone who needs to see all the gory details, then you want a fully attributed model.

Model Level of Detail Deliverable Expectation by Customer Need
Overview

Entity

Focused identifier/uniqueness
Key based

All the details
Fully attributed

Choose one option from these three ways to narrow the definition of your product, and you’ll see what model you need to build. The following maps the combinations of ideas that focus the product definition of a model.

Deliverable Needs Combined to Focus on What Model Should Be Built
(Project Type, Model Goal, Customer View Need, Analysis Level, Level of Definition)
Enterprise
Abstract
Overview
Conceptual
Entity level

Transactional
Abstract
Overview
Conceptual
Entity level

Transactional
Abstract
Overview
Conceptual
Fully attributed

Transactional
Analysis
Overview
Logical
Entity level

Transactional
Data element analysis
Focused identifier/uniqueness
Logical
Key based

Transactional
Data element analysis
All the details
Logical
Fully attributed

Transactional
Physical design
Focused identifier/uniqueness
Physical
Key based

Transactional
Physical design
All the details
Physical
Fully attributed

Data warehouse/enterprise reporting
Abstract
Overview
Conceptual
Entity level

Data warehouse/enterprise reporting
Data element analysis
All the details
Logical
Fully attributed

Data warehouse/enterprise reporting
Physical design: normalized or dimensional
Focused identifier/uniqueness
Physical
Key based

Data warehouse/enterprise reporting
Physical design: normalized or dimensional
All the details
Physical
Fully attributed

Note: Remember that if the model doesn’t communicate what you’re trying to get across to your audience, then it’s useless.
These are still just high-level definitions of what’s on a model and the details that show up. You have an almost unlimited set of choices, especially if you’re using one of the modeling software packages. Making these choices just helps you narrow down what kind of model you’re building and therefore what your checklist of information is. You still need to be aware of the level of understanding (rather than what they want to know) of your model customer, the needs of the project, and any methodology or external standard.

Model Tips
The following is a list of things you should keep in mind when modeling:
- Picking the level of analysis and detail of a model depends on many factors.
- Staying within scope is a challenge.
- Being wrong on a model can be a good thing. It stimulates discussion.
- There is no “one right” model. The models serve different purposes. Not all of them are schematics for databases.
- Sometimes the “correct” solution isn’t the best solution for the enterprise.
- Models are products and have customers. Be customer oriented.
- Stay objective. Feedback for designs or analysis is meant to enhance, not criticize.
- Finding a way to keep the team in sync with each other at every model iteration requires a creative publish and deploy mechanism.
- Every model you build helps you build the next one.
- You’ll need a second pair of eyes to look for your mistakes. You can be too close to see them.
The primary goal of your work is to create a model that your audience can understand. The “perfectly designed” model may not be consumable by your customers. Also, work with your audience to encourage them to think of better ways of doing things; sometimes resistance isn’t because of a failure of design but because of the comfort level of your customers.

Building a Conceptual Model
Defining the Objectives – 20 – 80 rule
What are you trying to accomplish in building a Conceptual model? In this example, you’re going to draw a picture that portrays the data scope of a business process. You’re also going to refine and clarify all the individual descriptions of a process until it’s distilled into a single truth (or as close as you can get to it). To do that, you need to do the following:
- You should become informed enough to explain and defend the discoveries you make.
- You should learn the vocabulary of the process and its surroundings as the business sees it.
- You should identify the concepts or data elements, and determine the rules that govern them.
- You should validate the scope.
- You should build a Conceptual ER model.

Defining the Scope
Defining the Approach
- Top-Down Approach
- Bottom-Up Approach

Documenting the Process

Building the Conceptual Model

Building a Logical Model

Understanding OLAP Database Basics
The main difference between a data mart and a data warehouse is this: a data mart is intended to store a single snapshot of the business data, and the data warehouse will contain more than just a single point-in-time snapshot. A data mart is usually intended to be exposed to query tools or decision support applications and therefore usually follows a Dimensional model design. A data warehouse may have a Third Normal Form (3NF) design or a Dimensional model design, depending on its purpose within the enterprise. If a data warehouse has a 3NF design, it’s likely to feed dependent data marts and therefore is infrequently accessed. If the data warehouse has a Dimensional model design, it may be supporting an OLAP query tool or application directly. It’s important to note that the terms data mart and data warehouse generally describe a database’s use rather than imply a Physical model design.

Data marts generally have the following characteristics:
- They have a Dimensional model design.
- They’re accessed by OLAP (business intelligence) tools or an OLAP application.
- They contain a single point-in-time snapshot of data.
- They’re periodically refreshed from a source system (usually a data warehouse).
- They don’t have other dependent OLAP databases.

Data warehouses generally have the following characteristics:
- They have either a Dimensional model design or a normalized design.
They’re infrequently accessed by any process that isn’t an extraction, transformation, and loading (ETL) process, unless they have a Dimensional model design.
- They contain a time variant and historical view of the data.
- They’re periodically refreshed from a source system (usually source OLTP systems).
- They have dependent data marts.

Dimensional Modeling
It’s a methodology for modeling data that starts from a set of base measurement events and constructs a table called the fact table, generally with one record for each discrete measurement. This fact table is then surrounded by a set of dimension tables, describing precisely what’s known in the context of each measurement record. Because of the characteristic structure of a Dimensional model, it’s often called a star schema. Dimensional models are the logical foundation of all OLAP systems.
The key to a Dimensional model is in determining the relevant context (or face). A Dimensional model that’s designed to find trends in sales information will likely treat each sale as a fact. Any supporting information important to that analysis, such as the country of sale or the month of the year, will be a dimension.

No comments:

Post a Comment