MMH Data Warehousing, Corporate Portal & e-business Applications.

Data Warehousing, Corporate Portal & e-Business Intelligence Applications

 

Mimno, Myers & Holum

COLUMN: TRY CONDUCTING SHORT USER INTERVIEWS

FlashPoint Column - January 25, 2002
COLUMN: TRY CONDUCTING SHORT USER INTERVIEWS
by Pieter Mimno
Independent Consultant

SUMMARY: Top-down development methodologies often require weeks or months of up-front interviews with end users to define functional requirements for a data warehouse. Because of this extensive up-front effort, project managers often cannot achieve rapid Return on Investment for their data warehousing efforts. In contrast, bottom-up development techniques have demonstrated the ability to deliver significant functionality in 90 days or less. Among other techniques, bottom-up development methodologies achieve rapid ROI at low risk by conducting short interviews with end users and eliminating the need to design enterprise-level data models.

TOP-DOWN VERSUS BOTTOM-UP DEVELOPMENT
Top-down development methodologies, which are derived from Information Engineering techniques, typically require lengthy interviews with end users, detailed studies of functional requirements across multiple business areas, and time-consuming enterprise-level data modeling. Due to the extensive amount of up-front effort required, top-down development often requires 8 to 10 months to produce tangible business benefits. This is often unacceptable to business managers who are looking for a much more rapid solution to their business problems. A common business requirement is to obtain tangible ROI from the first data warehousing project in 90 days or less.
Bottom-up development methodologies, which are derived from Rapid Application Development (RAD) techniques, meet this requirement by cutting out major chunks of the up-front design effort. RAD techniques applied to data warehousing focus on development of individual data marts, one business unit at a time. Global requirement definitions are defined through short interviews with individual business units, lasting no more than one day per interview.
Additional time is saved by avoiding the requirement to design an enterprise-level data model for the data warehouse. The use of logical data modeling and metadata integration techniques ensures that all components of the data warehousing application remain integrated, stable, and extensible, without requiring up-front development of an enterprise data model. Bottom-up RAD techniques have demonstrated over many years that it is possible to build large, complex applications without first developing an enterprise data model.

END-USER INTERVIEWS
As part of the requirements specification process for any data warehousing project, interviews are conducted with end users from multiple business units, e.g., sales, marketing, finance, HR, etc. Interviews represent a horizontal business discovery process that specifies functional requirements for multiple data marts.
The top-down implementation process typically requires multiple, lengthy interviews with individuals and groups of individuals in each business unit that is to be supported by the data warehousing application. These interviews may go on for months and result in the generation of stacks of requirement specifications on paper. The client is frequently frustrated by this process because there is no concrete deliverable that can provide a solution for the business problem.
In contrast, the bottom-up methodology limits user interviews to no more than one day per business unit. Interviews are conducted with personnel from multiple business units. Attendees at each interview consist of 8 to 10 individuals from a business area that has expressed an interest in a data mart. Short interviews, which are characteristic of bottom-up development, represent a major difference with top-down development.
The deliverable from each interview is a short, concise, requirement specification for the business unit, with a minimum amount of paper. The requirement specification consists of answers to a structured set of questions and a top-level dimensional data model representing the data sources, source-to-target mappings, target database, and reports required for a specific business unit. The top-level data models produced as a result of interviews with multiple business units are then synthesized to identify common data sources, facts, dimensions, attributes, transformations, aggregates, etc.

USERS DON'T KNOW WHAT THEY WANT IN A DATA MART
The bottom-up interview process is based on an important characteristic of data warehousing: users of data marts cannot specify the functionality they need in a data mart until they actually use it, i.e., users don't know what they want until they see it. No matter how long the interview process lasts, users cannot provide a detailed set of functional requirements for the data mart to be implemented. Typically, once the data mart is in operation, users generate a flood of change requests, based on their growing understanding of how the data mart can be used to manage their business unit better.
The process of defining requirements for a data warehousing application is very different from specifying functional requirements for an OLTP application. In an OLTP application, it is possible to interview users at length and specify the functional requirements for the application in the form of a "Victorian novel". The users are then induced to sign-off on the specifications, and months later, the application is delivered to them.
In data warehousing applications, business users often have little or no experience with the functionality that can be delivered with a data warehouse. They can provide examples of reports that are generated by their existing decision support systems, but they cannot define, in any detail, the functionality required in a proposed data mart. IT personnel are lucky if they can obtain 50% of the functional requirements for a data mart through interviews with end users. The other 50% of required functionality will be initiated by end users through change requests as a result of actual use of the data mart.

QUESTIONS ASKED DURING THE INTERVIEW PROCESS
The same set of structured questions is asked at each interview with personnel from a business unit. Questions that are asked during the first portion of the user interview identify the responsibilities and challenges facing the business unit, i.e.,:

  • What is the business function performed by the business unit?
  • What do individuals do when they come to work every morning?
  • What functions do they perform individually?
  • What business decisions do they have to make?
  • What source data do they use to make these decisions?
  • What tools do they use to aid in the decision-making process?
  • What are the deficiencies in their current decision-support tools?
  • How could they perform their duties better?
  • How many different types of users need to be supported, e.g., general-purpose users, power users, financial analysts, executives, etc.?
  • Are there local business rules for the computation of metrics for profit, sales, depletion, etc.
  • How consistent are these rules? Does each business analyst have a different definition of profit? Can these different rules be specified in an equation? Can the entities used as terms of these equations be defined and maintained centrally (i.e., can local business rules be defined in local metadata that are 100% dependent on entities defined in central metadata)?
The objective of the second set of questions in the user interview process is to specify a top-level dimensional data model that summarizes the functional requirements for a data mart for the business unit. The dimensional model identifies data sources (i.e., source databases and tables) that must be extracted, a representative data model for the data mart (i.e., facts, dimensions, attributes, aggregates), and representative reports to be generated by the data mart. Users are asked to bring in reports they are currently generating. An initial version of the data model for the target database for the data mart is then specified that can be used to generate the reports. Finally, a source-to-target mapping is specified that can be implemented by an extraction/transformation/load (ETL) tool or process to populate the target data base.
The data in the target data model is organized by:
  • Facts: numeric, continuously valued measurements
  • Dimensions: primary business variables, e.g., time, product, customer, store, region, sales rep., etc.
  • Attributes: descriptors of dimensions, (attributes of time are day, week, month, fiscal quarter, year, etc.)
  • Aggregates: summaries of Facts by day, month, quarter, or year, for different combinations of dimensions
It is assumed that a Business-Intelligence (BI) tool will be used to slice the numerical facts in the Fact Table across any dimension or attributes (attributes typically become column or row headers in reports). The BI tool can be used to drill up/down to delete/add column headers, or "drill through" to detailed transaction data stored in an external database, such as a Central Data Warehouse.

SYNTHESIS OF DATA MODELS
The result of each interview is a top-level data model specifying data sources, facts, dimensions, attributes, aggregates, and source-to-target mappings. Following each interview, the data model is documented on a large flip chart and hung on the wall. The data model is a concise, compact specification of the functional requirements for a data mart for the business unit. It is at the appropriate level of detail to support the implementation process.
After interviews with all business units have been completed, the project team studies the top-level data models that were generated as a result of each interview. The team synthesizes the data models by looking for common data sources, facts, conformed dimensions, mappings, etc. across the data models. Conformed dimensions are dimensions, such as time, product, employee, store, etc., that have identical attributes across multiple data marts. Identification of conformed dimensions can save significant development effort because for each conformed dimension, only one dimension table in the entire application has to be populated by the ETL tool or process. A goal is to achieve 60 - 70 percent conformed dimensions.
Following synthesis of the data models, a workshop is convened to specify the long-term data warehousing architecture, and a second workshop is held to define the project plan for the first data mart to be implemented as a subset of the long-term architecture. After all components of the data warehousing architecture for the first data mart have been selected and installed, implementation of the first data mart proceeds within a 90-day time box.

Note: This series of articles describes steps in a bottom-up methodology that I have found to be successful for the implementation of data warehousing applications. The primary goals of the methodology are to reduce the up-front effort required to specify the functionality of a data warehousing application and deliver data marts in 90 days or less, at low cost and low development risk. The overall methodology is summarized in a previous FlashPoint article by Pieter Mimno entitled "Fast Payoff for Data Warehouse Investment".
For further information about the issues discussed in this report, please contact Pieter Mimno, Independent Consultant, at pmimno@mimno.com, or visit his Web site at www.mimno.com. Mr. Mimno specializes in the selection of system components and support for all phases of development for data warehousing, corporate portals, and eBusiness-Intelligence applications.

Reprinted with permission from The Data Warehousing Institute. 
Copyright 2002. The Data Warehousing Institute.

[ TOP ]