Author: macleanwalker
-
Data Transformation – Procedural & Non Procedural Solutions
This paper looks at a somewhat awkward data transformation, and at solutions written in SQL and in a procedural language. It describes some techniques which can be used to develop the solution in both languages. It also compares the solutions in terms of ease of development, performance and cost of maintenance. Are we building transformations…
-
Basic Data Quality Checks
This article looks at basic data quality audit that can be done within a database. Examples are given using Oracle syntax however the techniques can also be applied to other databases Introduction The following article discusses some of the data quality issues that can be addressed by manual scripts on a copy of the…
-
Detecting Changed Data
Introduction When loading data warehouses, it is usually possible to decreases the load time very significantly by processing only changes since the last load, rather than completely refreshing all the data every time. This article describes one approach for detecting changes, which has been used successfully in a number of data warehouse projects. Background There…
-
Job scheduling – fixed and relative timing
This article looks at the relatve merits of two types of processing schedule, once based on a fixed times (similar to ‘cron’ on a unix system) and onebased on relative times (similar to ‘at’ on a unix system) Introduction Much of data warehousing is dependent on running regular jobs to get collect data and load…
-
Auditing Data Cleaning Updates
How to track what has been updated by data cleaning processes The Problem A common problem when building a data warehouse is to track and audit which records have had data quality updates or cleaning applied to them. A simple method is to add a bitmap that reflects the data quality updates have been applied.…
-
Fly Fishing File System
San Francisco USENIX Conference 1992 – Contest The conference contest, “Name That File System”, was a great success, with several hundred (yes, you read that right) entries. The rules read as follows: In the beginning, there was the file system, and it was good enough for the disk technology of the time. Then disks got…
-
Sequent – Values & Principles
Sequent was a hardware vendor that pioneered high-performance symmetric multiprocessing (SMP) open systems between 1983 and 1999 when it was acquired by IBM. I was lucky enough to work for Sequent between 1992 and 1995. In its heyday it operated by a set of values and principles that created a great work environment because individuals who worked at the company…
-
Template Solutions for Data Warehouses and Data Marts
A presentation given in 1998 to the Data Warehousing & Business Intelligence Conference on the value of using template solutions for data warehouses and data marts. It describes the types of reporting system and discusses the templates then available for them Download Template Solutions for Data Warehouses and Data Marts Now
-
KeySum – Using Checksum Keys
In 1997 we were working on a project for Swisscom Mobile and needed to innovate a way of using a checksum key on a database. Here is the solution we design: Introduction Keysum is a new and interesting technique (not a product) in the generation of keys within a database. It has particular application within…
-
A Technical Architecture for the Data Warehouse
In April 1995 I left Sequent Computer Systems to set up Data Management & Warehousing. Between the completion of the Perot/Europcar project and leaving I had worked with Ralph Kimball, who was training Sequent to develop a Data Warehousing practice and promoting his Red Brick technology (currently owned by IBM). This paper outlines the first two…
You must be logged in to post a comment.