Magazine Article | December 1, 2001

Hard-Core Data Analysis

Source: Field Technologies Magazine

Much in the same way multiple constituents come together to produce a durable alloy such as steel - multiple storage, supply chain, and other enterprise technologies are necessary ingredients for forging a strong enterprise.

Integrated Solutions, December 2001

Steel has been part of some of the greatest achievements in history. It formed the "iron horse" and rails that helped carve a nation out of the frontier. Steel is the backbone of bridges, the skeleton of skyscrapers, the framework for automobiles. But, equally intriguing as the finished product is the steel-making process itself.

National Steel Corp. (Mishawaka, IN) is familiar with that process. The $3 billion company is one of the largest producers of carbon flat-rolled steel products in the United States, with annual shipments of approximately 6 million tons. To the outside observer, National Steel is driven by elements and compounds joined together by fiery furnaces. But, drilling down to the core of the business reveals a much different driving force. Simply put, the company is driven by its data.

Manual Data Migration Becomes Overwhelming Endeavor
The massive data that National Steel generates is a result of all the custom orders it handles. Almost no two jobs are identical. For example, two similar construction customers may both need hundreds of tons of steel. Because of building requirements, one construction company might need its steel to have a higher concentration of carbon while the other may require a different kind of heat treatment. This information makes for more than 90 GB of data that National Steel generates each year. Previously, the company tried to manually migrate aged data from its production databases to a nonproduction DB2 database. Furthermore, as the business grew, National Steel added an OFS (order fulfillment system) to its repertoire, which incorporates order entry, inventory control, and order status modules. With these new applications, writing and maintaining custom archiving software in-house became an overwhelming burden. "It got to the point that we were trying to keep more than 90 GB of data online, which would slow down our backups," recalls Bill Matts, IT director for National Steel. "We decided that we either had to create some kind of automated data archiving solution in-house or purchase an out-of-the-box solution." As part of its research, National Steel estimated what it would cost to create the solution itself and then checked out various HSM (hierarchical storage management) vendors to see how they stacked up. The company found that Princeton Softech's (Princeton, NJ) Archive for DB2 could be plugged into National Steel's DB2 platform with minimum programming required. "By choosing storage management software over a customized archiving solution, we saved more than 80% of the projected costs - literally hundreds of thousands of dollars," says Matts.

Put Old Data To Good Use
The software was completely installed and integrated with the OFS in less than 120 hours. After the initial implementation, National Steel was able to archive 700 million rows of data, which equated to 65% of the data in its OFS. Now, on a monthly basis the HSM software scans the company's dozens of relational databases, extracts about 20 million rows of data from hundreds of tables, and migrates the data to its nonproduction DB2 database. Not only does this enable National Steel's OFS modules to run more efficiently, it also cuts down on the time it takes for the IT group to perform backups of the OFS. In fact, migrating old and outdated data initially cut down backup time by 65% (equivalent to the 65% data migration). Additionally, this will enable National Steel to move less important data to cheaper tape media in the near future rather than store it online in the DB2 database. With the archive software, National Steel can also restore data back to the OFS. "If customers need detailed information about a past job order because of an audit or a claim, we have to be able to respond quickly to their requests," says Matts. "With the archiving solution, this information can be accessed no matter what archival stage (hierarchy) it is in at the time."

Raw Data + Data Mining = Business Intelligence
National Steel developed some data marts from the OFS database with the aid of Ascential Software's (Westboro, MA) DataStage XE ETL (extract, transform, and load) software. The software extracts data from disparate systems, such as DB2 and Oracle, reformats it into a common language, and prepares the data for further analysis. What companies often find, however, is that their data, like the impurities that float to the top of a batch of molten steel, is "dirty." There are three problems that cause dirty data: format violations, referential integrity violations, and matching violations. The first problem usually means that letters were entered into fields that were formatted for numbers. The second situation refers to incomplete record keeping, such as when the "sales" table refers to customers not listed in the "customers" table. The third reason for dirty data occurs when the same data is entered differently, such as when one table lists a state as "Texas" and a second table lists "TX." Using DataStage, National Steel can run a data-cleansing test to check for and correct any of these scenarios. After this, the data is ready to be analyzed. This is where query, analysis, and reporting software from Microstrategy (Washington, D.C.) comes into play. Microstrategy's Agent first queries the database to find answers to such questions as "How many orders do not have enough inventory to be completed on time?" Or, "Is there inventory available to apply to a different order or customer?" Once the queries are completed, the Web reporting software publishes the responses to the National Steel intranet for sales managers and sales representatives to view. "The analysis tools help us not only juggle steel among customers, they also help us look at orders that have already been satisfied and notice any trends in that data," says Matts. Some of the trends could be increases in orders of a particular type of steel, decreases in a particular kind of process, or an increase/decrease in sales to a particular vertical market.

In the future, National Steel plans to implement more modules on its OFS, and it plans to extend its HSM model to its inventory tracking and financials software. The process will most likely follow the same model that the company uses for its OFS, whereby data is moved from the active applications to a nonproduction database and eventually moved offline to a tape backup library. The data will then, like the raw materials used to make steel, be put through the fire in the hopes that the final business intelligence that is yielded will help the manufacturer build on its solid foundation.

Questions about this article? E-mail the author at