Smart Data Categorization

Smart Data Categorization

Data Categorization is the process of grouping similar items together under a common label so that information can be acted upon in aggregate form. Categorization is essential to effective spend analysis, price negotiations, standardization, value analysis, and other supply chain management efforts. Traditional methods for categorizing data typically involve significant manual effort by skilled clinicians and computer programs requiring continual programming and manual configuration of classification rules. Both approaches are costly, time consuming, and inefficient.

The Data Categorization Service of the Data Genomics solution utilizes a revolutionary Open Source technology to identify and group similar products into formal taxonomies such as the United Nations Standard Products and Services Code® (UNSPSC®), Global Product Classification (GPC), eCl@ss®, or your own proprietary classification scheme.

The Data Categorization Service automatically learns your classification rules from previously categorized data, called training sets, effectively leveraging your existing classification investments. The Data Categorization Service also analyzes product data elements from the normalization process to suggest classification assignments when no training set exists. In both cases, either with or without training sets, the need to manually develop classification rules is eliminated. And because the technology driving the Data Categorization Service is Open Source, your cost to achieve high quality classification is greatly reduced. With properly Categorized data you can perform meaningful spend analysis and compare costs among multiple vendors.

Data Normalization

The Data Normalization Service of the Data Genomics solution preprocesses data transforming individual product records into a specific structured format in preparation for the Categorization process. Attempting Categorization without a well-defined and structured input significantly slows down the process and increases the likelihood of an incorrect assignment. The Data Normalization Service is a transitory processor that takes the enriched product descriptions and tokenizes or separates them into an ordered and weighted input stream. By providing the input stream in this structured format the Data Categorization Service of the Data Genomics solution is able to make faster and more accurate classification assignments.

Privacy Policy, Terms & Conditions Copyright © 2009-2010 International Technology Group, Inc. All Rights Reserved.