Traditionally, organizations use data tactically – to manage operations. For a competitive edge, strong organizations use data strategically – to expand the business, to improve profitability, to reduce costs, and to market more effectively. Data mining (DM) creates information assets that an organization can leverage to achieve these strategic objectives.
In this article, we address some of the key questions executives have about data mining. These include:
- What is data mining?
- What can it do for my organization?
- How can my organization get started?
Business Definition of Data Mining
Data mining is a new component in an enterprise's decision support system (DSS) architecture. It complements and interlocks with other DSS capabilities such as query and reporting, on-line analytical processing (OLAP), data visualization, and traditional statistical analysis. These other DSS technologies are generally retrospective. They provide reports, tables, and graphs of what happened in the past. A user who knows what she's looking for can answer specific questions like: "How many new accounts were opened in the Midwest region last quarter," "Which stores had the largest change in revenues compared to the same month last year," or "Did we meet our goal of a ten-percent increase in holiday sales? "
We define data mining as "the data-driven discovery and modeling of hidden patterns in large volumes of data." Data mining differs from the retrospective technologies above because it produces models – models that capture and represent the hidden patterns in the data. With it, a user can discover patterns and build models automatically, without knowing exactly what she's looking for. The models are both descriptive and prospective. They address why things happened and what is likely to happen next. A user can pose "what-if" questions to a data-mining model that can not be queried directly from the database or warehouse. Examples include: "What is the expected lifetime value of every customer account," "Which customers are likely to open a money market account," or "Will this customer cancel our service if we introduce fees?"
The information technologies associated with DM are neural networks, genetic algorithms, fuzzy logic, and rule induction. It is outside the scope of this article to elaborate on all of these technologies. Instead, we will focus on business needs and how data mining solutions for these needs can translate into dollars.
Mapping Business Needs to Solutions and Profits
What can data mining do for your organization? In the introduction, we described some strategic opportunities for an organization to use data for advantage: business expansion, profitability, cost reduction, and sales and marketing. Let's consider these opportunities very concretely through several examples where companies successfully applied DM.
Expanding your business: Keystone Financial of Williamsport, PA, wanted to expand their customer base and attract new accounts through a LoanCheck offer. To initiate a loan, a recipient just had to go to a Keystone branch and cash the LoanCheck. Keystone introduced the $ 5000 LoanCheck by mailing a promotion to existing customers.
The Keystone database tracks about 300 characteristics for each customer. These characteristics include whether the person had already opened loans in the past two years, the number of active credit cards, the balance levels on those cards, and finally whether or not they responded to the $ 5000 LoanCheck offer. Keystone used data mining to sift through the 300 customer characteristics, find the most significant ones, and build a model of response to the LoanCheck offer. Then, they applied the model to a list of 400,000 prospects obtained from a credit bureau.
By selectively mailing to the best-rated prospects determined by the DM model, Keystone generated $ 1.6M in additional net income from 12,000 new customers.
Reducing costs: Empire Blue Cross / Blue Shield is New York State's largest health insurer. To compete with other healthcare companies, Empire must provide quality service and minimize costs. Attacking costs in the form of fraud and abuse is a cornerstone of Empire's strategy, and it requires considerable investigative skill as well as sophisticated information technology.
The litter includes a data mining application that profiles each physician in the Empire network based on patient claim records in their database. From the profile, the application detects minor deviations in physician behavior relative to her / his peer group. These deviations are reported to fraud investigators as a "suspicion index." A physician who performs a high number of procedures per visit, charges 40% more per patient, or sees many patients on the weekend would be flagged immediately from the suspicion index score.
What has this DM effort returned to Empire? In the first three years, they realized fraud-and-abuse savings of $ 29M, $ 36M, and $ 39M respectively.
Improving sales efficiency and profitability: Pharmaceutical sales representatives have a broad assortment of tools for promoting products to doctors. These tools include clinical literature, product samples, dinner meetings, teleconferences, golf outings, and more. Knowing which promotions will be most effective with which doctors is extremely valuable since wrong decisions can cost the company hundreds of dollars for the sales call and even more in lost revenue.
The reps for a large pharmaceutical company collectively make tens of thousands of sales calls. One drug maker linked six months of promotional activity with corresponding sales figures in a database, which they then used to build a predictive model for each doctor. The data-mining models revealed, for instance, that among six different promotional alternatives, only two had a significant impact on the prescribing behavior of physicians. Using all the knowledge embedded in the data-mining models, the promotional mix for each doctor was customized to maximizeize ROI.
Although this new program was rolled out just recently, early responses indicate that the drug maker will exceed the $ 1.4M sales increase originally planned. Given that this increase is generated with no new promotional spending, profits are expected to increase by a similar amount.
Looking back at this set of examples, we must ask, "Why was data mining necessary?" For Keystone, response to the loan offer did not exist in the new credit bureau database of 400,000 potential customers. The model predicted the response given the other available customer characteristics. For Empire, the suspicion index quantified the differences between physician practices and peer (model) behavior. Appropriate physicist behavior was a multi-variable aggregate produced by data mining – once again, not available in the database. For the drug maker, the promotion and sales databases contained the historical record of activity. An automated data mining method was necessary to model each doctor and determine the best combination of promotions to increase future sales.
In each case presented above, data mining yielded significant benefits to the business. Some were top-line results that increased revenues or expanded the customer base. Others were bottom-line improvements resulting from cost-savings and enhanced productivity. The natural next question is, "How can my organization get started and begin to realize the competitive advantages of DM?"
In our experience, pilot projects are the most successful vehicles for introducing data mining. A pilot project is a short, well-planned effort to bring DM into an organization. Good pilot projects focus on one very specific business need, and they involve business users up front and through the project. The duration of a typical pilot project is one to three months, and it generally requires 4 to 10 people part-time.
The role of the executive in such pilot projects is two-pronged. At the outside, the executive participates in setting the strategic goals and objectives for the project. During the project and prior to roll out, the executive takes part by supervising the measurement and evaluation of results. Lack of executive sponsorship and failure to involve business users are two primary reasons DM initiatives stall or fall short.
In reading this article, sometimes you've developed a vision and want to proceed – to address a pressing business problem by sponsoring a data mining pilot project. Twisting the old adage, we say "just because you should not mean you can." Be aware that a capacity assessment needs to be an integral component of a DM pilot project. The assessment takes a critical look at data and data access, personnel and their skills, equipment, and software. Organizations typically underestimate the impact of data mining (and information technology in general) on their people, their processes, and their corporate culture. The pilot project provides a reliably high-reward, low-cost, and low-risk opportunity to quantify the potential impact of DM.
Another stumbling block for an organization is deciding to defer any data mining activity until a data warehouse is built. Our experience indicates that, oftentimes, DM could and should come first. The purpose of the data warehouse is to provide users the opportunity to study customer and market behavior both retrospectively and prospectively. A data mining pilot project can provide important insight into the fields and aggregates that need to be designed into the warehouse to make it really valuable. Further, the cost savings or revenue generation provided by DM can provide bootstrap funding for a data warehouse or related initiatives.
Recapping, in this article we addressed the key questions executives have about data mining – what it is, what the benefits are, and how to get started. Armed with this knowledge, begin with a pilot project. From there, you can continue building the data mining capacity in your organization; to expand your business, improve profitability, reduce costs, and market your products more effectively.
Copyright Discovery Corps, Inc., 2011