Picks, Pans, & Dynamite Data-Mining Algorithms

The data-mining algorithms used on the Web fall into several general categories.

Neural networks work something like your brain. When patterns are presented to you, your brain eventually figures out that certain patterns are associated with other desired outcomes. This can be applied to targeting, estimation, prediction, and knowledge management. Neural networks must be trained, sometimes taking hours of CPU time. They don't adapt to new patterns until trained again, and they need to be carefully tuned by a human.

Collaborative filters organize profile data by person, then use this logic: People who have done things you have done are good predictors for what you will do. In a sense, they are a restricted type of neural network, with the input data in a regular form. This restriction gives collaborative filters three great advantages: They adapt rapidly to new behavior patterns. They can predict for thousands of data points simultaneously. And they don't need to be tuned. This makes collaborative filters ideal for realtime personalization applications.

Bayesian networks build a directed graph of conditional probabilities. As a visitor provides more information about himself or herself, a Bayesian network adjusts the probabilities of each possible end result. This allows a Web system to accelerate the visitor's experience by bringing the most likely things to the visitor's attention as soon as possible. Bayesian networks are most appropriate to help satisfy short-term visitor goals, such as answering customer support questions, diagnosing problems, or selecting an appliance. However, training a Bayesian network is often extremely slow. --DG