Channels ▼

Data Mining on the Web

Web Techniques: Sidebar


Picks, Pans, & Dynamite Data-Mining Algorithms

The data-mining algorithms used on the Web fall into several general categories.

Neural networks work something like your brain. When patterns are presented to you, your brain eventually figures out that certain patterns are associated with other desired outcomes. This can be applied to targeting, estimation, prediction, and knowledge management. Neural networks must be trained, sometimes taking hours of CPU time. They don't adapt to new patterns until trained again, and they need to be carefully tuned by a human.

Collaborative filters organize profile data by person, then use this logic: People who have done things you have done are good predictors for what you will do. In a sense, they are a restricted type of neural network, with the input data in a regular form. This restriction gives collaborative filters three great advantages: They adapt rapidly to new behavior patterns. They can predict for thousands of data points simultaneously. And they don't need to be tuned. This makes collaborative filters ideal for realtime personalization applications.

Bayesian networks build a directed graph of conditional probabilities. As a visitor provides more information about himself or herself, a Bayesian network adjusts the probabilities of each possible end result. This allows a Web system to accelerate the visitor's experience by bringing the most likely things to the visitor's attention as soon as possible. Bayesian networks are most appropriate to help satisfy short-term visitor goals, such as answering customer support questions, diagnosing problems, or selecting an appliance. However, training a Bayesian network is often extremely slow. --DG

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.