Some of the More Common Techniques for Data Mining Analysis

Data mining is the process of discovering relationshipsstrategy was applied to the stars in space, it could find
in large data sets. It is an area of Computer Sciencethat each galaxy is a cluster and assign a unique
that has received a large amount of commercialcluster identification to each star in a specified galaxy.
interest. In this article, I'll detail a few of the mostThis cluster identification then becomes another field in
common techniques of data mining analysis.the data set and may be employed in further data
Association rule discovery: Association rule discoverymining analysis. For instance, you may use a cluster id
techniques are used to extract associations from datafield to form association rules to some other fields in
sets. Historically, the strategy was developed onthe data set.
supermarket purchase data. An association rule is aDecision trees: Decision trees are used to form a tree
rule of the form X -> Y. An instance of this may be "Ifof decisions in a data set to help forecast a value
a buyer purchases milk this implies ( -> ) that the buyerassociate with that data. For example, if you were
will also purchase bread". An association rule haslooking at a data set that was employed to predict
associated with it support and confidence values. Thewhether a potential loan applicant would be a credit
support is the proportion of all entries (or transactions inrisk, a tree of decisions would be formed based on
this example) that have all of the items. For example,factors in the data set. The tree may contain decisions
the proportion of all purchases in which both milk andlike whether the applicant had defaulted on a loan
bread were purchased. The confidence is thebefore, the age of the applicant, whether the applicant
proportion of the transactions that satisfy the left sidewas employed or not, the applicants earnings and the
of the rule that also satisfy the right side of the rule.total repayments on the loan. You could then follow
For instance, in this situation, the confidence would bethis tree of decisions to say for example, if an
the proportion of purchases that purchased milk whichapplicant has never defaulted on a loan before, the
also purchased bread. Association discoveryapplicant is employed, their earnings is in the top 15
techniques will extract all possible association rulespercentile for the country and the loan amount is
from a data set for which the user has stipulated arelatively low then there's a extremely low risk of
minimum support and confidence.default.
Cluster analysis: Cluster analysis is the process ofThese are some of the more common methods for
taking several numeric fields and assigning clusters todata mining analysis amongst a large group of data
their values. These clusters represent groups of pointsmining methods that are frequently applied to analyzing
which are close to one another. For instance, if yoularge data sets. These strategies have proved
watch a documentary on space, you will see thatvaluable to gather helpful information and relationships
galaxies contain a large number of stars and planets.from data sets that may otherwise be too large to
There are many galaxies in space, however the starsanalyse well.
and planets all occur in clusters that are the galaxies.The author owns a number of websites that provide
That is, the stars and planets are not at randomfinancial loan calculators including this refinancing
locations in space but are clumped together in groupscalculator, this amortization calculator and this boat loan
that are galaxies. A cluster analysis technique is usedcalculator.
to find these types of groups. If a cluster analysis