Similarity based clustering
Autoren
Mehr zum Buch
Similarity-based learning methods have a great potential as an intuitive and ? exible toolbox for mining, visualization, and inspection of largedata sets. They combine simple and human-understandable principles, such as distance-based classi? cation, prototypes, or Hebbian learning, with a large variety of di? erent, problem-adapted design choices, such as a data-optimum topology, similarity measure, or learning mode. In medicine, biology, and medical bioinformatics, more and more data arise from clinical measurements such as EEG or fMRI studies for monitoring brain activity, mass spectrometry data for the detection of proteins, peptides and composites, or microarray pro? les for the analysis of gene expressions. Typically, data are high-dimensional, noisy, and very hard to inspect using classic (e. g. , symbolic or linear) methods. At the same time, new technologies ranging from the possibility of a very high resolution of spectra to high-throughput screening for microarray data are rapidly developing and carry thepromiseofane? cient, cheap, andautomaticgatheringoftonsofhigh-quality data with large information potential. Thus, there is a need for appropriate - chine learning methods which help to automatically extract and interpret the relevant parts of this information and which, eventually, help to enable und- standingofbiologicalsystems, reliablediagnosisoffaults, andtherapyofdiseases such as cancer based on this information. Moreover, these application scenarios pose fundamental and qualitatively new challenges to the learning systems - cause of the speci? cs of the data and learning tasks. Since these characteristics are particularly pronounced within the medical domain, but not limited to it and of principled interest, this research topic opens the way towardimportant new directions of algorithmic design and accompanying theory.