The idea of machine learning is to mimic the learning process of human beings, i.e., gaining knowledge through experience. Machine learning algorithms allow machines to generalize rules from empirical data, and, based on the learned rules, make predictions for future data. The Machine Learning Toolkit (MLT) provides various machine learning algorithms in LabVIEW. It is a powerful tool for problems such as visualization of high-dimensional data, pattern recognition, function regression and cluster identification.
The Machine Learning Toolkit includes the following features.
2.1 Unsupervised Learning Algorithms
Unsupervised learning refers to the problems of revealing hidden structure in unlabeled data. Since the data are unlabeled, there is no error signal fed back to the learner in the algorithm. This distinguishes unsupervised learning from supervised learning.
Clustering is one of the main and important approaches of unsupervised learning. Clustering means the assignment of class memberships to a set of objects so that similar objects are assigned into the same class and dissimilar ones are assigned into different classes. Each class often represents a meaningful pattern in the respective problem. Clustering is thereby useful for identification of different patterns in data. For example, in image processing, clustering can be used to divide a digital image into distinct regions for border detection or object recognition.
Supervised learning refers to the generalization of the relationship (function) between the input data and their corresponding outputs (labels). The relationship (function) is learned through a training set of examples, each of which is a pair of an input data and a desired output. During the training, the error between the actual and the desired outputs is frequently fed back into the system for tuning the system parameters according to certain learning rule. After the training, the performance of the learned relationship (function) should be evaluated on a test set (of examples) that is separate from the training set.
Supervised learning is useful for pattern recognition, function regression, etc. One example of applications is recognition of handwriting numbers. A supervised classifier can be trained with a reservoir of handwriting numbers, each with a label (the true numbers it represents). Having been validated on a separate test set, the trained classifier can be used for fast and accurate recognition of future handwriting numbers.
Dimension reduction refers to the process of reducing the number of dimension of the data. The projection of the data set in the reduced space is often desired to preserve certain important data characteristics. In some cases, data analysis, such as clustering, can be done more easily and accurately in the reduced space than in the original space. One prime application of dimension reduction is face recognition, where face images represented by a large number of pixels are projected to a more manageable low-dimensional feature space before classification.
The MLT provides validation and visualization utilities to facilitate the monitoring of the quality of learning. The utilities fall into three categories: cluster validity indices, evaluation of classification, visualization of learned results. The list of functions in each category is shown below.