To provide a introductory tutorial to students and enthusiasts interested in applying pattern recognition in their work. This tutorial contains a brief theory, mathematical details and some simple demos of pattern recognition techniques. This article addresses a few pattern recognition problems from basic classification problems like linearly separable datasets to complex data structures. We will look at different classifiers and their relative performance.
Pattern recognition is the field of engineering or sometimes classified as sub-section of machine learning with the goal of replicating human recognition and classification skills with the use of computer algorithms. One can observe that humans do pattern recognition with great ease, for example, colour coding, we can separate a set of coloured balls into different colour groups with no effort. Another fascinating example is our ability to recognize people seen only a few times or people seen a long time ago. In engineering fields like image processing, target identification and so on. the task of the pattern recognition engineer is to identify different sections of the image based on a certain property, for example, to classify a land image into a urban and rural areas. Recently, the efforts to automate these process have increased, especially, due to availability of computing power and advances in machine learning. Pattern recognition algorithms are usually known as classifiers. A few popular algorithms are k-Nearest Neighbour classifier, k-Means clustering.
Pattern Recognition Process
The process consists of three major steps after data acquisition. Datasets for pattern recognition can be from a wide range of sources like satellite sensor data, ground based sensor data, medical images and so on. Once the dataset is acquired it is preprocessed, so that it is suitable for subsequent sub-processes. Next step is feature extraction, in which, the dataset is converted into a set of feature vectors which are supposed to be representatives of the original data. These features are used in the classification step to segregate the data points into different classes based on the problem.
One of the most common preprocessing steps done in field of pattern recognition are normalization to zero mean and unit variance, especially for 1-D datasets. In the field of remote sensing most common preprocessing step required is re-gridding, which is basically assigning a spatio-temporally uniform grid to raw data. In many image processing applications, it is desirable to have a uniform spatial grid for the pattern recognition process. However, satellite datasets usually have non-uniform grid, this problem can be rectified by re-sampling the spatial data by either interpolation or averaging to an uniform grid. Another common method used is spatial interpolation, as most of the datasets acquired are usually full of missing data points. The problem of missing data points is well known in statistics and this problem can be overcome by using a slew of techniques from simple averaging to advanced spectral analysis methods.
The main goal of feature extraction is to reduce the data dimensionality and properly represent the original data in feature space. Features useful for classification process can be simple features like RGB values in color images, or complex features like energies from the Fourier Transform or Wavelet Transform of a time series. The feature extraction process usually consists of three steps 1) Feature construction is the step in which features are constructed from linear or non-linear combination of raw features. 2) feature selection process is done using techniques like relevancy ranking of individual features and 3) feature reduction process is used to reduce the no. of features especially when too many features are selected compared to the no. of feature vectors. These three steps are not mandatory in the feature extraction process.
It evaluates the features presented and further makes the final decision
For instance, let us consider a ring like image with a cluster of points in the centre.