Information Processing and Novelty Detection

 

 

Information is the result of collecting, processing and organizing data in a way that adds to the knowledge of the receiver, i.e. the context in which data is taken.

 

Information as a concept bears a diversity of meanings and is not the same as data. The concept of information is closely related to notions of constraint, communication, control, form, instruction, knowledge, meaning, stimulus, pattern, perception, and representation.

 

Information has a well defined meaning in physics. One example is the phenomenon of quantum entanglement. Here particles can interact without reference to their separation or the speed of light. Information itself cannot travel faster than light even if the information is transmitted indirectly. This could lead to the fact that all attempts at physically observing a particle with an "entangled" relationship to another are slowed down, even though the particles are not connected in any other way other than by the information they carry.

 

Although data in everyday language is used as a synonym for information, this is not the case in the exact sciences. Here we have a clear distinction between data and information. Data is a measurement that can be disorganized and when the data becomes organized it becomes information. Data may relate to reality, but also to fiction. In the former case it consists of propositions, e.g classes of measurements or observations of a variable. Such propositions may comprise numbers images, etc.

 

Novelty detection is the identification of new or unknown data. It is closely related to change detection. However, vhange detection is mainly a general way of determining when a discrete change occurs in a given sequence of data points. It is often used in data mining, statistics, and dynamic programming.

 

Finding a needle in a haystack? There are many ways and means. One may be the central limit theorems (any of a set of weak-convergence results in probability theory). They all express the fact that any sum of many independent and identically-distributed random variables will tend to be distributed according to a particular "attractor distribution". The most famous CLT. It states that if the sum of the variables has a finite variance, then it will be approximately normally distributed. This is demonstrated in this draft paper by Elena Wildner.

 

Other data mining methods include the applications of neural networks, fuzzy logics, etc. This is often done in combination with feature extraction algorithms like the Principal Component Analysis, various clustering methods, wavelet transforms, the Hilbert-Huang transform, etc.