Research at Aware


Since 2005, I have been working at a great company inventing, developing, implementing and directing research in, core algorithms for biometric technologies. The common thread in all of our product development is quality. I mean this from both a strategic company perspective and, more importantly and fundamentally, an epistemological perspective with respect to image analysis. Matchability as a defining criterion for quality is only one of many possible criteria. Quality is ultimately undefinable - it is value judgement that emerges from the relationship between observer and the object being observed. Consequently, context and application are critical to defining and ultimately measuring quality as it pertains to a given aspect of biometric pattern analysis. This concept and perspective are pervasive in our approach to the work we do. Among the many projects I have been involved in are: the development of facial analysis software, facial, fingerprint and iris quality analysis, facial, fingeprint and iris matching, multi-modal fusion, hand segmentation, entity resolution, text analytics, and many more. Please visit the Aware website for a detailed description of some of the work we do.










Biometric Perturbations for Classifier Enhancement


Classifiers often require significant training time and resources. They are also at the mercy of the quality of the input pattern. In many real-time applications, there can be multiple potential input variations, sampled from an unknown but constant distribution. Choosing an optimal input sample for classification can drastically increase overall classifier performance. To this end, I have developed, in collaboration with Terrance Boult (described here), a novel technique for improving classifier performance by learning a classifier's behavior on perturbed training data, and using that information to select a best perturbed test pattern for optimal classification. The technique is demonstrated for face recognition classifiers, but may be generally applicable. Previous studies have shown the critical importance of eye localization in face recognition algorithms. A support vector machine is used to learn 'good' and 'bad' wavelet transforms of similarity score distributions from an analysis of the gallery using perturbed input eye locations. In production, input face images with a high likelihood of having been incorrectly matched are then reprocessed using perturbed eye coordinate inputs, and the best results used to "correct" the initial classification. Preliminary results were encouraging, and suggest a more general approach involving the use of input perturbations for increasing classifier performance in general.







Eye Localization


Face-based biometrics have been growing in importance as researchers continue to seek more advanced algorithms to improve face system performance, especially under difficult conditions. Most face recognition systems rely on accurate detection of facial features either as input to the classifier directly, or more commonly for the purpose of normalizing images. The very unique nature of the human face makes eye localization a critical component of any face recognition system. However, very little empirical analysis of the effect of eye localization on face recognition accuracy has been done. Some of the work I did at the VAST Lab with Terrance Boult sought to address this.

While many people would expect eye localization to have an impact on recognition accuracy, we showed that it had a significant quantitative impact. Even with ideal data, it is shown to have a significant effect on the recognition accuracy of several different face recognition algorithms. The response of various face recognition algorithms to eye perturbations was found to be similar, despite significant differences in algorithm design, suggesting that our observations are relevant to many face analysis systems.

Interestingly, we found a significant difference between the effect of eye separation errors and the effect of eye location errors on recognition accuracy. The effect of eye localization appears to be more dramatic when errors are made in eye separation distance versus errors made in identifying the actual locations of eyes. Our results suggest that reliance on eye separation for scaling and normalization may be precarious especially in instances where face images are degraded or taken in more realistic environments. Since much of later face processing relies on proper scaling, our results indicate the need for additional "backup" methods of scaling, which may either complement the use of eye separation or supercede it.

Our results also indicate that blurring/noise from a real imaging system may in fact help to make algorithms less sensitive to errors in eye location. The working hypothesis is that blurring seems to mitigate the effect of eye localization, making algorithms more stable with respect to eye perturbations: blurring may impact eye localization, but image smoothing mitigates the impact of such errors. Experiment results are available in our paper The Eyes Have It

 








CAPTCHAs


In 1950, Alan Turing proposed a test for validating the existence of an artificially intelligent machine. Today, we are still far from achieving anything even remotely resembling such a machine. That fact, enables the development of security features for the web that may be used to prevent machines from accessing information that we want only humans to access. CAPTCHA is an acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart". As part of this effort, in collaboration with Henry Baird, I developed a novel CAPTCHA called ScatterType which promises to be resistent to automated character-segmentation attacks. Each ScatterType CAPTCHA challenge is a pseudorandomly synthesized image of a word rendered in a machine-print typeface, whose characters are fragmented using horizontal and vertical cuts, and scattered in random directions by vertical and horizontal displacements. We are in the process of exploring the parameter space of pseudorandomly chosen typefaces, cut positions, and displacements and have tentatively identified a regime in which the words remain legible to human readers but resist automatic segmentation into characters and thus are well defended against one of the most effective attacks on reading CAPTCHAs. The human legibility of ScatterType seems to depend on Gestalt perception abilities which are assisted by style-consistent similarities among primitive features of character fragments. As in the BaffleText CAPTCHA, English-like but unspellable words are used to defend against dictionary-based attacks. Unlike the PessimalPrint and BaffleText CAPTCHAs among many others, no image degradations or occlusions, other than fragmentation, are employed. The tunability of the approach may also lead to detection of automated attacks as an added layer in web security.







The Collective Learning Genetic Algorithm (CLGA)


The Collective Learning Genetic Algorithm (CLGA) not only works, but demonstrates a new paradigm for evolutionary computation that may offer a more information-theoretic perspective on genetic computation. Of course, the mechanism for recombination is new, but the idea behind it embodies much of the intuition in the EC community that has been around for quite a while, and that is, that ECs should be adaptive, and should somehow learn from the fitness landscape which they search.

The CLGA employs genotypic learning to do intelligent recombination based on a cooperative exchange of knowledge between interacting chromosomes. A CLGA consists of a population of adaptive learning agents each of which consists of two components: an adaptive learning automaton and a chromosome string representing the best solution the agent has found so far. Each agent in the population observes a unique set of features in the chromosomes of the other agents it interacts with in order to explicitly estimate the average observed fitnesses of k-order permutations of bits (where k is some small number) in the population, and to use that information to guide recombination. Agents collaborate and exchange knowledge instead of symbols, which they use to modify their own chromosome strings in a Lamarckian fashion, essentially replacing the random crossover operator with a consensus of information based on the permutations of observed symbols that yield the best string fitness. As in standard genetic algorithms, the stages of evolution are still controlled by a global algorithm, but most of the control in the CLGA is distributed among agents that are individually responsible for recombination and selection of surviving offspring.

Note that no random mutation is used. The earliest incarnation of the CLGA did use random mutation [Riopka and Bock, 2000], but later experiments showed that random mutation actually degrades performance, interfering with the actual mechanism of CLGA recombination. There is strong evidence to suggest that the success of the CLGA is not due to a capacity to do linkage learning (as was first thought), but due to the CLGA's high resistance to convergence and its ability to modify its recombinative behavior based on the consistency of the information in its environment, specifically, the observed fitness landscape. Contrary to other self-adaptive algorithms that seek to explicitly alter algorithm behavior as evolution proceeds, the CLGA seems to adapt its behavior naturally, as a consequence of the mechanism used to do recombination. See my paper Adapting to Complexity During Search in Combinatorial Landscapes on this topic. Click here to see a screen capture of the algorithm in progress.

 

What Inspired the CLGA?

Although I refer to the CLGA as a geneticalgorithm, it could also be called a cultural algorithm, because the primary mechanism of the CLGA was actually inspired by human behavior. In a course on cognition, I developed, with a colleague, a human simulation of a genetic algorithm to accomplish several things: 1. to obtain experience designing and performing a pilot experiment for human factor experiments, 2. to design an interactive method of teaching students about genetic algorithms, and lastly (quite ambitiously) 3. to determine how types of information exchange and the type of data exchanged between chromosomes (in this case humans) affected the efficiency of the genetic algorithm. Obviously, the small number of trials we could run (involving about 10 populations of 10 students - paid handsomely with pizza and the promise of honourable mention in future papers) limited the meaningfulness of the results. Nevertheless, a *very* interesting phenomenon occurred during the human simulations. Conversations recorded during the experiments showed a significant flow of information between the "human chromosomes" which encouraged them to collaborate in order to make their hypothesis testing more efficient! Combine this idea with collective learning systems (which was originally designed with the emulation of human behavior in mind) and voila...the CLGA was born. One and a half years of painstaking programming later, I finally had a prototype CLGA that did some very facinating things! An even more amazing fact is that the same machine learning paradigm used in the CLGA is also capable of performing a variety of image analysis and pattern recognition tasks from a relatively new and fresh perspective.

 

Why Should Anyone Be Interested in the CLGA?

For one thing, the concept behind the CLGA is unique in that it relies on adaptive learning automatons to learn from one another to accumulate information about the fitness landscape. That information is then used to guide the search strategy, enabling the CLGA to essentially adapt to problem complexity as seen by the CLGA.

Second, the CLGA is highly resistent to convergence because the algorithm does not rely explicitly on schemata to modify its strings (only on the knowledge acquired about them) and hence the diversity of the population schemata does not directly affect the exploratory capabilities of the CLGA.

Third, the agent-oriented approach to genetic computation has the potential for a vast number of enhancements to the intelligence of search algorithms in general. The CLGA's capacity to simultaneously search for good and bad solutions, yet still work cooperatively is only one very interesting example.

Finally, the CLGA really does behave in interesting new ways, providing fertile ground for future researchers to explore and understand. If you are interested in finding out more about it, or would like some code to test, please don't hesitate to contact me.







Hubble Telemetry Analysis (HTA) Project


For the last several years, the Hubble Space Telescope has been in the process of transitioning from its current real-time ground system, to a more modern, client-server based system called the Control Center System (CCS). The primary goal of the CSS development is to automate routine functions and provide for the monitoring of ground and space systems by the CCS. A next generation component of the monitoring system is a computer subsystem that can quickly and accurately detect and/or predict potentially anomalous satellite conditions. This project was undertaken (P.I. Peter Bock) to determine the feasibility of statistical learning methods, specifically collective learning, for the automated detection/prediction of HST anomalies. It is meant to complement the SYM_DetectFaults subsystem, a rule based approach that is currently being developed at Lockheed Martin for the same function.

The approach presented here, uses a collective learning paradigm (in a modified ALISA engine) to essentially generate multi-dimensional non-parametric pdfs based on past telemetry data, and to use that information to detect anomalous sensor input. The approach is especially powerful because it is adaptive, can be easily modified to handle any number of input sensors (whether from HST or any other multi-sensor system) and its sensitivity can be easily adjusted for Type II errors. It is also immune to the brittleness of a typical rule-based system.

The figure below (click on the image to obtain a higher resolution image [225K]) depicts a set of HST orbits and graphical output of the HTA program. The output is a "test" result after training on approximately five days of archived telemetry data (187 electrical sub-system related sensors) resulting in the detection of a (known) anomaly in the electrical sub-system of the HST. Red areas show anomalous regions, and the lighter the region, the more "normal" it is, as compared to the data on which it was trained.













Automatic Metaphase Finder


The Automatic Metaphase Finder (AMF) (described here) was an instrument manufactured by Perceptive Scientific Instruments, Inc. capable of searching a microscope slide and generating a list of the stage coordinates for all metaphases within a 20 mm x 40 mm search area in approximately ten minutes. It could process a cassette load of sixty slides in approximately ten hours, ranking the metaphases according to six categories of quality. A robot arm was used to load and unload slides from a 60-slide cassette onto an automated microscope stage. Approximately 1.5 sq. mm of slide area could be searched per second using automatic focus to keep the image sharp during the search. Metaphases were classified based on individual chromosome and metaphase characteristics, usingprimarily Bayes classifiers. However, an adaptive resonance neural network was later developed to update classifiers on-the-fly using feedback from operators during metaphase relocation. The instrument continues to be used by several large cytogenetic labs in Japan by Nikon, and has been used at the Nichols Institute in California as well as M.D. Anderson (briefly) in Houston. Click on the image below to obtain a higher resolution labeled version of the AMF microscope.












Fourier Domain Chromosome Eigenanalysis


Eigenanalysis forms the basis of mathematical techniques such as principal component analysis, correspondence analysis, factor analysis, the Karhunen-Loeve transform and singular value decomposition (SVD), all of which transform a set of vectors into mutually orthogonal vectors. The technique was first introduced to image processing by Andrews and Patterson[1976], and since then applied to a number of different applications including electron microscopy, face recognition and chromosome analysis.

The objective of eigenanalysis is to reduce a large number of original, highly correlated variables to a small number of transformed, independent variables. For an image of a chromosome, this kind of approach is feasible, because most of the crucial banding information is along the length of the chromosome arms. As a result, eigenanalysis can either be done on an individual chromosome (single chromosome eigenanalysis) where the vectors of the data matrix are the columns of a single chromosome image, or on a set of chromosomes (homolog chromosome eigenanalysis) where the vectors of the data matrix are whole chromosomes (column concatenated).

The first step in applying Fourier domain eigenanalysis is to take the fourier transform of the data matrix in the dimension corresponding to the length of the chromosome. To achieve proper band alignment, eigenanalysis in the Fourier domain must be performed in two distinct phases. First, eigenanalysis is performed on the magnitude portion of the Fourier expansion and the magnitude data matrix reconstructed using only the k most significant eigenvectors (where k is determined by the desired reconstruction accuracy). Second, eigenanalysis is performed on the dot product of the reconstructed magnitude matrix and the phase portion of the original Fourier expansion, and similarly reconstructed using only the k most significant eigenvectors (where k is determined by the desired reconstruction accuracy). The result is then divided by the reconstructed magnitude matrix to obtain the final reconstructed phase matrix. Inverse fourier transforming gives the desired reconstructed chromosome image(s). Thus, the spectrum is reconstituted using only significant phase components, and then inverse transformed to yield an image in which the bands are in registration from one chromosome to the next. Since positional information regarding the sinusoidal components of an image is encoded in the phase spectrum, synchronizing the phase spectra of the images in the set consolidates their banding patterns spatially while leaving amplitude information unchanged.

The thumbnail images below show three examples of how eigenanalysis can be used: to generate prototype chromosomes[45K], to generate a "clean" karyotype from multiple karyotypes of the same person [31K], and for classification [48K}. Much of this work was done in collaboration with Ken Castleman (a very smart, great guy and co-founder of Perceptive Systems Inc., later Perceptive Scientific Instruments, Inc.)












A Control Architecture for Autonomous Robots


Goal-driven behavior is an important characteristic of an intelligent system. It influences the types of control mechanisms and knowledge representation we adopt as well as ensures global system functionality. Traditionally, symbolic reasoning techniques have been successfully used to implement goal-driven planning and problem solving. However, real world situations do not always allow the time to spend on symbolic inference. On the other hand, reactive behaviors representing independent competence modules can both sense the world and control system actuators in order to respond quickly to the current world situation. However, such an approach does not scale well. As the number of modules implementing behaviors increases, their possible interactions become significant, making it harder to ensure that modules do not work at conflicting goals. In this robot design, we implemented a control architecture that integrated higher level reasoning processes with reactive behaviors. Micro-behaviors were programmed via an interactive joystick control that enabled the robot to learn "skills" (e.g. turning, reversing, etc.) while situated in its environment, alleviating difficulties generally associated with inaccurate simulations. The robot, named CYCLOPS, was a LEGO mini-robot based on the 6.270 MIT kit, consisting of a MC68HC11 microprocessor with 16K on-board RAM (!), and a variety of sensors, including infra-red reflectance sensors, modulated infra-red beacons, a scanning phototransistor vision system, a battery power sensor, beam splitter encoders and numerous tactile sensors. A complete description of its design and construction is given in A Constraint-Based Control Architecture for Acting and Reasoning in Autonomous Robots. The robot was built and presented for a Mobile Robotic Seminar at the University of Rochester.












Active Contours (Snakes)


Active contours is a segmentation algorithm based on a model first proposed by [Kass et al., 1987] which uses a set of connected points (called a snake) that move so as to minimize some formulation of an energy functional. The curve formed by the connected points delineates the active contour. Features in the image (e.g. graylevel, gradient, etc.) contribute to the energy of the snake, as do constraints on the continuity and curvature of the contour itself. In this way, the snake contour can react to the image and move in a continuous manner insuring boundary continuity and smoothness as it locates the desired object boundary. Furthermore, the iterative nature of the algorithm allows higher level processes to actively adjust the weights of the energy functional to affect the dynamic behaviour of the active contour.

Many different implementations of active contours have been described in the literature. The algorithm I developed for PSII implements a greedy algorithm based on [Williams and Shah, "A Fast Algorithm for Active Contours", CVGIP:Image Understanding, 55(1):14-26, 1992]. Its main advantages are computational efficiency and algorithmic simplicity. Its main disadvantage is the standard characteristic of any greedy algorithm - the extremely local nature of its decision criteria. Nevertheless, the algorithm was relatively simple to implement and apparently results in performance comparable to other more global approaches (see the Williams and Shah article).

The general idea behind the algorithm is as follows. The algorithm begins with some starting contour. The energy of the first point on the contour is calculated and compared with the energies calculated for points in its immediate neighborhood. The new location of the contour point corresponds to the location of the point with the lowest energy. If that location is different from the current location of the contour point, the point is considered "adjusted". Performing this operation on all points of the contour once constitutes one iteration of the snake algorithm. This is repeated until an iteration causes no more points to be adjusted. At that point, the snake is said to have converged. I give a more thorough description of this implementation in [Castleman, Riopka and Wu, 1996] (see publications).

In the image above (click on the image to obtain a higher resolution image [40K]), a Hough transform is used to obtain good initial guesses for the starting contour, and then the snake algorithm takes over. What was interesting about this algorithm was that the termination criteria had to include monitoring the algorithm for loops in the contour adjustment, since occasionally, the point adjustment would never stop, but simply cycle indefinitely with some (usually small) period. For this algorithm, loops were relatively easy to identify because I knew where to look. How would an artificial intelligence recognize loops in its own processing?....











Particle Tracking


One of my earliest projects involved the development of particle tracking software for sedimentation experiments, to determine if the degree of particle agglomeration could be used as a measure of particle surface tension. This program was implemented on one of the first PDP-1173 image processing systems to come out of the commercialization of the JPL image processing hardware. The entire program was written in Fortran WAT-IV....yes that means IF THEN - GO TO statements and ...does anyone remember...OVERLAYS? I must have been one of the last people in the world to have had to bother with overlays....I know...for some of you it brings back bad memories. I won't continue.

The system on which this was implemented was meant to replace an analogue image analysis system (OMNICON) originally developed by Baush and Lomb which we used extensively in the lab for cell counting. You can imagine the fun I had with a digital image analysis system after working with an analogue one...to have access to the pixels of an image!...what a rush...The system also had an ALU for hardware convolving...I was in heaven...and of course, I've never looked back. No matter how much work I do in machine learning, image analysis will forever be my calling.

ANYWAY...the images below show the particle tracker at work. The algorithm used the a prioriknowledge of particles descending downward to limit the search to a cone. Images were digitized from negatives obtained by taking long exposure photographs of strobed particles in a specially designed sedimentation chamber containing liquids of varying surface tension. Sedimentation rates were computed, particle sizes measured and in conjunction with interpolated particle/volume coefficient tables, used to estimate particle agglomeration. Yes...and all of this for a measly master's degree in mechanical engineering!....and I won't even mention the shear modulus experiments...sheesh.








Back to Home Page   |   Research at Aware   |   Biometric Perturbation   |   Eye Localization   |   CLGA   |   Captchas   |   HST Telemetry   |   AMF Microscope   |   Eigenanalysis   |   Robot   |   Active Contours   |   Particle Tracking   |   Top of page