Machine Learning: Data Clustering, Active Information Retrieval, Neural Networks and Applications Data Clustering.

The difference between a “sea of data” and “information systems” lies in the ability to have an overall picture of the data as required — for example, for information-based strategic planning such as individualized medicine based on genetic properties, and for information extraction. Data mining can be viewed as an underlying process for reaching these goals, and clustering of individual pieces of data into groups of related ones is a possible technique for achieving it. Data clustering is currently an active field of research with numerous applications in engineering, business, and medicine.

Most existing clustering algorithms either assume a known cluster’s shape and learn the descriptive parameters (e.g., center and variance), or they allow for flexible shapes but assume a non-overlapping cluster arrangement. In biological based systems, however, cluster shapes may be irregular or non-convex, and real-world sites or genes may naturally belong to more than one category. Indeed, clustering is considered a very tough problem in this field. We introduced a novel algorithm for clustering irregularly shaped data, including very close and partially overlapping clusters, which are difficult to distinguish by other state-of-the-art clustering methods. Our algorithm learns the individual “shapes” adaptively and defines appropriate local metrics using a new notion of “geometric” neurons. It thus constitutes a middle way between parametric and non-parametric approaches. The ability to recognize overlapping clusters and to provide information on the clusters and their relationships makes our algorithm attractive in current biological information systems. We have tested this algorithm, with excellent results.

The identification of data points defining the clusters’ boundaries, and distinguishing them from internal data points of the clusters and from outliers, is crucial in data-mining applications. With Vladimir Vapnik, David Horn, and Ben-Hur, we developed a fast method for clustering irregular shapes with rich complex boundaries that provides this information. This kernel-based algorithm generalizes the heart of Vapnik’s novel Support Vector Machines so as to apply for the unsupervised clustering of unlabeled data, rather than only divide labeled data points.

Active and Interactive Information Systems.

The access to useful information at relevant times may be the deciding factor in successful decision making. Traditional information retrieval (IR) systems try to estimate the information beneficial to the user from the initial query. This technique is not satisfactory when the initial query is not sufficiently specific , as is frequently the case. First, the user has to know what information is available to be retrieved by the system in order to come up with an optimal query. Second, the user needs to be aware of the terms used in the collection of documents with their synonyms and similarity. Third, the user simply has to be aware of what information is important, which is difficult, especially when learning a new domain or when dealing with unpredictable environmental changes, e.g. in a disaster management situation. There is a high level of dissatisfaction with current IR systems which rely on the user’s initial query. If the query is not focused the result will not be either, forcing the user to continue making-asking queries in a long and tedious search session.

We introduced a new IR methodology, Active Information Retrieval (AIR), which is based on the notion that queries are best rendered by a process of dialog between the user and the system. A useful analogy for user-system queries would be to the patient-doctor-patient interaction. The doctor is not limited to merely replying to questions but is expected to actively engage the patient by formulating pertinent questions (we term these “reverse queries”). This process is more expedient in arriving at a diagnosis and treatment options. Based on this analogy, we proposed an information retrieval framework that assumes the responsibility of steering the users to the information, thus increasing efficiency and satisfaction. The active information retrieval paradigm incorporates the main results of both statistical experimental design and active machine learning (e.g. M. Jordan). The nature of a search is probabilistic, and we provide an analytical approach, based on mutual information, for choosing the reverse queries in order to maximize information about user’s request and minimize expected search time between the initial query and user satisfaction. We prove optimality of our approach for search in information systems. Applications are suggested to medical databases with different needs for patients and relatives, doctors, and researchers.

Neural Compiler for combining symbolic knowledge with tuning.

Using the theoretical understanding of analog computation sketched above, I have conducted practical research in the design of adaptive and analog technology. With Das Gupta and Sontag we calculated the complexity of learning in a basic analog neural architecture. This was the first result on the computational limitations of analog neural architectures. With Giles we suggested the first general method, based on control-theoretical concepts, for constructing minimal analog neural networks for given functions. The practicality of the Turing equivalence theorem is strikingly demonstrated in a compiler that I designed under a 1995-1997 grant from the Ministry of Arts and Sciences. This compiler transfers a high-level parallel language into a neural structure, which can later be tuned by neural learning algorithms. Some subsequent neural compilers have been suggested, including with Costa and Neto.

Applications of Machine and Neural Learning.

I have experimented with several applications both in my research and industrial consulting. A partial list includes:

  • Identifying brain signals of Event Related Potentials. This work provides means to filter ERP signals from their 100-500% background noise, and to classify them online without averaging. Analytical study shows the convergence of the adaptive process to optimal matched filters.
  • Predicting the need to convert a laparoscopic cholecystectomy into an open cholecystectomy in acute cholecystitis.
  • Predicting protein structure.
  • Upgrading automation for nuclear fuel in-core management by revising the rules adaptively. This was based on the compiler NIL that I had designed earlier.
  • Aligning radars in a sensor registration process of a multi-radar system to avoid misinterpretation of moving objects.

 

Bibliography
A. Ben-Hur, D. Horn, H.T. Siegelmann and V. Vapnik, “Support vector clustering,” Journal of Machine Learning Research 2:125-137, 2001

B. Das Gupta, H.T. Siegelmann and E. Sontag, “On the Complexity of Training Neural Networks with Continuous Activation Functions,” IEEE Transactions on Neural Networks, 6(6), 1995: 1490-1504

S. Eldar, H. T. Siegelmann, D. Buzaglo, I. Matter, A. Cohen, E. Sabo, J. Abrahamson, “Conversion of Laparoscopic Cholecystectomy to open cholecystectomy in acute cholecystitis: Artificial neural networks improve the prediction of conversion,” World Journal of Surgery. 2002 Jan 26(1): 79-85

O. Frieder and H.T. Siegelmann, ” Document Allocation: A Genetic Algorithm Approach,” IEEE Transactions on Knowledge and Data Engineering, 9(4), 1997: 640-642

T. Jaakkola and H. Siegelmann. “Active information retrieval.” Advances in Neural Information processing systems 14, 2001

H. Karniely and H.T. Siegelmann, “Sensor Registration Using Neural Networks,” IEEE on Aerospace and Electronic Systems, 36(1), 2000: 85-98

D. Lange, H.T. Siegelmann, H. Pratt, and G.F. Inbar, “Overcoming Selective Ensemble Averaging: Unsupervised Identification of Event Related Brain Potentials.” IEEE Transactions on Biomedical Engineering, 47(6), June 2000: 822-826

Loureiro, O. and Siegelmann, H., “Introducing an Active Cluster-Based Information Retrieval Paradigm,” Journal of the American Society for Information Science and Technology, vl 56, n. 10, August 2005, pp. 1024-1030

H. Lipson and H.T. Siegelmann, “Geometric Neurons for Clustering,” Neural Computation 12(10), August 2000

J. P. Neto, H. T. Siegelmann, and J. F. Costa. “Symbolic processing in neural networks,” Journal of the Brazilian Computer Society, 8(3), July 2003

E. Nissan, H.T. Siegelmann, A. Galperin, and S. Kimhi, “Upgrading Automation for Nuclear Fuel In-Core Management: From the Symbolic Generation of Configurations, to the Neural Adaptation of Heuristics,” Engineering with Computers, 13(1), 1997: 1-19

H.T. Siegelmann, “On NIL: The Software Constructor of Neural Networks,” Parallel Processing Letters, 6(4), 1996: 575-582

H.T. Siegelmann and O. Frieder, “The Allocation of Documents in Multiprocessor Information Retrieval Systems: An Application of Genetic Algorithms,” Proceedings of the IEEE Conference on Systems, Man, and Cybernetics, Charlottesville, Virginia, October 1991

H.T. Siegelmann and C.L. Giles, “The Complexity of Language Recognition by Neural Networks,” Journal of Neurocomputing; Special Issue “Recurrent Networks for Sequence Processing,” Editors: M. Gori, M. Mozer, A.H. Tsoi, W. Watrous, 15(3-4), 1997: 327-345

H.T. Siegelmann, E. Nissan, and A. Galperin, “A Novel Neural/Symbolic Hybrid Approach to Heuristically optimized Fuel Allocation and Automated Revision of heuristics in Nuclear Engineering,” Advanced in Engineering Software, 28(9), 1997: 581-592

S. Sivan, O. Filo and H.T. Siegelman, “Application of Expert Networks for Predicting Proteins Secondary Structure,” Biomolecular Engineering, 2007, to appear