Đăng ký Đăng nhập

Tài liệu Generic inference

.PDF
471
557
73

Mô tả:

GENERIC INFERENCE www.it-ebooks.info GENERIC INFERENCE A Unifying Theory for Automated Reasoning Marc Pouly Cork Constraint Computation Centre University College Cork, Ireland Interdisciplinary Centre for Security, Reliability and Trust University of Luxembourg Jürg Kohlas Department of Informatics University of Fribourg, Switzerland WILEY A JOHN WILEY & SONS, INC., PUBLICATION www.it-ebooks.info Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Pouly, Marc, 1980-, author. Generic Inference : A Unifying Theory for Automated Reasoning / Marc Pouly, Jürg Kohlas. p. cm Includes bibliographical references and index. ISBN 978-0-470-52701-6 (hardback) 1. Valuation theory. 2. Algorithms. 3. Algebra, Abstract. I. Kohlas, Jürg, 1939-, author. II. Title. QA162.P65 2011 519.54—dc22 2010042336 Printed in Singapore. oBook ISBN: 9781118010877 ePDF ISBN: 9781118010846 ePub ISBN: 9781118010860 10 9 8 7 6 5 4 3 2 1 www.it-ebooks.info To Marita and Maria www.it-ebooks.info CONTENTS List of Instances and Applications xiii List of Figures and Tables xvii Acknowledgments xxi Introduction xxiii PART I 1 LOCAL COMPUTATION Valuation Algebras 1.1 1.2 1.3 3 Operations and Axioms First Examples Conclusion Appendix: Generalizations of the Valuation Algebra Framework A.l Ordered Sets and Lattices A. 1.1 Partitions and Partition Lattices A.2 Valuation Algebras on General Lattices A.3 Valuation Algebras with Partial Projection Problem Sets and Exercises 4 6 17 17 17 19 20 24 26 VII www.it-ebooks.info viii 2 CONTENTS 29 2.1 2.2 2.3 2.4 3 Inference Problems 30 32 33 44 44 Graphs, Trees and Hypergraphs Knowledgebases and their Representation The Inference Problem Conclusion Problem Sets and Exercises Computing Single Queries 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 47 Valuation Algebras with Variable Elimination 49 Fusion and Bucket Elimination 50 3.2.1 The Fusion Algorithm 50 3.2.2 Join Trees 55 3.2.3 The Bucket Elimination Algorithm 57 3.2.4 First Complexity Considerations 61 3.2.5 Some Generalizing Complexity Comments 66 3.2.6 Limitations of Fusion and Bucket Elimination 68 Valuation Algebras with Neutral Elements 69 3.3.1 Stable Valuation Algebras 70 Valuation Algebras with Null Elements 73 Local Computation as Message-Passing Scheme 75 3.5.1 The Complexity of Fusion as Message-Passing Scheme 77 Covering Join Trees 78 Join Tree Construction 82 3.7.1 Join Tree Construction by Triangulation 83 The Collect Algorithm 87 3.8.1 The Complexity of the Collect Algorithm 90 3.8.2 Limitations of the Collect Algorithm 91 Adjoining an Identity Element 91 The Generalized Collect Algorithm 92 3.10.1 Discussion of the Generalized Collect Algorithm 96 3.10.2 The Complexity of the Generalized Collect Algorithm 97 An Application: The Fast Fourier Transform 98 Conclusion 100 Appendix : Proof of the Generalized Collect Algorithm Problem Sets and Exercises 101 103 4 109 Computing Multiple Queries www.it-ebooks.info CONTENTS 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 IX The Shenoy-Shafer Architecture 4.1.1 Collect & Distribute Phase 4.1.2 The Binary Shenoy-Shafer Architecture 4.1.3 Performance Gains due to the Identity Element 4.1.4 Complexity of the Shenoy-Shafer Architecture 4.1.5 Discussion of the Shenoy-Shafer Architecture 4.1.6 The Super-Cluster Architecture Valuation Algebras with Inverse Elements 4.2.1 Idempotent Valuation Algebras The Lauritzen-Spiegelhalter Architecture 4.3.1 Complexity of the Lauritzen-Spiegelhalter Architecture The HUGIN Architecture 4.4.1 Complexity of the HUGIN Architecture The Idempotent Architecture 4.5.1 Complexity of the Idempotent Architecture Answering Uncovered Queries 4.6.1 The Complexity of Answering Uncovered Queries Scaling and Normalization Local Computation with Scaling 4.8.1 The Scaled Shenoy-Shafer Architecture 4.8.2 The Scaled Lauritzen-Spiegelhalter Architecture 4.8.3 The Scaled HUGIN Architecture Conclusion 111 114 116 117 118 120 120 121 122 123 127 128 131 131 134 134 141 143 147 147 150 151 152 Appendix: Valuation Algebras with Division D. 1 Properties for the Introduction of Division D.l.l Separative Valuation Algebras D.I.2 Regular Valuation Algebras D.1.3 Idempotent Valuation Algebras D.2 Proofs of Division-Based Architectures D.2.1 Proof of the Lauritzen-Spiegelhalter Architecture D.2.2 Proof of the HUGIN Architecture D.3 Proof for Scaling in Valuation Algebras Problem Sets and Exercises 153 154 155 164 167 167 167 168 169 173 PART II GENERIC CONSTRUCTIONS 5 Semiring Valuation Algebras 181 5.1 182 Semirings www.it-ebooks.info X CONTENTS 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.1.1 Multidimensional Semirings 5.1.2 Semiring Matrices Semirings and Order Semiring Valuation Algebras Examples of Semiring Valuation Algebras Properties of Semiring Valuation Algebras 5.5.1 Semiring Valuation Algebras with Neutral Elements 5.5.2 Stable Semiring Valuation Algebras 5.5.3 Semiring Valuation Algebras with Null Elements Some Computational Aspects Set-Based Semiring Valuation Algebras Properties of Set-Based Semiring Valuation Algebras 5.8.1 Neutral and Stable Set-Based Semiring Valuations 5.8.2 Null Set-Based Semiring Valuations Conclusion 185 185 186 190 192 195 196 196 197 198 200 203 203 204 204 Appendix: Semiring Valuation Algebras with Division E. 1 Separative Semiring Valuation Algebras E.2 Regular Semiring Valuation Algebras E.3 Cancellative Semiring Valuation Algebras E.4 Idempotent Semiring Valuation Algebras E.5 Scalable Semiring Valuation Algebras Problem Sets and Exercises 205 205 209 212 213 215 217 6 Valuation Algebras for Path Problems 219 6.1 6.2 6.3 6.4 6.5 6.6 220 223 232 237 245 246 249 253 256 256 262 262 6.7 6.8 6.9 6.10 Some Path Problem Examples The Algebraic Path Problem Quasi-Regular Semirings Quasi-Regular Valuation Algebras Properties of Quasi-Regular Valuation Algebras Kleene Algebras 6.6.1 Matrices over Kleene Algebras Kleene Valuation Algebras Properties of Kleene Valuation Algebras Further Path Problems Conclusion Problem Sets and Exercises www.it-ebooks.info CONTENTS 7 XI Language and Information 265 7.1 266 266 269 271 274 275 279 280 281 286 289 290 7.2 7.3 7.4 Propositional Logic 7.1.1 Language and Semantics 7.1.2 Propositional Information 7.1.3 Some Computational Aspects Linear Equations 7.2.1 Equations and Solution Spaces 7.2.2 Algebra of Affine Spaces Information in Context 7.3.1 Contexts: Models and Language 7.3.2 Tuple Model Structures Conclusion Problem Sets and Exercises PART III 8 APPLICATIONS Dynamic Programming 293 8.1 8.2 294 297 297 299 303 304 304 307 311 317 318 8.3 8.4 8.5 9 Solutions and Solution Extensions Computing Solutions 8.2.1 Computing all Solutions with Distribute 8.2.2 Computing some Solutions without Distribute 8.2.3 Computing all Solutions without Distribute Optimization and Constraint Problems 8.3.1 Totally Ordered Semirings 8.3.2 Optimization Problems Computing Solutions of Optimization Problems Conclusion Problem Sets and Exercises Sparse Matrix Techniques 321 9.1 322 323 325 328 335 338 339 343 9.2 Systems of Linear Equations 9.1.1 Gaussian Variable Elimination 9.1.2 Fill-ins and Local Computation 9.1.3 Regular Systems 9.1.4 LDR-Decomposition Symmetric, Positive Definite Matrices 9.2.1 Valuation Algebra of Symmetric Systems 9.2.2 Solving Symmetric Systems www.it-ebooks.info XII CONTENTS 9.3 9.4 10 9.2.3 Symmetrie Gaussian Elimination 9.2.4 Symmetrie Deeompositions and Elimination Sequenees 9.2.5 An Application: The Least Squares Method Semiring Fixpoint Equation Systems 9.3.1 Computing Quasi-Inverse Matrices 9.3.2 Local Computation with Quasi-Regular Valuations 9.3.3 Local Computation with Kleene Valuations Conclusion Problem Sets and Exercises 348 354 357 364 365 366 371 382 382 Gaussian Information 385 10.1 10.2 10.3 Gaussian Systems and Potentials Generalized Gaussian Potentials Gaussian Information and Gaussian Potentials 10.3.1 Assumption-Based Reasoning 10.3.2 General Gaussian Information 10.3.3 Combination of Gaussian Information 10.3.4 Projection of Gaussian Information Valuation Algebra of Gaussian Potentials An Application: Gaussian Dynamic Systems An Application: Gaussian Bayesian Networks Conclusion 386 395 400 400 403 407 408 414 417 423 428 Valuation Algebra Properties of Hints Gaussian Densities Problem Sets and Exercises 428 428 433 435 10.4 10.5 10.6 10.7 Appendix: J.l J.2 References 437 Index 447 www.it-ebooks.info List of Instances and Applications 1.1 Indicator Functions, Boolean Functions and Crisp Constraints 7 1.2 Relational Algebra 9 1.3 Arithmetic Potentials 10 1.4 Set Potentials 12 1.5 Density Functions 13 1.6 Gaussian Densities and Gaussian Potentials 15 A.7 Set Potentials on Partition Lattices 24 A.8 Quotients of Density Functions 26 2.1 Вayesian Networks 33 2.2 Query Answering in Relational Databases 35 2.3 Dempster-Shafer Theory of Evidence 36 2.4 Satisfiability of Constraints and Propositional Logic 38 2.5 Hadamard Transform 41 xiii www.it-ebooks.info XIV LIST OF INSTANCES AND APPLICATIONS 2.6 Discrete Fourier and Cosine Transforms 42 B.4 Filtering, Prediction and Smoothing in hidden Markov chains 45 4.4 Probability Potentials 146 4.5 Probability Density Functions 146 D.9 Scaling of Belief Functions 170 5.1 Weighted Constraints, Spohn Potentials and GAI Preferences 192 5.2 Possibility Potentials, Probabilistic Constraints and Fuzzy Sets 193 5.3 Set-based Constraints and Assumption-based Reasoning 194 5.3 Possibility Measures 202 5.3 Disbelief Functions 202 E.15 Quasi-Spohn Potentials 212 E. 18 Bottleneck Constraints 215 6.1 Connectivity Path Problem 220 6.2 Shortest and Longest Distance Problem 221 6.3 Maximum Capacity Problem 221 6.4 Maximum Reliability Problem 222 6.5 Regular Languages 223 6.6 Path Counting Problem 257 6.7 Markov Chains 257 6.8 Numeric Partial Differentiation 259 6.9 Matrix Multiplication 261 F. 1 Path Listing Problems 263 F.2 Testing whether Graphs are Bipartite 263 F.3 Identifying Cut Nodes in Graphs 263 F.6 Network Compilation 264 F.6 Symbolic Partial Differentiation 264 7.1 Satisfiability in Prepositional Logic 272 7.2 Theorem Proving in Prepositional Logic 272 www.it-ebooks.info LIST OF INSTANCES AND APPLICATIONS XV 7.3 Propositional Logic 283 7.4 Linear Equations Systems and Affine Spaces 283 7.5 Linear Inequality Systems and Convex Polyhedra 284 7.6 Predicate Logic 284 G. 1 Consequence Finding in Propositional Logic 290 8.1 Classical Optimization 307 8.2 Satisfiability of Constraints and Propositional Logic 308 8.3 Maximum Satisfiability of Constraints and Propositional Logic 308 8.4 Most and Least Probable Explanations 308 8.5 Bayesian and Maximum Likelihood Decoding 309 8.6 Linear Decoding and Gallager-Tanner-Wiberg Algorithm 310 9.1 Least Squares Method 357 9.2 Smoothing and Filtering in Linear Dynamic System 361 10.1 Gaussian Time-Discrete Dynamic Systems 395 10.2 Kaiman Filter: Filtering Problem 418 10.3 Gaussian Bayesian Networks 423 www.it-ebooks.info List of Figures I.l The graph associated with the inference problem of Exercise 1.1. A.l A tree of diagnoses for stubborn cars. 21 2.1 Undirected graphs. 30 2.2 Directed graphs. 31 2.3 A labeled graph. 31 2.4 Hypergraph, primal graph and dual graph. 32 2.5 Possibilities of knowledgebase representation. 33 2.6 Bayesian network of a medical example. 35 2.7 A digital circuit of a binary adder. 39 2.8 The result table of a full adder circuit. 39 2.9 Connecting two 1-bit-adders produces a 2-bit-adder. 40 2.10 Input and output of a discrete Fourier transform. 42 2.11 Function board of a discrete Fourier transform. 44 xxvi xvii www.it-ebooks.info XVÜi LIST OF FIGURES 2.12 A hidden Markov chain. 46 3.1 Finalization of the graphical fusion process. 56 3.2 Extending a labeled tree to a join tree. 56 3.3 The bucket-tree of Example 3.5. 59 3.4 The join tree obtained from the bucket-tree of Figure 3.3. 60 3.5 The join tree for Example 1.2. 61 3.6 Different elimination sequences produce different join trees. 63 3.7 A primal graph and its induced graph. 64 3.8 A covering join tree of an inference problem. 76 3.9 The fusion algorithm as message-passing scheme. 77 3.10 A covering join tree of Instance 2.1. 79 3.11 A covering join tree of Instance 2.1. 81 3.12 A graph with two possible triangulations. 83 3.13 The triangulated primal graph of Instance 2.1. 86 3.14 Join graph and derived join tree. 86 3.15 A complete run of the collect algorithm. 89 3.16 Join tree of a discrete Fourier transform. 99 4.1 Changing the root of a join tree. 110 4.2 Mailboxes in the Shenoy-Shafer architecture. 111 4.3 Join tree to illustrate the Shenoy-Shafer architecture. 113 4.4 Using non-binary join trees creates redundant combinations. 117 4.5 Larges treewidth due to non-binary join trees. 117 4.6 The performance gain due to the identity element 1. 118 4.7 The performance gain due to the identity element 2. 119 4.8 Illustration of the super-cluster approach. 121 4.9 Separator in the HUGIN architecture. 129 4.10 The path between node 1 and node 6 in the join tree of Figure 4.3. 140 4.11 A graphical summary of local computation architectures. www.it-ebooks.info 154 LIST OF FIGURES XIX D. 1 Embedding a separative valuation algebra into a union of groups. 157 D.2 Decomposing a regular valuation algebra into a union of groups. 165 5.1 Semiring valuation algebras with neutral and null elements. 199 E. 1 Embedding a separative semigroup into a union of groups. 207 E.2 Decomposing a regular semigroup into a union of groups. 210 E.3 Embedding a cancellative semigroup into a group. 213 E.4 Semiring valuation algebras and division-related properties. 216 6.1 The connectivity path problem. 221 6.2 The shortest distance problem. 222 6.3 The maximum capacity problem. 222 6.4 The maximum reliability problem. 223 6.5 Determining the language of an automaton. 224 6.6 Path sets in cycled graphs may be infinite. 225 6.7 The adjacency matrix of a directed, weighted graph. 226 6.8 An interpretation of equation (6.32). 250 6.9 The path counting problem. 257 6.10 Interpreting computations in Markov chains as path problems. 259 6.11 The computational graph of a function. 260 8.1 A binary, unreliable, memory less communication channel. 310 8.2 The join tree that belongs to the node factors of Example 8.4. 314 8.3 A serial connection of unreliable, memoryless channels. 318 8.4 A parallel connection of unreliable, memoryless channels. 319 9.1 Zero-pattern of a linear equation system. 325 9.2 The join tree for Figure 9.1. 326 9.3 The join tree induced by the variable elimination in Example 9.1. 334 9.4 Recursive buildup of lower triangular matrices. 350 9.5 The join tree of Example 9.3. 353 9.6 The graph representing the non-zero pattern of Example 9.4. 355 www.it-ebooks.info XX LIST OF FIGURES 9.7 The join tree of the triangulated graph in Figure 9.6. 9.8 The join tree covering the equations of the linear dynamic system. 362 9.9 A path problem with values from a partially ordered semiring. 379 10.1 A causal model for the wholesale price of a car. 388 10.2 The graph of the time-discrete dynamic system. 396 10.3 A covering join tree for the system (10.6). 397 10.4 Influence among variables in the Kaiman filter model. 419 10.5 A covering join tree for the Kaiman filter model. 419 www.it-ebooks.info 355 Acknowledgments The authors would like to thank our collaborators from the Department of Informatics at the University of Fribourg (Switzerland), the Cork Constraint Computation Centre at the University College Cork (Ireland) and the Interdisciplinary Centre for Security, Reliability and Trust at the University of Luxembourg for the many interesting discussions that helped to improve the content and presentation of this book. In particular, we are grateful for the indispensable support of Christian Eichenberger during our journey through the Gaussian world and to Radu Marinescu and Nie Wilson for providing expert knowledge regarding inference, search methods and semiring systems. Jutta Langel and Jacek Jonczy both corrected parts of this book and provided valuable feedback. Special thank also goes to Cassie Craig from John Wiley & Sons, Inc. for the editing support and to the Swiss National Science Foundation, which financed the research stay of Marc Pouly at the Cork Constraint Computation Centre. Marc Pouly & Jiirg Kohlas XXI www.it-ebooks.info Introduction Generic Algorithms Abstract mathematical structures usually have a large number of models. Algorithms based on operations and laws of such structures therefore have an identical form: they are generic. This means that for each particular instance of the structure, only the basic operations have to be adapted, whereas the overall algorithm remains the same. The simplest and typical problem is the one of sorting. It applies to totally ordered structures with an order relation <. Any sorting algorithm can be formulated in terms of the order relation < and it can be adapted to any particular model of an order relation by specifying this relation; for example for integers, real numbers or for a lexicographical order relation of alphanumeric strings. In addition, many programming languages allow for generic programming, i.e. in their syntax they provide the means to formulate generic algorithms and to specialize them to particular instances. In this book, an abstract algebraic structure called valuation algebra is proposed. It provides the foundation to formulate an abstract inference problem, to construct a graphical data structure and a generic inference algorithm to solve the inference problem. Usually, one arrives at such general structures from concrete problems and algorithms by lifting them to their most abstract form. This also holds for the present case. In 1988, Lauritzen and Spiegelhalter (Lauritzen & Spiegelhalter, 1988) proposed an algorithm for solving the inference problem for Bayesian networks based XXIII www.it-ebooks.info XXÏV INTRODUCTION on a paradigm called local computation. Soon after, Shenoy and Shafer (Shafer & Shenoy, 1988; Shenoy & Shafer, 1990) noted that the same algorithm could also be applied to solve inference problems with belief functions. They proposed a small but sufficient system of axioms for an algebraic framework that makes possible the application of the generic inference algorithm. The algebraic theory of valuation algebras was developed in (Kohlas, 2003), and it was shown that they permit, under certain varying additional conditions, four different generic inference architectures. In (Pouly, 2008) a computer implementation of these generic architectures together with a number of concrete instances is described, see also (Pouly, 2010). In the meantime, many different models of valuation algebras from very distant fields of Mathematics and Computer Science were identified. They range from probabilistic and statistical systems, different uncertainty formalisms to constraint systems, passing by various systems of logic, relational databases and systems of linear equations over fields and semirings. Many of these instances are presented in this book. Their computational interest can always be expressed in terms of the general inference problem, which can be solved by the same generic algorithm or inference architectures. This is first and foremost of great practical interest. Instead of developing and programming inference algorithms for each instance separately, it is possible to use a single generic system. Of course, in the past, individual solution algorithms and computer programs have been proposed and written mostly independent from each other, using the same basic concepts but often naming them quite differently. This adds to a certain Babylonian confusion, especially for students who should get an overall view of Computer Science. Further, it also represents a dissipation of valuable human resources in research and problem solving. Second, the process of abstraction is of great theoretical importance. It provides a unifying view of seemingly different fields of Computer Science and allows to elaborate both similarities and differences between problem classes. In fact, valuation algebras allow for a very appealing general view of information processing: information comes in pieces, usually from different sources. Each piece of information concerns a particular domain and answers, possibly only partially, specific questions. Pieces of information can be aggregated or combined, which is one of the basic operations of a valuation algebra. So, the general inference problem essentially consists of combining all available information. But usually one is only interested in some particular facet or aspect of the total information. Therefore, the parts of the total information, which are relevant to precise questions, must be extracted. Extraction or projection of information is the second basic operation of a valuation algebra. In relational databases for example, combination corresponds to the join operation, whereas information extraction corresponds to the usual operation of projection in the relational algebra. So, valuation algebras show that this scheme of relational algebra is much more general. The form and nature of information elements may be very different from model to model. www.it-ebooks.info
- Xem thêm -

Tài liệu liên quan