Ralf Treinen: Selected Publications

The Debian distribution includes 28 814 maintainer scripts, almost all of which are written in Posix shell. These scripts are executed with root privileges at installation, update, and removal of a package, which make them critical for system maintenance. While Debian policy provides guidance for package maintainers producing the scripts, few tools exist to check the compliance of a script to it. We report on the application of a formal verification approach based on symbolic execution to find violations of some non-trivial properties required by Debian policy in maintainer scripts. We present our methodology and give an overview of our toolchain. We obtained promising results: our toolchain is effective in analysing a large set of Debian maintainer scripts and it pointed out over 150 policy violations that lead to reports on the Debian Bug Tracking system.

Logics for Unordered Trees with Data Constraints

We study monadic second-order logics with counting constraints (CMso) for unordered data trees. Our objective is to enhance this logic with data constraints for comparing string data values. Comparisons between data values at arbitrary positions of a data tree quickly lead to undecidability. Therefore, we restrict ourselves to comparing sibling data values of unordered trees. But even in this case CMso remains undecidable when allowing for data comparisons that can check the equality of string factors. However, for more restricted data constraints that can only check the equality of string prefixes, it becomes decidable. This decidability result is obtained by reduction to WSkS. Furthermore, we exhibit a restricted class of constraints which can be used in transitions of tree automata, resulting in a model with tractable complexity, which can be extended with structural equality tests between siblings. This efficient restriction is relevant to applications such as checking well-formedness properties of file system trees.

Morbig: A Static Parser for POSIX Shell

The POSIX shell language defies conventional wisdom of compiler construction on several levels: The shell language was not designed for static parsing, but with an intertwining of syntactic analysis and execution by expansion in mind. Token recognition cannot be specified by regular expressions, lexical analysis depends on the parsing context and the evaluation context, and the shell grammar given in the specification is ambiguous. Besides, the unorthodox design choices of the shell language fit badly in the usual specification languages used to describe other programming languages. This makes the standard usage of Lex and Yacc as a pipeline inadequate for the implementation of a parser for POSIX shell. The existing implementations of shell parsers are complex and use low-level character-level parsing code which is difficult to relate to the POSIX specification. We find it hard to trust such parsers, especially when using them for writing automatic verification tools for shell scripts. This paper offers an overview of the technical difficulties related to the syntactic analysis of the POSIX shell language. It also describes how we have resolved these difficulties using advanced parsing techniques (namely speculative parsing, parser state introspection, context-dependent lexical analysis and longest-prefix parsing) while keeping the implementation at a sufficiently high level of abstraction so that experts can check that the POSIX standard is respected. The resulting tool, called Morbig, is an open-source static parser for a well-defined and realistic subset of the POSIX shell language.

Deciding the First-Order Theory of an Algebra of Feature Trees with Updates

We investigate a logic of an algebra of trees including the update operation, which expresses that a tree is obtained from an input tree by replacing a particular direct subtree of the input tree, while leaving the rest unchanged. This operation improves on the expressivity of existing logics of tree algebras, in our case of feature trees. These allow for an unbounded number of children of a node in a tree. We show that the first-order theory of this algebra is decidable via a weak quantifier elimination procedure which is allowed to swap existential quantifiers for universal quantifiers. This study is motivated by the logical modeling of transformations on UNIX file system trees expressed in a simple programming language.

Automata for unordered trees

We present a framework for defining automata for unordered data trees that is parametrized by the way in which multisets of children nodes are described. Presburger tree automata and alternating Presburger tree automata are particular instances. We establish the usual equivalence in expressiveness of tree automata and MSO for the automata defined in our framework. We then investigate subclasses of automata for unordered trees for which testing language equivalence is in P-time. For this we start from automata in our framework that describe multisets of children by finite automata, and propose two approaches of how to do this deterministically. We show that a restriction to confluent horizontal evaluation leads to polynomial-time emptiness and universality, but still suffers from coNP-completeness of the emptiness of binary intersections. Finally, efficient algorithms can be obtained by imposing an order of horizontal evaluation globally for all automata in the class. Depending on the choice of the order, we obtain different classes of automata, each of which has the same expressiveness as Counting MSO.

A Formally Verified Interpreter for a Shell-like Programming Language

The shell language is widely used for various system administration tasks on UNIX machines, as for instance as part of the installation process of software packages in FOSS distributions. Our mid-term goal is to analyze these scripts as part of an ongoing effort to use formal methods for the quality assurance of software distributions, to prove their correctness, or to pinpoint bugs. However, the syntax and semantics of POSIX shell are particularly treacherous. We propose a new language called CoLiS which, on the one hand, has well-defined static semantics and avoids some of the pitfalls of the shell, and, on the other hand, is close enough to the shell to be the target of an automated translation of the scripts in our corpus. The language has been designed so that it will be possible to compile automatically a large number of shell scripts into the CoLiS language. We formally define its syntax and semantics in Why3, define an interpreter for the language in the WhyML programming language, and present an automated proof in the Why3 proof environment of soundness and completeness of our interpreter with respect to the formal semantics.

Component repositories play an increasingly relevant role in software life-cycle management, from software distribution to end-user, to deployment and upgrade management. Software components shipped via such repositories are equipped with rich metadata that describe their relationship (e.g., dependencies and conflicts) with other components.

In this practice paper we show how to use a tool, distcheck, that uses component metadata to identify all the components in a repository that cannot be installed (e.g., due to unsatisfiable dependencies), provides detailed information to help developers understanding the cause of the problem, and fix it in the repository.

We report about detailed analyses of several repositories: the Debian distribution, the Opam package collection, and Drupal modules. In each case, distcheck is able to efficiently identify not installable components and provide valuable explanations of the issues. Our experience provides solid ground for generalizing the use of distcheck to other component repositories.

Automated Synthesis and Deployment of Cloud Applications

Complex networked applications are assembled by connecting software components distributed across multiple machines. Building and deploying such systems is a challenging problem which requires a significant amount of expertise: the system architect must ensure that all component dependencies are satisfied, avoid conflicting components, and add the right amount of component replicas to account for quality of service and fault-tolerance. In a cloud environment, one also needs to minimize the virtual resources provisioned upfront, to reduce the cost of operation. Once the full architecture is designed, it is necessary to correctly orchestrate the deployment phase, to ensure all components are started and connected in the right order.

We present a toolchain that automates the assembly and deployment of such complex distributed applications. Given as input a high-level specification of the desired system, the set of available components together with their requirements, and the maximal amount of virtual resources to be committed, it synthesizes the full architecture of the system, placing components in an optimal manner using the minimal number of available machines, and automatically deploys the complete system in a cloud environment.

Automated synthesis and deployment of cloud applications

Roberto Di Cosmo, Michael Lienhardt, Ralf Treinen, Stefano Zacchiroli, Jakub Zwolakowski, Antoine Eiche, Alexis Agahi
ASE '14 Proceedings of the 29th ACM/IEEE international conference on Automated software engineering, 2014

Formal Aspects of Free and Open Source Software Components - A Short Survey

Free and Open Source Software (Foss) distributions are popular solutions to deploy and maintain software on server, desktop, and mobile computing equipment. The typical deployment method in the Foss setting relies on software distributions as vendors, packages as independently deployable components, and package managers as upgrade tools.

We review research results from the past decade that apply formal methods to the study of inter-component relationships in the Foss context. We discuss how those results are being used to attack both issues faced by users, such as dealing with upgrade failures on target machines, and issues important to distributions such as quality assurance processes for repositories containing tens of thousands, rapidly evolving software packages.

Optimal Provisioning in the Cloud

Complex distributed systems are classically assembled by deploying several existing software components to multiple servers. Building such systems is a challenging problem that requires a significant amount of problem solving as one must i) ensure that all inter-component dependencies are satisfied; ii) ensure that no conflicting components are deployed on the same machine; and iii) take into account replication and distribution to account for quality of service, or possible failure of some services. We propose a tool, Zephyrus, that automates to a great extent assembling complex distributed systems. Given i) a high level specification of the desired system architecture, ii) the set of available components and their requirements) and iii) the current state of the system, Zephyrus is able to generate a formal representation of the desired system, to place the components in an optimal manner on the available machines, and to interconnect them as needed.

Learning from the Future of Component Repositories

An important aspect of the quality assurance of large component repositories is the logical coherence of component metadata. We argue that it is possible to identify certain classes of such problems by checking relevant properties of the possible future repositories into which the current repository may evolve. In order to make a complete analysis of all possible futures effective however, one needs a way to construct a finite set of representatives of this infinite set of potential futures. We define a class of properties for which this can be done.

We illustrate the practical usefulness of the approach with two quality assurance applications: (i) establishing the amount of `forced upgrades' induced by introducing new versions of existing components in a repository, and (ii) identifying outdated components that need to be upgraded in order to ever be installable in the future. For both applications we provide experience reports obtained on the Debian distribution.

Learning from the future of component repositories

Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli
CBSE '12 Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering, 2012

Why Do Software Packages Conflict?

Inter-package conflicts require the presence of two or more packages in a particular configuration, and thus tend to be harder to detect and localize than conventional (intra- package) defects. Hundreds of such inter-package conflicts go undetected by the normal testing and distribution process until they are later reported by a user. The reason for this is that current meta-data is not fine-grained and accurate enough to cover all common types of conflicts. A case study of inter- package conflicts in Debian has shown that with more detailed package meta-data, at least one third of all package conflicts could be prevented relatively easily, while another one third could be found by targeted testing of packages that share common resources or characteristics. This paper reports the case study and proposes ideas to detect inter-package conflicts in the future.

Reducing Equational Theories for the Decision of Static Equivalence

Static equivalence is a well established notion of indistinguishability of sequences of terms which is useful in the symbolic analysis of cryptographic protocols. Static equivalence modulo equational theories allows for a more accurate representation of cryptographic primitives by modelling properties of operators by equational axioms. We develop a method that allows us in some cases to simplify the task of deciding static equivalence in a multi-sorted setting, by removing a symbol from the term signature and reducing the problem to several simpler equational theories. We illustrate our technique at hand of bilinear pairings.

MPM : a modular package manager

Software distributions in the FOSS world rely on so-called package managers for the installation and removal of packages on target machines. State-of-the-art package managers are monolithic in architecture, and each of them is hard-wired to an ad-hoc dependency solver implementing a customized heuristics. In this paper we propose a modular architecture allowing for pluggable dependency solvers and backends. We argue that this is the path that leads to the next generation of package managers that will deliver better results, accept more expressive input languages, and can be easily adaptable to new platforms. We present a working prototype---called MPM---which has been implemented following the design advocated in this pape

MPM: a modular package manager

Pietro Abate, Roberto DiCosmo, Ralf Treinen, Stefano Zacchiroli
CBSE '11 Proceedings of the 14th international ACM Sigsoft symposium on Component based software engineering, 2011

Multiple Congruence Relations, First-Order Theories on Terms, and the Frames of the Applied Pi-Calculus

We investigate the problem of deciding first-order theories of finite trees with several distinguished congruence relations, each of them given by some equational axioms. We give an automata-based solution for the case where the different equational axiom systems are linear and variable-disjoint (this includes the case where all axioms are ground), and where the logic does not permit to express tree relations x=f(y,z). We show that the problem is undecidable when these restrictions are relaxed. As motivation and application, we show how to translate the model-checking problem of AπL, a spatial equational logic for the applied pi-calculus, to the validity of first-order formulas in term algebras with multiple congruence relations.

Reducing Equational Theories for the Decision of Static Equivalence

Static equivalence is a well established notion of indistinguishability of sequences of terms which is useful in the symbolic analysis of cryptographic protocols. Static equivalence modulo equational theories allows a more accurate representation of cryptographic primitives by modelling properties of operators by equational axioms. We develop a method that allows in some cases to simplify the task of deciding static equivalence in a multi-sorted setting, by removing a symbol from the term signature and reducing the problem to several simpler equational theories. We illustrate our technique at hand of bilinear pairings.

Expressing Advanced User Preferences in Component Installation

State of the art component-based software collections - such as FOSS distributions - are made of up to dozens of thousands components, with complex inter-dependencies and conflicts. Given a particular installation of such a system, each request to alter the set of installed components has potentially (too) many satisfying answers.

We present an architecture that allows to express advanced user preferences about package selection in FOSS distributions. The architecture is composed by a distribution-independent format for describing available and installed packages called CUDF (Common Upgradeability Description Format), and a foundational language called MooML to specify optimization criteria. We present the syntax and semantics of CUDF and MooML, and discuss the partial evaluation mechanism of MooML which allows to gain efficiency in package dependency solvers.

Download paper: Expressing advanced user preferences in component installation

Ralf Treinen, Stefano Zacchiroli
IWOCE '09 Proceedings of the 1st international workshop on Open component ecosystems, 2009

Proving Group Protocols Secure Against Eavesdroppers

Security protocols are small programs designed to ensure properties such as secrecy of messages or authentication of parties in a hostile environment. In this paper we investigate automated verification of a particular type of security protocols, called group protocols, in the presence of an eavesdropper, i.e., a passive attacker. The specificity of group protocols is that the number of participants is not bounded.

Our approach consists in representing an infinite set of messages exchanged during an unbounded number of sessions, one session for each possible number of participants, as well as the infinite set of associated secrets. We use so-called visibly tree automata with memory and structural constraints (introduced recently by Comon-Lundh et al.) to represent over-approximations of these two sets. We identify restrictions on the specification of protocols which allow us to reduce the attacker capabilities guaranteeing that the above mentioned class of automata is closed under the application of the remaining attacker rules. The class of protocols respecting these restrictions is large enough to cover several existing protocols, such as the GDH family, GKE, and others.

A Spatial Equational Logic for the Applied pi-calculus

Spatial logics have been proposed to reason locally and modularly on algebraic models of distributed systems. In this paper we investigate a spatial equational logic (ApiL) whose models are processes of the applied pi-calculus, an extension of the pi-calculus allowing term manipulation modulo a predefined equational theory, and wherein communications are recorded as active substitutions in a frame. Our logic allows us to reason locally either on frames or on processes, thanks to static and dynamic spatial operators. We study the logical equivalences induced by various relevant fragments of ApiL, and show in particular that the whole logic induces a coarser equivalence than structural congruence. We give characteristic formulae for this new equivalence as well as for static equivalence on frames. Going further into the exploration of ApiL's expressivity, we also show that it can eliminate standard term quantication, and that the model-checking problem for the adjunct-free fragment of ApiL can be reduced to satisfiability of a purely first-order logic of a term algebra.

Intruder deduction for the equational theory of Abelian groups with distributive encryption

Cryptographic protocols are small programs which involve a high level of concurrency and which are difficult to analyze by hand. The most successful methods to verify such protocols are based on rewriting techniques and automated deduction in order to implement or mimic the process calculus describing the execution of a protocol. We are interested in the intruder deduction problem, that is vulnerability to passive attacks in presence of equational theories which model the protocol specification and properties of the cryptographic operators.

In the present paper we consider the case where the encryption distributes over the operator of an Abelian group or over an exclusive-or operator. We prove decidability of the intruder deduction problem in both cases. We obtain a PTIME decision procedure in a restricted case, the so-called binary case.

These decision procedures are based on a careful analysis of the proof system modeling the deductive power of the intruder, taking into account the algebraic properties of the equational theories under consideration. The analysis of the deduction rules interacting with the equational theory relies on the manipulation of Z-modules in the general case, and on results from prefix rewriting in the binary case.

Managing the Complexity of Large Free and Open Source Package-Based Software Distributions

The widespread adoption of Free and Open Source Software (FOSS) in many strategic contexts of the information technology society has drawn the attention on the issues regarding how to handle the complexity of assembling and managing a huge number of (packaged) components in a consistent and effective way. FOSS distributions (and in particular GNU/Linux-based ones) have always provided tools for managing the tasks of installing, removing and upgrading the (packaged) components they were made of. While these tools provide a (not always effective) way to handle these tasks on the client side, there is still a lack of tools that could help the distribution editors to maintain, on the server side, large and high-quality distributions. In this paper we present our research whose main goal is to fill this gap: we show our approach, the tools we have developed and their application with experimental results. Our contribution provides an effective and automatic way to support distribution editors in handling those issues that were, until now, mostly addressed using ad-hoc tools and manual techniques.

Symbolic Protocol Analysis in Presence of a Homomorphism Operator and Exclusive Or

Security of a cryptographic protocol for a bounded number of sessions is usually expressed as a symbolic trace reachability problem. We show that symbolic trace reachability for well-defined protocols is decidable in presence of the exclusive or theory in combination with the homomorphism axiom. These theories allow us to model basic properties of important cryptographic operators.

This trace reachability problem can be expressed as a system of symbolic deducibility constraints for a certain inference system describing the capabilities of the attacker. One main step of our proof consists in reducing deducibility constraints to constraints for deducibility in one step of the inference system. This constraint system, in turn, can be expressed as a system of quadratic equations of a particular form over Z/2Z[X], the ring of polynomials in one indeterminate over the finite field Z/2Z. We show that satisfiability of such systems is decidable.

Résolution Symbolique de Contraintes

Habiliation Thesis, Unversité Paris-Sud, defended November 17, 2005. (in French).

Easy Intruder Deduction

We investigate extensions of the Dolev-Yao model by some algebraic properties of cryptographic primitives. We provide sufficient conditions under which the intruder deduction problem is decidable (resp. decidable in polynomial time). We apply this result to the equational theory of homomorphism, and show that in this case the intruder deduction problem is linear, provided that the messages are in normal form.

The First-Order Theory of Subtyping Constraints

We investigate the first-order theory of subtyping constraints. We show that the first-order theory of non-structural subtyping is undecidable, and we show that in the case where all constructors are either unary or nullary, the first-order theory is decidable for both structural and non-structural subtyping. The decidability results are shown by reduction to a decision problem on tree automata. This work is a step towards resolving long-standing open problems of the decidability of entailment for non-structural subtyping.

Download paper: The first-order theory of subtyping constraints

Zhendong Su, Alexander Aiken, Joachim Niehren, Tim Priesnitz, Ralf Treinen
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, 2002

First-Order Theories of Concrete Domains

These lecture notes address the problem of deciding the first-order theories of structures of elementary data structures: natural, rational and real numbers, strings and trees. This subject is hence part of the vast field of Automated Deduction. Our viewpoint, however, is in some sense complementary to the traditional problem setting.

The traditional problem in Automated Deduction could be phrased as follows: Under a given definition of a proof system (for instance, a Gentzen-style proof in first-order logic, or an equational proof in equational logic), design algorithms that take as input a set of axioms and a formula, and that construct a proof of the formula from the axioms. Proofs are to be searched for arbitrary axioms, even if algoritms can be specialized for certain classes of axioms.

Our approach will be different. Instead of the problem whether a formula is consequence of a set of axioms we will now always work in one fixed semantical structure. We may now ask whether a formula is valid in the structure under consideration, a question which corresponds to a formulas as theorems point of view. In a fixed semantical structure there is, however, another way to look at a formula: A formula with, say, n free variables now denotes the set of n-tupels of values that are solutions of the formula. This is the perspective of formulas as constraints. The problem is now, given a formula, to simplify the formula (that is to find a formula that is equivalent in the structure under consideration and that is simpler in a sense to be made precise). The basic operation to be performed on constraints is then to decide whether they have a solution or not.

The First-Order Theory of Ordering Constraints over Feature Trees

The system FT<= of ordering constraints over feature trees has been introduced as an extension of the system FT of equality constraints over feature trees. We investigate the first-order theory of FT<= and its fragments in detail, both over finite trees and over possibly infinite trees. We prove that the first-order theory of FT<= is undecidable, in contrast to the first-order theory of FT which is well-known to be decidable. We show that the entailment problem of FT<= with existential quantification is PSPACE-complete. So far, this problem has been shown decidable, coNP-hard in case of finite trees, PSPACE-hard in case of arbitrary trees, and cubic time when restricted to quantifier-free entailment judgments. To show PSPACE-completeness, we show that the entailment problem of FT<= with existential quantification is equivalent to the inclusion problem of non-deterministic finite automata.

Dominance Constraints: Algorithms and Complexity

Dominance constraints for finite tree structures are widely used in several areas of computational linguistics including syntax, semantics, and discourse. In this paper, we investigate algorithmic and complexity questions for dominance constraints and their first-order theory. We present two NP algorithms for solving dominance constraints, which have been implemented in the concurrent constraint programming language Oz. The main result of this paper is that the satisfiability problem of dominance constraints is NP-complete. Despite this intractability result, the more sophisticated of our algorithms performs well in an application to scope underspecification. We also show that the existential fragment of the first-order theory of dominance constraints is NP-complete and that the full first-order theory has non-elementary complexity.

Undecidability of the Emptiness Problem of Reduction Automata with Component-Wise Tests

We show that the refinement of reduction automata to component-wise tests has an undecidable emptiness problem. This undecidability result even holds for automata that perform one single equality test, and where this test is performed at the root of the tree.

Grid Structures and Undecidable Constraint Theories

We prove three new undecidability results for computational mechanisms over finite trees: There is a linear, ultra-shallow, noetherian and strongly confluent rewrite system R such that the E*A*-fragment of the first-order theory of one-step-rewriting by R is undecidable; the emptiness problem for tree automata with equality tests between cousins is undecidable; and the E*A*-fragment of the first-order theory of set constraints with the union operator is undecidable. The common feature of these three computational mechanisms is that they allow us to describe the set of first-order terms that represent grids. We extend our representation of grids by terms to a representation of linear two-dimensional patterns by linear terms, which allows us to transfer classical techniques on the grid to terms and thus to obtain our undecidability results.

Constraints in Computational Logic

CCL (Construction of Computational Logics and later Constraints in Computational Logic) is the name of an ESPRIT working group which met regularly from 1992 to 1999. It united research teams from Germany, France, Spain and Israel, and was managed by the company COSYTEC.

In the last years, the main emphasis of the working group was on constraints - techniques to solve them, combine them and applications ranging from industrial applications to logic programming and automated deduction. At the end of the working group, in fall 1999, we organized a summer school, intending to summarize the main advances achieved in the field during the last 7 years. The present book contains the (revised) lecture notes of this school. It contains six chapters, each of which written by some member(s) of the working group and covering the various aspects of constraints in computational logic. We intend it to be read by non specialists, though a prior knowledge in first-order logic and programming is probably necessary.

On Rewrite Constraints and Context Unification

We show that stratified context unification, which is one of the most expressive fragments of context unification known to be decidable, is equivalent to the satisfiability problem of slightly generalized rewriting constraints.

Predicate Logic and Tree Automata with Tests

We investigate the question whether the well-known correspondence between tree automata and the weak second order logic of two successor functions (WS2S) can be extended to tree automata with tests. Our first insight is that there is no generalization of tree automata with tests that has a decidable emptiness problem and that is equivalent to the full class of formulae in some extension of WS2S, at least not when we are asking for an conservative extension of the classical correspondence between WS2S and tree automata to tree automata with tests.

As a consequence we can extend the correspondence between tree automata and WS2S to automata with tests only when we admit a restriction of the class of formulae. We present a logic, called WS2Sy, and a restriction of the class of formula, called uniform, that is equivalent to tree automata with tests.

The central idea of constraints is to compute with descriptions of data instead of to compute with data items. Generally speaking, a constraint-based computation mechanism (in a broad sense, this includes for instance deduction calculi and grammar formalisms) can be seen as a two-tired architecture, consisting of

This lecture is organised as follows: We start with a small tour of constraint-based formalisms comprising constraint logic programming, constraints in automated deduction, constraint satisfaction problems, constrained grammars in computational linguistics, and constraint-based program analysis. In this first section, we will employ a somewhat naive view of constraint systems since we will assume that we have complete and efficient means to manipulate descriptions of data (that is, constraints). This allows to cleanly separate calculus and constraints, which can be considered as an ideal situation.

In the following sections we will see that, in reality, things are not that simple. In Section 2 we will investigate a particular family of constraint system, so-called feature constraints, in some depth. It will turn out that complete procedures to compute with constraints may be too costly and that we may have to refine our calculus to accommodate for incomplete constraint handling. Section 3 will present such a refinement for constraint logic programming. In Section 4, finally, we will address the question of how the basic algorithms to compute with constraints can be implemented in the same programming calculus than the one that uses these algorithms.

We employ the model-theoretic method of Ehrenfeucht-Fraisse Games to prove the completeness of the theory CFT which has been introduced in [Smolka&Treinen 94] for describing rational trees in a language of selector functions. The comparison to other techniques used in this field shows that Ehrenfeucht-Fraisse Games lead to simpler proofs.

The First-Order Theory of Linear One-Step Rewriting is Undecidable

The theory of one-step rewriting for a given rewrite system R and signature Sigma is the first-order theory of the following structure: its universe consists of all Sigma-ground terms, and its only predicate is the relation `x rewrites to y in one step by R'. The structure contains no function symbols and no equality. We show that there is no algorithm deciding the E*A*-fragment of this theory for an arbitrary linear and non-erasing term-rewriting system.

With the same technique we prove that the theory of encompassment plus one-step rewriting by the rule f(x) -> g(x), the theory of encompassment plus a WS2S-definable binary predicate, and the modal theory of one-step rewriting are undecidable.

If you wish to obtain a (paper or electronic) copy of the final journal version please send me an email.

An On-line Problem Database

This is an update of the RTA List of Open Problems. The most recent version of the problem list can be found on the RTA LooP home page.

Feature Trees over Arbitary Structures

This paper presents a family of first order feature tree theories, indexed by the theory of the feature labels used to build the trees. A given feature label theory, which is required to carry an appropriate notion of sets, is conservatively extended to a theory of feature trees with the predicates x[t]y (feature t leads from the root of tree x to the tree y), where we have to require t to be a ground term, and $\isdef xt$ (feature t is defined at the root of tree x). In the latter case, t might be a variable. Together with the notion of sets provided by the feature label theory, this yields a first-class status of arities.

We present a quantifier elimination procedure to reduce any sentence of the feature tree theory to an equivalent sentence of the feature label theory. Hence, if the feature label theory is decidable, the feature tree theory is too.

If the feature label theory is the theory of infinitely many constants and finite sets over infinitely many constants, we obtain an extension of the feature theory CFT, giving first-class status to arities. As an another application, we obtain decidability of the theory of feature trees, where the feature labels are words, and where the language includes the successor function on words, lexical comparison of words and first-class status of arities.

The First-Order Theory of Lexicographic Path Orderings is Undecidable

We show, under some assumption on the signature, that the E*A* fragment of the theory of a lexicographic path ordering is undecidable, both in the partial and in the total precedence cases. Our result implies in particular that the simplification rule of ordered completion is undecidable.

Constraint Deduction in an Interval-based Temporal Logic.

Executable Modal and Temporal Logics, Michael Fisher and Richard Owens, eds., Springer Lecture Notes in Artificial Intelligence, vol. 897, 1995, pages 103-117.

We describe reasoning methods for the interval-based modal temporal logic LLP which employs the modal operators sometimes, always, next, and chop. We propose a constraint deduction approach and compare it with a sequent calculus, developed as the basic machinery for the deductive planning system PHI which uses LLP as underlying formalism.

Records for Logic Programming.

CFT is a new constraint system providing records as logical data structure for constraint (logic) programming. It can be seen as a generalization of the rational tree system employed in Prolog~II, where finer-grained constraints are used, and where subtrees are identified by keywords rather than by position.

CFT is defined by a first-order structure consisting of so-called feature trees. Feature trees generalize the ordinary trees corresponding to first-order terms by having their edges labeled with field names called features. The mathematical semantics given by the feature tree structure is complemented with a logical semantics given by five axiom schemes, which we conjecture to comprise a complete axiomatization of the feature tree structure.

We present a decision method for CFT, which decides entailment and disentailment between possibly existentially quantified constraints. Since CFT satisfies the independence property, our decision method can also be employed for checking the satisfiability of conjunctions of positive and negative constraints. This includes quantified negative constraints such as $\forall y\forall z(x\neq f(y,z))$.

The paper also presents an idealized abstract machine processing negative and positive constraints incrementally. We argue that an optimized version of the machine can decide satisfiability and entailment in quasi-linear time.

Ordering Constraints on Trees.

Colloquium on Trees in Algebra and Programming (CAAP94), Edinburgh, Scotland, 11 - 13 April 1994, pages 1-14, LNCS 787.

We survey recent results about ordering constraints on trees and discuss their applications. Our main interest lies in the family of recursive path orderings which enjoy the properties of being total, well-founded and compatible with the tree constructors. The paper includes some new results, in particular the undecidability of the theory of lexicographic path orderings in case of a non-unary signature.

Feature Constraints with First Class Features.

1993 Conference on Mathematical Foundations of Computer Science, LNCS 711, Gdansk, Poland, Sep. 1993, pages 734-743.

Feature Constraint Systems have been proposed as a logical data structure for constraint (logic) programming. They provide a record-like view to trees by identifying subtrees by keyword rather than by position. Their atomic constraints are finer grained than in the constructor-based approach. The recently proposed CFT [Smolka&Treinen94] in fact generalizes the rational tree system of Prolog II.

We propose a new feature constraint system EF which extends CFT by considering features as first class values. As a consequence, EF contains constraints like x[v]w where v is a variable ranging over features, while CFT restricts v to be a fixed feature symbol.

We show that the satisfiability of conjunctions of atomic EF-constraints is NP-complete. Satisfiability of quantifier-free EF-constraints is shown to be decidable, while the $\exists^*\forall^*\exists^*$ fragment of the first order theory is undecidable.

Equational and Membership Constraints for Infinite Trees.

Extended version of the paper in the Proceedings of the 5th International Conference on Rewriting Techniques and Applications, LNCS 690, June 1992, pages 106-120.

We present a new constraint system with equational and membership constraints over infinite trees. It provides for complete and correct satisfiability and entailment tests and is therefore suitable for the use in concurrent constraint programming systems which are based on cyclic data structures.

Our set defining devices are greatest fixpoint solutions of regular systems of equations with a deterministic form of union. As the main technical particularity of the algorithms we present a novel memorization technique. We believe that both satisfiability and entailment tests can be implemented in an efficient and incremental manner.

A New Method for Undecidability Proofs of First Order Theories.

We claim that the reduction of Post's Correspondence Problem to the decision problem of a theory provides a useful tool for proving undecidability of first order theories given by some interpretation. The goal of this paper is to define a framework for such reduction proofs. The method proposed is illustrated by proving the undecidability of the theory of a term algebra modulo the axioms of associativity and commutativity and of the theory of a partial lexicographic path ordering.

First Order Data Types and First Order Logic.

Extended version of the paper in Theoretical Aspects of Computer Software 1991, LNCS 526.

This paper concerns the relation between parameterized first order data types and first order logic. Augmenting first order logic by data type definitions yields in general a strictly stronger logic than first order logic. We consider some properties of the new logic for fixed data type definitions. While our new logic always fulfills the downward Skolem-L\"owenheim property, compactness is fulfilled if and only if for the given data type definition the new logic has the same expressive power than first order logic. We show that this last property is undecidable.

Selected Draft Papers and Publications

Analysing installation scenarios of Debian packages

Logics for Unordered Trees with Data Constraints

Morbig: A Static Parser for POSIX Shell

Deciding the First-Order Theory of an Algebra of Feature Trees with Updates

Automata for unordered trees

A Formally Verified Interpreter for a Shell-like Programming Language

Mining Component Repositories for Installability Issues

Automated Synthesis and Deployment of Cloud Applications

Formal Aspects of Free and Open Source Software Components - A Short Survey

Optimal Provisioning in the Cloud

Learning from the Future of Component Repositories

Why Do Software Packages Conflict?

Reducing Equational Theories for the Decision of Static Equivalence

MPM : a modular package manager

Multiple Congruence Relations, First-Order Theories on Terms, and the Frames of the Applied Pi-Calculus

Reducing Equational Theories for the Decision of Static Equivalence

Expressing Advanced User Preferences in Component Installation

Proving Group Protocols Secure Against Eavesdroppers

A Spatial Equational Logic for the Applied pi-calculus

Intruder deduction for the equational theory of Abelian groups with distributive encryption

Managing the Complexity of Large Free and Open Source Package-Based Software Distributions

Symbolic Protocol Analysis in Presence of a Homomorphism Operator and Exclusive Or

Résolution Symbolique de Contraintes

Easy Intruder Deduction

The First-Order Theory of Subtyping Constraints

The First-Order Theory of Ordering Constraints over Feature Trees

Constraints and Constraint Solving: An Introduction

How to Win a Game with Features.

An On-line Problem Database

Feature Trees over Arbitary Structures

The First-Order Theory of Lexicographic Path Orderings is Undecidable

Constraint Deduction in an Interval-based Temporal Logic.

Records for Logic Programming.

Ordering Constraints on Trees.

Feature Constraints with First Class Features.

Equational and Membership Constraints for Infinite Trees.

A New Method for Undecidability Proofs of First Order Theories.

First Order Data Types and First Order Logic.