Selected Draft Papers and Publications


2020

Analysing installation scenarios of Debian packages

Benedikt Becker, Nicolas Jeannerod, Claude Marché, Yann Régis-Gianas, Mihaela Sighireanu, Ralf Treinen. Accepted for publication at the 26th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), Dublin, Ireland, 25-30 april 2020.

The Debian distribution includes 28 814 maintainer scripts, almost all of which are written in Posix shell. These scripts are executed with root privileges at installation, update, and removal of a package, which make them critical for system maintenance. While Debian policy provides guidance for package maintainers producing the scripts, few tools exist to check the compliance of a script to it. We report on the application of a formal verification approach based on symbolic execution to find violations of some non-trivial properties required by Debian policy in maintainer scripts. We present our methodology and give an overview of our toolchain. We obtained promising results: our toolchain is effective in analysing a large set of Debian maintainer scripts and it pointed out over 150 policy violations that lead to reports on the Debian Bug Tracking system.

[HAL entry, including PDF the preprint]


2019

Logics for Unordered Trees with Data Constraints

Adrien Boiret, Vincent Hugot, Joachim Niehren, Ralf Treinen. In Journal of Computer and System Sciences, Volume 104, September 2019, Pages 149-164.

We study monadic second-order logics with counting constraints (CMso) for unordered data trees. Our objective is to enhance this logic with data constraints for comparing string data values. Comparisons between data values at arbitrary positions of a data tree quickly lead to undecidability. Therefore, we restrict ourselves to comparing sibling data values of unordered trees. But even in this case CMso remains undecidable when allowing for data comparisons that can check the equality of string factors. However, for more restricted data constraints that can only check the equality of string prefixes, it becomes decidable. This decidability result is obtained by reduction to WSkS. Furthermore, we exhibit a restricted class of constraints which can be used in transitions of tree automata, resulting in a model with tractable complexity, which can be extended with structural equality tests between siblings. This efficient restriction is relevant to applications such as checking well-formedness properties of file system trees.

Publication note: This supersedes

[HAL entry, including PDF the preprint]


2018

Morbig: A Static Parser for POSIX Shell

Yann Régis-Gianas, Nicolas Jeannerod, Ralf Treinen In 11th ACM SIGPLAN International Conference on Software Language Engineering (SLE), November 5-6, 2018, Boston, USA.

The POSIX shell language defies conventional wisdom of compiler construction on several levels: The shell language was not designed for static parsing, but with an intertwining of syntactic analysis and execution by expansion in mind. Token recognition cannot be specified by regular expressions, lexical analysis depends on the parsing context and the evaluation context, and the shell grammar given in the specification is ambiguous. Besides, the unorthodox design choices of the shell language fit badly in the usual specification languages used to describe other programming languages. This makes the standard usage of Lex and Yacc as a pipeline inadequate for the implementation of a parser for POSIX shell. The existing implementations of shell parsers are complex and use low-level character-level parsing code which is difficult to relate to the POSIX specification. We find it hard to trust such parsers, especially when using them for writing automatic verification tools for shell scripts. This paper offers an overview of the technical difficulties related to the syntactic analysis of the POSIX shell language. It also describes how we have resolved these difficulties using advanced parsing techniques (namely speculative parsing, parser state introspection, context-dependent lexical analysis and longest-prefix parsing) while keeping the implementation at a sufficiently high level of abstraction so that experts can check that the POSIX standard is respected. The resulting tool, called Morbig, is an open-source static parser for a well-defined and realistic subset of the POSIX shell language.

[Download] [HAL entry] [Github]

Deciding the First-Order Theory of an Algebra of Feature Trees with Updates

Nicolas Jeannerod, Ralf Treinen. In 9th International Joint Conference on Automated Reasoning, part of FLoC 2018, July 14-17, 2018, Oxford, UK.

We investigate a logic of an algebra of trees including the update operation, which expresses that a tree is obtained from an input tree by replacing a particular direct subtree of the input tree, while leaving the rest unchanged. This operation improves on the expressivity of existing logics of tree algebras, in our case of feature trees. These allow for an unbounded number of children of a node in a tree. We show that the first-order theory of this algebra is decidable via a weak quantifier elimination procedure which is allowed to swap existential quantifiers for universal quantifiers. This study is motivated by the logical modeling of transformations on UNIX file system trees expressed in a simple programming language.

[Download of the extended version] [HAL entry]


2017

Automata for unordered trees

Adrien Boiret, Vincent Hugot, Joachim Niehren, Ralf Treinen In Information and Computation, 2017, 253, pp.304-335 .

We present a framework for defining automata for unordered data trees that is parametrized by the way in which multisets of children nodes are described. Presburger tree automata and alternating Presburger tree automata are particular instances. We establish the usual equivalence in expressiveness of tree automata and MSO for the automata defined in our framework. We then investigate subclasses of automata for unordered trees for which testing language equivalence is in P-time. For this we start from automata in our framework that describe multisets of children by finite automata, and propose two approaches of how to do this deterministically. We show that a restriction to confluent horizontal evaluation leads to polynomial-time emptiness and universality, but still suffers from coNP-completeness of the emptiness of binary intersections. Finally, efficient algorithms can be obtained by imposing an order of horizontal evaluation globally for all automata in the class. Depending on the choice of the order, we obtain different classes of automata, each of which has the same expressiveness as Counting MSO.

Publication note: This supersedes

[Dowload (preprint)] [HAL entry]

A Formally Verified Interpreter for a Shell-like Programming Language

Nicolas Jeannerod, Ralf Treinen In 9th Working Conference on Verified Software: Theories, Tools, and Experiments (VSTTE 2017), part of CAV 2017, July 22-23, 2017, Heidelberg, Germany.

The shell language is widely used for various system administration tasks on UNIX machines, as for instance as part of the installation process of software packages in FOSS distributions. Our mid-term goal is to analyze these scripts as part of an ongoing effort to use formal methods for the quality assurance of software distributions, to prove their correctness, or to pinpoint bugs. However, the syntax and semantics of POSIX shell are particularly treacherous. We propose a new language called CoLiS which, on the one hand, has well-defined static semantics and avoids some of the pitfalls of the shell, and, on the other hand, is close enough to the shell to be the target of an automated translation of the scripts in our corpus. The language has been designed so that it will be possible to compile automatically a large number of shell scripts into the CoLiS language. We formally define its syntax and semantics in Why3, define an interpreter for the language in the WhyML programming language, and present an automated proof in the Why3 proof environment of soundness and completeness of our interpreter with respect to the formal semantics.

[Download (preprint)] [HAL entry]


2015

Mining Component Repositories for Installability Issues

Pietro Abate, Roberto Di Cosmo, Louis Gesbert, Fabrice Le Fessant, Ralf Treinen, Stefano Zacchiroli. In
12th Working Conference on Mining Software Repositories (MSR 2015), May 16–17. Florence, Italy.

Component repositories play an increasingly relevant role in software life-cycle management, from software distribution to end-user, to deployment and upgrade management. Software components shipped via such repositories are equipped with rich metadata that describe their relationship (e.g., dependencies and conflicts) with other components.

In this practice paper we show how to use a tool, distcheck, that uses component metadata to identify all the components in a repository that cannot be installed (e.g., due to unsatisfiable dependencies), provides detailed information to help developers understanding the cause of the problem, and fix it in the repository.

We report about detailed analyses of several repositories: the Debian distribution, the Opam package collection, and Drupal modules. In each case, distcheck is able to efficiently identify not installable components and provide valuable explanations of the issues. Our experience provides solid ground for generalizing the use of distcheck to other component repositories.


2014

Automated Synthesis and Deployment of Cloud Applications

Roberto Di Cosmo, Michaël Lienhardt, Ralf Treinen, Stefano Zacchiroli, Jakub Zwolakowski, Antoine Eiche, Alexis Agahi. In 29th IEEE/ACM International Conference on Automated Software Engineering (ASE 2014), Västerås, Sweden, September 2014.

Complex networked applications are assembled by connecting software components distributed across multiple machines. Building and deploying such systems is a challenging problem which requires a significant amount of expertise: the system architect must ensure that all component dependencies are satisfied, avoid conflicting components, and add the right amount of component replicas to account for quality of service and fault-tolerance. In a cloud environment, one also needs to minimize the virtual resources provisioned upfront, to reduce the cost of operation. Once the full architecture is designed, it is necessary to correctly orchestrate the deployment phase, to ensure all components are started and connected in the right order.

We present a toolchain that automates the assembly and deployment of such complex distributed applications. Given as input a high-level specification of the desired system, the set of available components together with their requirements, and the maximal amount of virtual resources to be committed, it synthesizes the full architecture of the system, placing components in an optimal manner using the minimal number of available machines, and automatically deploys the complete system in a cloud environment.

ACM DL Author-ize serviceAutomated synthesis and deployment of cloud applications
Roberto Di Cosmo, Michael Lienhardt, Ralf Treinen, Stefano Zacchiroli, Jakub Zwolakowski, Antoine Eiche, Alexis Agahi
ASE '14 Proceedings of the 29th ACM/IEEE international conference on Automated software engineering, 2014

2013

Formal Aspects of Free and Open Source Software Components - A Short Survey

Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. In Elena Giachino, Reiner Hähnle, Frank S. de Boer and Marcello M. Bonsangue (eds.), Revised Lectures of the 11th International Symposium on Formal Methods for Components and Objects, Bertinoro, Italy, September 24-28, 2012, Springer LNCS 7866, pp 216-239, 2013.

Free and Open Source Software (Foss) distributions are popular solutions to deploy and maintain software on server, desktop, and mobile computing equipment. The typical deployment method in the Foss setting relies on software distributions as vendors, packages as independently deployable components, and package managers as upgrade tools.

We review research results from the past decade that apply formal methods to the study of inter-component relationships in the Foss context. We discuss how those results are being used to attack both issues faced by users, such as dealing with upgrade failures on target machines, and issues important to distributions such as quality assurance processes for repositories containing tens of thousands, rapidly evolving software packages.

Optimal Provisioning in the Cloud

Roberto Di Cosmo, Michaël Lienhardt, Ralf Treinen, Stefano Zacchiroli, Jakub Zwolakowski. Draft, available on HAL

Complex distributed systems are classically assembled by deploying several existing software components to multiple servers. Building such systems is a challenging problem that requires a significant amount of problem solving as one must i) ensure that all inter-component dependencies are satisfied; ii) ensure that no conflicting components are deployed on the same machine; and iii) take into account replication and distribution to account for quality of service, or possible failure of some services. We propose a tool, Zephyrus, that automates to a great extent assembling complex distributed systems. Given i) a high level specification of the desired system architecture, ii) the set of available components and their requirements) and iii) the current state of the system, Zephyrus is able to generate a formal representation of the desired system, to place the components in an optimal manner on the available machines, and to interconnect them as needed.

[PDF]


2012

Learning from the Future of Component Repositories

Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. 15th International ACM SIGSOFT Symposium on Component Based Software Engineering (CBSE-2012), Bertinoro, Italy, June 26-28, 2012.

An important aspect of the quality assurance of large component repositories is the logical coherence of component metadata. We argue that it is possible to identify certain classes of such problems by checking relevant properties of the possible future repositories into which the current repository may evolve. In order to make a complete analysis of all possible futures effective however, one needs a way to construct a finite set of representatives of this infinite set of potential futures. We define a class of properties for which this can be done.

We illustrate the practical usefulness of the approach with two quality assurance applications: (i) establishing the amount of `forced upgrades' induced by introducing new versions of existing components in a repository, and (ii) identifying outdated components that need to be upgraded in order to ever be installable in the future. For both applications we provide experience reports obtained on the Debian distribution.

ACM DL Author-ize serviceLearning from the future of component repositories
Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli
CBSE '12 Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering, 2012

Why Do Software Packages Conflict?

Cyrille Artho, Roberto Di Cosmo, Kuniyasu Suzaki, Ralf Treinen, Stefano Zacchiroli. 9th Working Conference on Mining Software Repositories, June 2-3. Zurich, Switzerland

Inter-package conflicts require the presence of two or more packages in a particular configuration, and thus tend to be harder to detect and localize than conventional (intra- package) defects. Hundreds of such inter-package conflicts go undetected by the normal testing and distribution process until they are later reported by a user. The reason for this is that current meta-data is not fine-grained and accurate enough to cover all common types of conflicts. A case study of inter- package conflicts in Debian has shown that with more detailed package meta-data, at least one third of all package conflicts could be prevented relatively easily, while another one third could be found by targeted testing of packages that share common resources or characteristics. This paper reports the case study and proposes ideas to detect inter-package conflicts in the future.

Reducing Equational Theories for the Decision of Static Equivalence

Steve Kremer, Antoine Mercier, Ralf Treinen. Journal of Automated Reasoning volume 48 number 2, pages 197--217, 2012.

Static equivalence is a well established notion of indistinguishability of sequences of terms which is useful in the symbolic analysis of cryptographic protocols. Static equivalence modulo equational theories allows for a more accurate representation of cryptographic primitives by modelling properties of operators by equational axioms. We develop a method that allows us in some cases to simplify the task of deciding static equivalence in a multi-sorted setting, by removing a symbol from the term signature and reducing the problem to several simpler equational theories. We illustrate our technique at hand of bilinear pairings.

[Postscript of the Conference Version (Asian 09)]

Publication note: This supersedes


2011

MPM : a modular package manager

Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. 14th International ACM SIGSOFT Symposium on Component Based Software Engineering (CBSE-2011), June 21th - 23th, 2011 -- Boulder, Colorado, USA.

Software distributions in the FOSS world rely on so-called package managers for the installation and removal of packages on target machines. State-of-the-art package managers are monolithic in architecture, and each of them is hard-wired to an ad-hoc dependency solver implementing a customized heuristics. In this paper we propose a modular architecture allowing for pluggable dependency solvers and backends. We argue that this is the path that leads to the next generation of package managers that will deliver better results, accept more expressive input languages, and can be easily adaptable to new platforms. We present a working prototype---called MPM---which has been implemented following the design advocated in this pape

ACM DL Author-ize service MPM: a modular package manager
Pietro Abate, Roberto DiCosmo, Ralf Treinen, Stefano Zacchiroli
CBSE '11 Proceedings of the 14th international ACM Sigsoft symposium on Component based software engineering, 2011

Multiple Congruence Relations, First-Order Theories on Terms, and the Frames of the Applied Pi-Calculus

Florent Jacquemard, Étienne Lozes, Ralf Treinen, Jules Villard. Theory of Security and Applications (TOSCA'11), affiliated with ETAPS 2011, March 31 and April 1, 2011, Saarbrücken, Germany.

We investigate the problem of deciding first-order theories of finite trees with several distinguished congruence relations, each of them given by some equational axioms. We give an automata-based solution for the case where the different equational axiom systems are linear and variable-disjoint (this includes the case where all axioms are ground), and where the logic does not permit to express tree relations x=f(y,z). We show that the problem is undecidable when these restrictions are relaxed. As motivation and application, we show how to translate the model-checking problem of AπL, a spatial equational logic for the applied pi-calculus, to the validity of first-order formulas in term algebras with multiple congruence relations.


2009

Reducing Equational Theories for the Decision of Static Equivalence

Steve Kremer, Antoine Mercier, Ralf Treinen. 13th Annual Asian Computing Science Conference, December 14-16, 2009, Seoul, Korea.

Static equivalence is a well established notion of indistinguishability of sequences of terms which is useful in the symbolic analysis of cryptographic protocols. Static equivalence modulo equational theories allows a more accurate representation of cryptographic primitives by modelling properties of operators by equational axioms. We develop a method that allows in some cases to simplify the task of deciding static equivalence in a multi-sorted setting, by removing a symbol from the term signature and reducing the problem to several simpler equational theories. We illustrate our technique at hand of bilinear pairings.

[Postscript]

Publication note: This supersedes

Expressing Advanced User Preferences in Component Installation

Ralf Treinen, Stefano Zacchiroli. International Workshop on Open Component Ecosystems (IWOCE 2009), 24 August 2009 - Amsterdam, The Netherlands.

State of the art component-based software collections - such as FOSS distributions - are made of up to dozens of thousands components, with complex inter-dependencies and conflicts. Given a particular installation of such a system, each request to alter the set of installed components has potentially (too) many satisfying answers.

We present an architecture that allows to express advanced user preferences about package selection in FOSS distributions. The architecture is composed by a distribution-independent format for describing available and installed packages called CUDF (Common Upgradeability Description Format), and a foundational language called MooML to specify optimization criteria. We present the syntax and semantics of CUDF and MooML, and discuss the partial evaluation mechanism of MooML which allows to gain efficiency in package dependency solvers.

ACM DL Author-ize serviceDownload paper: Expressing advanced user preferences in component installation
Ralf Treinen, Stefano Zacchiroli
IWOCE '09 Proceedings of the 1st international workshop on Open component ecosystems, 2009

2008

Proving Group Protocols Secure Against Eavesdroppers

Steve Kremer, Antoine Mercier, Ralf Treinen. The 4th International Joint Conference on Automated Reasoning (IJCAR 2008), 10th-15th August 2008, Sydney, Australia. Springer-Verlag, LNCS 5195, pages 116-131

Security protocols are small programs designed to ensure properties such as secrecy of messages or authentication of parties in a hostile environment. In this paper we investigate automated verification of a particular type of security protocols, called group protocols, in the presence of an eavesdropper, i.e., a passive attacker. The specificity of group protocols is that the number of participants is not bounded.

Our approach consists in representing an infinite set of messages exchanged during an unbounded number of sessions, one session for each possible number of participants, as well as the infinite set of associated secrets. We use so-called visibly tree automata with memory and structural constraints (introduced recently by Comon-Lundh et al.) to represent over-approximations of these two sets. We identify restrictions on the specification of protocols which allow us to reduce the attacker capabilities guaranteeing that the above mentioned class of automata is closed under the application of the remaining attacker rules. The class of protocols respecting these restrictions is large enough to cover several existing protocols, such as the GDH family, GKE, and others.

A Spatial Equational Logic for the Applied pi-calculus

Jules Villard, Étienne Lozes, Ralf Treinen. Research Report LSV-08-10, Laboratoire Spécification et Vérification, ENS Cachan, France, March 2008. 44 pages.

Spatial logics have been proposed to reason locally and modularly on algebraic models of distributed systems. In this paper we investigate a spatial equational logic (ApiL) whose models are processes of the applied pi-calculus, an extension of the pi-calculus allowing term manipulation modulo a predefined equational theory, and wherein communications are recorded as active substitutions in a frame. Our logic allows us to reason locally either on frames or on processes, thanks to static and dynamic spatial operators. We study the logical equivalences induced by various relevant fragments of ApiL, and show in particular that the whole logic induces a coarser equivalence than structural congruence. We give characteristic formulae for this new equivalence as well as for static equivalence on frames. Going further into the exploration of ApiL's expressivity, we also show that it can eliminate standard term quantication, and that the model-checking problem for the adjunct-free fragment of ApiL can be reduced to satisfiability of a purely first-order logic of a term algebra.

[Postscript file]


2007

Intruder deduction for the equational theory of Abelian groups with distributive encryption

Pascal Lafourcade, Denis Lugiez, Ralf Treinen.

Information and Computation 205(4), pages 581-623 (2007)

Cryptographic protocols are small programs which involve a high level of concurrency and which are difficult to analyze by hand. The most successful methods to verify such protocols are based on rewriting techniques and automated deduction in order to implement or mimic the process calculus describing the execution of a protocol. We are interested in the intruder deduction problem, that is vulnerability to passive attacks in presence of equational theories which model the protocol specification and properties of the cryptographic operators.

In the present paper we consider the case where the encryption distributes over the operator of an Abelian group or over an exclusive-or operator. We prove decidability of the intruder deduction problem in both cases. We obtain a PTIME decision procedure in a restricted case, the so-called binary case.

These decision procedures are based on a careful analysis of the proof system modeling the deductive power of the intruder, taking into account the algebraic properties of the equational theories under consideration. The analysis of the deduction rules interacting with the equational theory relies on the manipulation of Z-modules in the general case, and on results from prefix rewriting in the binary case.

[Postscript file]

Publication note: This paper extends


2006

Managing the Complexity of Large Free and Open Source Package-Based Software Distributions

Fabio Mancinelli, Jaap Boender, Roberto Di Cosmo, Jérôme Vouillon, Berke Durak, Xavier Leroy, Ralf Treinen.

Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering (ASE 2006), Tokyo, Japan, September 18-22, 2006, IEEE Computer Society, pages 199-208.

The widespread adoption of Free and Open Source Software (FOSS) in many strategic contexts of the information technology society has drawn the attention on the issues regarding how to handle the complexity of assembling and managing a huge number of (packaged) components in a consistent and effective way. FOSS distributions (and in particular GNU/Linux-based ones) have always provided tools for managing the tasks of installing, removing and upgrading the (packaged) components they were made of. While these tools provide a (not always effective) way to handle these tasks on the client side, there is still a lack of tools that could help the distribution editors to maintain, on the server side, large and high-quality distributions. In this paper we present our research whose main goal is to fill this gap: we show our approach, the tools we have developed and their application with experimental results. Our contribution provides an effective and automatic way to support distribution editors in handling those issues that were, until now, mostly addressed using ad-hoc tools and manual techniques.

[PDF file]

Symbolic Protocol Analysis in Presence of a Homomorphism Operator and Exclusive Or

Stéphanie Delaune, Pascal Lafourcade, Denis Lugiez, Ralf Treinen.

Proceedings of the 33rd International Colloquium on Automata, Languages and Programming (ICALP 2006), July 9 - 16, 2006 S. Servolo, Venice - Italy, edited by Michele Bugliesi, Bart Preneel, Vladimiro Sassone and Ingo Wegener, Springer LNCS 4052, pages 132-143.

Security of a cryptographic protocol for a bounded number of sessions is usually expressed as a symbolic trace reachability problem. We show that symbolic trace reachability for well-defined protocols is decidable in presence of the exclusive or theory in combination with the homomorphism axiom. These theories allow us to model basic properties of important cryptographic operators.

This trace reachability problem can be expressed as a system of symbolic deducibility constraints for a certain inference system describing the capabilities of the attacker. One main step of our proof consists in reducing deducibility constraints to constraints for deducibility in one step of the inference system. This constraint system, in turn, can be expressed as a system of quadratic equations of a particular form over Z/2Z[X], the ring of polynomials in one indeterminate over the finite field Z/2Z. We show that satisfiability of such systems is decidable.


2005

Résolution Symbolique de Contraintes

Ralf Treinen.

Habiliation Thesis, Unversité Paris-Sud, defended November 17, 2005. (in French).

[Postscript File]


2004

Easy Intruder Deduction

Hubert Comon-Lundh and Ralf Treinen.

Verification: Theory and Practice - Essays Dedicated to Zohar Manna on the Occassion of His 64th Birthday, edited by Nachum Dershowitz, Springer LNCS 2772, pages 225--242, 2004.

We investigate extensions of the Dolev-Yao model by some algebraic properties of cryptographic primitives. We provide sufficient conditions under which the intruder deduction problem is decidable (resp. decidable in polynomial time). We apply this result to the equational theory of homomorphism, and show that in this case the intruder deduction problem is linear, provided that the messages are in normal form.

Publication note: This supersedes

[Postcript file].
2002

The First-Order Theory of Subtyping Constraints

Zhendong Su, Alexander Aiken, Joachim Niehren, Tim Priesnitz, Ralf Treinen.

Proceedings of the 29th Annual Symposium on Principles of Programming Languages (POPL), Portland, January 16-18, 2002, pages 203-216.

We investigate the first-order theory of subtyping constraints. We show that the first-order theory of non-structural subtyping is undecidable, and we show that in the case where all constructors are either unary or nullary, the first-order theory is decidable for both structural and non-structural subtyping. The decidability results are shown by reduction to a decision problem on tree automata. This work is a step towards resolving long-standing open problems of the decidability of entailment for non-structural subtyping.

ACM DL Author-ize serviceDownload paper: The first-order theory of subtyping constraints
Zhendong Su, Alexander Aiken, Joachim Niehren, Tim Priesnitz, Ralf Treinen
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, 2002


First-Order Theories of Concrete Domains

Ralf Treinen.

Lecture Notes for the DEA Sémantique, Preuves et Langages (preparatory year for prospective PhD students).

These lecture notes address the problem of deciding the first-order theories of structures of elementary data structures: natural, rational and real numbers, strings and trees. This subject is hence part of the vast field of Automated Deduction. Our viewpoint, however, is in some sense complementary to the traditional problem setting.

The traditional problem in Automated Deduction could be phrased as follows: Under a given definition of a proof system (for instance, a Gentzen-style proof in first-order logic, or an equational proof in equational logic), design algorithms that take as input a set of axioms and a formula, and that construct a proof of the formula from the axioms. Proofs are to be searched for arbitrary axioms, even if algoritms can be specialized for certain classes of axioms.

Our approach will be different. Instead of the problem whether a formula is consequence of a set of axioms we will now always work in one fixed semantical structure. We may now ask whether a formula is valid in the structure under consideration, a question which corresponds to a formulas as theorems point of view. In a fixed semantical structure there is, however, another way to look at a formula: A formula with, say, n free variables now denotes the set of n-tupels of values that are solutions of the formula. This is the perspective of formulas as constraints. The problem is now, given a formula, to simplify the formula (that is to find a formula that is equivalent in the structure under consideration and that is simpler in a sense to be made precise). The basic operation to be performed on constraints is then to decide whether they have a solution or not.

[Postscript File (version 13/1/2002)]


2001

The First-Order Theory of Ordering Constraints over Feature Trees

Joachim Niehren, Martin Müller and Ralf Treinen.

Discrete Mathematics and Theoretical Computer Science vol 4 (2), september 2001, pages 193-234

The system FT<= of ordering constraints over feature trees has been introduced as an extension of the system FT of equality constraints over feature trees. We investigate the first-order theory of FT<= and its fragments in detail, both over finite trees and over possibly infinite trees. We prove that the first-order theory of FT<= is undecidable, in contrast to the first-order theory of FT which is well-known to be decidable. We show that the entailment problem of FT<= with existential quantification is PSPACE-complete. So far, this problem has been shown decidable, coNP-hard in case of finite trees, PSPACE-hard in case of arbitrary trees, and cubic time when restricted to quantifier-free entailment judgments. To show PSPACE-completeness, we show that the entailment problem of FT<= with existential quantification is equivalent to the inclusion problem of non-deterministic finite automata.

Publication note: This supersedes

[DMTCS link]


Dominance Constraints: Algorithms and Complexity

Alexander Koller, Joachim Niehren and Ralf Treinen.

Logical Aspects of Computational Linguistics (LACL'98), Grenoble, France, December 14-16, 1998, Springer LNAI 2014, 2001, pages 106-125.

Dominance constraints for finite tree structures are widely used in several areas of computational linguistics including syntax, semantics, and discourse. In this paper, we investigate algorithmic and complexity questions for dominance constraints and their first-order theory. We present two NP algorithms for solving dominance constraints, which have been implemented in the concurrent constraint programming language Oz. The main result of this paper is that the satisfiability problem of dominance constraints is NP-complete. Despite this intractability result, the more sophisticated of our algorithms performs well in an application to scope underspecification. We also show that the existential fragment of the first-order theory of dominance constraints is NP-complete and that the full first-order theory has non-elementary complexity.

Postscript File (short version)


Undecidability of the Emptiness Problem of Reduction Automata with Component-Wise Tests

Ralf Treinen.

Draft Paper, June 27, 2001.

We show that the refinement of reduction automata to component-wise tests has an undecidable emptiness problem. This undecidability result even holds for automata that perform one single equality test, and where this test is performed at the root of the tree.

[Postscript File]


Grid Structures and Undecidable Constraint Theories

Franck Seynhaeve, Sophie Tison, Marc Tommasi and Ralf Treinen.

Theoretical Computer Science, vol 258 (1--2), may 2001, pages 453--490

Publication note: This supersedes

We prove three new undecidability results for computational mechanisms over finite trees: There is a linear, ultra-shallow, noetherian and strongly confluent rewrite system R such that the E*A*-fragment of the first-order theory of one-step-rewriting by R is undecidable; the emptiness problem for tree automata with equality tests between cousins is undecidable; and the E*A*-fragment of the first-order theory of set constraints with the union operator is undecidable. The common feature of these three computational mechanisms is that they allow us to describe the set of first-order terms that represent grids. We extend our representation of grids by terms to a representation of linear two-dimensional patterns by linear terms, which allows us to transfer classical techniques on the grid to terms and thus to obtain our undecidability results.

[ Postscript File]


Constraints in Computational Logic

Hubert Comon, LNCS 2002, Springer-Verlag 2001.

CCL (Construction of Computational Logics and later Constraints in Computational Logic) is the name of an ESPRIT working group which met regularly from 1992 to 1999. It united research teams from Germany, France, Spain and Israel, and was managed by the company COSYTEC.

In the last years, the main emphasis of the working group was on constraints - techniques to solve them, combine them and applications ranging from industrial applications to logic programming and automated deduction. At the end of the working group, in fall 1999, we organized a summer school, intending to summarize the main advances achieved in the field during the last 7 years. The present book contains the (revised) lecture notes of this school. It contains six chapters, each of which written by some member(s) of the working group and covering the various aspects of constraints in computational logic. We intend it to be read by non specialists, though a prior knowledge in first-order logic and programming is probably necessary.

[LNCS 2002 page at Springer]


2000

On Rewrite Constraints and Context Unification

Joachim Niehren, Sophie Tison, and Ralf Treinen.

Information Processing Letters 74 (1--2), pages 35-40, 2000.

We show that stratified context unification, which is one of the most expressive fragments of context unification known to be decidable, is equivalent to the satisfiability problem of slightly generalized rewriting constraints.

[ Postscript File (draft version)]


Predicate Logic and Tree Automata with Tests

Ralf Treinen.

Foundations of Software Science and Computation Structures (FOSSACS 2000), pp 329--343, March 27-30, 2000, Berlin, Germany. Springer LNCS 1784

We investigate the question whether the well-known correspondence between tree automata and the weak second order logic of two successor functions (WS2S) can be extended to tree automata with tests. Our first insight is that there is no generalization of tree automata with tests that has a decidable emptiness problem and that is equivalent to the full class of formulae in some extension of WS2S, at least not when we are asking for an conservative extension of the classical correspondence between WS2S and tree automata to tree automata with tests.

As a consequence we can extend the correspondence between tree automata and WS2S to automata with tests only when we admit a restriction of the class of formulae. We present a logic, called WS2Sy, and a restriction of the class of formula, called uniform, that is equivalent to tree automata with tests.

[Postscript File] [ Slides] [Draft of an Extended and Corrected Report]


1999

Constraints and Constraint Solving: An Introduction

Jean-Pierre Jouannaud and Ralf Treinen.

International Summer School on Constraints in Computational Logics, Gif-sur-Yvette, France, September 5-8, 1999.

The central idea of constraints is to compute with descriptions of data instead of to compute with data items. Generally speaking, a constraint-based computation mechanism (in a broad sense, this includes for instance deduction calculi and grammar formalisms) can be seen as a two-tired architecture, consisting of

  1. A language of data descriptions, and means to compute with data descriptions. Predicate logic is, for reasons that will be explained in this lecture, the natural choice as a framework for expressing data descriptions, that is constraints.
  2. A computation formalism which operates on constraints by a well-defined set of operations. The choice of this set of operations typically is a compromise between what is desirable for the computation mechanism, and what is feasible for the constraint system.
In this introductory lecture we will try to explain basic concepts for constraint-based formalisms in an informal manner. In-depth treatment is delegated to the lectures dedicated to particular subjects and to the literature.

This lecture is organised as follows: We start with a small tour of constraint-based formalisms comprising constraint logic programming, constraints in automated deduction, constraint satisfaction problems, constrained grammars in computational linguistics, and constraint-based program analysis. In this first section, we will employ a somewhat naive view of constraint systems since we will assume that we have complete and efficient means to manipulate descriptions of data (that is, constraints). This allows to cleanly separate calculus and constraints, which can be considered as an ideal situation.

In the following sections we will see that, in reality, things are not that simple. In Section 2 we will investigate a particular family of constraint system, so-called feature constraints, in some depth. It will turn out that complete procedures to compute with constraints may be too costly and that we may have to refine our calculus to accommodate for incomplete constraint handling. Section 3 will present such a refinement for constraint logic programming. In Section 4, finally, we will address the question of how the basic algorithms to compute with constraints can be implemented in the same programming calculus than the one that uses these algorithms.

Postscript File (updated version)


1998

How to Win a Game with Features.

Rolf Backofen and Ralf Treinen.

Information and Computation 142(1), pages 76-101, April 1998.

We employ the model-theoretic method of Ehrenfeucht-Fraisse Games to prove the completeness of the theory CFT which has been introduced in [Smolka&Treinen 94] for describing rational trees in a language of selector functions. The comparison to other techniques used in this field shows that Ehrenfeucht-Fraisse Games lead to simpler proofs.

Postscript File and BibTeX Entry.

Publication note: This supersedes


The First-Order Theory of Linear One-Step Rewriting is Undecidable

Ralf Treinen.

Theoretical Computer Science 208 (1-2), November 1998, pages 149-177.

The theory of one-step rewriting for a given rewrite system R and signature Sigma is the first-order theory of the following structure: its universe consists of all Sigma-ground terms, and its only predicate is the relation `x rewrites to y in one step by R'. The structure contains no function symbols and no equality. We show that there is no algorithm deciding the E*A*-fragment of this theory for an arbitrary linear and non-erasing term-rewriting system.

With the same technique we prove that the theory of encompassment plus one-step rewriting by the rule f(x) -> g(x), the theory of encompassment plus a WS2S-definable binary predicate, and the modal theory of one-step rewriting are undecidable.

Postscript File of the tech report version, BibTeX Entry.

If you wish to obtain a (paper or electronic) copy of the final journal version please send me an email.

Publication note: This supersedes


An On-line Problem Database

Nachum Dershowitz and Ralf Treinen.

Proceedings of the 9th International Conference on Rewriting Techniques and Applications (RTA'98), LNCS, March/April 1998, edited by Tobias Nipkow.

This is an update of the RTA List of Open Problems. The most recent version of the problem list can be found on the RTA LooP home page.

Postscript File.


1997

Feature Trees over Arbitary Structures

Chapter 7 of Specifying Syntactic Structures. Edited by Patrick Blackburn and Maarten de Rijke. CSLI Publications and FoLLI, Studies in Logic, Language and Information, pages 185--211, 1997.

This paper presents a family of first order feature tree theories, indexed by the theory of the feature labels used to build the trees. A given feature label theory, which is required to carry an appropriate notion of sets, is conservatively extended to a theory of feature trees with the predicates x[t]y (feature t leads from the root of tree x to the tree y), where we have to require t to be a ground term, and $\isdef xt$ (feature t is defined at the root of tree x). In the latter case, t might be a variable. Together with the notion of sets provided by the feature label theory, this yields a first-class status of arities.

We present a quantifier elimination procedure to reduce any sentence of the feature tree theory to an equivalent sentence of the feature label theory. Hence, if the feature label theory is decidable, the feature tree theory is too.

If the feature label theory is the theory of infinitely many constants and finite sets over infinitely many constants, we obtain an extension of the feature theory CFT, giving first-class status to arities. As an another application, we obtain decidability of the theory of feature trees, where the feature labels are words, and where the language includes the successor function on words, lexical comparison of words and first-class status of arities.

Postscript File and BibTeX Entry.


The First-Order Theory of Lexicographic Path Orderings is Undecidable

Hubert Comon and Ralf Treinen.

Theoretical Computer Science 176(1-2), april 1997, pages 67-87.

We show, under some assumption on the signature, that the E*A* fragment of the theory of a lexicographic path ordering is undecidable, both in the partial and in the total precedence cases. Our result implies in particular that the simplification rule of ordered completion is undecidable.

Postscript File, BibTeX Entry, Slides of CAAP'97.


1995

Constraint Deduction in an Interval-based Temporal Logic.

Jana Koehler and Ralf Treinen.

Executable Modal and Temporal Logics, Michael Fisher and Richard Owens, eds., Springer Lecture Notes in Artificial Intelligence, vol. 897, 1995, pages 103-117.

We describe reasoning methods for the interval-based modal temporal logic LLP which employs the modal operators sometimes, always, next, and chop. We propose a constraint deduction approach and compare it with a sequent calculus, developed as the basic machinery for the deductive planning system PHI which uses LLP as underlying formalism.

Postscript File and BibTeX Entry.


1994

Records for Logic Programming.

Gert Smolka and Ralf Treinen.

Journal of Logic Programming 18(3), April 1994, pp. 229-258.

CFT is a new constraint system providing records as logical data structure for constraint (logic) programming. It can be seen as a generalization of the rational tree system employed in Prolog~II, where finer-grained constraints are used, and where subtrees are identified by keywords rather than by position.

CFT is defined by a first-order structure consisting of so-called feature trees. Feature trees generalize the ordinary trees corresponding to first-order terms by having their edges labeled with field names called features. The mathematical semantics given by the feature tree structure is complemented with a logical semantics given by five axiom schemes, which we conjecture to comprise a complete axiomatization of the feature tree structure.

We present a decision method for CFT, which decides entailment and disentailment between possibly existentially quantified constraints. Since CFT satisfies the independence property, our decision method can also be employed for checking the satisfiability of conjunctions of positive and negative constraints. This includes quantified negative constraints such as $\forall y\forall z(x\neq f(y,z))$.

The paper also presents an idealized abstract machine processing negative and positive constraints incrementally. We argue that an optimized version of the machine can decide satisfiability and entailment in quasi-linear time.

Postscript File and BibTeX Entry.

Publication note: This supersedes


Ordering Constraints on Trees.

Hubert Comon and Ralf Treinen.

Colloquium on Trees in Algebra and Programming (CAAP94), Edinburgh, Scotland, 11 - 13 April 1994, pages 1-14, LNCS 787.

We survey recent results about ordering constraints on trees and discuss their applications. Our main interest lies in the family of recursive path orderings which enjoy the properties of being total, well-founded and compatible with the tree constructors. The paper includes some new results, in particular the undecidability of the theory of lexicographic path orderings in case of a non-unary signature.

Postscript File and BibTeX Entry.


1993

Feature Constraints with First Class Features.

Ralf Treinen.

1993 Conference on Mathematical Foundations of Computer Science, LNCS 711, Gdansk, Poland, Sep. 1993, pages 734-743.

Feature Constraint Systems have been proposed as a logical data structure for constraint (logic) programming. They provide a record-like view to trees by identifying subtrees by keyword rather than by position. Their atomic constraints are finer grained than in the constructor-based approach. The recently proposed CFT [Smolka&Treinen94] in fact generalizes the rational tree system of Prolog II.

We propose a new feature constraint system EF which extends CFT by considering features as first class values. As a consequence, EF contains constraints like x[v]w where v is a variable ranging over features, while CFT restricts v to be a fixed feature symbol.

We show that the satisfiability of conjunctions of atomic EF-constraints is NP-complete. Satisfiability of quantifier-free EF-constraints is shown to be decidable, while the $\exists^*\forall^*\exists^*$ fragment of the first order theory is undecidable.

Postscript File and BibTeX Entry.


Equational and Membership Constraints for Infinite Trees.

Joachim Niehren, Andreas Podelski and Ralf Treinen.

Extended version of the paper in the Proceedings of the 5th International Conference on Rewriting Techniques and Applications, LNCS 690, June 1992, pages 106-120.

We present a new constraint system with equational and membership constraints over infinite trees. It provides for complete and correct satisfiability and entailment tests and is therefore suitable for the use in concurrent constraint programming systems which are based on cyclic data structures.

Our set defining devices are greatest fixpoint solutions of regular systems of equations with a deterministic form of union. As the main technical particularity of the algorithms we present a novel memorization technique. We believe that both satisfiability and entailment tests can be implemented in an efficient and incremental manner.

Postscript File and BibTeX Entry (of the Proceedings paper).


1992

A New Method for Undecidability Proofs of First Order Theories.

Ralf Treinen.

Journal of Symbolic Computation 5(14), pp 437-457, Nov 1992.

We claim that the reduction of Post's Correspondence Problem to the decision problem of a theory provides a useful tool for proving undecidability of first order theories given by some interpretation. The goal of this paper is to define a framework for such reduction proofs. The method proposed is illustrated by proving the undecidability of the theory of a term algebra modulo the axioms of associativity and commutativity and of the theory of a partial lexicographic path ordering.

Postscript File and BibTeX Entry.

Publication note: This supersedes


1991

First Order Data Types and First Order Logic.

Ralf Treinen.

Extended version of the paper in Theoretical Aspects of Computer Software 1991, LNCS 526.

This paper concerns the relation between parameterized first order data types and first order logic. Augmenting first order logic by data type definitions yields in general a strictly stronger logic than first order logic. We consider some properties of the new logic for fixed data type definitions. While our new logic always fulfills the downward Skolem-L\"owenheim property, compactness is fulfilled if and only if for the given data type definition the new logic has the same expressive power than first order logic. We show that this last property is undecidable.

Postscript File and BibTeX Entry (of the Proceedings paper).


Ralf Treinen July 18, 2014.