Publications

Journal articles

On software component co-installability
(with Roberto Di Cosmo). In ACM Trans. Softw. Eng. Methodol. 22(4): 34 (2013).

Modern software systems are built by composing components drawn from large repositories. A fundamental challenge for the maintainability and the scalability of such software systems is the ability to quickly identify the components that can or cannot be installed together: this is the co-installability problem, which is related to boolean satisability and is known to be algorithmically hard. This paper develops a novel theoretical framework that allows us to associate to each concrete component repository a much smaller one with a simpler structure with equivalent co-installability properties. This repository can be displayed in a way that provides a concise view of the co-installability issues in the original repository, or used as a basis for various algorithms related to co-installability.

From bytecode to JavaScript: the Js_of_ocaml compiler
(with Vincent Balat). In Software: Practice and Experience, 2013.

We present the design and implementation of a compiler from OCaml bytecode to JavaScript. The compiler first translate the bytecode into an SSA intermediate representation on which optimizations are performed, before generating JavaScript. We believe that taking bytecode as input instead of a high-level language is a sensible choice. Virtual machines provide a very stable API. Such a compiler is thus easy to maintain. It is also convenient to use: it can just be added to an existing installation of the development tools. Already compiled libraries can be used directly, with no need to reinstall anything. Finally, some virtual machines are the target of several languages. A bytecode to JavaScript compiler would make it possible to retarget all these languages to Web browsers at once.

A Solution to the POPLMARK Challenge Based on de Bruijn Indices
In J. Autom. Reasoning, 49(3): 327-362 (2012)

The PoplMark challenge proposes a set of benchmarks intended to assess the usability of proof assistants in the context of research on programming languages. It is based on the metatheory of System Fsub. We present a solution to the challenge using de Bruijn indices, developed with the Coq proof assistant.

Regular Expression Types for XML
(with Haruo Hosoya, and Benjamin Pierce). In ACM Transactions on Programming Languages and Systems, 27(1):46–90, January 2005. A preliminary version appeared in the proceedings of the International Conference on Functional Programming (ICFP), 2000.

We propose regular expression types as a foundation for statically typed XML processing languages. Regular expression types capture and generalize regular expression notations such as repetition, alternation, etc., that are commonly found in schema languages for XML, and support a powerful notion of subtyping. We argue that the flexibility provided by this form of subtyping is useful in supporting smooth evolution of XML-based systems.

Objective ML: An effective object-oriented extension to ML
(with Didier Rémy). Theory and Practive of Object Systems, 4(1):27–50, 1998. A preliminary version appeared in the proceedings of the 24th ACM Conference on Principles of Programming Languages, 1997.

Objective ML is a small practical extension to ML with objects and top level classes. It is fully compatible with ML; its type system is based on ML polymorphism, record types with polymorphic access, and a better treatment of type abbreviations. Objective ML allows for most features of object-oriented languages including multiple inheritance, methods returning self and binary methods as well as parametric classes. This demonstrates that objects can be added to strongly typed languages based on ML polymorphism.

Conference papers

Easing Software Component Repository Evolution
(with Roberto Di Cosmo and Mehdi Dogguy) ICSE 2014.

Modern software systems are built by composing components drawn from large repositories. Maintaining and evolving these software collections is a complex task, and a strict qualication process needs to be enforced to ensure that the modications accepted into the reference repository do not disrupt its useability. We studied in depth the Debian software repository, one of the largest and most complex existing ones, which uses several separate repositories to stage the acceptance of new components, and we developed comigrate, an extremely efficient tool that is able to identify the largest sets of components that can migrate to the reference repository without violating its quality constraints. This tool outperforms signicantly existing tools, and provides detailed information that is crucial to understand the reasons why some components cannot migrate. Extensive validation on the Debian distribution has been performed. The core architecture of the tool is quite general, and can be easily adapted to other software repositories.

Broken Sets in Software Repository Evolution
(with Roberto Di Cosmo) ICSE 2013.

Modern software systems are built by composing components drawn from large repositories. Software systems built with components from a release of a repository should be seamlessly upgradeable using components from the next release. Unfortunately, users are often confronted with sets of components that were installed together, but cannot be upgraded together to the latest version from the new repository. Identifying these broken sets can be of great help for a quality assurance team, that could examine and fix these issues well before they reach the end user. Building on previous work on component co-installability, we show that it is possible to find these broken sets for any two releases of a component repository, computing extremely efficiently a concise representation of these upgrade issues, together with informative graphical explanations. A tool implementing the algorithm presented in this paper is available as free software, and is able to process the evolution between two major releases of the Debian GNU/Linux distribution in just a few seconds. These results make it possible to integrate seamlessly this analysis in a repository development process.

On software component co-installability
(with Roberto Di Cosmo). ESEC/FSE 2011.

Modern software systems are built by composing components drawn from large repositories. A fundamental challenge for the maintainability and the scalability of such software systems is the ability to quickly identify the components that can or cannot be installed together: this is the co-installability problem, which is related to boolean satisability and is known to be algorithmically hard. This paper develops a novel theoretical framework that allows us to associate to each concrete component repository a much smaller one with a simpler structure with equivalent co-installability properties. This repository can be displayed in a way that provides a concise view of the co-installability issues in the original repository, or used as a basis for various algorithms related to co-installability.

Experience Report: Ocsigen, a Web Programming Framework
(with Vincent Balat and Boris Yakobowski). ICFP 2009

The evolution of Web sites towards very dynamic applications makes it necessary to reconsider current Web programming technologies. We believe that Web development would benefit greatly from more abstract paradigms and that a more semantical approach would result in huge gains in expressiveness. In particular, functional programming provides a really elegant solution to some important Web interaction problems, but few frameworks take advantage of it. The Ocsigen project is an attempt to provide global solutions to these needs. We present our experience in designing this general framework for Web programming, written in Objective Caml. It provides a full featured Web server and a framework for programming Web applications, with the aim of improving expressiveness and safety. This is done by taking advantage of functional programming and static typing as much as possible.

A Very Modal Model of a Modern, Major, General Type System
(with Andrew W. Appel, Paul-André Melliès, and Christopher D. Richards). POPL 2007

We present a clean and powerful model of recursive and impredicatively quantified types with mutable references. We interpret in this model all of the type constructors needed for typed intermediate languages and typed assembly languages used for object-oriented and functional languages. We establish in this purely semantic fashion a soundness proof of the typing systems underlying these TILs and TALs—ensuring that every well-typed program is safe. The technique is generic, and applies to any small-step semantics including λ-calculus, labeled transition systems, and von Neumann machines. It is also simple, and reduces mainly to defining a Kripke semantics of the Gödel-Löb logic of provability. We have machine-checked proofs in Coq.

Managing the Complexity of Large Free and Open Source Package-Based Software Distributions
(with Fabio Mancinelli, Jaap Boender, Roberto Di Cosmo, Berke Durak, Xavier Leroy, and Ralf Treinen). ASE 2006

The widespread adoption of Free and Open Source Software (FOSS) has lead to a freer and more agile marketplace where there is a higher number of components that can be used to build systems in many original and often unforeseen ways. One of the most prominent examples of complex systems built with FOSS components are GNU/Linux-based distributions. In this paper we present some tools that aim at helping distribution editors with maintaining the huge package bases associated with these distributions, and improving their quality, by detecting errors and inconsistencies in an effective, fast and automatic way.

Polymorphic Regular Tree Types and Patterns
Proceedings of the 33th ACM Conference on Principles of Programming Languages, Charleston, USA, January 2006.

We propose a type system based on regular tree grammars, where algebraic datatypes are interpreted in a structural way. Thus, the same constructors can be reused for different types and a flexible subtyping relation can be defined between types, corresponding to the inclusion of their semantics. For instance, one can define a type for lists and a subtype of this type corresponding to lists of even length. Patterns are simply types annotated with binders. This provides a generalization of algebraic patterns with the ability of matching arbitrarily deep in a value. We put the emphasize on simplicity and try to keep the specification and the algorithms as straightforward as possible. Our main contribution, compared to languages such as XDuce and CDuce, is that we are able to deal with both polymorphism and function types.

Recursive polymorphic types and parametricity in an operational framework
(with Paul-André Melliès) In Proceedings of the 20th IEEE Symposium on Logic in Computer Science, pages 82–91, Chicago, IL, USA, June 2005.

We construct a realizability model of recursive polymorphic types, starting from an untyped language of terms and contexts. An orthogonality relation e⊥π indicates when a term e and a context π may be safely combined in the language. Types are interpreted as sets of terms closed by biorthogonality. Our main result states that recursive types are approximated by converging sequences of interval types. Our proof is based on a “type-directed” approximation technique, which departs from the “language-directed” approximation technique developed by MacQueen, Plotkin and Sethi in the ideal model. We thus keep the language elementary (a call-by-name λ-calculus) and unstratified (no typecase, no reduction labels). We also include a short account of parametricity, based on an orthogonality relation between quadruples of terms and contexts.

Subtyping Union Types
In J. Marcinkowski and A. Tarlecki, editors, 18th International Workshop CSL 2004, volume 3210 of Lecture Notes in Computer Science, page 415–429. Springer Verlag, September 2004. A longer unpublished version, with full proofs and additional material is also available.

Subtyping can be fairly complex for union types, due to interactions with other types, such as function types. Furthermore, these interactions turn out to depend on the calculus considered: for instance, a call-by-value calculus and a call-by-name calculus will have different possible subtyping rules. In order to abstract ourselves away from this dependence, we consider a fairly large class of calculi. We define types in a semantic fashion, as sets of terms. Then, a type can be a subtype of another type if its denotation is included in the denotation of the other type. We first consider a simple type system with union, function, pair and constant types. Using inference rules, we specify a subtyping relation which is both sound and complete with respect to the class of calculi.

Semantic Types: A fresh look at the ideal model for types
(with Paul-André Melliès). In Proceedings of the 31th ACM Conference on Principles of Programming Languages, pages 52–63, Venezia, Italia, January 2004.

We present a generalization of the ideal model for recursive polymorphic types. Types are defined as sets of terms instead of sets of elements of a semantic domain. Our proof of the existence of types as a solution of a fixpoint does not rely on metric spaces properties, but instead uses the fact that the identity is the limit of projection terms. This establishes a connection with the work of Pitts on relational properties of domains. We suggest that the right generalization of ideals are closed sets of terms defined by orthogonality with respect to a set of contexts.

Combining subsumption and binary methods: An object calculus with views
In Proceedings of the 28th ACM Conference on Principles of Programming Languages, pages 290–303, 2001

We present an object-oriented calculus which allows arbitrary hiding of methods in prototypes, even in the presence of binary methods and friend functions. This combination of features permits complete control of the interface a class exposes to the remainder of a program (which is of key importance for program readibility, security and ease of maintenance), while still allowing complex interactions with other classes belonging to a same module or software component.

Regular Expression Types for XML
(with Haruo Hosoya, and Benjamin Pierce). In Proceedings of the International Conference on Functional Programming (ICFP), pages 11–22, 2000

Objective ML: A simple object-oriented extension of ML
(with Didier Rémy). In Proceedings of the 24th ACM Conference on Principles of Programming Languages, pages 40–53, Paris, France, January 1997.

Thesis report

Conception et réalisation d'une extension du langage ML avec des objets
October 2000.

Other material

Macaque: interrogation sûre et flexible de base de données depuis OCaml
JFLA 2012

Macaque is an OCaml library providing an interface to SQL databases. Modular, statically typed queries can be expressed. A Camlp4 syntax extension brings a concrete syntax using comprehensions. Subtle properties of SQL values, such as nullability, are encoded into the OCaml type system using phantom types

Lwt: a cooperative thread library
ML 2008

We present a cooperative thread library for Objective Caml. The library is entirely written in Objective Caml and does not rely on any external C function. Programs involving threads are written in a monadic style. This makes it possible to write threaded code almost as regular ML code, even though it has a different semantics. Cooperative threads are especially well suited for concurrent network applications, where threads perform little computation and spend most of their time waiting for input or output, at which time other threads can run. This library has been successfully used in the Unison file synchronizer and the Ocsigen Web server

Polymorphism and XDuce-style patterns
Informal proceedings of the PLAN-X'06 workshop, January 2006.

We present an extension of XDuce, a programming language dedicated to the processing of XML documents, with polymorphism and abstract types, two crucial features for programming in the large. We show that this extension makes it possible to deal with first class functions and eases the interoperability with other languages. A key mechanism of XDuce is its powerful pattern matching construction and we mainly focus on this construction and its interaction with abstract types. Additionally, we present a novel type inference algorithm for XDuce patterns, which works directly on the syntax of patterns.

Subtyping Union Types (draft)
May 2004.

Subtyping can be fairly complex for union types, due to interactions with other types, such as function types. Furthermore, these interactions turn out to depend on the calculus considered: for instance, a call-by-value calculus and a call-by-name calculus will have different possible subtyping rules. In order to abstract ourselves away from this dependence, we consider a fairly large class of calculi. We define types in a semantic fashion, as sets of terms. Then, a type can be a subtype of another type if its denotation is included in the denotation of the other type. We first consider a simple type system with union, function, pair and constant types. Using inference rules, we specify a subtyping relation which is both sound and complete with respect to the class of calculi. We then extend this result to a richer type system with ML-style polymorphism and type constructors. We expect this framework to allow the study of subtyping relations that only hold for some calculi by restricting the class considered, and to allow the study of subtyping relations for richer type systems by enriching the class.

What's in Unison? A Formal Specification and Reference Implementation of a File Synchronizer
(with Benjamin C. Pierce) Technical Report MS-CIS-03-36, University of Pennsylvania, February 2004.

A file synchronizer is a tool that reconciles disconnected modifications to a replicated directory structure. Trustworthy synchronizers are difficult to build, since they must deal correctly with both the semantic complexities of file systems and the unpredictable failure modes arising from distributed operation. On the other hand, synchronizers are often packaged as stand-alone, user-level utilities, whose intended behavior is relatively easy to isolate from the other functions of the system. This combination of subtlety and isolability makes file synchronizers attractive candidates for precise mathematical specification. We present here a detailed specification of a particular file synchronizer called Unison.

The reality of virtual types for free!
(with Didier Rémy). Unpublished note available electronically, October 1998.

We show, mostly through detailed examples, that programming patterns known to involve the notion of virtual types can be implemented in a real object-oriented language—Ocaml—in a direct way by taking advantage of parametric classes and a flexible treatment of object types via the use of row variables.

Using modules as classes
Informal proceedings of the FOOL'5 workshop, January 1998.

We show how classes can be combined with a ML-like module system, so that the same constructions be used for both classes and modules. For that, we are brought to slightly extend the module system with implicit type quantification of module types.