Master project description
TITLE: Semantic subtyping in OCaml
This project is proposed in collaboration with the LRI .

The goal of this internship is to overcome several limitations due to the use of polymorphic variants in OCaml and to set the theoretical bases for studying the usage of semantic subtyping in OCaml. In particular, in this internship we will address the integration of intersection types in OCaml. The motivation of such an integration is twofold: to specify in a more refined way the behavior of polymorphic functions defined by pattern matching and to endow the OCaml language with bounded polymorphism. In this internship the student will thus be required to establish sound formal foundations of an extension of the OCaml type system and then to implement them in a modified version of the OCaml compiler.
More in details, let's consider this simple function definition in OCaml
let foo = function
 | `One as x -> x
 | `Two as x -> x
after which OCaml returns the message
val foo : [< `One | `Two ] -> [> `One | `Two ] = <fun>
This message states that the function accepts any argument with variants `One and `Two and returns a result having at least variants `One and `Two. A first problem is that if we force the type of foo to be [< `One ] -> [> `One ]:
let foo : [< `One ] -> [> `One ] = function
 | `One as x -> x
 | `Two as x -> x
OCaml returns:
let foo : [< `One ] -> [> `One ] = function
   | `One as x -> x;;
   | `Two as x -> x
      ---
Warning 11: this match case is unused.
val foo : [< `One ] -> [> `One | `Two ] = <fun>
Thus OCaml realizes that the second case of the match expression is not used when the argument is of type [< `One ]. Nevertheless this case is taken into account in the return type, since the type deduced for foo is [< `One ] -> [> `One | `Two ]. Apart from the fact that the type declaration of foo [< `One ] -> [> `One ] though sound is not directly compatible with such a deduction since [< `One ] -> [> `One ] is not a super-type of [< `One ] -> [> `One | `Two ] (this is due to the instanciation of a hidden type variable), the main problem is that such a deduction makes it impossible to use foo wherever a function of type [< `One ] -> [> `One ] is expected. There exist some situations in which we would like to use foo with such a type, an this would always be "safe". In order to avoid code duplication, the idea proposed for this internship is to use an intersection type (denoted in what follows by the ampersand symbol &)
let foo : ([< `One ] -> [> `One ]) 
        & ([< `One | `Two ] -> [> `One | `Two ]) = function
   | `One as x -> x
   | `Two as x -> x
which declares (and instructs the type checker to verify) that foo has both the type [< `One ] -> [> `One ] and the type [< `One | `Two ] -> [> `One | `Two ], and that, hence, foo can be used wherever these types are expected. It is important to notice that here we aim at defining intersection types in a way that is far more general than the current version used in OCaml where intersection has quite a limited use.

The use of polymorphic variants make foo lose its polymorphic nature. Since foo is the identity function, we would like to type it by the following type

   foo : (type a < [`One | `Two ]) 'a -> 'a 
accoding to which foo has all types obtained by instatiating 'a by a subtype of [`One | `Two ]. This kind of quantification is called "bounded quantification", since the type variable can be instantiated only by the subtypes of a given type (the bound). Once more such a behaviour can be obtained by resorting to intersection types, since the above bounded quantification can be considered as syntactic sugar for
   foo : ('a & [`One | `Two ]) -> ('a & [`One | `Two ])
Thus the extension to intersection types make it possible to model two mechanisms (the loss of polymorphism due to the use of polymorphic variants and the possibilitity to refin the type of a function defined by pattern matching) in principle rather different. And this all can be obtained by the type system itself (as such a linguistic level and not at a meta-level as the bounded quantification defined by François Pottier in his PhD dissertation). In conclusion this internship aims at studying the integration of intersection types in OCaml and possimbly implement them in a modified version of the OCaml compiler. This work will pave the way, in the long term, to study the integration of semantic subtyping (union types included) in languages of the ML family.

Optional: According to the advancement state of the project, the intern will also explore using intersection types to generalize or complete polymorphism of record types based on row variables. Thus the project will study the best way to integrate the two kinds of polymorphism

Required background
Acquaintance with OCaml and type theory is mandatory.
References
  • G. Castagna and Z. Xu: Set-theoretic Foundation of Parametric Polymorphism and Subtyping. In ICFP '11: 16th ACM-SIGPLAN International Conference on Functional Programming, pag. 94-106, September, 2011.
  • G. Castagna, K. Nguyễn, Z. Xu, H. Im, S. Lenglet, and L. Padovani: Polymorphic Functions with Set-Theoretic Types. Part 1: Syntax, Semantics, and Evaluation. In POPL '14, 41th ACM Symposium on Principles of Programming Languages, January, 2014.
  • G. Castagna, K. Nguyễn, Z. Xu, and P. Abate: Polymorphic Functions with Set-Theoretic Types. Part 2: Local Type Inference and Type Reconstruction, November, 2013
  • A. Frisch, G. Castagna, and V. Benzaken: Semantic Subtyping: dealing set-theoretically with function, union, intersection, and negation types. Journal of the ACM, vol. 55, n. 4, pag. 1―64, 2008.
  • G. Castagna and A. Frisch: A gentle introduction to semantic subtyping. In Proceedings of PPDP '05, the 7th ACM SIGPLAN International Symposium on Principles and Practice of Declarative Programming, pages 198-208, ACM Press (full version) and ICALP '05, 32nd International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science n. 3580, pages 30-34, Springer (summary), July, 2005. Joint ICALP-PPDP keynote talk.
Contacts
Giuseppe Castagna : PPS Laboratory, Université Paris 7.
Kim Nguyễn : LRI .