[Go to contents] [Go to menu]

Type-safe Unmarshalling for Objective Caml

This page is outdated. See work in progress.

[version française]

Example

# let v = (1.2, 1);;
v : ( float * int ) = (1.2 , 1)
# let str = Marshal.to_string v [] ;;
str : string = "..."
# SafeUnmarshal.from_string [^ ( float * int ) ^] str 0 ;;
- : ( float * int ) = (1.2, 1)
# SafeUnmarshal.from_string [^ ( int * float ) ^] str 0 ;;
Exception: SafeUnmarshal.Fail

Here is a more complete example.

If you are in a hurry :
./make_source_tree.sh ocaml-ty && cd ocaml-ty && ./configure && make world && make -C otherlibs/safe_unmarshaling top

Principle

This type-safe unmarshalling does not need any modification of the OCaml runtime, and in particular uses the current format of marshalled values. The unmarshalling function is correctly typed thanks to an extra argument that represents the expected type of the value to be read. The dynamic compatibility check of the value against the type is performed by going through the whole value, interpreting cycles and sharing as polymorphic type constraints.

This imposed to add the following.

The checking algorithm is then implemented by a library named  SafeUnmarshal. For more information on how the algorithm works and its properties, see the references.

How can I test?

Beware: this code has an alpha status... and its correctness has been proved only with paper and pencil. For new bugs, write to: gregoire.henry AT pps.jussieu.fr

The code is available as two patches (1) and then (2) for the 3.09 branch of the Objective Caml compiler. It is necessary to bootstrap the compiler before and after applying the second patch: make core && make bootstrap -- or, alternatively, to copy these already bootstrapped versions of ocamlc and  ocamllex in the  ocaml/boot/ directory (before or after applying patches ... but before compiling!).

To get an already patched version of the INRIA compiler that is directly compilable, choose among the following:

Complete Examples

Loading.

$ ocaml safeUnmarshal.cma
         Objective Caml version 3.09.2+ty1

# open SafeUnmarshal;;
# from_string;;
- : 'a tyrepr -> string -> int -> 'a

A test function.

# let test ty obj =
  try
     let _ = from_string ty (Marshal.to_string obj []) 0 in true
  with Fail -> false;;
- : 'a tyrepr -> 'b -> bool = <fun>
# test [^ int ^] 3;;
- : bool = true
# test [^ int ^] 3.4;;
- : bool = false
# test [^ int ^] false;;
- : bool = true

Sum types.

# type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree;;
type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
# let t = Node (Node (Leaf, 3, Leaf), 5, Leaf);;
val t : int tree = Node (Node (Leaf, 3, Leaf), 5, Leaf)
# test [^ int tree ^] t;;
- : bool = true
# test [^ bool tree ^] t;;
- : bool = false

Record types.

# type point = { x: int; y: int };;
type point = { x : int; y : int; }
# type point3D = { a: int; b: int; c:int };;
type point3D = { a : int; b : int; c : int; }
# let p = { x=4; y=0 };;
val p : point = {x = 4; y = 0}
# test [^ point ^] p;;
- : bool = true
# test [^ point3D ^] p;;
- : bool = false

Abstract types.

# module M : sig type t val v : t end = struct
    type t = (int * float) list
    let v = [(1,1.2);(4,3.)]
  end;;
module M : sig type t val v : t end
# test [^ M.t ^] M.v;;
- : bool = true
# test [^ M.t ^] p;;
- : bool = false

Sharing and cycles

# type ('node, 'leaf) btree =
   | Node of ('node, 'leaf) btree * 'node * ('node, 'leaf) btree
   | Leaf of 'leaf ;;
type ('a, 'b) btree = Leaf of 'b | Node of ('a, 'b) btree * 'a * ('a, 'b) btree
# let l = Leaf 3;;
val l : ('a, int) btree = Leaf 3
# let t = Node (l, 5.2, l), Node (l, 'a', l);;
val t : (float, int) btree * (char, int) btree =
(Node (Leaf 3, 5.2, Leaf 3), Node (Leaf 3, 'a', Leaf 3))
# test [^ (float, int) btree * (char, int) btree ^] t;;
- : bool = true
# let rec t2 = Node ( t2, "E", t2 );;
val t2 : (string, 'a) btree = ...
# test [^ (string, 'a) btree ^] t2;;
- : bool = true
# test [^ (string, int) btree * (string, float) btree ^] (t2, t2);;
- : bool = true

Ref type.

# let r = ref [];;
val r : '_a list ref = {contents = []}
# test [^ float list ref ^] r;;
- : bool = true
# test [^ int list ref ^] r;;
- : bool = true
# test [^ int list ref * float list ref ^] (r,r);;
- : bool = false

Strange conversions

# copy [^ int ^] true;;
- : int = 1
# copy [^ bool ^] 0;;
- : bool = false
# copy [^ bool ^] 2;;
Exception: SafeUnmarshal.Fail

# copy [^ point ^] (1,2);;
- : point = {x = 1; y = 2}
# copy [^ point3D ^] (1,2);;
Exception: SafeUnmarshal.Fail

# copy [^ M.t ^] [];;
- : M.t = <abstr>
# copy [^ M.t ^] [(1,2.3)];;
- : M.t = <abstr>
# copy [^ M.t ^] [1;2;3];;
Exception: SafeUnmarshal.Fail

# copy [^ string tree ^] t2;;
- : string tree = ...
# copy [^ string tree ^] l;;
Exception: SafeUnmarshal.Fail.

Features and Limitations

Checkable types include all primitive types (excepted format4, lazy_t, exn and ->) as well as all definable sum and record types.

Abstract types can be checked when they are implemented in OCaml, but not when they correspond to "Custom" blocks.

When compiling a project that contains an interface file without implementation, it is necessary to either create an implementation (for instance, a symbolic link to the .mli file) or use the -noty compiler option for compiling the .ml files that depend on this interface.

Support for polymorphic variants is still incomplete.

Variance information are not taken into account.

The checking algorithm cannot check closures: it is therefore impossible to safely unmarshall functions or objects with this mechanism.

There is no patterm-matching matching on values of type  'a tyrepr.

Clearly, the syntax of values of type  'a tyrepr could be extended with an anti-quotation mechanism. In a future version.

References

This code comes from the Master internship that I did in 2005 under the supervision of Michel Mauny and Emmanuel Chailloux. The report is available (in French) with my colleagues' on the MPRI web site. Since then, a paper -- probably more readable, but in French -- has been published at JFLA'2006. See also the corresponding slides.