Type-safe Unmarshalling for Objective Caml
This page is outdated. See work in progress.
[version française]
Example
# let v =
(1.2,
1);;
v :
( float *
int )
= (1.2
, 1)
# let str =
Marshal.to_string
v []
;;
str : string
= "..."
# SafeUnmarshal.from_string
[^ ( float *
int ) ^]
str 0
;;
- :
( float *
int )
= (1.2, 1)
# SafeUnmarshal.from_string
[^ ( int *
float ) ^]
str 0
;;
Exception: SafeUnmarshal.Fail
Here is a more complete example.
If you are in a hurry :
./make_source_tree.sh ocaml-ty
&& cd ocaml-ty
&& ./configure
&& make world
&& make -C otherlibs/safe_unmarshaling top
Principle
This type-safe unmarshalling does not need any modification of the
OCaml runtime, and in particular uses the current format of marshalled
values. The unmarshalling function is correctly typed thanks to an
extra argument that represents the expected type of the value to be
read. The dynamic compatibility check of the value against the type is
performed by going through the whole value, interpreting cycles and
sharing as polymorphic type constraints.
This imposed to add the following.
- To the language,
- a syntax construct for representing static types as
runtime values:
- a predefined parameterized type
'a tyrepr
such that the value
[^
τ ^]
has type
τ tyrepr
;
- To the standard library:
- a module Ty for matching and
explore values of type 'a
tyrepr.
The checking algorithm is then implemented by a library
named
SafeUnmarshal.
For more information on how the algorithm works and its properties,
see the references.
How can I test?
Beware: this code has an alpha
status... and its correctness has been proved only with
paper and pencil. For new bugs, write to:
gregoire.henry AT pps.jussieu.fr
The code is available as two patches (1)
and then (2) for the 3.09 branch of the
Objective Caml compiler. It is necessary to bootstrap the
compiler before
and after applying the
second patch:
make core && make bootstrap --
or, alternatively, to copy these already bootstrapped versions of
ocamlc and
ocamllex in the
ocaml/boot/ directory (before or after applying patches ...
but before compiling!).
To get an already patched version of the INRIA compiler that is
directly compilable, choose among the following:
- use this small script :
make_source_tree.sh
dest_dir, that fetches the CVS version of OCaml, from
which the patches have been generated, and then patches it
and puts the bootstrapped version at the right place!
- download this big -- but complete --
tgz archive, and
follow the instructions contained in
the README.
Complete Examples
Loading.
$ ocaml safeUnmarshal.cma
Objective Caml version 3.09.2+ty1
# open SafeUnmarshal;;
# from_string;;
- : 'a tyrepr -> string -> int -> 'a
A test function.
# let test ty obj =
try
let _ = from_string ty (Marshal.to_string obj []) 0 in true
with Fail -> false;;
- : 'a tyrepr -> 'b -> bool = <fun>
# test [^ int ^] 3;;
- : bool = true
# test [^ int ^] 3.4;;
- : bool = false
# test [^ int ^] false;;
- : bool = true
Sum types.
# type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree;;
type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
# let t = Node (Node (Leaf, 3, Leaf), 5, Leaf);;
val t : int tree = Node (Node (Leaf, 3, Leaf), 5, Leaf)
# test [^ int tree ^] t;;
- : bool = true
# test [^ bool tree ^] t;;
- : bool = false
Record types.
# type point = { x: int; y: int };;
type point = { x : int; y : int; }
# type point3D = { a: int; b: int; c:int };;
type point3D = { a : int; b : int; c : int; }
# let p = { x=4; y=0 };;
val p : point = {x = 4; y = 0}
# test [^ point ^] p;;
- : bool = true
# test [^ point3D ^] p;;
- : bool = false
Abstract types.
# module M : sig type t val v : t end = struct
type t = (int * float) list
let v = [(1,1.2);(4,3.)]
end;;
module M : sig type t val v : t end
# test [^ M.t ^] M.v;;
- : bool = true
# test [^ M.t ^] p;;
- : bool = false
Sharing and cycles
# type ('node, 'leaf) btree =
| Node of ('node, 'leaf) btree * 'node * ('node, 'leaf) btree
| Leaf of 'leaf ;;
type ('a, 'b) btree = Leaf of 'b | Node of ('a, 'b) btree * 'a * ('a, 'b) btree
# let l = Leaf 3;;
val l : ('a, int) btree = Leaf 3
# let t = Node (l, 5.2, l), Node (l, 'a', l);;
val t : (float, int) btree * (char, int) btree =
(Node (Leaf 3, 5.2, Leaf 3), Node (Leaf 3, 'a', Leaf 3))
# test [^ (float, int) btree * (char, int) btree ^] t;;
- : bool = true
# let rec t2 = Node ( t2, "E", t2 );;
val t2 : (string, 'a) btree = ...
# test [^ (string, 'a) btree ^] t2;;
- : bool = true
# test [^ (string, int) btree * (string, float) btree ^] (t2, t2);;
- : bool = true
Ref type.
# let r = ref [];;
val r : '_a list ref = {contents = []}
# test [^ float list ref ^] r;;
- : bool = true
# test [^ int list ref ^] r;;
- : bool = true
# test [^ int list ref * float list ref ^] (r,r);;
- : bool = false
Strange conversions
# copy [^ int ^] true;;
- : int = 1
# copy [^ bool ^] 0;;
- : bool = false
# copy [^ bool ^] 2;;
Exception: SafeUnmarshal.Fail
# copy [^ point ^] (1,2);;
- : point = {x = 1; y = 2}
# copy [^ point3D ^] (1,2);;
Exception: SafeUnmarshal.Fail
# copy [^ M.t ^] [];;
- : M.t = <abstr>
# copy [^ M.t ^] [(1,2.3)];;
- : M.t = <abstr>
# copy [^ M.t ^] [1;2;3];;
Exception: SafeUnmarshal.Fail
# copy [^ string tree ^] t2;;
- : string tree = ...
# copy [^ string tree ^] l;;
Exception: SafeUnmarshal.Fail.
Features and Limitations
Checkable types include all primitive types (excepted
format4,
lazy_t,
exn and
->)
as well as all definable sum and record types.
Abstract types can be checked when they are implemented in OCaml,
but not when they correspond
to "Custom" blocks.
When compiling a project that
contains an interface file without implementation,
it is necessary to either create an implementation (for instance,
a symbolic link to the .mli file) or use the -noty
compiler option for compiling the .ml files that depend
on this interface.
Support for polymorphic variants is still incomplete.
Variance information are not taken into account.
The checking algorithm cannot check closures: it is therefore
impossible to safely unmarshall functions or objects with this
mechanism.
There is no patterm-matching matching on values of type
'a tyrepr.
Clearly, the syntax of values of type
'a tyrepr
could be extended with an anti-quotation mechanism. In a future version.
References
This code comes from the Master internship that I did in 2005 under
the supervision of Michel Mauny
and Emmanuel
Chailloux. The report is available (in French) with my colleagues'
on the MPRI
web site. Since then, a paper -- probably more readable, but in French -- has been published at JFLA'2006. See also the corresponding slides.