Astral Plane characters in Erlang JSON/RFC4627 implementation
November 16th, 2007 tonyg
Sam Ruby examines support for astral-plane characters in various JSON implementations. His post prompted me to check my Erlang implementation of rfc4627. I found that for astral plane characters in utf-8, utf-16, or utf-32, everything worked properly, but the RFC4627-mandated surrogate-pair “\uXXXX” encodings broke. A few minutes hacking later, and:
Eshell V5.5.5 (abort with ^G)
1> {ok, Utf8Encoded, []} =
rfc4627:decode("\"\\u007a\\u6c34\\ud834\\udd1e\"").
{ok,<<122,230,176,180,240,157,132,158>>,[]}
2> xmerl_ucs:from_utf8(Utf8Encoded).
[122,27700,119070]
3> rfc4627:encode(Utf8Encoded).
[34,122,230,176,180,240,157,132,158,34]
4>
Much better.
You can get the updated code using mercurial:
hg clone http://hg.opensource.lshift.net/erlang-rfc4627/
Entry Filed under: Technology, Programming, Our Software, Standards, Erlang
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed