Avi Bryant <avi_at_beta4.com> wrote:
Then shouldn't this pass? It doesn't:
testLineEndingsDoNotMatter
|text cr crlf|
text :=
'<foo>
bar
baz
</foo>'.
cr = XMLDOMParser parseDocumentFrom: text readStream.
crlf = XMLDOMParser parseDocumentFrom: text withInternetLineEndings
readStream. self assert:
(cr elements first contents first string) =
(crlf elements first contents first string).
Bug 1: cr is "<foo>bar baz </foo>". The XML specification is
completely clear and unambiguous about this, with no room for
weaselling: line ends MUST be mapped to (line feed) *NOT* to
some other character that takes your fancy. The result should be
"<foo>bar baz </foo>".
Bug 2: crlf is "<foo>bar baz </foo>". The fact that
this is not "<foo>bar baz </foo>" violates conformance to XML
(whether XML 1.0 1st ed, XML 1.0 2nd ed, or XML 1.1 draft) so badly
that it really is NOT funny.
My own XML parser for Squeak (a) gets line endings right, and (b) makes
string nodes *be* Strings instead of funny things you have to send
#string to. This test case convinces me that I was right to stick
with my own parser and ignore XMLDOMParser completely.