Log in

No account? Create an account
What I'd like to do: String xml = "(some XML here)";… - Chronicles of a Hereditary Geek [entries|archive|friends|userinfo]
Darth Paradox

[ website | Pyrlogos - a fantasy webcomic ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

[May. 28th, 2008|12:34 pm]
Darth Paradox
What I'd like to do:
    String xml = "(some XML here)";
    Document doc = Document.parseXml(xml);

What I have to do instead:
    String xml = "(some XML here)";
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();

    Reader reader = new StringReader(xml);
    InputSource in = new InputSource(reader);

    Document doc = builder.parse(in);

Not counting the line declaring the XML string, that's five times as many lines of code as what I'm conceptually trying to do. And this isn't the first time I've had to write this code, either.

I don't really have a choice about using Java for this project, but I'd really like the language - and all its common libraries - to just get the hell out of my way and let me write what I mean.

[User Picture]From: mcmartin
2008-05-28 08:22 pm (UTC)
The were designed in the expectation that what you meant was "I want to use this specific named XML-parsing library", much as one would want to use a specific SQL database backend. The idiom is identical to the JDBC stuff, where it's a lot less obtrusive.

Just throw a wrapper class into DarthUtil and be done with it; that's basically what I did for Blorple.
(Reply) (Thread)
[User Picture]From: darthparadox
2008-05-28 09:35 pm (UTC)
I don't keep a DarthUtil around. This is in a completely separate codebase than the last time I had to use it, and the time before; code reuse is a little hard in that case. Were this a personal project, I probably would have done that by now...

I guess I can see how the Java approach allows a lot of power over exactly how the parsing, string-reading, etc. gets done. But it seems like that power comes at the expense of the 90% case where I just want to use the default/basic/whatever choices for everything.

A reasonable compromise might be:
    String xml = "(some XML here)";
    DocumentParser parser = new DocumentParser();
    Document doc = parser.parse(xml);

If you want to use a specific library, or twiddle the options on the parser, you have an object that you can subclass or manipulate, but the default case that the vast majority of people likely want is also the easy one.

I recall Perl's motto of "make easy things easy and hard things possible". Making simple things complex should be avoided, especially if you're doing it to make the complex things only slightly simpler.

(Reply) (Parent) (Thread)
[User Picture]From: reskusic
2008-05-28 09:52 pm (UTC)
I don't think it would cause too much eyestrain to combine the last three lines into:
Document doc = builder.parse(new StringBufferInputStream(xml));

Lines 2 and 3 could be pretty easily combined, but it forces you to admit that you're throwing away objects that require substantial setup overhead. Wrapping parser construction in a Singleton class would probably be worth, in runtime performance, the eight or so lines of code that it would cost you.
(Reply) (Parent) (Thread)
[User Picture]From: mcmartin
2008-05-29 01:10 am (UTC)
That can burn you if the XML document in the String purports to not be in UTF-8 format, though. I found that XML files are noticably easier to deal with as Files or byte[]s, and I suspect the Reader/InputSource dance is to deal with that side of things if you've got a String to parse.

Happily, the files I needed to parse were guaranteed to be UTF-8 by the standard.
(Reply) (Parent) (Thread)
[User Picture]From: mcmartin
2008-05-29 01:00 am (UTC)
I really just meant to build a Singleton class that does the default work and that lets you just call parse(InputStream) and parse(File), statically if necessary, not a literal DarthUtil.

I recall Perl's motto of "make easy things easy and hard things possible".

The trick remains, of course, in what counts as a "hard thing". Perl's idea of this always struck me as madness.

I'm not claiming this wasn't an error on JAXP's part - but it's an error of assuming that XML processing is a task more like setting up a database instead of like setting up a file. There's no reason to have a Grand Unified API to parsers. For the final document representation, yes, but for the parsing engine? DocumentBuilderFactory is dead weight in a way that DatabaseConnection isn't - there's no reason why any given implementation can't just return a Document.
(Reply) (Parent) (Thread)
From: dacut
2008-05-28 10:09 pm (UTC)
Clearly, you need to add a AbstractDocumentBuilderFactoryBuilderFlyweightFactoryAdapter class and access it via AspectJ or creative reflection hacks.
(Reply) (Thread)