Using XML4J 1.x with network stream

How to use xml4j 1.x with a network stream

The other files that make up this example are HTTPInputStream and a sanitized sample of code which uses it.

I apologize if the code samples are not completly lucid. I quickly sanitized some production code rather than creating a clean example from scratch.

In a nutshell, I do this:

        /** The reader to get stuff from the socket */
        HTTPInputStream socketReader = null;
        /** The XML parser instance reading the request */
        Parser  xmlparser = null;
        /** Source stream into xml parser */
        Source  xmlsource = null;
        /** The socket which we are connected to */
        Socket  socket;


        socketReader = new HTTPInputStream(socket.getInputStream());

        /* Read the HTTP headers.
	 * If you are using a the real java servlet interface, then
	 * the web server will have parsed the headers.  I have my
	 * own server from the ground up, so I do it here.
	 */
        HTTPUtil.readHTTPRequest(this, socketReader);

	/* Create and set up the parser.   Note the functions we override
	 */
	xmlparser = new Parser(null,
                                new RequestErrorListener(),
                                new RequestStreamProducer());
	// I don't need comments
	xmlparser.setKeepComment(false);
	// I do not use a DTD, this prevents DTD checks
	xmlparser.setElementFactory(new TXDocument() {
                        public boolean isCheckValidity() {
                                return false;
                        }
	});
	// The TagHandler is crucial
	xmlparser.setTagHandler(new PUPTagHandler());
	// System.out.println("Start xml parse");
	contents = xmlparser.readStream(socketReader);
	xmlparser = null;       // discard, it cannot be reused

	//*** contents is the root of the DOM Tree. It is of type
	//*** Document which implements the DOM Document interface.

	// Print the DOM tree for debugging
	DOMHelpers.dumpDoc(contents);

	// act on it ....

	// socket is still open so we can write back to it.

The actual code samples show the RequestErrorListener, RequestStreamProducer & RequestTagHandler that are needed to make this work.

HTTPInputStream is also important, because it lets you signal the EOF to the parser when it actually has not happend. It works in conjunction with the handleEndTag routine. You must call socketReader.signalEOF() on whatever tag marks the final boundary of your input.

Other thoughts

This example uses the DOM interface. The SAX interface might simplify a lot of things.

Other people have commented:

Marc A. Saegesser writes:

Thanks for the info. I appreciate it. I posted message on IBM's XML discussion group and one of the XML developers acknowledged that the 1.1.* version need to read the entire document before any of the SAX callbacks are called. He also said that the version 2.0 of XML4J would fix this. SAX callbacks will occur as the document is being read. The DOM interface will still require reading the entire document first.

I switched to Aelfred in the interim and I think I'll probably stick with it right since I'm working on an applet and Aelfred is *much* smaller than XML4J even after IBM compressed the JAR file.

Tony Aiuto