Tuesday, August 4, 2009

Pretty Printing Groovy's StreamingMarkupBuilder

Unfortunately, Groovy's StreamingMarkupBuilder will not output indented xml due to its streaming nature. However, never fear, JTidy is here to save the day!

JTidy is a project hosted here on sourceforge and it is a pretty nice HTML/XML pretty printer. It also seems to check your HTML for errors, which is cool! Examples are lacking however so I thought I would share my findings with this example:


public class JTidyExample {

def tidyMeUp(String singleLine) {
StringWriter writer = new StringWriter()
Tidy tidy = new Tidy()
tidy.identity {
setEscapeCdata(false)//leave cdata untouched
setIndentCdata(true)//indent the CData
setXmlTags(true)//working with xml not html
parse(new StringReader(singleLine), writer)
}
writer.toString()
}

def createMarkUp() {
String cData = "<![CDATA[hello]]>"
StreamingMarkupBuilder xml = new StreamingMarkupBuilder();
def person1 = {
person(id: 1) {
firstName("John")
lastName("Doe")
data_labels {
mkp.yieldUnescaped(cData)
}
}
}
def personList = {
people {
out << person1
}
}
xml.bind(personList).toString()
}

def static main(def args) {
def example = new JTidyExample()
def singleLine = example.createMarkUp()
println "Before: \n ${singleLine}"
println "After: \n ${example.tidyMeUp(singleLine)}"
}

}



The following is output:


Before:

<people><person id='1'><firstName>John</firstName><lastName>Doe</lastName><data_labels><![CDATA[hello]]></data_labels></person></people>

Tidy (vers 26-Sep-2004) Parsing "InputStream"
no warnings or errors were found

After:

<people>
<person id='1'>
<firstName>John</firstName>
<lastName>Doe</lastName>
<data_labels>
<![CDATA[hello]]>
</data_labels>
</person>
</people>



Here we can see that our XML is now nicely indented. Want some more??? Check out Scott Davis' Groovy Recipes, it has a pretty sweet XML section!

UPDATE: Check out Paul Kings comments where he suggests using groovy.xml.XmlUtil.serialize(xml.bind(personList)). Sweet!

6 comments:

  1. Nice example. You can also use XmlUtil.serialize as shown below:

    import groovy.xml.*

    String cData = "<![CDATA[hello]]>"
    def xml = new StreamingMarkupBuilder()
    def personList = {
    people {
    person(id: 1) {
    firstName("John")
    lastName("Doe")
    data_labels {
    mkp.yieldUnescaped(cData)
    }
    }
    }
    }
    println XmlUtil.serialize(xml.bind(personList))

    ReplyDelete
  2. Or this slight variation:

    import groovy.xml.*

    def personList = {
      people {
        person(id: 1) {
          firstName("John")
          lastName("Doe")
          data_labels {
            unescaped << "<![CDATA[hello]]>"
          }
        }
      }
    }
    def xml = new StreamingMarkupBuilder()
    println XmlUtil.serialize(xml.bind(personList))

    ReplyDelete
  3. Actually the XmlUtil.serialize just puts a line feed and carriage return... so im getting an XML with no indentation... I don't see a way to do this really pretty...

    Here is some example code:

    def xmlBuilder = new StreamingMarkupBuilder().bind() { mkp.xmlDeclaration(version:'1.0')
    ROOT(stage:xmlExportHelper.getStageString()) {
    LANGUAGE(publicationEntity.getLanguage().getName())
    ENTRY() {
    ...

    return XmlUtil.serialize(xmlBuilder.toString())



    Using Groovy 1.7.5

    ReplyDelete
  4. Hey,

    http://groovyconsole.appspot.com/script/314001

    Seems to work, this is using 1.8 though.

    I have tried using 1.7.4 and 1.7.5 and it also seems to work. If you are having issues with you should submit a jira with a failing test or try the user list but everything seems to be working fine!

    Good luck!

    ReplyDelete
  5. Same indentation issue for me as well. I thought Groovy would simplify my life, but it seems to not alleviate the requirement for increasing third party dependencies in Java. I have a lot of trouble each time I need to add some dependency, because I am using Maven and often need to Google long times to get a working set of Maven coordinates for each new package needed. And why are more and more links on forums and web pages NON-CLICKABLE? This is a real pain. Firefox issue?

    ReplyDelete