Friday 3 October 2014

Make structured xml from flat source with XSLT 1

Issue

A requirement for structured XML to be generated from a flat XML source but only XSLT 1 can be used for the transformation. This required that all paragraph elements needed to be nested within a subsection element and all subparagraphs needed to be nested within the paragraph element. An additional requirement was for textual content to be contained within a <text/> element

Resolution

Source:

<sectiontitle>sample content text</sectiontitle>
<subsection>sample content text</subsection>
<paragraph>sample content text</paragraph>
<paragraph>sample content text</paragraph>
<subparagraph>sample content text</subparagraph>
<subparagraph>sample content text</subparagraph>
<subsection>sample content text</subsection>

Required Output:

<sectiontitle><text>sample content text</text></sectiontitle>
<subsection>
    <text>sample content text</text>
    <paragraph><text>sample content text</text></paragraph>
    <paragraph>
        <text>sample content text.</text>
        <subparagraph><text>sample content  text<text></subparagraph>
        <subparagraph><text>sample content  text<text></subparagraph>
    </paragraph>
</subsection>
<subsection><text>sample content text.</text></subsection>


XSLT

<xsl:template match="node()|@*">
    <xsl:copy>
        <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
</xsl:template>

<xsl:template match="subsection">
    <subsection>
        <text>
            <xsl:value-of select="." />
        </text>
        <xsl:apply-templates
        select="following-sibling::paragraph
        [generate-id(preceding-sibling::subsection[1])
        = generate-id(current())]"  mode="nest" />
    </subsection>
</xsl:template>

 <xsl:template match="paragraph" mode="nest">
    <paragraph>
        <text>
            <xsl:value-of select="." />
        </text>
        <xsl:apply-templates 
            select="following-sibling::subparagraph
            [generate-id(preceding-sibling::paragraph[1])
            = generate-id(current())]"  mode="nest" />
    </paragraph>
</xsl:template>

<xsl:template match="subparagraph" mode="nest">
    <xsl:copy>
        <text>
            <xsl:apply-templates />
        </text>
    </xsl:copy>
</xsl:template>
 
<xsl:template match="paragraph"/>
 
<xsl:template match="subparagraph"/>

Points to note:

  • The xsl:value-of could be xsl:apply-templates if we have anything other than a text node within the content
  • There is a requirement for consistency withn the XML source

No comments:

Post a Comment