Giter Site home page Giter Site logo

vtd-xml's People

Contributors

jzhang2004 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vtd-xml's Issues

Regression in xpath evaluation for namespaced elements

In vtd-xml-2.11 the following test would pass as the xpath of orders/order/items/item would find the first 12 items and ignore those with the namespace prefix ns but as of 2.12 this now returns all the <item> elements including the <ns:item> items which is incorrect.

VTDGen parse CDATA error

When trying to parse a CDATA section that contains JSON with a list inside a list

Eg:

<?xml version="1.0" encoding="UTF-8" ?>
<Test>
<Description>
<![CDATA["Test":["First" : ["Element1", "Element2"]]
]]>
</Description></Test>

VTDGen.parse() outputs
com.ximpleware.ParseException: Error in CDATA: Invalid termination sequence
The problem occurs when there is a "\n" after the pair or "]"

Eg:

<?xml version="1.0" encoding="UTF-8" ?>
<Test>
<Description>
<![CDATA[Test]]
]]>
</Description></Test>

This example works fine:

<?xml version="1.0" encoding="UTF-8" ?>
<Test>
<Description>
<![CDATA[Test]]]]>
</Description>
</Test>

consecutive spaces in xml text value are collapsed into a single one

The 2 spaces in the value of the a element in x y are squashed into a single space.

@Test
public void test_multiple_spaces() throws ParseException, XPathParseException, NavException, XPathEvalException {
    VTDGen vtdGen = new VTDGen();
    // notice the 2 spaces
    String expectedValue = "x  y";
    vtdGen.setDoc(("<a>" + expectedValue + "</a>").getBytes());
    vtdGen.parse(true);  // set namespace awareness to true
    VTDNav vn = vtdGen.getNav();
    AutoPilot ap = new AutoPilot(vn);
    ap.selectXPath("//a");
    String actualValue = null;
    int result = -1;
    result = ap.evalXPath();
    if (result != -1) {
        int t = vn.getText();
        if (t != -1)
            actualValue =  vn.toNormalizedString(t);
    }
    Assert.assertEquals(expectedValue, actualValue);
}

The code for reading the value using xpath is taken from the code samples page

VTD-XML shadows namespace from sibling element and add unexpected ns

I have an issue when trying to extract element via getElementFragmentNs.

Here sample test:

 @Test
    public void shouldNotShadowNamespaceAndAddSiblingNamespaces() throws Exception {
        byte[] bytes = ("<ns2:Response xmlns=\"urn://message\" xmlns:ns2=\"urn://ns2\">\n" +
                "    <ns2:Data Id=\"SIGNED_BY_CONSUMER\">\n" +
                "        <Content>\n" +
                "            <tns:Response\n" +
                "                    xmlns:tns=\"urn://tns\"\n" +
                "                    xmlns=\"urn://shadow\">\n" +
                "                <tns:test/>\n" +
                "            </tns:Response>\n" +
                "        </Content>\n" +
                "        <AttachmentHeaderList>\n" +
                "            <AttachmentHeader/>\n" +
                "        </AttachmentHeaderList>\n" +
                "    </ns2:Data>\n" +
                "</ns2:Response>").getBytes("UTF-8");

        VTDGen vg = new VTDGen();
        vg.setDoc(bytes);
        vg.parse(true);  // set namespace awareness to true

        VTDNav vn = vg.getNav();
        AutoPilot ap = new AutoPilot(vn);
        ap.selectElement("AttachmentHeader");
        ap.iterate();

        ElementFragmentNs efn = vn.getElementFragmentNs();
        byte[] result = efn.toBytes();
        assertThat(new String(result, "UTF-8"), is("<AttachmentHeader xmlns=\"urn://message/\"/>"));
    }

The actual result is:

<AttachmentHeader xmlns:tns="urn://tns" xmlns="urn://shadow" xmlns:ns2="urn://ns2"/>

But I expect:

<AttachmentHeader xmlns="urn://message/"/>

Why it shadows default namespace with default namespace from sibling subelement? And why it adds unnecessary namespaces from it?

No TOKEN_END_TAGS

No end tags appear in the navigation or anywhere, which makes some tasks really difficult, plus there are bad hacks in the code working around this, and if you want to do something like cut out a complete XML fragment from your children, it is hacky as well. Why are the end tags defined as a constant but don't exist?

Some words like &#55357;&#56452;&#55357;&#56388;&#55357;&#56416; can't be parse

Hi
I have xml like this

<last-name>&#55357;&#56452;&#55357;&#56388;&#55357;&#56416;็ณ–</last-name>

when I parse the xml will be error here

com.ximpleware.extended.EntityExceptionHuge: Errors in entity reference: Invalid XML char.
	at com.ximpleware.extended.VTDGenHuge.entityIdentifier(VTDGenHuge.java:980)
	at com.ximpleware.extended.VTDGenHuge.getCharAfterSe(VTDGenHuge.java:1119)
	at com.ximpleware.extended.VTDGenHuge.parse(VTDGenHuge.java:1445)
	at com.ximpleware.extended.VTDGenHuge.parseFile(VTDGenHuge.java:1293)
	at com.baozun.sql.maker.MemberSqlFactory.main(MemberSqlFactory.java:364)

What can I do to fix it. some one can help me.

Regression in XMLMemMapperBuffer

Hi there,

This code works as expected in 2.11 :

VTDGenHuge vgh = new VTDGenHuge();
if (vgh.parseFile("c:/xml/text1.xml",true , VTDGenHuge.MEM_MAPPED)) {
     VTDNavHuge vnh = vgh.getNav();
     vnh.toElement(VTDNavHuge.FC);
     long[] la = vnh.getElementFragment();
     vnh.getXML().writeToFileOutputStream(new FileOutputStream("c:/xml/text2.xml"), la[0], la[1]);
}

But since 2.12, it raise this error :

java.nio.channels.ClosedChannelException
	at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
	at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:588)
	at com.ximpleware.extended.XMLMemMappedBuffer.writeToFileOutputStream(XMLMemMappedBuffer.java:104)

The line https://github.com/jzhang2004/vtd-xml/blob/master/ximple-dev/com/ximpleware/extended/VTDGenHuge.java#L927 close the FileChannel at the end of vgh.getNav();

Have a nice day !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.