go-xmlpath / xmlpath Goto Github PK

View Code? Open in Web Editor NEW

114.0 9.0 37.0 272 KB

Strict subset of the XPath specification for the Go language.

Home Page: http://gopkg.in/xmlpath.v2

License: Other

Go 100.00%

xmlpath's Introduction

Installation and usage

See gopkg.in/xmlpath.v2 for documentation and usage details.

xmlpath's People

Contributors

Stargazers

Watchers

xmlpath's Issues

Convenience Functions

Do you feel there'd be any value in adding the following 2 convenience functions? I find myself re-writing them over and over with literally no variation (and others seem to as well, e.g. Issue #11 ), and so I got to thinking they would be really useful to just pull in to the package itself.

This is not tested code, just meant for discussion (function names are inspired by the standard regexp library - I'm not married to them):

// Finds all matching node's string values
func FindAllString(xml, xpath string) ([]string, error) { 

    path, err := xmlpath.Compile(xpath)
    if err != nil {
        return nil, err
    }

    root, err := xmlpath.Parse(strings.NewReader(xml))
    if err != nil {
        return nil, err
    }

    ss := []string{}

    i := path.Iter(root)
    for iter.Next() {
        s := iter.Node().String()
        ss = append(ss, strings.TrimSpace(s))
    }

    return ss, nil  
}

// Finds first matching node's string value
// Use when there's only one expected matching node
func FindString(xml, xpath string) (string, error) { 
    ss, err := FindAllString(xml, xpath string)
    if err != nil {
        return "", err
    }

    return ss[0], nil
}

How do you set the Charset Reader?

I'm trying to set the Decoder.CharsetReader for .Parse because I'm getting:

xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil

My code:

package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
  "bytes"
  "strings"
  "os"
  "gopkg.in/xmlpath.v2"
)

type BarkClient struct {
  Protocol string
  User string
  Password string
  Subdomain string
  Host string
  Port string
}

type Monit struct {

}

func CreateClient(protocol, user, password, subdomain, host, port string) BarkClient {
  client := &BarkClient{
    Protocol: protocol,
    User: user,
    Password: password,
    Subdomain: subdomain,
    Host: host,
    Port: port,
  }

  return *client
}

func toUtf8(iso8859_1_buf []byte) string {
    buf := make([]rune, len(iso8859_1_buf))
    for i, b := range iso8859_1_buf {
        buf[i] = rune(b)
    }
    return string(buf)
}

func main() {
  var buffer bytes.Buffer

  client := CreateClient("http", "admin", "monit", "", "localhost", "2812",)

  array := [9]string{"http://", client.User, ":", client.Password, "@", client.Host, ":", client.Port, "/_status?format=xml"}

  for _, elem := range array {
    buffer.WriteString(elem)
  }

  url := buffer.String()

  response, err := http.Get(url)
  if err != nil {
    fmt.Printf("%s", err)
    os.Exit(1)
  } else {
    defer response.Body.Close()
    contents, err := ioutil.ReadAll(response.Body)
    if err != nil {
      fmt.Printf("%s", err)
      os.Exit(1)
    }

    utf_string_xml := toUtf8(contents)
    node, err := xmlpath.Parse(strings.NewReader(utf_string_xml))
    if err != nil {
      fmt.Println(err)
    }

    fmt.Println(node)
  }
}

Output:

~/g/G/s/g/k/gobark ❯❯❯ go install && $GOPATH/bin/gobark                                                                                                                                                                        
xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil
<nil>

How do I move forward with parsing different character sets?

please tag and version this project

Hello,

Can you please tag and version this project?

I am the Debian Maintainer for xmlpath and versioning would help Debian keep up with development.

2022/01/12 17:05:34 xml: unsupported version "1.1"; only version 1.0 is supported

how to deal with?

Project scope?

Will this project provide functions to set content, set attributes, etc.? Thanks!

Non-strict parser

Hi,

I am trying to parse some invalid HTML. The xmlpath library however is being anal about it.
_, err = xmlpath.ParseHTML(resp.Body)
returns an error of:
XML syntax error on line 12: expected attribute name in element

Is it possible to add a special method, or argument to existing method:
xmlpath.ParseHTML(r io.Reader, strict bool) ..., whereas when running in non-strict mode it would just ignore invalid nodes? Thanks.

No Length Information Available for Iter

I'm using xmlpath for writing tests which verify that a generated XML file contains all required information. Unfortunately, there is currently no way to get the number of matched elements contained in an Iter. I would like to be able to write a test like this (using testify in this case):

path := xmlpath.MustCompile("//path/to/array")
nodes, _ := xmlpath.Parse(bytes.NewReader(xmlAsBytes))
iter := path.Iter(nodes)

require.Equal(3, iter.Length, "Expected 3 matches but got %d", iter.Length)

Infinite recursion with `%#v` formatter

Giving any of the fmt.*f functions a xmlpath.Node will result in an infinite recursion and an eventual stack overflow. This happens due to Node.nodes containing non-pointers and the recursive referencing with the nodes in it.

non-xmlpath reproduce
xmlpath reproduce

While the fmt package is the one at fault for not handling recursion. xmlpath might want to implement a GoString to avoid the stack overflow for users.

Reported it as a Go issue too as issue 8241. This mostly to get a fmt documentation update.

root.String() return "empty" string

import (
    "bytes"
    "encoding/xml"
    "fmt"
    "strings"
    "gopkg.in/xmlpath.v2"
    "golang.org/x/net/html/charset"
)

func main() {
    r := bytes.NewBuffer([]byte(d))
    decoder := xml.NewDecoder(r)
    decoder.CharsetReader = charset.NewReaderLabel
    root, _ := xmlpath.ParseDecoder(decoder)

        fmt.Printf("% x\n", root.Bytes()) 

    p, _ := xmlpath.Compile("list_boxes/boxes/*")
    iter := p.Iter(root)
    for iter.Next() {
        elem := iter.Node()
        s := strings.TrimSpace(elem.String())
        fmt.Printf("node: [%s]\n", s)
    }
}

const d = `
<?xml version="1.0" encoding="windows-1251" ?>
<list_boxes>
  <boxes>
    <box Name="1" Caption="Àäìèíèñòðàòîð" >
      <Flags/>
    </box>
    <box Name="2" Caption="Îïåðàòîð">
      <Flags/>
    </box>
    </boxes>
</list_boxes>`

Basic tutorial

Would it be possible to have a basic tutorial which shows how to iterate over nodes which match a given XPath? I didn't find anything in the tests that detail that. Thanks.

Both Path.Iter and Path.String seems to ignore the Context

I am working with HTML files(parsed with ParseHTML) and I think both Path.Iter() and Path.String() seems to ignore the context node I provide, and instead they look for the element as if the context was the whole document.

Support multiple attribute conditions

It would be super nice to be able to use logical operators in the path expression

Example:

<test attr1="a" attr2="b">
     <a>aaaa</a>
     <b>bbb</b>
</test>

Xpath:

/test[@attr1='a' and @attr2='b']/a

This results in a panic currently.

Active maintainer needed

There are numerous very valid PRs which haven't seemed to receive any attention for a long time. If the original maintainers are too busy with other projects may I suggest inviting a new maintainer to the project?

How to get matched node names

First I am a newbe in golang but I am trying to pass the XML test which is one of the most difficult one for a new language (Who wants to use XML anyway :-) ).

Is there a way to just print the node names that have been matched ?
See below in the iterator for each Node I would like to get only the node name by when I stringify it I get the complete node content.

capability_path := xmlpath.MustCompile("//Capability/Request/*")

root, err := xmlpath.Parse(reader)
if err != nil {
        log.Fatal(err)
}
iter := capability_path.Iter(root)
for iter.Next() {
         elem := iter.Node()
         fmt.Println("Found:[", elem.String(),"]")
    }

Thank in advance

ok deleted

Equals operator index out of range

When the library tries to compare the value of a node with a string that matches the first few characters, it breaks. So it tries to match "1" to "10" and experiences an index out of range error due to an off-by-one error in node text length calculation.

https://gist.github.com/KevBurnsJr/bf71fe52cdc5cc706609

XPath in iterated sub-nodes

Hello,

I'm facing an issue there, I don't know if this is standard XPath behavior or a bug in the lib...

Consider this XML string:

const xmlList = `
<list>
	<element>
		<title>foo</title>
	</element>
	<element>
		<title>bar</title>
	</element>
</list>
`

and this basic function:

func main() {
	elementPath := xmlpath.MustCompile(`//element`)
	titlePath := xmlpath.MustCompile(`//title`) // => same as /list/element/title
	root, err := xmlpath.Parse(strings.NewReader(xmlList))
	if err != nil {
		panic(err)
	}

	elementsIter := elementPath.Iter(root)
	for elementsIter.Next() {
		element := elementsIter.Node()
		fmt.Println("[DEBUG]", element.String())
		if title, ok := titlePath.String(element); ok {
			fmt.Println("[title]", title)
		}
		fmt.Println("--")
	}
}

It is meant to iterate through the <element> of the XML, and for each of them, extract with a sub-XPath their title.

The output is (I removed newlines and tabs, for clarity sake):

[DEBUG] foo
[title] foo
--
[DEBUG] bar
[title] foo
--

i.e., elements are iterated correctly, but titles are extracted from the root node (the same result is obtained if I use titlePath := xmlpath.MustCompile("/list/element/title"), while I clearly specified to parse from the element node...

Does anybody know what is going on here?

Thank you

LGPLv3 and static linking

I would love to use this library in the closed source application that the company I work for is building, but since LGPL forces me to either open source our code base or link dynamically with a library that is already present on the user's machine, I don't think it is possible to use your work.

So what is the reasoning behind licensing it as LGPLv3? Or do you simply only want people who build open source to use this library?

http://stackoverflow.com/questions/376306/is-empty-string-valid-xml

Should I try to create a PR?

go-xmlpath / xmlpath Goto Github PK

xmlpath's Introduction

Installation and usage

xmlpath's People

Contributors

Stargazers

Watchers

Forkers

xmlpath's Issues

Recommend Projects

Recommend Topics

Recommend Org