Giter Site home page Giter Site logo

xmlpath's Introduction

xmlpath's People

Contributors

niemeyer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xmlpath's Issues

Convenience Functions

Do you feel there'd be any value in adding the following 2 convenience functions? I find myself re-writing them over and over with literally no variation (and others seem to as well, e.g. Issue #11 ), and so I got to thinking they would be really useful to just pull in to the package itself.

This is not tested code, just meant for discussion (function names are inspired by the standard regexp library - I'm not married to them):

// Finds all matching node's string values
func FindAllString(xml, xpath string) ([]string, error) { 

    path, err := xmlpath.Compile(xpath)
    if err != nil {
        return nil, err
    }

    root, err := xmlpath.Parse(strings.NewReader(xml))
    if err != nil {
        return nil, err
    }

    ss := []string{}

    i := path.Iter(root)
    for iter.Next() {
        s := iter.Node().String()
        ss = append(ss, strings.TrimSpace(s))
    }

    return ss, nil  
}

// Finds first matching node's string value
// Use when there's only one expected matching node
func FindString(xml, xpath string) (string, error) { 
    ss, err := FindAllString(xml, xpath string)
    if err != nil {
        return "", err
    }

    return ss[0], nil
}

How do you set the Charset Reader?

I'm trying to set the Decoder.CharsetReader for .Parse because I'm getting:

xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil

My code:

package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
  "bytes"
  "strings"
  "os"
  "gopkg.in/xmlpath.v2"
)

type BarkClient struct {
  Protocol string
  User string
  Password string
  Subdomain string
  Host string
  Port string
}

type Monit struct {

}

func CreateClient(protocol, user, password, subdomain, host, port string) BarkClient {
  client := &BarkClient{
    Protocol: protocol,
    User: user,
    Password: password,
    Subdomain: subdomain,
    Host: host,
    Port: port,
  }

  return *client
}

func toUtf8(iso8859_1_buf []byte) string {
    buf := make([]rune, len(iso8859_1_buf))
    for i, b := range iso8859_1_buf {
        buf[i] = rune(b)
    }
    return string(buf)
}

func main() {
  var buffer bytes.Buffer

  client := CreateClient("http", "admin", "monit", "", "localhost", "2812",)

  array := [9]string{"http://", client.User, ":", client.Password, "@", client.Host, ":", client.Port, "/_status?format=xml"}

  for _, elem := range array {
    buffer.WriteString(elem)
  }

  url := buffer.String()

  response, err := http.Get(url)
  if err != nil {
    fmt.Printf("%s", err)
    os.Exit(1)
  } else {
    defer response.Body.Close()
    contents, err := ioutil.ReadAll(response.Body)
    if err != nil {
      fmt.Printf("%s", err)
      os.Exit(1)
    }

    utf_string_xml := toUtf8(contents)
    node, err := xmlpath.Parse(strings.NewReader(utf_string_xml))
    if err != nil {
      fmt.Println(err)
    }

    fmt.Println(node)
  }
}

Output:

~/g/G/s/g/k/gobark ❯❯❯ go install && $GOPATH/bin/gobark                                                                                                                                                                        
xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil
<nil>

How do I move forward with parsing different character sets?

please tag and version this project

Hello,

Can you please tag and version this project?

I am the Debian Maintainer for xmlpath and versioning would help Debian keep up with development.

Project scope?

Will this project provide functions to set content, set attributes, etc.? Thanks!

Non-strict parser

Hi,

I am trying to parse some invalid HTML. The xmlpath library however is being anal about it.
_, err = xmlpath.ParseHTML(resp.Body)
returns an error of:
XML syntax error on line 12: expected attribute name in element

Is it possible to add a special method, or argument to existing method:
xmlpath.ParseHTML(r io.Reader, strict bool) ..., whereas when running in non-strict mode it would just ignore invalid nodes? Thanks.

No Length Information Available for Iter

I'm using xmlpath for writing tests which verify that a generated XML file contains all required information. Unfortunately, there is currently no way to get the number of matched elements contained in an Iter. I would like to be able to write a test like this (using testify in this case):

path := xmlpath.MustCompile("//path/to/array")
nodes, _ := xmlpath.Parse(bytes.NewReader(xmlAsBytes))
iter := path.Iter(nodes)

require.Equal(3, iter.Length, "Expected 3 matches but got %d", iter.Length)

Infinite recursion with `%#v` formatter

Giving any of the fmt.*f functions a xmlpath.Node will result in an infinite recursion and an eventual stack overflow. This happens due to Node.nodes containing non-pointers and the recursive referencing with the nodes in it.

non-xmlpath reproduce
xmlpath reproduce

While the fmt package is the one at fault for not handling recursion. xmlpath might want to implement a GoString to avoid the stack overflow for users.

Reported it as a Go issue too as issue 8241. This mostly to get a fmt documentation update.

root.String() return "empty" string

import (
    "bytes"
    "encoding/xml"
    "fmt"
    "strings"
    "gopkg.in/xmlpath.v2"
    "golang.org/x/net/html/charset"
)

func main() {
    r := bytes.NewBuffer([]byte(d))
    decoder := xml.NewDecoder(r)
    decoder.CharsetReader = charset.NewReaderLabel
    root, _ := xmlpath.ParseDecoder(decoder)

        fmt.Printf("% x\n", root.Bytes()) 

    p, _ := xmlpath.Compile("list_boxes/boxes/*")
    iter := p.Iter(root)
    for iter.Next() {
        elem := iter.Node()
        s := strings.TrimSpace(elem.String())
        fmt.Printf("node: [%s]\n", s)
    }
}

const d = `
<?xml version="1.0" encoding="windows-1251" ?>
<list_boxes>
  <boxes>
    <box Name="1" Caption="Àäìèíèñòðàòîð" >
      <Flags/>
    </box>
    <box Name="2" Caption="Îïåðàòîð">
      <Flags/>
    </box>
    </boxes>
</list_boxes>`

Basic tutorial

Would it be possible to have a basic tutorial which shows how to iterate over nodes which match a given XPath? I didn't find anything in the tests that detail that. Thanks.

Support multiple attribute conditions

It would be super nice to be able to use logical operators in the path expression

Example:

<test attr1="a" attr2="b">
     <a>aaaa</a>
     <b>bbb</b>
</test>

Xpath:

/test[@attr1='a' and @attr2='b']/a

This results in a panic currently.

Active maintainer needed

There are numerous very valid PRs which haven't seemed to receive any attention for a long time. If the original maintainers are too busy with other projects may I suggest inviting a new maintainer to the project?

How to get matched node names

First I am a newbe in golang but I am trying to pass the XML test which is one of the most difficult one for a new language (Who wants to use XML anyway :-) ).

Is there a way to just print the node names that have been matched ?
See below in the iterator for each Node I would like to get only the node name by when I stringify it I get the complete node content.

capability_path := xmlpath.MustCompile("//Capability/Request/*")

root, err := xmlpath.Parse(reader)
if err != nil {
        log.Fatal(err)
}
iter := capability_path.Iter(root)
for iter.Next() {
         elem := iter.Node()
         fmt.Println("Found:[", elem.String(),"]")
    }

Thank in advance

XPath in iterated sub-nodes

Hello,

I'm facing an issue there, I don't know if this is standard XPath behavior or a bug in the lib...

Consider this XML string:

const xmlList = `
<list>
	<element>
		<title>foo</title>
	</element>
	<element>
		<title>bar</title>
	</element>
</list>
`

and this basic function:

func main() {
	elementPath := xmlpath.MustCompile(`//element`)
	titlePath := xmlpath.MustCompile(`//title`) // => same as /list/element/title
	root, err := xmlpath.Parse(strings.NewReader(xmlList))
	if err != nil {
		panic(err)
	}

	elementsIter := elementPath.Iter(root)
	for elementsIter.Next() {
		element := elementsIter.Node()
		fmt.Println("[DEBUG]", element.String())
		if title, ok := titlePath.String(element); ok {
			fmt.Println("[title]", title)
		}
		fmt.Println("--")
	}
}

It is meant to iterate through the <element> of the XML, and for each of them, extract with a sub-XPath their title.

The output is (I removed newlines and tabs, for clarity sake):

[DEBUG] foo
[title] foo
--
[DEBUG] bar
[title] foo
--

i.e., elements are iterated correctly, but titles are extracted from the root node (the same result is obtained if I use titlePath := xmlpath.MustCompile("/list/element/title"), while I clearly specified to parse from the element node...

Does anybody know what is going on here?

Thank you

LGPLv3 and static linking

I would love to use this library in the closed source application that the company I work for is building, but since LGPL forces me to either open source our code base or link dynamically with a library that is already present on the user's machine, I don't think it is possible to use your work.

So what is the reasoning behind licensing it as LGPLv3? Or do you simply only want people who build open source to use this library?

Iterate through nodes

When I try to iterate through the root node obtained from ParseHTML, I cannot use any of the unexported properties such as the nodes array or the kind property. How can I find all of the text nodes?

I am trying to iterate through the hierarchy of nodes and filter out the text to be used later for searching.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.