Giter Site home page Giter Site logo

rreverser / serde-xml-rs Goto Github PK

View Code? Open in Web Editor NEW
260.0 8.0 87.0 137 KB

xml-rs based deserializer for Serde (compatible with 1.0+)

Home Page: https://crates.io/crates/serde-xml-rs

License: MIT License

Rust 100.00%
serde deserialization xml rust parsing

serde-xml-rs's Introduction

serde-xml-rs

Build Status

xml-rs based deserializer for Serde (compatible with 1.0)

Example usage

use serde::{Deserialize, Serialize};
use serde_xml_rs::{from_str, to_string};

#[derive(Debug, Serialize, Deserialize, PartialEq)]
struct Item {
    name: String,
    source: String,
}

fn main() {
    let src = r#"<Item><name>Banana</name><source>Store</source></Item>"#;
    let should_be = Item {
        name: "Banana".to_string(),
        source: "Store".to_string(),
    };

    let item: Item = from_str(src).unwrap();
    assert_eq!(item, should_be);

    let reserialized_item = to_string(&item).unwrap();
    assert_eq!(src, reserialized_item);
}

serde-xml-rs's People

Contributors

01d55 avatar agentydragon avatar axfaure avatar dushistov avatar duxet avatar efx avatar farodin91 avatar ignatenkobrain avatar jgalenson avatar michael-f-bryan avatar oeb25 avatar oli-obk avatar ortham avatar punkstarman avatar rreverser avatar rustysec avatar sd2k avatar tobz1000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

serde-xml-rs's Issues

Option not working with rename macro

The following code fails with:
thread 'it_works' panicked at 'called `Result::unwrap()` on an `Err` value: Expected token XmlEvent::StartElement { name, attributes, .. }, found En dElement(Item)', /checkout/src/libcore/result.rs:859

#[macro_use] extern crate serde_derive;
extern crate serde_xml_rs;

use serde_xml_rs::deserialize;

#[derive(PartialEq, Debug, Serialize, Deserialize)]
struct Item {
    pub name: String,
    pub source: String,
}

#[derive(PartialEq, Debug, Serialize, Deserialize)]
struct Project {
    pub name: String,

    #[serde(rename = "Item")]
    pub items: Option<Item>,
}

#[test]
fn it_works() {
    let s = r##"
        <Project name="my_project">
            <Item name="hello" source="world.rs"/>
        </Project>
    "##;
    let project: Project = deserialize(s.as_bytes()).unwrap();
    println!("{:#?}", project);
}

If I remove #[serde(rename = "Item")] and rename Project.items to Project.item the test runs without a failure.

Fails to parse numbers

The deserialize function can't currently deserialize numbers from the XML input.

#[derive(Debug, Deserialize, PartialEq)]
struct StringAndU32 {
  pub path: String,
  pub num: u32,
}

#[test]
fn string_and_u32() {
  let s = r##"
    <string_and_u32 path="/dev/null" num="2020"/>
  "##;
  let result: StringAndU32 = deserialize(s.as_bytes()).unwrap();
  assert_eq!(result,
    StringAndU32{
      path: "/dev/null".to_string(),
      num: 2020,
    }
  );
}

serialize duplicates field name/type

#[derive(Default, Debug, Clone, PartialEq, Serialize)]
pub struct Message<'a, T> {
	#[serde(rename = "ARTSCommonHeader")]
	pub arts_common_header: ArtsCommonHeader<'a>,
	pub content: T,
}

#[derive(Default, Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct ArtsCommonHeader<'a> {
	#[serde(rename = "MessageType")]
	pub message_type: &'a str,
}

#[derive(Default, Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct SystemTransaction<'a> {
	#[serde(rename = "DeviceTime")]
	pub device_time: &'a str,
	#[serde(rename = "ActionCode")]
	pub action_code: &'a str,
}

let s = Message {
	arts_common_header: ArtsCommonHeader {
		message_type: "Request"
	},
	content: SystemTransaction {
		action_code: "Update",
		device_time: &time.to_string(),
	}
};

"<Message><ARTSCommonHeader><ArtsCommonHeader><MessageType>Request</MessageType></ArtsCommonHeader></ARTSCommonHeader><content><SystemTransaction><DeviceTime>2017-10-11 12:53:00</DeviceTime><ActionCode>Update</ActionCode></SystemTransaction></content></Message>"

It serializes <ARTSCommonHeader><ArtsCommonHeader>...</ArtsCommonHeader></ARTSCommonHeader>. This shouldn't happen!

Btw, what I want to get is:
"<Message><ARTSCommonHeader MessageType="Request"/><SystemTransaction ActionCode="Update"><DeviceTime>2017-10-11T12:53:00</DeviceTime></SystemTransaction></Message>"

How to ignore attributes/children when deserializing?

How can I parse a XML doc and ignore attributes/members that the xml has, that I don't need in Rust?
E.g.:

<?xml version="1.0" encoding="UTF-8"?>
<k foo="asd">
    <pan h="0" v="0" f="90"/>
    <view fisheye="0" limitview="range" fov="75" hlookatmin="-180" hlookatmax="180" vlookatmin="67" vlookatmax="-90" fovmin="40" fovmax="140" hlookat="-20" vlookat="0"/>
    <autorotate enabled="true" waittime="20" speed="1"/>
    <image type="CUBE" multires="false">
        <left url="0_307_2560_50.jpg"/>
        <front url="4_307_2560_85.jpg"/>
        <right url="5_307_2560_50.jpg"/>
        <back url="1_307_2560_50.jpg"/>
        <up url="2_307_2560_85.jpg"/>
        <down url="3_307_2560_40.jpg"/>

        <mobile>
            <left url="0_307_1024_50.jpg"/>
            <front url="4_307_1024_50.jpg"/>
            <right url="5_307_1024_50.jpg"/>
            <back url="1_307_1024_50.jpg"/>
            <up url="2_307_1024_50.jpg"/>
            <down url="3_307_1024_20.jpg"/>
        </mobile>
        <tablet>
            <left url="0_307_1024_50.jpg"/>
            <front url="4_307_1024_50.jpg"/>
            <right url="5_307_1024_50.jpg"/>
            <back url="1_307_1024_50.jpg"/>
            <up url="2_307_1024_50.jpg"/>
            <down url="3_307_1024_30.jpg"/>
        </tablet>
    </image>
</k>

I only care about the <image> so I wrote:

#[derive(Debug, Deserialize)]
struct K {
	#[serde(rename = "$value")]
	items: Vec<KItems>,
}

#[derive(Debug, Deserialize)]
#[allow(non_camel_case_types)]
enum KItems {
	pan(), //{ h: u32, v: u32, f: u32 },
	view,
	autorotate,
	image(Image),
}

#[derive(Debug, Deserialize)]
struct Image {
	#[serde(rename = "type")]
	typ: String,
	multires: bool,
	right: Side,
	left: Side,
	up: Side,
	down: Side,
	front: Side,
	back: Side,
}

#[derive(Debug, Deserialize)]
struct Side {
	url: String
}

But when I try to parse the above doc, the program allocates more and more GB of RAM, even surpassing 18 GB and doesn't return so I have to kill it.

  1. This should never happen. Why does it happen?
  2. How can I ignore all the attributes/child items I don't care about?

I tried pan, pan(), but neither seems to work...

Thanks!

Can't deserialize f32

In trying to deserialize:

#[derive(Debug, Deserialize)]
struct NoisePower {
    value: f32,
    units: String
}

#[derive(Debug, Deserialize)]
struct ReceiverReport {
    #[serde(rename = "NoisePower")]
    noise_power: NoisePower
}

#[test]
    fn receiver_noise_report_a() {
        let xml = r##"
            <!--<?xml version=โ€1.0โ€ standalone=โ€yesโ€?>-->
            <ReceiverReport>
                <NoisePower value="-152.3" units="dBm"/>
            </ReceiverReport>
            "##;
        let report: ReceiverReport = deserialize(xml.as_bytes()).unwrap();
        println!("{:#?}", report);   
    }

I get:

'tests::receiver_noise_report_a' panicked at 'called Result::unwrap() on an Err value: invalid type: string "-152.3", expected f32', /checkout/src/libcore/result.rs:906:4

How do I ensure that NoisePower::value is deserialized as f32?

I am unable to map some parts of vk.xml to serde

From the vulkan spec:

    <types comment="Vulkan type definitions">
        <type name="vk_platform" category="include">#include "vk_platform.h"</type>

            <comment>WSI extensions</comment>
        <type category="include">#include "<name>vulkan.h</name>"</type>
        <type category="include">#include &lt;<name>X11/Xlib.h</name>&gt;</type>
        <type category="include">#include &lt;<name>X11/extensions/Xrandr.h</name>&gt;</type>
        <type category="include">#include &lt;<name>android/native_window.h</name>&gt;</type>
        <type category="include">#include &lt;<name>mir_toolkit/client_types.h</name>&gt;</type>
        <type category="include">#include &lt;<name>wayland-client.h</name>&gt;</type>
        <type category="include">#include &lt;<name>windows.h</name>&gt;</type>
        <type category="include">#include &lt;<name>xcb/xcb.h</name>&gt;</type>

        <type requires="X11/Xlib.h" name="Display"/>
        <type requires="X11/Xlib.h" name="VisualID"/>
        <type requires="X11/Xlib.h" name="Window"/>
        <type requires="X11/extensions/Xrandr.h" name="RROutput"/>
        <type requires="android/native_window.h" name="ANativeWindow"/>
        <type requires="mir_toolkit/client_types.h" name="MirConnection"/>
        <type requires="mir_toolkit/client_types.h" name="MirSurface"/>
        <type requires="wayland-client.h" name="wl_display"/>
        <type requires="wayland-client.h" name="wl_surface"/>
        <type requires="windows.h" name="HINSTANCE"/>
        <type requires="windows.h" name="HWND"/>
        <type requires="windows.h" name="HANDLE"/>
        <type requires="windows.h" name="SECURITY_ATTRIBUTES"/>
        <type requires="windows.h" name="DWORD"/>
        <type requires="windows.h" name="LPCWSTR"/>
        <type requires="xcb/xcb.h" name="xcb_connection_t"/>
        <type requires="xcb/xcb.h" name="xcb_visualid_t"/>
        <type requires="xcb/xcb.h" name="xcb_window_t"/>

I have no idea how this would map to serde. I almost believe that this can not be expressed?

I am really confused, any help is greatly appreciated.

Return error on composite key.

// TODO: Is it possible to ensure our key is never a composite type?

IIUC this can be done by creating a new serializer that is restricted to only "key safe" types instread of returning parent. Then that serilizer throws an error for anything that can't be used as a key.

Parsing wrapped primitives

I have an element which can take either of the following forms:

<plaintext>some text</plaintext>
</plaintext>

It can also be missing entirely.

What is the proper way to handle this scenario with serde-xml-rs?

(Continued from serde-deprecated/xml#35)

failed with a BOM file

If the file has a BOM header, the 'deserialize' will be failed.

Report error like this:

Error { pos: 1:1, kind: Syntax("Unexpected characters outside the root element: \u{feff}") }

Advanced #[serde(rename)] impl

Required detail discussion, my vision is for #[serde(rename = "$xml:x")]

  • $xml:tag:value or $xml:value for tag Value
  • $xml:ns:name for NS Value
  • $xml:attr:name for Attr Value

Also somethink like $path (may be hard to impl)

Most XML definitions looks like that

<Reply>
    <Container>
        <Item>1</Item>
        <Item>2</Item>
    </Container>
</Reply>

But usually we don't need container element, we just need collect items, and skip reply/container

struct X {
    #[serde(rename = "$xml:path:Reply/Container/Item")]
    items: Vec<i32>
}

Finally may be impl something like XPath? (goalgs for v2.0)

Affect

Namespace

Do you have any way to get namespace during deserialisation ?
And to add prefixes (and definition) of them for serialisation ?

Deserialize with xml header

I'm trying to deserialize xml that starts with:

<?xml version=โ€1.0โ€ standalone=โ€yesโ€?>

but it fails. Is there a way to ignore the header?

Unable to handle structures with varying keys

I'm having trouble deserializing input like this:

<parent attr="value">
  <required_field>some other value</required_field>
  <kind_a>kind a</kind_a>
  <kind_b>32</kind_b>
</parent>

into a structure like this:

#[serde(untagged)]
enum AnEnum {
  KindA(String),
  KindB(i32),
}

struct MyStruct {
  attr: String,
  required_field: String,
  kinds: Vec<AnEnum>,
}

I can get it to sort of work if I use #[serde(rename="$value")] for kinds, but this fails because then all fields go there. Am I stuck with having to implement deserialization myself or is there something I can do to make this simpler? The XML is fixed and the rust is flexible here, so if structure changes are required, that's fine. Also, there is (luckily) at most one kinds field per structure, but the presence of the required_field makes things complicated to go down that route (AFAICT).

Update readme.md

Todo

  • serde version 1.0
  • code example should work with current syntax
  • add serializer

Deserializing fields with values

What rust structure shall I go for the following situation?
<MyStruct> <Foo a="351"> <deDE>Flugzeug</deDE> <enUS>Airplane</enUS> </Foo> <Foo a="342" b="bar">Triangle</Foo> </MyStruct>

I could not find a suitable example for describing a field with a value like <a b="c">k</a>

Thanks!

Deserializing lists

The following can be sucessfully deserialized into the following structs.

<root>
  <someTag>
  </someTag>
  <serviceList>
    <service>
      ...
    </service>
    <service>
      ...
    </service>
  </serviceList>
</root>
struct Root {
    some_tag: SomeTag,
    service_list: ServiceList,
}
struct ServiceList {
    service: Vec<Service>,
}
struct Service {
}

However, is it possible to avoid creating the ServiceList struct and just make the service_list member a Vec instead?

Serializing should use the XML library instead of printing strings

When I initially wrote the Serializer I was just using write!() to write the XML as a string to an internal io::Writer.

Ideally we should be using the xml-rs library instead so can guarantee the generated XML is syntactically correct, plus we'd get things like pretty-printing for free.

Infinite Loop when parsing Vec

This results in an infinite loop:

let tickets: Vec<Ticket> = deserialize(xml.as_bytes()).unwrap();

When used on a xml doc that has multiple root elements.

How to serialize struct field as xml attribute?

How to serialize a struct field as xml attribute?

What I want to get is:
"<Message><ARTSCommonHeader MessageType="Request"/><SystemTransaction ActionCode="Update"><DeviceTime>2017-10-11T12:53:00</DeviceTime></SystemTransaction></Message>"

But if I use normal struct fields for MessageType and ActionCode, they are not serialized as attributes but as tags (<MessageType>Request</MessageType> and <ActionCode>Update</ActionCode>).

Is there a #[annotation] I should use for the struct fields?

serialize doesn't work

following code:

#[macro_use]
extern crate serde_derive;
extern crate serde_xml_rs;
//extern crate serde;

use serde_xml_rs::*;

#[derive(Debug, Deserialize, Serialize) ]
struct Item {
pub name: String,
pub source: String
}

#[derive(Debug, Deserialize, Serialize)]
struct Project {
pub name: String,

#[serde(rename = "Item", default)]
pub items: Vec<Item>

}

fn get_project() -> Project {
let s = r##"




"##;
let project: Project = deserialize(s.as_bytes()).unwrap();
project
}

fn main() {

let p: Project = get_project();
println!("\n\ndeserialized\n{:#?}", p);

let mut buffer = Vec::new();

serialize(&p, &mut buffer);
println!("\nserialized buffer:\n{:?}", buffer);
let serialized = String::from_utf8(buffer).unwrap();
println!("\nserialized:\n{}", serialized);

}

produces output:

deserialized
Project {
name: "my_project",
items: [
Item {
name: "hello",
source: "world.rs"
},
Item {
name: "goodbye",
source: "world.rs"
}
]
}

serialized buffer:
[60, 80, 114, 111, 106, 101, 99, 116, 62, 60, 110, 97, 109, 101, 62, 109, 121, 95, 112, 114, 111, 106, 101, 99, 116, 60, 47, 110, 97, 109, 101, 62, 60, 73, 116, 101, 109, 62]

serialized:
my_project

Add streaming deserialization

I merged #32 without realizing the implications.

Now we can parse

<item name="hello" source="world.rs" />
<item name="hello" source="world.rs" />

as a sequence, even though it is malformed XML.

On the other hand, serde-json has a way to deserialize objects from a stream. I think we should forbid this deserialization from working in the regular case, but add an additional way to get an iterator that produces multiple values from an input like the above.

Serialization Support

From serde-deprecated/xml#35 it sounds like this might soon be the de facto crate for using xml with serde. Are there any plans for implementing Serialization?

If so, is there anything I can do to help? I'm wanting to serialize to and from XML in one of my projects and it'd be nice to use serde instead of having to implement it myself with some sort of ToXML trait.

Unable to deserialize newtype structs

I just encountered an issue when deserializing a newtype struct. You usually use a newtype struct (e.g. struct Foo(Bar)) to put something inside a tag, however serde_xml_rs seems to want to treat it as a map.

Recycling bits from the tests/test.rs file:

#[derive(Debug, PartialEq, Deserialize)]
struct NewType(Item);

#[test]
fn newtypes_are_just_wrappers_around_the_inner_type() {
    let src = "<NewType>
                <Item>
                    <name>foo</name>
                    <source>/path/to/foo</source>
                </Item>
            </NewType>";

    let should_be = NewType(Item {
        name: String::from("foo"),
        source: String::from("/path/to/foo"),
    });

    let got: NewType = from_str(src).unwrap();
    assert_eq!(got, should_be);
}

When run (cargo test --tests), this errors with:

---- newtypes_are_just_wrappers_around_the_inner_type stdout ----
	DEBUG - Peeked StartElement(NewType, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
DEBUG - Fetched StartElement(NewType, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
thread 'newtypes_are_just_wrappers_around_the_inner_type' panicked at 'called `Result::unwrap()` on an `Err` value: Error(Custom("invalid type: map, expected tuple struct NewType"), State { next_error: None, backtrace: None })', /checkout/src/libcore/result.rs:906:4
note: Run with `RUST_BACKTRACE=1` for a backtrace.

failing tests from serde-xml

test test_option ... FAILED
test test_parse_attributes ... FAILED
test test_parse_bool ... FAILED
test test_parse_complexstruct ... FAILED
test test_parse_enum ... FAILED
test test_parse_f64 ... FAILED
test test_parse_hierarchies ... FAILED
test test_parse_i64 ... FAILED
test test_parse_string ... FAILED
test test_parse_u64 ... FAILED
test test_forwarded_namespace ... FAILED
test test_parse_unit ... FAILED
test test_doctype ... FAILED
test futile2 ... FAILED
test test_nicolai86 ... FAILED

`Expected token XmlEvent::Characters(s), found StartElement` when trying to deserialize text and node children together

Hi!

I'm trying to capture some data from an xml fragment that includes both a child node, and a text node, but the parser seems to be having trouble. Looks like capturing $value is breaking when there's any other node present.

The use case is virtually identical to #58 but I'm seeing a different backtrace.

extern crate serde;                                                            
#[macro_use]                                                                   
extern crate serde_derive;                                                     
extern crate serde_xml_rs;                                                     
                                                                               
#[derive(Debug, Deserialize, PartialEq, Default)]                              
pub struct Description {                                                                                            
    #[serde(rename = "ShortName", default)]                                    
    pub short_name: String,                                                    
    #[serde(rename = "$value")]                                                
    pub text: String,                                                          
}                                                                              
                                                                               
#[cfg(test)]                                                                   
mod tests {                                                                    
    use serde_xml_rs::deserialize;                                             
    use super::Description;                                                    
                                                                               
    #[test]                                                                    
    fn test_text_plus_node() {                                                 
        let xml = r#"                                                          
        <Description xml:lang="en">                                            
            <ShortName>Excelsior Desk Chair</ShortName>                          
            Leather Reclining Desk Chair with Padded Arms                        
        </Description>                                                           
        "#;                                                                       
                                                                                  
        let desc: Description = deserialize(xml.as_bytes()).unwrap();             
        assert_eq!("Leather Reclining Desk Chair with Padded Arms", desc.text);
        assert_eq!("Excelsior Desk Chair", desc.short_name);                      
    }                                                                             
} 

Running the test:

$ cargo test
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running target/debug/deps/xml_test-85159a57334a1842

running 1 test
test tests::test_text_plus_node ... FAILED

failures:

---- tests::test_text_plus_node stdout ----
	thread 'tests::test_text_plus_node' panicked at 'called `Result::unwrap()` on an `Err` value: Expected token XmlEvent::Characters(s), found StartElement(ShortName, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})', /checkout/src/libcore/result.rs:916:5

Wondering if there's another way we should be marking this up as a struct, or if this is just a bug in the deserialization.

Unable to handle structure with different array values

Hello,
I modified the example slightly. Adding OtherItem into the Project structure give the following error:
thread 'it_works' panicked at 'called Result::unwrap() on an Err value: duplicate field Item', /checkout/src/libcore/result.rs:859

Modified example:

#[macro_use] extern crate serde_derive;
extern crate serde_xml_rs;

use serde_xml_rs::deserialize;

#[derive(Debug, Deserialize)]
struct Item {
    pub name: String,
    pub source: String,
}

#[derive(Debug, Deserialize)]
struct OtherItem {
    pub name: String,
}

#[derive(Debug, Deserialize)]
struct Project {
    pub name: String,

    #[serde(rename = "Item", default)]
    pub items: Vec<Item>,
    #[serde(rename = "OtherItem", default)]
    pub other_items: Vec<OtherItem>,
}

#[test]
fn it_works() {
    let s = r##"
        <Project name="my_project">
            <Item name="hello" source="world.rs" />
            <OtherItem name="hello" source="world.rs" />
            <Item name="hello" source="world.rs" />
        </Project>
    "##;
    let project: Project = deserialize(s.as_bytes()).unwrap();
    println!("{:#?}", project);
}

If I change the order of the items, it works:

        <Project name="my_project">
            <Item name="hello" source="world.rs" />
            <Item name="hello" source="world.rs" />
            <OtherItem name="hello" source="world.rs" />
        </Project>

Is this a bug or is this how it is intended to work?

Add convenience function for parsing common forms of booleans

Hi all,

I have the following (minimized) program:

#[macro_use]
extern crate serde_derive;
extern crate serde;
extern crate serde_xml_rs as serde_xml;


#[derive(Deserialize, Debug)]
pub struct ListBucketResult {
    #[serde(rename = "IsTruncated")]
    pub is_truncated: bool
}

fn main() {
    let result_string = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<ListBucketResult xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\"><Name>relationalai</Name><Prefix>/</Prefix><KeyCount>0</KeyCount><MaxKeys>1000</MaxKeys><IsTruncated>false</IsTruncated></ListBucketResult>";
    let deserialized: ListBucketResult = serde_xml::deserialize(result_string.as_bytes()).expect("Parse error!");
}

And the result of running it is:

thread 'main' panicked at 'Parse error!: invalid type: string "false", expected a boolean', src/libcore/result.rs:860:4
stack backtrace:
   0:        0x10d695833 - std::sys::imp::backtrace::tracing::imp::unwind_backtrace::h5d6b821bcccc8af3
   1:        0x10d69681a - std::panicking::default_hook::{{closure}}::haca53f8b96e15b81
   2:        0x10d6964e2 - std::panicking::default_hook::h0029f59c1ec97ffc
   3:        0x10d698a22 - std::panicking::rust_panic_with_hook::hb8eae939c3fcaf9c
   4:        0x10d698884 - std::panicking::begin_panic_new::h3c5f9a0be81106be
   5:        0x10d6987d2 - std::panicking::begin_panic_fmt::hf585b6224c51a06c
   6:        0x10d69873a - rust_begin_unwind
   7:        0x10d6bffc3 - core::panicking::panic_fmt::h13ed235e8f32b1d5
   8:        0x10d636f7e - core::result::unwrap_failed::he22d59ef245624e4
   9:        0x10d62f274 - <core::result::Result<T, E>>::expect::hbd704600f0f822af
  10:        0x10d644522 - test_proj::main::ha13d119a4874b1d9
  11:        0x10d69982c - __rust_maybe_catch_panic
  12:        0x10d698d18 - std::rt::lang_start::heed3cc6f59fb65ca
  13:        0x10d645199 - main

I am using rustc 1.20.0 (f3d6973f4 2017-08-27) and my sample project is in test_proj.zip.

Can anyone help me to solve the issue?

Boolean fields specified with integers always parse as true

Boolean fields specified as integers always seem to parse as true, even with contents of 0:

#[macro_use] extern crate serde_derive;
extern crate serde_xml_rs;

#[derive(Debug, Deserialize)]
struct Structure {
    boolean_field: bool,
}

fn main() {
    let s = r##"
        <Structure boolean_field="0">
        </Structure>
    "##;
    let test: Structure = serde_xml_rs::from_reader(s.as_bytes()).unwrap();
    println!("{:#?}", test);
}

This test program outputs:

Structure {
    boolean_field: true
}

Deserialize "$value" as String or Vec<SomeStruct> based on attributes

I have data which can look like either:

<data encoding="XML">
  <tile gid="0"/>
  <tile gid="1"/>
</data>

or

<data encoding="base64">
  aaabbbbbbbcc
</data>

And, as far as I can see, this is not (yet) possible to deserialize. With the old serde-xml I was able to get a serde_xml::Value, which was an enum that could be, e.g., Value::Element(...) or Value::Content(...), which made it possible to handle them individually. However, I have no Idea how I would do that currently.

The regular Deserializer/Visitor pattern seems to fail here, as I need to dispatch based on what is actually inside the XML, and don't known that beforehand.

Edit:
Something that looked like it should work, but didn't. It worked for the string case, but not for the XML-case (cant match any variant-error):

#[derive(Debug, Deserialize)]
#[serde(rename="tile")]
struct Tile {
  gid: u32
}

#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum DataContent {
  Str(String),
  Tiles(Vec<Tile>)
}

#[derive(Debug, Deserialize)]
struct DataImpl {
  encoding: Option<String>,
  compression: Option<String>,
  #[serde(rename="$value")]
  value: DataContent,
};

Inlining either of the variants (String or Vec) into DataImpl actually makes it work for that case (but obviously not for the other).

All help is appreciated!

i64 is deserializaed as string

 #[derive(Debug, PartialEq, Deserialize, Serialize)]
struct Test_XML {
    t1: i32,
   t2: i32,
   wind_speed: i64
}

And xml

let s = r##"              
<?xml version="1.0"?>
<meas version="1.0">   
     <t1>-600</t1> 
    <t2>-600</t2>         
     <wind_speed>250</wind_speed> 
 </meas>  "##;

let project: Test_XML = deserialize(s.as_bytes()).unwrap();   
println!("{:?}",  project ); 

And error:

thread 'tests::meteo_xml_deserialization' panicked at 'called Result::unwrap() on an Err value: invalid type: string "250", expected i64', src/libcore/result.rs:906:4

if I change wind_speed to i32 works fine

Internally tagged enums are confused

It seems when using the internally tagged enums, the deserializer gets somewhat confused. If I have this code:

#[derive(Deserialize)]
#[serde(tag = "z")]
enum Z {
    A { b: String },
    B { c: String },
}
#[derive(Deserialize)]
struct X {
    y: Z,
}
let x = br#"<x><y z="A"><b>hello</b></y></x>"#;
serde_xml_rs::deserialize::<_, X>(Cursor::new(x)).unwrap();

it produces this error:

invalid type: map, expected a string

I believe this should actually work. This code does work:

#[derive(Deserialize)]
struct Z {
    z: String,
    b: String,
}
#[derive(Deserialize)]
struct X {
    y: Z,
}
let x = br#"<x><y z="A"><b>hello</b></y></x>"#;
serde_xml_rs::deserialize::<_, X>(Cursor::new(x)).unwrap();

I know the internally-tagged thing is a bit weird in XML, but I actually discovered it in real-life (parsing the conntrack XML output).

Duplicate field $value error

I'm currently, as part of trying to learn how serde-xml-rs works, trying to parse Senate roll call data. When learning new XML libraries this is my go-to source as it usually is well formed and has some more advanced pieces. An example of a vote:

<vote>
  <vote_number>00024</vote_number>
  <vote_date>12-Jan</vote_date>
  <issue>S.Con.Res. 3</issue>
  <question>On the Motion <measure>S.Amdt. 180</measure></question>
  <result>Rejected</result>
  <vote_tally>
    <yeas>51</yeas>
    <nays>47</nays>
  </vote_tally>
  <title>Motion to Waive All Applicable Budgetary Discipline Re: Hatch Amdt. No. 180; To establish a deficit-neutral reserve fund relating to strengthening Social Security and repealing and replacing Obamacare, which has increased health care costs, raised taxes on middle-class families, reduced access to high quality care, created disincentives for work, and caused tens of thousands of Americans to lose coverage they had and liked, and replacing it with reforms that strengthen Medicaid and the Children's Health Insurance Program without prioritizing able-bodied adults over the disabled or children and lead to patient-centered step-by-step health reforms that provide access to quality, affordable private health care coverage for all Americans and their families by increasing competition, State flexibility, and individual choice, and safe-guarding consumer protections that Americans support.</title>
</vote>

When I try to deserialize the <question> tag I run into issues. I've been able to get it to (mostly) deserialize by setting up this structure:

// ...snip...
#[derive(Debug, Deserialize)]
struct Question {
    #[serde(default)]
    measure: String,
}
// ...snip...
#[derive(Debug, Deserialize)]
struct Vote {
    vote_number: String,
    vote_date: String,
    issue: String,
    question: Question,
    result: String,
    vote_tally: VoteTally,
    title: String,
}
// ...snip...

However, with this data, the text that is included in the <question> tag is also important. I've tried changing the Question struct to the below example, but I get a panic with an error stating duplicate field $value.

#[derive(Debug, Deserialize)]
struct Question {
    #[serde(rename="$value")]
    text: String,
    #[serde(default)]
    measure: String,
}

The error, with backtrace:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: duplicate field `$value`', src/libcore/result.rs:906:4
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
   1: std::panicking::default_hook::{{closure}}
   2: std::panicking::default_hook
   3: std::panicking::rust_panic_with_hook
   4: std::panicking::begin_panic
   5: std::panicking::begin_panic_fmt
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::result::unwrap_failed
   9: <core::result::Result<T, E>>::unwrap
  10: rust_parse_xml_serde::main
  11: __rust_maybe_catch_panic
  12: std::rt::lang_start
  13: main

Is this expected? From what I've gathered so far, it seems I should be able to gather the text through $value, and so I'm not sure if I'm doing something verboten or if I've run into an actual bug.

I placed my whole code (with an example XML file) in a github repo.

Deserializing Vec fails if there's something in between

Using this code...

#[macro_use] extern crate serde_derive;
extern crate serde_xml_rs;

use std::io::stdin;

#[derive(Debug, Deserialize)]
struct Root {
    foo: Vec<String>,
    bar: Vec<String>
}

fn main() {
    let res: Root = match serde_xml_rs::deserialize(stdin()) {
        Ok(r) => r,
        Err(e) => panic!("{:?}", e),
    };

    println!("{:?}", res);
}

...to deserialize this file...

<root>
    <foo>abc</foo>
    <foo>def</foo>

    <bar>lmn</bar>
    <bar>opq</bar>

    <foo>ghi</foo>
</root>

...gives this error...

thread 'main' panicked at 'duplicate field `foo`', src/bin/bug.rs:15:18

This doesn't happen if the elements in the root are contiguous, like...

<root>
    <foo>abc</foo>
    <foo>def</foo>
    <foo>ghi</foo>

    <bar>lmn</bar>
    <bar>opq</bar>
</root>

Expected output in both cases is:

Root { foo: ["abc", "def", "ghi"], bar: ["lmn", "opq"] }

Am I doing something wrong here? Is there any way around this? The data I'm working with is formatted like the first example.

Does not handle surrogate pairs?

I get a syntax error when decoding the following element:

 <sms protocol="0" address="redacted" date="1487032373307" type="2" subject="null" body="&#55357;&#56495;" toa="null" sc_toa="null" service_center="null" read="1" status="-1" locked="0" />
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { pos: 3304:101, kind: Syntax("Invalid decimal character number in an entity: #55357") }', /checkout/src/libcore/result.rs:859

Get tag attributes and body

<Field name="blabla">content</Field>

How can I get both the attribute name and the body into a struct?

struct Field {
  pub name String,
  pub body String,
}

Deserializer does not support borrowed strings

It appears that serde-xml-rs cannot be used to deserialize into borrowed strings. This is rather annoying.

Example:

#[macro_use]
extern crate serde_derive; // 1.0

extern crate serde; // 1.0
extern crate serde_xml_rs; // 0.2

#[derive(Debug, Deserialize)]
struct Foo<'a> {
    name: &'a str
}

fn main() {
    let foo: Result<Foo,_> = serde_xml_rs::deserialize(&b"<foo><name>test</name></foo>"[..]);
    println!("{:?}", foo);
}

This prints

Err(invalid type: string "test", expected a borrowed string)

Can't handle duplicate field between TAG and Attribute

The following XML data result in "Error: duplicate field fragments"
<?xml version="1.0" encoding="UTF-8"?> <output_message timezone="UTC" id="3200" repeat="0" fragments="1"> <fragments> <fragment cycle="0" frame="11" /> </fragments> </output_message>

input struct is:

#[derive(Debug, Deserialize)]
pub struct OutputMessage {
pub timezone: String,
pub id: String,
pub repeat: String,
#[serde(rename = "fragments", default)]
pub nr_of_fragments: String,
pub fragments: Fragments,
}

#[derive(Debug, Deserialize)]
pub struct Fragments {
pub fragment: Vec,
}

#[derive(Debug, Deserialize)]
pub struct Fragment {
pub cycle: String,
pub frame: String,
}

Structure for optionally-empty lists

I have a use case which involves parsing optionally-empty lists, like so:

<mylist>
    <item name="foo"></item>
    <item name="bar"></item>
</mylist>

or:

<mylist>
</mylist>

What would be the equivalent rust struct to represent this with serde-xml-rs?

Duplicate field error on interwoven element sequences

I've got some XML that looks like this:

<?xml version='1.0' encoding='UTF-8'?>
<kml xmlns='http://www.opengis.net/kml/2.2' xmlns:gx='http://www.google.com/kml/ext/2.2'>
<Document>
    <Placemark>
        <open>1</open>
        <gx:Track>
            <altitudeMode>clampToGround</altitudeMode>
            <when>2017-01-01T00:00:00Z</when>
            <gx:coord>0 0 0</gx:coord>
            <when>2017-01-01T00:00:00Z</when>
            <gx:coord>0 0 0</gx:coord>
        </gx:Track>
    </Placemark>
</Document>
</kml>

and a Track struct looks like this:

#[derive(Debug,Deserialize)]
pub struct Track {
    #[serde(rename = "altitudeMode")]
    pub altitude_mode: String,

    #[serde(rename = "when", default)]
    pub whens: Vec<String>,

    #[serde(rename = "coord", default)]
    pub coords: Vec<String>,
}

This causes my parsing test to fail with

panicked at 'called `Result::unwrap()` on an `Err` value: duplicate field `when`

However, when I tweak the XML to read

<?xml version='1.0' encoding='UTF-8'?>
<kml xmlns='http://www.opengis.net/kml/2.2' xmlns:gx='http://www.google.com/kml/ext/2.2'>
<Document>
    <Placemark>
        <open>1</open>
        <gx:Track>
            <altitudeMode>clampToGround</altitudeMode>
            <when>2017-01-01T00:00:00Z</when>
            <when>2017-01-01T00:00:00Z</when>
            <gx:coord>0 0 0</gx:coord>
            <gx:coord>0 0 0</gx:coord>
        </gx:Track>
    </Placemark>
</Document>
</kml>

it works fine. Is there anything I can do to avoid this error?

Also, ideally I'd like to be able to transform the pairs of <when> and <gx:coord> into objects, eg.

#[derive(Debug,Deserialize)]
pub struct Position {
    pub when: String,
    pub coord: String,
}

#[derive(Debug,Deserialize)]
pub struct Track {
    #[serde(rename = "altitudeMode")]
    pub altitude_mode: String,

    # Some attribute here I guess?
    pub positions: Vec<Position>,
}

I mention this in case there's a way to kill two birds with one stone, avoiding the error by pretending each pair of elements is one Position element/struct.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.