Giter Site home page Giter Site logo

wxr's Introduction

XML Schemas for WordPress Export RSS (WXR)

An XML Schema 1.1 schema for WXR.

Description

These schemas are intended to serve primarily as documentation of WXR.

Currently, there are schemas for:

  1. WXR 1.2
  2. a proposal for a new WXR 1.3

Purpose

There are 2 primary audiences for these schemas:

  1. WP Core contributors involved in maintaining the export/import code (i.e., to serve as a check for keeping the exporter/importer in sync when making modifications)
  2. plugin authors who want to write their own export/import code.

Not Purpose

These schemas are NOT intended to be used for run-time validation during an import. The primary reason for this is that the XML parsers that are included in PHP do NOT support XML Schema 1.1 (because they are all based on libxml, which only supports validation against XML Schema 1.0, which is not expressive enough to capture the rules of RSS, so validating with a 1.0 schema would be useless, or worse) :-)

Documentation

HTML browsable documentation generated from these schemas is available at:

  1. WXR 1.2
  2. WXR 1.3 Proposed

If I could figure out how to add that documentation as wiki pages here on GitHub I would, but I can't, so I won't :-)

Issues

Theses schema documents are sprinkled throughout with xs:annotation/xs:documentation elements. Most are intended to document the element/type they are children of. However, some contain @todo's where I know there is still something to do or where there is an open question about how something should be done. Many of these @todo's note that I intend to open trac tickets on a number of topics.

I'd be very interested in general feedback from the WP community on this schema before before I open any of those tickets.

If you have comments/suggestions, please open an issue here. General "discussion" issues are welcome!

wxr's People

Contributors

pbiron avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

drzraf psramkumar

wxr's Issues

change namespaceURI versioning policy

WXR 1.0, 1.1 and 1.2 all use different namespaceURIs:

  • http://wordpress.org/export/1.0/
  • http://wordpress.org/export/1.1/
  • http://wordpress.org/export/1.2/

respctively.

Doing namespace-aware parsing that is backwards compatible in the sense of handling WXR files from different versions with the same code would be MUCH easier if the WXR namespaceURI did NOT change with each version.

The namespaceURI in WXR 1.3-proposed is http://wordpress.org/export/ and would stay the same for any versions after that. The specific version of WXR in WXR 1.3-proposed (and any future version) is specified with rss/@wxr:version.

I recognize that this change would represent a MAJOR backwards imcompatibility (and thus will require A LOT of discussion and consensus building before it could be enacted) but it would would make FUTURE changes to WXR and the corresponding export/import code easier to deal with.

use RSS's <enclosure> element instead of <wxr:attachment_url>

The RSS Advisory Board's "Best Practices" profile says, in part

When a namespace element duplicates the functionality of an element defined in RSS, the core element should be used.

and the semantics of RSS's exactly matches the semantics of <wxr:attachment_url>.

RSS's <enclosure> has 3 required attributes:

  • The length attribute indicates the size of the file in bytes
  • The type attribute identifies the file's MIME media type
  • The url attribute identifies the URL of the file

The file size and mime-type is available at the time an export is generated (and of course, so is the URL), so there would be no problem with using <enclosure> in newly generated WXR instances.

Making this change would slightly complicate matters for the transform from WXR 1.0, 1.1 and 1.2 instances into WXR 1.3-proposed instances that is part of the WordPress Importer Redux. However, the complication is only slight, since the response headers returned by a call to wp_remote_head() on the URL SHOULD provide the necessary information.

document WXR on DevHub

It would be good to provide documentation about WXR (all versions?, as a Handbook?) on DevHub.

Exactly what form that documentation has is open for discussion. Any one have any suggestions?

I'm not sure whether the documentation that I have auto-generated from the schemas (e.g., http://sparrowhawkcomputing.com/wxr/1.3/docs/wxr.html is appropriate as is. However, that documentation is generated by an XSLT transform on the schema documents...and that transform could certainly be modified to produce something that would better integrate with the look-and-feel of DevHub.

investigate why WXR is an RSS Profile

Generating and consuming WXR (i.e., export/import) would be somewhat easier (how much is a matter of opinion) if WXR where a pure WP-defined markup language and not an RSS profile.

It would be helpful knowing the history of why WXR is an RSS profile. Anyone who knows the history of how this came to be all those years ago when WXR 1.0 was introduced, please comment here.

allow xs:anyAttribute on all elements with simple content

Define complexTypes with simpleContent that with xs:anyAttribute for all of the simple types used throughout.

This applies to everything in WXR 1.3 and all elements defined in the RSS spec for WXR 1.2.

This is to better capture RSS's extension rules and my proposed extension rules for WXR.

post a "namespace document" at the namespaceURI(s)

When a markup language is defined in a namespace (such as WXR), it is considered good practice to post a "namespace document" at the namespaceURI.

Namespace documents are usually short and sweet and do not constitute full documentation on the markup defined in that namespace. However, they typically contain a link to where that full documentation can be found (see Issue 4: document WXR on DevHub).

Good examples of namespace documents include:

XML applications generally do NOT dereference namespaceURIs, but are certainly not prohibited from doing so. So, posting a namespace document at the namespaceURI(s) should not be undertaken lightly because it might result in increased HTTP hits on .org.

The RSS spec defines , which is an optional child of <channel>, whose value is:

A URL that points to the documentation for the format used in the RSS file....It's for people who might stumble across an RSS file on a Web server 25 years from now and wonder what it is.

WXR 1.3-proposed includes <docs> and WordPress Exporter Redux outputs it, with a value that is the same as the WXR 1.3-proposed namespaceURI.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.