Comments (5)
I should have a patch for you tomorrow.
from gofeed.
For you information, numerical character references are decoded by encoding/xml
starting at line 991 of https://golang.org/src/encoding/xml/xml.go.
from gofeed.
@galdor thanks for the bug report.
For a number of reasons I was unable to utilize go's built in xml chardata decoding. Mostly because I wanted to be able to parse naked XML/HTML for certain elements. This has me using the ,innerxml
attribute which bypasses the built-in code you referenced.
So, I think to do this correctly, I'm going to have to re-implement what they do myself. I'll have to replace the string.Replace
code that I use in parseutils.go
with a proper parser which decodes both the numeric and non-numeric character entities it finds in the text.
I'm currently in the process of rewriting my unit tests for this project (due to an unintentional licensing violation on my part). When #53 is ready to merge with my completed Atom tests, I'll add some new unit tests for both atom/rss that ensures that character entities and numeric character entities get decoded correctly.
from gofeed.
So there's only DecodeEntities
to modify, i.e. parse the string to decode both types of entities ? I could give it a try if you wish.
from gofeed.
@galdor sure, give it a try.
I had considered implementing it like I did the other entity decoding, but I was concerned if there was a lot of the numeric entities that we would have to match, it may be an expensive operation to strings.Replace
for so many items.
But, perhaps doing it the slow way is better than nothing for now, and I can take time to implement it on a character-by-charcter basis (like go's xml package does) later when I have time.
from gofeed.
Related Issues (20)
- Failed to detect feed type:http://www.dailytelegraph.com.au/feed/ HOT 2
- XML syntax error on line 3131: illegal character code U+001C HOT 2
- Failed to match this url:https://rss.netkeiba.com/?pid=rss_netkeiba&site=netkeiba HOT 2
- Undetect Json array HOT 2
- Multiple 'enclosure' elements not being picked up... HOT 2
- build failed on go 1.18
- GoFeed parses the CoinDesk feed but the Link fields are empty HOT 4
- no parse content HOT 2
- Thanks for the hat tip! HOT 1
- Support for comments tag in feed Item HOT 3
- Limit the maximum number of feed items when parsing HOT 5
- Support URI field for Author in Atom Feed HOT 3
- corrupted / mangled nested custom XML HOT 1
- Enhance custom element handling HOT 5
- Make illegal character sanitization more robust HOT 1
- These links cannot be resolved HOT 3
- ITunes Summary should be store in the Description field of the feed if Description is empty
- Unit tests are failing HOT 8
- The RSS 2.0 comments fields appears to be missing HOT 1
- Ignore RSS "enclosure" child tags
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gofeed.