[ ] I’m using stock docker compose setup, unmodified.
[ ] I’m using docker compose setup, with modifications (modified .yml files, third party plugins/themes, etc.) - if so, describe your modifications in your post. Before reporting, see if your issue can be reproduced on the unmodified setup.
[x] I’m not using docker on my primary instance, but my issue can be reproduced on the aforementioned docker setup and/or official demo.
I’m using some feeds from feedburner; however since a while ago, their feed is broken as they append a script tag at the end of it.
Steps to reproduce the problem:
-
Subscribe to “Nerf NOW!!”
-
After the updater runs, the feed is not updated and the following is found in the logs: “Update process for feed 12 ([Unknown], owner UID: 2) failed with exit code: 100 (LibXML error 5 at line 95 (column 1): Extra content at the end of the document).”
-
Tiny Tiny RSS version (including git commit id): dc25a9cf6816b756cb38490eab93f02589c44a10
-
Platform (i.e. Linux distro, Docker, PHP, PostgreSQL, etc) versions: Happens everywhere, but tested on the official demo and Debian
Extra info:
The issue is clearly on the feedburner side, but I’m pretty sure having them fix it will not happen at this point. I’m not too sure on what to do except maybe allow a looser feed parser (which might not be a good idea).
I wonder if it would make sense as a toggle, or something to “pre-process” feeds, even as an advanced option.
fox
2
tt-rss requires valid XML input. that’s all there is to it.
you can do that using plugins. not sure what’s there for broken xml though.
Fair enough. Requiring valid XML is an obviously sane choice, but unfortunately sometimes we have to deal with real world data, and unfortunately reaching out to feedburner is seemingly impossible.
I glanced at the plugin support, and put something together that fits my needs.
It might requires more tuning but so far it didn’t break any other feeds I’m subscribed to, so I’m good.
In case anyone cares to handle it, I’ll leave the code here:
<?php
class XML_Extraend extends Plugin {
function about() {
return array(null, "Strip extra stuff at end of feed", "CleyFaye", false);
}
function init($host) {
$host->add_hook($host::HOOK_FEED_FETCHED, $this);
}
function hook_feed_fetched($feed_data, $fetch_url, $owner_uid, $feed) {
$filtered = $this->filter_feed_xml($feed_data);
return $filtered;
}
private function filter_feed_xml($feed_data) {
$final_tag = "</feed>";
$lastIndex = strrpos($feed_data, $final_tag);
if ($lastIndex === false) return $feed_data;
return substr($feed_data, 0, $lastIndex + strlen($final_tag));
}
function api_version() {
return 2;
}
}
?>
(if anyone wants to take this and make it a proper plugin, feel free)
I have a couple of feedburner feeds and don’t see this issue with them. The script is for loading Cloudflare’s email protection script and nerfnow.com looks to be behind Cloudflare according to a dns lookup. Given feedburner is a google product it’s most likely an issue on nerfnow’s/Cloudflare’s end and not feedburner’s.
Opening the broken feedburner xml source you can see the rss location directly from the nerfnow website Nerf NOW!! appears to be fine so it may just be a cached version is broken.
Interesting. Thanks for looking it up. When I curl the source, I still get the script tag at the end, so it might be more complex than that.
Not being a general issue with feedburner is interesting. I’ll try to ask the site owner if he can do something; but anyway it’s clearly not something to handle in ttrss.