& in XML stream stops parsing

Dear TTRSS developers, I recently started having problems with a particular feed (see debug log below). It is no longer parsing correctly on my installation of TTRSS (no Docker). The same happens when I use TTRSS’s demo URL on this site which runs Tiny Tiny RSS v21.05-b2f888e. Also, the myfeedsucks page shows errors on the stream. Here is the debug output from TTRSS:

[07:24:14/117] start
[07:24:14/117] setting basic feed info for 12…
[07:24:14/117] running HOOK_FETCH_FEED handlers…
[07:24:14/117] === 0.0000 (sec) Af_Comics
[07:24:14/117] feed data has not been modified by a plugin.
[07:24:14/117] local cache will not be used for this feed
[07:24:14/117] last unconditional update request: 2021-05-11 07:23:47
[07:24:14/117] stored last modified for conditional request:
[07:24:14/117] fetching Wissenschaftsstadt Darmstadt (force_refetch: 1)…
[07:24:15/117] fetch done.
[07:24:15/117] effective URL (after redirects): Wissenschaftsstadt Darmstadt (IP: 85.10.253.17)
[07:24:15/117] server last modified:
[07:24:15/117] saving to local cache: cache/feeds/7a6c0aff575b607d4c68790500645d96c1e9c385.xml
[07:24:15/117] running HOOK_FEED_FETCHED handlers…
[07:24:15/117] feed data has not been modified by a plugin.
[07:24:15/117] fetch error: LibXML error 68 at line 269 (column 643): xmlParseEntityRef: no name

[07:24:15/117] + LibXML error 68 at line 269 (column 643): xmlParseEntityRef: no name

[07:24:15/117] + LibXML error 68 at line 269 (column 1298): xmlParseEntityRef: no name

[07:24:15/117] update failed.

As far as I was able to analyze the “no name” error 68 situation, it is due to two non-escaped ampersands (&) in the XML stream. Since the stream is not in my hands, I would like to ask the question if TTRSS can circumvent this situation?

Unfortunately, PHP is not my language but for those who are able to code PHP, the following link contains a few ways to work around this issue on TTRSS side:

Maybe TTRSS can improve on its robustness. But I am also willing to forward any suggestions from your side to the RSS stream owner if there is potential for them to improve the situation. Best regards.

what if, instead, feed publishers stopped generating broken XML documents? :thinking:

Thanks for the quick reply. In which way is it broken? Is my assumption correct that it is due to the ampersands? Should they replace "&" with "&" for example? I will try to improve the situation and let them know.

i’m not going to explain libxml errors to you, forum poster ripley.

they should replace whatever garbage they are using to generate XML documents.

Great. Thanks for the info.

Ah, good old fox, always the charmer :wink:

Because it’s the feed of my home city’s website, I just shot them a short email about the problem. We’ll see what they make of it.

that’s just how things are. attempting to fix broken XML on the consuming side is an impossible task, it’s a complex format which might be broken in so many different ways.

that’s just how things are. attempting to fix broken XML on the consuming side is an impossible task, it’s a complex format which might be broken in so many different ways.

I agree with you with regards to the topic, but in my experience a little less… direct style of communication often proved helpful. If you start off by stepping on people’s toes, they aren’t that much inclined to listen to what you have to say. I’m not suggesting to sugar coat everything, but a slap doesn’t sting as much if do it with a smile instead of a scowl :wink:

Anyway, that’s off topic in this thread in any case.

I did the same and I hope they get it right. They were successful in quickly fixing a certificate problem a year ago. So, let’s keep our fingers crossed.