I’m wondering if these two plugins are causing problems.
entityclean is a bit sloppy… it just does a regex replace on the whole feed (as a massive text string) instead of properly parsing the DOM.
ff_xmllint uses lint and tidy and those can definitely change the feed data depending on what they encounter.
Really entityclean shouldn’t exist because it’s too careless in how it works and ff_xmllint needs to be selectively applied to only feeds that are known to have invalid form.
ah right there are plugins which work on entire feed before tt-rss processes individual articles. i’ve completely forgotten those exist. yeah they can easily cause those kinds of problems.
e: maybe we should consider deprecating those hooks or hiding them behind a config.php knob with a bunch of warnings on top of it.
@JustAMacUser Thanks for the insight on entityclean, I’ll simply remove it. I installed it years ago when I was having trouble with a local newspaper’s feed being full of garbage that was throwing errors and I couldn’t get them to fix their shit. Ditto ff_xmllint. I didn’t even have lint or tidy enabled. I’ve deactivated them both.
There should be at least 2 new additions tomorrow that will then update the feed at 1pm GMT (although the modified date will be earlier than that as it uses the date it was detected and added to the database).
Yep, after chatting with @Reader_Refugee it was clear that it wasn’t a scraping attempt and there was nothing nefarious going on I unblocked the IP address