Tiny Tiny RSS: Community

Readability got me banned from a server?

Here’s a full list of all active plugins:

auth_internal

af_fsckportal

af_fullpost

af_newspapers

af_readability

af_redditimgur

af_unburn

af_zz_noautoplay

af_zz_vidmute

auto_assign_labels

bookmarklets

cache_starred_images

close_button

entityclean

ff_xmllint

mail

no_url_hashes

note

share

vf_shared

it would probably be more helpful if you posted feed debugger (f D) logs for this feed.

I was going to suggest this as well but the @Matthew has blocked his IP so it would be difficult to get real-world results…

oh duh

well maybe @Matthew would be kind enough to unblock op for diagnostic purposes

@fox Is this ok: https://i.imgur.com/zHZdnzG.png

@JustAMacUser Matthew very graciously unblocked me yesterday as a good-will gesture subsequent to this conversation.

oh. well he did implement conditional requests so until his feed posts something new, i think you’re going to be stuck with http 304.

which should effectively largely solve this problem, i suppose…

I’m wondering if these two plugins are causing problems.

entityclean is a bit sloppy… it just does a regex replace on the whole feed (as a massive text string) instead of properly parsing the DOM.

ff_xmllint uses lint and tidy and those can definitely change the feed data depending on what they encounter.

Really entityclean shouldn’t exist because it’s too careless in how it works and ff_xmllint needs to be selectively applied to only feeds that are known to have invalid form.

ah right there are plugins which work on entire feed before tt-rss processes individual articles. i’ve completely forgotten those exist. yeah they can easily cause those kinds of problems.

e: maybe we should consider deprecating those hooks or hiding them behind a config.php knob with a bunch of warnings on top of it.

@JustAMacUser Thanks for the insight on entityclean, I’ll simply remove it. I installed it years ago when I was having trouble with a local newspaper’s feed being full of garbage that was throwing errors and I couldn’t get them to fix their shit. Ditto ff_xmllint. I didn’t even have lint or tidy enabled. I’ve deactivated them both.

af_comics uses some of those hooks.

let’s continue hook discussion here - https://community.tt-rss.org/t/troublesome-hooks-or-not/2890

There should be at least 2 new additions tomorrow that will then update the feed at 1pm GMT (although the modified date will be earlier than that as it uses the date it was detected and added to the database).

Yep, after chatting with @Reader_Refugee it was clear that it wasn’t a scraping attempt and there was nothing nefarious going on I unblocked the IP address :smiley:

There were new items to fetch today, so I re-ran the feed debugger with forced refetch. Here’s the output: https://pastebin.com/AgFpFQfM

[15:43:33] stored article seems up to date [IID: 1599313], updating timestamp only

well it looks like there are previously existing items which are not being processed needlessly so it’s an improvement.