Since php-readability was updated recently I’ve been noticing that occasionally my feeds stop updating. If I regress that commit then all of the feeds successfully update again. Going back to the new version will then work OK for a while before eventually failing again. I assume there must be some specific posts which are triggering a bug. I’m using PHP 7.3, but as I said, the older version works fine with PHP 7.3 if I git reset back to 487d06a20dc471fba487da579e69cf0cac291cc0.
[14:57:08/43903] Scheduled 37 feeds to update…
[14:57:09/43903] Base feed: ISPreview UK
[14:57:09/43903] => 2019-02-18 12:13:52.147662, 56 2
PHP Fatal error: Uncaught TypeError: Argument 1 passed to iterator_to_array() must implement interface Trav
ersable, null given in /usr/www/ttrss/vendor/andreskrey/Readability/Nodes/NodeTrait.php:324
Stack trace: #0 /usr/www/ttrss/vendor/andreskrey/Readability/Nodes/NodeTrait.php(324): iterator_to_array(NULL) #1 /usr/www/ttrss/vendor/andreskrey/Readability/Nodes/NodeTrait.php(421): andreskrey\Readability\N
odes\DOM\DOMText->getChildren(true) #2 /usr/www/ttrss/vendor/andreskrey/Readability/Readability.php(1270): andreskrey\Readability\Node
s\DOM\DOMText->hasSingleTagInsideElement(‘tr’) #3 /usr/www/ttrss/vendor/andreskrey/Readability/Readability.php(1166): andreskrey\Readability\Read
ability->prepArticle(Object(andreskrey\Readability\Nodes\DOM\DOMDocument)) #4 /usr/www/ttrss/vendor/andreskrey/Readability/Readability.php(155): andreskrey\Readability\Reada
bility->rateNodes(Array) #5 /usr/www/ttrss/plugins/af_readability/init.php(178): andreskrey\Readability\Readabi in /usr/www/ttrss/vendor/andreskrey/Readability/Nodes/NodeTrait.php on line 324
[14:57:11/43243] removing lockfile (43243)…
[14:57:11/41915] [reap_children] child 43243 reaped.
[14:57:11/41915] [SIGCHLD] jobs left: 1
[14:57:14/44892] Scheduled 0 feeds to update…
[14:57:14/44892] Sending digests, batch of max 15 users, headline limit = 1000
[14:57:14/44892] All done.
Since February 17th I’m getting Uncaught: TypeError when updating feeds
If possible include steps to reproduce the problem:
…
tt-rss version (including git commit id):
Version v18.12 (9e7bbf6)
Platform (i.e. Linux distro, PHP, PostgreSQL, etc) versions:
Arch Linux PHP 7.3.2 MySQL
Please provide any additional information below:
Strangely 1 feed is still updating sporadically. I rebooted the system after a kernel update yesterday which refreshed my feeds and I got about 25 news stories, but now it’s not refreshing properly. I use cron to update the feeds, but I also tried the update_daemon2.php file and get the same result.
CLI errors:
PHP Fatal error: Uncaught TypeError: Argument 1 passed to iterator_to_array() must implement interface Traversable, null given in /tt-rss/vendor/andreskrey/Readability/Nodes/NodeTrait.php:324
Stack trace: #0 /tt-rss/vendor/andreskrey/Readability/Nodes/NodeTrait.php(324): iterator_to_array(NULL) #1 /tt-rss/vendor/andreskrey/Readability/Nodes/NodeTrait.php(421): andreskrey\Readability\Nodes\DOM\DOMText->getChildren(true) #2 /tt-rss/vendor/andreskrey/Readability/Readability.php(1272): andreskrey\Readability\Nodes\DOM\DOMText->hasSingleTagInsideElement(‘td’) #3 /tt-rss/vendor/andreskrey/Readability/Readability.php(1166): andreskrey\Readability\Readability->prepArticle(Object(andreskrey\Readability\Nodes\DOM\DOMDocument)) #4 /tt-rss/vendor/andreskrey/Readability/Readability.php(155): andreskrey\Readability\Readability->rateNodes(Array) #5 /tt-rss/vendor/andreskrey/Readability/Nodes/NodeTrait.php on line 324
Ahhh so it’s not a ttrss issue, but an upstream issue with that library. OK. I don’t personally have a github account. Does anybody else who is seeing the same issue have one who could report it to save me having to register an account etc.?
Not much interest from the author to fix the problem, as I’m not much help in the coding department. Maybe someone can read through the few comments and help with what he’s asking for. Otherwise we’ll have to revert and hope for another type of fix.
So I rolled back to a previous commit. For those that would like help with that, I issued git log --oneline and went back 8 commits to find 13e7e775a and then I issued git reset --hard 13e7e775a and now my cron is updating the feeds again. I use cron nightly to git pull the repo, and I commented out that line in my contab so the revert sticks.
Some people also issue git clean -f after a git reset, YMMV.
Hope this helps! EDIT: Please make a copy of the original repo before issuing the above commands, in case of catastrophic errors. cp -r tt-rss tt-rss.bak
i guess i’ll make a VM or something with php 7.3 and take a closer look at this
UPD: i’ve subscribed to the feed in the OP but so far no errors. maybe the data is not in the feed anymore. op can you post more feeds / specific posts where this happens on?
e: i’m using an ubuntu 18.04 test vm, php 7.3.2-3 from the ppa
This is the list of the feeds that I have the af_readability plugin enabled on… The issue is sporadic for me though. It will work fine for a day or two and then I’ll start seeing that error in the update process. So I guess if you subscribe to these and monitor it for a while you should see the same at some point.
I’m not sure about specific post titles. All my feeds are up to date at the moment, and if I git pull back up to HEAD then it will work fine. I guess if I purge the database of post entries it might trigger it though?
one of your feeds (the gazette) doesn’t open with connection timeout, i guess it’s geoblocking or something, the rest seemingly updated without any errors
yeah, i’m not going to keep a separate vm running and updating a bunch of random feeds because it might trigger a readability error at some point, maybe. this sounds like too much effort for a third party library + bleeding edge php combination. instead i’m going to wrap this into try-catch.
next time this happens make it trigger reliably on specific post urls (use force rehash in feed debugger) at least and post those here.
alternatively be a normal person like the rest of us and use a server distro for your server stuff.
e: maybe support should be limited to stable distros like centos and debian (+ubuntu) to begin with, it’s not like i’m going to investigate any issues with meme-tier garbage like arch or gentoo or whatever
update: readability parsing is already inside a try-catch block, which means it crashes in constructor? strange. i’ll move it inside the block, i guess.
it could be a good idea to dump entire article XML somewhere (i.e. with file_put_contents) so that we could train readability on it later and see if it crashes
sorry about the 525. i’ve updated docker-ce and discourse, uh, didn’t take it well. i had to rebuild the container and since it’s such an overbloated monstrosity it always takes forever.