Since the new URL filters were introduced last week, some posts were not processed by some of my plugins. Today, I tried to find out the cause.
It turned out that the URLs of all these posts contained non-latin characters, and filter_var with the FILTER_VALIDATE_URL option, which is called in validate_url, does not consider these URLs valid, so in the line
$entry_link = rewrite_relative_url($site_url, clean($item->get_link()));
in rssutils.php the left hand side is the empty string, which in turn is entered into the DB and caused my plugins to skip the processing of these posts since they check $article['link']
in the article_filter hook.
Example post, which is contained in the r/brasil feed. myfeedsucks shows the correct links, but if it only works with the FeedParser class, it should do that since in the class, validate_url isn’t called, only clean.
Maybe this is an edge case of my PHP installation, and even if it isn’t, I don’t know what to do. Changing the code in validate_url to
if (filter_var($url, FILTER_VALIDATE_URL) === false && filter_var(htmlentities($url), FILTER_VALIDATE_URL) === false)
could work, but I don’t know if that is a good idea (maybe apply htmlentities to the path component only?).
If somebody could confirm that this is not only happening to me, I’d be obliged, otherwise, I’m sorry.