Feed parser doesn't ignore <template> tags

  • [x] I’m using stock docker compose setup, unmodified.
  • [ ] I’m using docker compose setup, with modifications (modified .yml files, third party plugins/themes, etc.) - if so, describe your modifications in your post. Before reporting, see if your issue can be reproduced on the unmodified setup.
  • [ ] I’m not using docker on my primary instance, but my issue can be reproduced on the aforementioned docker setup and/or official demo.

Describe the problem you’re having:

Some strange formatting appears in a RSS feed, when I brought this to the attention of the feed owner they remarked that RSS readers are not supposed to render these <template> tags RSS feed broken formatting · Issue #6 · space-wizards/space-wizards.github.io · GitHub

Include steps to reproduce the problem:

Using this feed see the appearance of tags in it’s feed items.

  • Tiny Tiny RSS version (including git commit id): v22.06-d4be821
  • Platform (i.e. Linux distro, Docker, PHP, PostgreSQL, etc) versions:
    Linux 5847d7bcdfcd 5.15.0-100-generic #110-Ubuntu SMP Wed Feb 7 13:27:48 UTC 2024 x86_64
    Docker version 25.0.4, build 1a576c5
    PHP Version 8.0.13

Guessing you’re using ttrss-af-readability? The feed itself doesn’t have <template> elements, but the articles/pages linked to (e.g. “Progress Report #38: Oh Baby a Triple”) do.

edit: A quick fix would be to disable Readability for that feed. Long-term might be for https://gitlab.tt-rss.org/main/libraries/readability-php/-/blob/8ac5abdd497b37d2be4833bcf18d6819bba4d9c9/src/Readability.php#L290 to strip <template>, similar to what it’s doing for <script> and <noscript>.

That makes sense. For now I can’t disable Readability for that feed because the full-post isn’t available in the feed and I use the plugin in order to inline the article content. Yeah it would be good for readability-php to strip this tag content then if that’s appropriate.

we can also strip this tag out in sanitizer, it doesn’t seem very useful regardless of where html came from :thinking:

e: also i thought we had a tag whitelist, not blacklist. hm.

Haven’t really dug into the code, but it might be better to have Readability strip <template> before it does its “rating” of nodes.

probably both places then i guess

  • bad for readability
  • useless in general in case someone puts this stuff into feed xml (wouldn’t surprise me)

@redmond an update for the plugin is available, if you’d like to give it a try.