Get Gitlab feeds working?

  • [x] I’m not using docker on my primary instance, but my issue can be reproduced on the aforementioned docker setup and/or official demo.

Apparently, except for the fact that Gitlab " is overbloated shit garbage", its Atom feeds seem to be malformed as as well. Since there is no other topic here discussing this problem, I’m curious about whether this is specific to the example feed below (unlikely) and/or whether there’s a way to remedy this.

Here’s the output by myfeedsucks for https://gitlab.com/librewolf-community/browser/windows.atom:

It should be noted that once every month or so, problematic posts that create these issues with the feed are absent and tt-rss does successfully parse the feed.

Fetch error: N/A
Error: LibXML error 76 at line 157 (column 656): Opening and ending tag mismatch: img line 157 and a
Error: LibXML error 76 at line 157 (column 656): Opening and ending tag mismatch: img line 157 and a
Error: LibXML error 76 at line 157 (column 660): Opening and ending tag mismatch: a line 157 and p
Error: LibXML error 76 at line 157 (column 1316): Opening and ending tag mismatch: img line 157 and a
Error: LibXML error 76 at line 157 (column 1320): Opening and ending tag mismatch: a line 157 and p
Error: LibXML error 76 at line 158 (column 7): Opening and ending tag mismatch: p line 157 and div
Error: LibXML error 76 at line 159 (column 13): Opening and ending tag mismatch: p line 157 and ummary
Error: LibXML error 76 at line 160 (column 9): Opening and ending tag mismatch: div line 156 and entry
Error: LibXML error 76 at line 174 (column 752): Opening and ending tag mismatch: img line 174 and a
Error: LibXML error 76 at line 174 (column 756): Opening and ending tag mismatch: a line 174 and p
Error: LibXML error 76 at line 174 (column 1410): Opening and ending tag mismatch: img line 174 and a
Error: LibXML error 76 at line 174 (column 1414): Opening and ending tag mismatch: a line 174 and p
Error: LibXML error 76 at line 175 (column 7): Opening and ending tag mismatch: p line 174 and div
Error: LibXML error 76 at line 176 (column 13): Opening and ending tag mismatch: p line 174 and ummary
Error: LibXML error 76 at line 177 (column 9): Opening and ending tag mismatch: div line 173 and entry
Error: LibXML error 76 at line 191 (column 956): Opening and ending tag mismatch: img line 191 and a
Error: LibXML error 76 at line 191 (column 960): Opening and ending tag mismatch: a line 191 and p
Error: LibXML error 76 at line 191 (column 1611): Opening and ending tag mismatch: img line 191 and a
Error: LibXML error 76 at line 191 (column 1615): Opening and ending tag mismatch: a line 191 and p
Error: LibXML error 76 at line 191 (column 2261): Opening and ending tag mismatch: img line 191 and a
Error: LibXML error 76 at line 191 (column 2265): Opening and ending tag mismatch: a line 191 and p
Error: LibXML error 76 at line 191 (column 2909): Opening and ending tag mismatch: img line 191 and a
Error: LibXML error 76 at line 191 (column 2913): Opening and ending tag mismatch: a line 191 and p
Error: LibXML error 76 at line 192 (column 7): Opening and ending tag mismatch: p line 191 and div
Error: LibXML error 76 at line 193 (column 13): Opening and ending tag mismatch: p line 191 and ummary
Error: LibXML error 76 at line 194 (column 9): Opening and ending tag mismatch: p line 191 and entry
Error: LibXML error 76 at line 314 (column 8): Opening and ending tag mismatch: p line 191 and feed
Error: LibXML error 77 at line 315 (column 1): Premature end of data in tag div line 190
  • Tiny Tiny RSS version (including git commit id): c30b24d09f4096e612965af658540595262f6848 (latest)
  • Platform (i.e. Linux distro, Docker, PHP, PostgreSQL, etc) versions: doesn’t matter, can be reproduced on every platform, and with myfeedsucks

GitLab is sending invalid XHTML, which LibXML (used by tt-rss) doesn’t like. At minimum, img tags aren’t being closed properly in summary content. There are probably no feed items with img tags when you see successful parsing.

Options including contacting GitLab to get that fixed, or creating a tt-rss plugin to fix up the content (likely by using the HOOK_FEED_FETCHED hook).

i vaguely recall xmllint plugin being a thing.

Apparently, GitHub - fastcat/tt-rss-ff-xmllint: Tiny Tiny RSS plugin to run xmllint and/or tidy will run on all feeds when enabled. Since it can potentially break prefectly valid feeds, I don’t think it’s a great option.

Thanks for the further explanation. I’ve found this has already been reported over 7 months ago via Broken activity feed when image or line break used in comment (#361722) · Issues · GitLab.org / GitLab · GitLab

Additionally, here’s an issue Invalid atom feed for tags (Missing entry element: updated) (#26800) · Issues · GitLab.org / GitLab · GitLab that’s related and was almost completely ignored for years. Great.

Ok, so this nasty little “preprocessor” works OK. This way the feed is valid on myfeedsucks, too.

<?php

if (!isset($_GET["feed"]))
	die("No parameter ?feed=... specified.");

$url = $_GET["feed"];
$feed = file_get_contents($url, false);
if (!$feed)
	die("Feed not found.");

$feed = str_replace("<summary type=\"xhtml\">", "<summary><![CDATA[", $feed);
$feed = str_replace("</summary>", "]]></summary>", $feed);

echo $feed;

?>

although the images themselves don’t load, because the img elements have lazy load stuff from gitlab.com:

<img src="data:image/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==" alt="image" decoding="async" class="lazy gfm" data-src="/librewolf-community/browser/windows/uploads/93bc65a882cc0ede21dbf67df8013995/image.png" data-canonical-src="/uploads/93bc65a882cc0ede21dbf67df8013995/image.png">
1 Like

:man_facepalming:

why is gitlab always like this?

it’s not hard to make a plugin which is enabled for specific feeds, there’s tons of other plugins to copy-paste this from. maybe just ask the dev?

It’s pretty stupid indeed.

Because of the lazy load issue, I might as well keep using a “pre-processor” that also fixes that, instead of the xmlllint plugin…

im not sure how your preprocessor works but turning it into a plugin shouldn’t be hard.

It’s really just replacing the following:

$feed = preg_replace("/<summary(.*?)>(.*?)<\/summary>/s", "<summary$1><![CDATA[$2]]></summary>", $feed);
$feed = preg_replace("/<img(.*?)src=\".*?\"/", "<img$1", $feed);
$feed = preg_replace("/<img(.*?)data-src=/", "<img$1src=", $feed);