Some error with RSS feeds

  • [X ] I’m using stock docker compose setup, unmodified.
  • [ ] I’m using docker compose setup, with modifications (modified .yml files, third party plugins/themes, etc.) - if so, describe your modifications in your post. Before reporting, see if your issue can be reproduced on the unmodified setup.
  • [ ] I’m not using docker on my primary instance, but my issue can be reproduced on the aforementioned docker setup and/or official demo.

I have 2 versions of tt-rss installed on virtual servers.
1 - One version from two years ago
2 - The second version is the latest
On the old version, all my RSS feeds work perfectly.
On the new version, an error occurs in a number of RSS feeds.
The error is like this:
Couldn’t download the specified URL: Client error: GET https://iz.ru/xml/rss/all.xml resulted in a 403 Forbidden response:

  • Tiny Tiny RSS version (including git commit id):
  • Platform (i.e. Linux distro, Docker, PHP, PostgreSQL, etc) versions:
    Linux, Docker

we can’t do anything about third parties blocking your server for whatever reason.

if those virtual servers are on different vds providers this (i.e. ip address / subnet) might be the cause. :man_shrugging:

problem with user agent
wget --user-agent - work fine
how to add user agent in tt-rss?

I tried to use
TTRSS_HTTP_USER_AGENT=Mozilla/5.0 (X11; Linux i686; rv:113.0) Gecko/20111914 Firefox/113.0

it works in the old version. It won’t help with the new one.

I just tested that and it worked fine. If you’re using Docker Compose, make sure you’re doing docker compose up -d and not a restart.

old tt-rss and new tt-rss vm-servers are located on the same network.
with old version - works fine, with new - not work.
I am talking about this rss feed Известия

User-agent in old and in new version
TTRSS_HTTP_USER_AGENT=Mozilla/5.0 (X11; Linux i686; rv:113.0) Gecko/20111914 Firefox/113.0

dunno what to tell you, if user agent is the same. second victim of guzzlehttp rework?

i’m getting 403 without user agent workarounds so maybe it didn’t actually apply for you?

p.s. you could also try passing the feed through feedburner, maybe they’ll allow it.

Probably the problem is somewhere at the php library level.
Thanks for the test and tips!

ps if I transfer the old version of tt-rss to postgres-15, will there be problems with the tt-rss application? I can easily upgrade databases

Looks like they have “DDoS-Guard” in place. I’m getting HTTP 403 Forbidden from a couple places. Might just need to try later.

{
	"result": {
		"code": 5,
		"message": "Client error: `GET https://iz.ru/xml/rss/all.xml` resulted in a `403 Forbidden` response:\n<!doctype html><html><head><title>DDoS-Guard</title><meta charset=\"utf-8\"/><meta name=\"viewport\" content=\"width=device-w (truncated...)\n"
	}
}

If I open it from a regular browser, sometimes I see a second message “Checking browser”.
In this case, wget --user-agent always works like the old version of tt-rss

Looks like the trigger might be use of HTTP/1.1 vs HTTP/2.

# DDoS-Guard page
curl --http1.1 -A 'Tiny Tiny RSS/24.04-d83290712 (https://tt-rss.org/)' https://iz.ru/xml/rss/all.xml
# Normal content
curl --http2 -A 'Tiny Tiny RSS/24.04-d83290712 (https://tt-rss.org/)' https://iz.ru/xml/rss/all.xml
1 Like

we could make a simple on-fetch plugin for izvestia which would use raw curl instead for those feed urls. if there’s a combination of curl options that works.

then again it might not help reliably against ddos screen they’re using.

shouldn’t be a problem but, as usual, make a backup (and verify that you can restore it).

can guzzle prefer http2?

Yeah, just need to set 'version' => 2 (or GuzzleHttp\RequestOptions::VERSION => 2) in the request options; curl will fall back to 1.1 if needed.

I just tested it out and hit DDoS-Guard again. Even with HTTP/2 being used, it seems there’s still some discernible difference between CLI curl and what’s happening in PHP land.

edit: It looks like ALPN was being used to tell the difference. Adding \CURLOPT_SSL_ENABLE_ALPN => false to the curl request options got things working in PHP with both HTTP/1.1 and HTTP/2.

i wonder if that’s a sane enough configuration that we could use as a default.

enabling ALPN shouldn’t be a bad thing too.

Disabling ALPN feels slightly hacky (or maybe just “limited benefit”) to me, but I’m guessing things would keep working-- just not as smoothly as it could be. Since it seems like a widespread and generally useful feature, I’d probably lean towards the plugin (or configuration) option unless disabling ALPN would also help with Cloudflare, etc.

oh, i misread it as force-enabling alpn instead of disabling it. yeah, disabling is hacky.

I looked at the git history on the working version.
Old version that work
3b4e12ff Andrew Dolgov [email protected] on 02.04.2023 at 20:07

Are there any other differences between your old and new system (e.g. switching from host installation to the Docker image, OpenSSL and/or PHP version change, etc.)?

in both cases default docker compose file (old and new docker compose) from instruction
in old version i see php 8.2 (default) in new version 8.3 (default)

Below is a very basic and brittle plugin you could try (place in plugins.local/ddos_guard_workaround/init.php). No real error handling, best practices, etc.

<?php

class Ddos_Guard_Workaround extends Plugin {
	const SITES_TO_HANDLE = [
		'https://iz.ru/',
	];

	public function about() {
		return [
			null, // version
			'Workaround for DDoS Guard on certain sites', // description
			'', // author
			false, // is system
			'', // more info URL
		];
	}

	public function api_version() {
		return 2;
	}

	public function init($host): void {
		$host->add_hook($host::HOOK_SUBSCRIBE_FEED, $this);
		$host->add_hook($host::HOOK_FEED_BASIC_INFO, $this);
		$host->add_hook($host::HOOK_FETCH_FEED, $this);
	}

	public function hook_subscribe_feed($contents, $url, $auth_login, $auth_pass) {
		return self::should_handle($url) ? self::fetch($url) : $contents;
	}

	public function hook_feed_basic_info($basic_info, $fetch_url, $owner_uid, $feed_id, $auth_login, $auth_pass) {
		return self::should_handle($fetch_url) ? ['site_url' => $fetch_url, 'title' => $fetch_url] : $basic_info;
	}

	public function hook_fetch_feed($feed_data, $fetch_url, $owner_uid, $feed, $last_article_timestamp, $auth_login, $auth_pass) {
		if (!self::should_handle($fetch_url)) {
			return $feed_data;
		}
		$content = self::fetch($fetch_url);
		return $content ?: $feed_data;
	}

	private static function should_handle(string $url): bool {
		foreach (self::SITES_TO_HANDLE as $site_prefix) {
			if (str_starts_with($url, $site_prefix)) {
				return true;
			}
		}
		return false;
	}

	private static function fetch(string $url): string {
		$ch = curl_init();
		curl_setopt($ch, \CURLOPT_URL, $url);
		curl_setopt($ch, \CURLOPT_SSL_ENABLE_ALPN, false);
		curl_setopt($ch, \CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux i686; rv:113.0) Gecko/20111914 Firefox/113.0');
		curl_setopt($ch, \CURLOPT_RETURNTRANSFER, 1);
		$result = curl_exec($ch);
		return $result ?: '';
	}
}