fox
9
why would they imagine their apps being exempt from GDPR while their site isn’t
The apps she referred to are US based, so they wouldn’t be subject to GDPR. Instead, I just bookmarked the RSS link, and open in my browser. Of course, the problem with that is there’s no way to hide the headlines for articles I’ve already scanned.
Just reading through this :-
fox
12
oh im fairly certain that’s not how it works
Me? I don’t have a clue how it works. I never heard the term GDPR before this discussion. My last comment was totally based on the other replies herein – in particular, the full error message text posted by mamil, and the response I received from the newspaper’s editor.
Assuming the the content apps she referred me to live on servers in the US (or anywhere other than the EU), then requests sent by those apps to the newspaper’s RSS server would not be denied. Likewise, any RSS reader that’s not based in the EU could serve up their RSS feeds to me. Or I can display them in my browser, in which case the request comes through my local ISP. I think what she’s saying is only requests from the EU are being blocked. Whether or not that’s an appropriate use of the 451 error is a different question.
imgx64
14
What I find weird is that it’s impossible to collect private user information from RSS feeds because A) no cookies are sent, B) no JS is executed, and C) the IP address is that of the RSS-reader server, not the end user’s IP address. So why do these sites break their RSS feeds when accessed from Europe?
Oh, I know the answer. Incompetence.
A GET request for an RSS feed can just as easily include a cookie as a request for anything else on a web server. Anyway, they’re surely blocking all requests coming from some list of European ips, they’re not going to add another check to permit European requests if they don’t include cookies.
There’s a lot of uncertainty about whether an ip address is enough to count as personally identifying. The fact that the request for the feed is coming from a server is immaterial; to the hosting site, TTRSS on a server is just as much a client as a feed reader on a desktop or mobile device. The TTRSS server could also be used by a single individual (which is often the case) so the server’s ip address could be just as personal, if not more so, than the ip assigned to you at home by your ISP.
You should be able to proxified the feed with something like https://feed43.com
imgx64
17
Yes, that’s technically possible. But are there any RSS clients that save and send back cookies by default?
I don’t know but it doesn’t matter, it’s all just HTTP.
Interesting idea. I just tried to set up a proxy for the feed linked in my original post. Unfortunately, I was stumped at the step where it asks to define the required “Item (repeatable) search pattern” macro. The “?” pop-up wasn’t very helpful, at least not for a noob like myself. If it’s not too much trouble, could you take a look? Feed43 New Feed
Thanks, Homlett. Unfortunately I still get the 451 error. I checked the location of feed43’s IP address and it’s in the US. Go figure.
EDIT: The next time I opened TT-RSS, there was no error. The proxy feed worked! There must have been a delay before the proxy URL took effect. Thanks!
fox
22
well some people use reader apps, as opposed to something client-server like feedly or tt-rss
@homlett, the feed43 home page says it can be used to create an RSS feed from any page. I could use something like that for the Associated Press news feeds. They discontinued RSS support last Fall. Instead. The AP news feeds are now presented as web pages (e.g., Top News: US & International Top News Stories Today | AP News). There was a plugin that fixed this but it no longer works (see this discussion).
I found a tutorial @ feed43 that explains how to set up the extraction rules. However, the AP uses JS to generate the article feed so there’s nothing to use for the extraction rules. Does that mean it’s not possible to create a proxy RSS feed in this case?
This kind of pages is a real pain. However, if you look for the requests made by the javascript (with the developers tools of your browser, tab “network”), you can found a json file with everything you need to build your feed:
https://afs-prod.appspot.com/api/v2/feed/tag?tags=apf-topnews
There is even the full text of the articles but with some ugly non html breaklines unfortunately.
Also, because the json file is quiet heavy, Feed43 keeps only a tiny part and you won’t be able to “extract” more than 9 entries. Which should be enough with a hard refresh rate I guess (depends on Feed43).
Building a complete feed from this json would be easy with a tool like Huginn of even a dedicated ttrss plugin. Anyway, here it is:
https://feed43.com/feed.html?name=ap-top-news for editing
https://feed43.com/ap-top-news.xml for subscribing
fox
25
luckily for all of us frontend developers are incapable of doing anything without a nice JSON provided by someone who has functional brain matter
Thanks @homlett. That works, within the limitations you mentioned. With the free version of feed43, the refresh rate is 6 hours. However. the AP Top News feed often produces a lot more than 9 feeds within that period, mostly repeats, so I’ll have to see how much it misses.
The author of the AP News plugin for TT-RSS replied to the other discussion I linked in my previous reply. He says the plugin still works, so we’re trying to figure out why @mamil and I are getting the "unable to download URL" error.
@mamil, is your instance of TT-RSS running on an EU server?
@mamil, so we can’t rule out GDPR. If so, I guess there’s not much to do about it other than stand up my own instance of TT-RSS here in the US. I’m afraid I’m not up for the technical challenge, nor do I have time to learn. Maybe next year. I recently built my first Linux box and I’ve had to spend way more time than I imagined learning how to do all the things I took for granted on my aging (but still productive) XP box.
@homlett, as I suspected, the feed43 proxy doesn’t pick up nearly all the AP top news articles. I guess my best option to to subscribe to the paid version, which updates every hour and has a larger page size allowance (250k vs 100k). Together those features should handle whatever AP throws at it. I really appreciate your effort to set that up!