Machine Learning for TTRSS

So I have been a user for TT-RSS for a long time, and recently started looking in to the Machine Learning algorithms as they are used for categorization of RSS feeds.

I was wondering if anyone is looking at getting this for TT-RSS.

So far based on my investigation there are a few ways for doing this. One of them is this Algorithmia blog. Basically the things that I was looking at is the Machine learning Auto-Tagging from this, then also SentimentAnalysis.

Then the Feedly created “Leo” which looks like they are using supervised learning model.

So the question is have anyone thought about working on this, my knowledge is purely from learning about ML and very rusty programming.

What would be the objective? When I subscribe to feeds, I put them into categories. It’s not much work to categorize feeds as you add them.

Or do you want it to surface articles most interesting to you from within feeds? That would give you a lot of work to do teaching the AI logic what you like and don’t like. You’d potentially miss articles you might have liked, but didn’t train the algorithm to look for.

My thinking is the following:

  • By using Natural Tag processing as I seen a lot of AI/ML based information you can read the full article and set up the tags accordingly for easy searching. You should then be able to use the tags in rules to + or - the score of the article, giving you a better way of surfacing things to the top instead of just using Regex (appear once).
  • In general you can use machine learning and training, with natural language processing to be able to once again affect scoring and + or - the score accordingly to be able to generate rules on content vs a single word.

While regex is great, it is still working on a string search fo things that appear somewhere in the feed. This is good but sometimes is problematic when dealing with a lot of feeds (I need them for work) where you want to surface things like security vulnerabilities up to the top. I think that ML can help here in finding relevance easier.