mazzy
May 21, 2019, 9:35am
1
I want to create a filter that finds articles with lots of images.
I want to put a label img
to such articles.
I’ve tried match expressions: (<img\s.*){5,}
and (<img\s.*){5,}
The filter does not working.
Now I use the expression: (\bimg\s.*){5,}
It does not quite accurate, of course.
Is there another solution?
Tiny Tiny RSS v19.2 (f38a89a), Server: Ubuntu 18.4, mySQL, Client: Win10 Chrome. search_sphinx disabled.
See also:
https://discourse.tt-rss.org/t/html-in-filters-not-possible-any-more/766
fox
May 21, 2019, 9:55am
2
you can’t use html markup in filters, it’s a known limitation.
e: you did find that previous thread about it, why make another one?
mazzy
May 21, 2019, 10:11am
3
My question is:
I want to put a label img
to articles with lots of images.
fox
May 21, 2019, 10:22am
4
well there’s one obvious solution: make a custom plugin.
mazzy
May 21, 2019, 10:25am
5
…server-side custom plugin.
is there one?
Untested but should probably do what you want (copied partly from auto_assign_labels).
(Save as ./plugins.local/auto_image_labels/init.php
and don’t forget to activate it in the preferences)
<?php
class Auto_Image_Labels extends Plugin {
private $host;
function about() {
return array(1.0,
"Assign labels to articles with more than x images",
"fox / aeritir");
}
function init($host) {
$this->host = $host;
$host->add_hook($host::HOOK_ARTICLE_FILTER, $this);
}
function hook_article_filter($article) {
$max_images = 5;
$my_label = 'img5';
$doc = new DOMDocument();
@$doc->loadHTML($article["content"]);
if ($doc) {
$xpath = new DOMXPath($doc);
$images = $xpath->query('//img');
if (count($images) > $max_images) {
array_push($article["labels"], $my_label);
}
}
return $article;
}
function api_version() {
return 2;
}
}
mazzy
May 21, 2019, 1:11pm
7
smart! interesting. thanks!
why import to xml?
can I just use preg_match
on $article["content"]
?
No one prevents you from doing so but most plugins use the DOMDocument/XPath approach as the article is often modified as well and you want to make sure the HTML still validates afterwards.