ImageCerberusPLG5 high score, no?
-
- Posts: 25
- Joined: 09 Feb 2015 11:29
ImageCerberusPLG5 high score, no?
Hi, everyone
I found an email, false positive, and the rule ImageCerberusPLG5 4.50 had a hit with high score. All the email had was a banner/image/letterhead with customers logo.
I found it strange, as this rule is not in official SA and as I said, the score is really high, helped a lot to mark the innocent message as spam. Ill try to analyse to see if the rule helps at all...
I saw other posts in forum about people asking on this score and considering lowering score. Any thoughts if this rule does have good hits and why such a high score and how it works? It tries to catch some pornografic images or something?
Thanks
I found an email, false positive, and the rule ImageCerberusPLG5 4.50 had a hit with high score. All the email had was a banner/image/letterhead with customers logo.
I found it strange, as this rule is not in official SA and as I said, the score is really high, helped a lot to mark the innocent message as spam. Ill try to analyse to see if the rule helps at all...
I saw other posts in forum about people asking on this score and considering lowering score. Any thoughts if this rule does have good hits and why such a high score and how it works? It tries to catch some pornografic images or something?
Thanks
- shawniverson
- Posts: 3650
- Joined: 13 Jan 2014 23:30
- Location: Indianapolis, Indiana USA
- Contact:
Re: ImageCerberusPLG5 high score, no?
For some this is the case. Image analysis is not always perfect.
Simply set a lower score for ImageCerberusPLG5 in /etc/mail/spamassassin/local.cf
Simply set a lower score for ImageCerberusPLG5 in /etc/mail/spamassassin/local.cf
-
- Posts: 25
- Joined: 09 Feb 2015 11:29
Re: ImageCerberusPLG5 high score, no?
Thanks, but is this an official SA rule? As I dont see it in SA rules. It does what exactly, what type of image it catches, porn?
Why such a high score? I will try to analyse to see if it does have some good hits also...
What are other folks experience with this rule? Worth lowering score?
Thanks
Why such a high score? I will try to analyse to see if it does have some good hits also...
What are other folks experience with this rule? Worth lowering score?
Thanks
Re: ImageCerberusPLG5 high score, no?
Hello,
we have the same problem (EFA 3.0.0.9 installed). But nothing about ImageCerverusPLGx is written in /etc/mail/spamassassin/local.cf. Instead the scores are configured in /etc/mail/spamassassin/ImageCerberusPLG.cf
So in which file shall we make changes to edit the score level?
Thanks in advance!
dwmp
we have the same problem (EFA 3.0.0.9 installed). But nothing about ImageCerverusPLGx is written in /etc/mail/spamassassin/local.cf. Instead the scores are configured in /etc/mail/spamassassin/ImageCerberusPLG.cf
So in which file shall we make changes to edit the score level?
Thanks in advance!
dwmp
Re: ImageCerberusPLG5 high score, no?
you edit /etc/mail/spamassassin/ImageCerberusPLG.cf to lower your scores
Re: ImageCerberusPLG5 high score, no?
No. Don't edit that file. That file could get overwritten on an update.
The proper answer is to override the values in local.cf.
I also found the ImageCerberus scoring too highly for the messages I received, so I reduced them to a 10th of what they were. Here is what I added to /etc/mail/spamassassin/local.cf
The proper answer is to override the values in local.cf.
I also found the ImageCerberus scoring too highly for the messages I received, so I reduced them to a 10th of what they were. Here is what I added to /etc/mail/spamassassin/local.cf
Code: Select all
# scoring too high. Reduce
score ImageCerberusPLG5 0.5 0.5 0.5 0.5
score ImageCerberusPLG4 0.4 0.4 0.4 0.4
score ImageCerberusPLG3 0.3 0.3 0.3 0.3
score ImageCerberusPLG2 0.2 0.2 0.2 0.2
score ImageCerberusPLG1 0.1 0.1 0.1 0.1
Re: ImageCerberusPLG5 high score, no?
Sorry if I gave wrong advice but all the sites I've been browsing were saying custom .cf and .pm files go into /etc/mail/spamassassin/ so I didn't expect anything to overwrite files in there but after reading up on it it seems that I was only partially right: you can place your custom files in there and they will stay but everything that's already in there by default could be overwritten.
Re: ImageCerberusPLG5 high score, no?
ovizii, that is the place for custom cf amd pm files (at least that is where I am putting mine, but it's not really the place to alter preexisting files (except local.cf) as they were created by other packages.
so yes, prexisting stuff (except local.cf hopefully) could get overwritten. new stuff should be left alone.
so yes, prexisting stuff (except local.cf hopefully) could get overwritten. new stuff should be left alone.
Re: ImageCerberusPLG5 high score, no?
Alright, i will edit the local.cf
Thanks very much guys!
Thanks very much guys!
-
- Posts: 25
- Joined: 09 Feb 2015 11:29
Re: ImageCerberusPLG5 high score, no?
Hi,
Thanks, everyone! Is it not possible/worth it to lower these scores by default in EFA?
Are these official SA rules?
Thanks
Thanks, everyone! Is it not possible/worth it to lower these scores by default in EFA?
Are these official SA rules?
Thanks
- shawniverson
- Posts: 3650
- Joined: 13 Jan 2014 23:30
- Location: Indianapolis, Indiana USA
- Contact:
Re: ImageCerberusPLG5 high score, no?
https://github.com/E-F-A/v3/issues/284robertboyl wrote:Hi,
Thanks, everyone! Is it not possible/worth it to lower these scores by default in EFA?
Are these official SA rules?
Thanks
-
- Posts: 25
- Joined: 09 Feb 2015 11:29
Re: ImageCerberusPLG5 high score, no?
Thanks a lot, Shawn, very nice of you.
Congrats on EFA and constant improvements!!
Congrats on EFA and constant improvements!!
- Daniel Beardsmore
- Posts: 28
- Joined: 06 Jan 2016 18:54
- Location: Hertfordshire, UK
- Contact:
Re: ImageCerberusPLG5 high score, no?
I've just spotted ImageCerberusPLG3 trip up over a couple of innocuous images in a mail signature (one blank(!) and one being the company name in black text) — this earned the message +3 for its audacity, taking it to 4.08 total (it got hit for 1.20 for KAM_LINEPADDING, but reprieved a little for using DKIM).robertboyl wrote:What are other folks experience with this rule? Worth lowering score?
I seem to recall being concerned with ImageCerberus scoring in the past, so I think it's time to lower the scores myself too.
This reminds me — something MailScanner seems to lack is a way to ask it "what have the Romans^W^W^Whas this rule ever done for me?" — it's all very well cursing a rule for a false positive, but maybe it's 99% accurate. I've yet to see any feature that allows you to conduct this search, although I've yet to actively seek a solution to this (unless I already tried, failed and forgot about it, which is possible …)robertboyl wrote:Why such a high score? I will try to analyse to see if it does have some good hits also...
Re: ImageCerberusPLG5 high score, no?
There is a report that'll show you the spam assassin rule hits and the spam/non spam scoring.
Sorry, not at a computer so cannot tell you exactly where. Look under reports or tools and you'll find it.
Sorry, not at a computer so cannot tell you exactly where. Look under reports or tools and you'll find it.
- Daniel Beardsmore
- Posts: 28
- Joined: 06 Jan 2016 18:54
- Location: Hertfordshire, UK
- Contact:
Re: ImageCerberusPLG5 high score, no?
That report doesn't bring up the individual messages associated with each rule. You can't determine from the report whether a rule is scoring too lowly (i.e. there are too many false negatives) or whether the rule is scoring too highly (i.e. there are too many false positives).pdwalker wrote:There is a report that'll show you the spam assassin rule hits and the spam/non spam scoring.
The one thing you do learn from it is the significance of the rule: the ratio of messages affected (positively or negatively) vs total messages for ImageCerberusPLG3 is 0.4% for me, so it doesn't seem a huge deal to largely write it out of the equation.
ImageCerberusPLG1 is the only one seeing a sizeable usage, of 7.5%, but (as I understand it) was only adding +1 anyway.
Re: ImageCerberusPLG5 high score, no?
Not sure what yo uare looking for but in my opinion going to EFA => Reports => SA Rule Hits shows all that I need to fine-tune and tweak my scores.
Re: ImageCerberusPLG5 high score, no?
Has anyone got good experiences with this ImageCerberus plugin?
I just checked my stats and ImageCerberusPLG1 - ImageCerberusPLG4 are 100% HAM and ImageCerberusPLG5 is 50% HAM / 50% SPAM so this plugin basically does nothing to help me...
I just checked my stats and ImageCerberusPLG1 - ImageCerberusPLG4 are 100% HAM and ImageCerberusPLG5 is 50% HAM / 50% SPAM so this plugin basically does nothing to help me...
- Daniel Beardsmore
- Posts: 28
- Joined: 06 Jan 2016 18:54
- Location: Hertfordshire, UK
- Contact:
Re: ImageCerberusPLG5 high score, no?
So far as I can tell, the only information that you can gather from the SpamAssassin Rule Hits report is what percentage of messages are being affected by a rule. If this percentage is low, and the rule caused a false positive, it's safe to disable the rule, as it was doing very little anyway.
Let's imagine however that a rule was found to be involved with a lot of messages, and scored 50% spam/50% ham. There are several possible explanations for this. One is that it should be scoring 100% spam, but the rule is scored too lowly and isn't effective enough. Another explanation is that the rule is either entirely inappropriate or is being scored tooo highly, and is causing a large number of false positives. It may be that the rule is actually largely ineffective and just happens to be there doing very little.
As I understand it, the "Ham" and "Spam" columns of the report don't tell you what the message really was, since SpamAssassin doesn't know. The figures only tell you how messages got classified, rather than how they should have been classified.
The report doesn't tell you how strong the rule is (that is, what score was applied), and the report doesn't give you any means to check for yourself whether the ham/spam classification is behaving as desired.
Going back to that 50/50 rule: if you could bring up a report of all messages affected by that rule, you could skim-read the list and check that all the messages in red appear to be spam, and that all the messages in grey appear to be ham. If you were to see ham messages in red and spam messages in grey, then it would become apparent that the messages are being misclassified, and you could check each message to see how the rule in question was being applied, and whether it was responsible for the misclassification.
Not unless I am missing something here?
Let's imagine however that a rule was found to be involved with a lot of messages, and scored 50% spam/50% ham. There are several possible explanations for this. One is that it should be scoring 100% spam, but the rule is scored too lowly and isn't effective enough. Another explanation is that the rule is either entirely inappropriate or is being scored tooo highly, and is causing a large number of false positives. It may be that the rule is actually largely ineffective and just happens to be there doing very little.
As I understand it, the "Ham" and "Spam" columns of the report don't tell you what the message really was, since SpamAssassin doesn't know. The figures only tell you how messages got classified, rather than how they should have been classified.
The report doesn't tell you how strong the rule is (that is, what score was applied), and the report doesn't give you any means to check for yourself whether the ham/spam classification is behaving as desired.
Going back to that 50/50 rule: if you could bring up a report of all messages affected by that rule, you could skim-read the list and check that all the messages in red appear to be spam, and that all the messages in grey appear to be ham. If you were to see ham messages in red and spam messages in grey, then it would become apparent that the messages are being misclassified, and you could check each message to see how the rule in question was being applied, and whether it was responsible for the misclassification.
Not unless I am missing something here?
Re: ImageCerberusPLG5 high score, no?
Makes sense and you're right but in my situation I can "trust" the report as I have only recently started using EFA (a few weeks ago) so I have been monitoring it very closely, daily, and corrected every single mistake made so I'd say that every SPAM got caught or at least learned by SA as SPAM if it slipped through and every HAm got marked and learned as HAM. TXREp and Bayes are working perfectly.
I did look at as many emails as I could which were 50/50 with ImageCerberusPLG5 and well, they were 50% SPAm and 50% HAM.:-/
Yes, I second the idea that being able to pull out all messages marked by a specific SA plugin would be awesome but I don't think that would be interesting to too many people or doable.
I did look at as many emails as I could which were 50/50 with ImageCerberusPLG5 and well, they were 50% SPAm and 50% HAM.:-/
Yes, I second the idea that being able to pull out all messages marked by a specific SA plugin would be awesome but I don't think that would be interesting to too many people or doable.
Re: ImageCerberusPLG5 high score, no?
If you're interested in playing with SQL, you can run a query to list which messages were affected by certain rules.
Of course, you may want to process the results further to make it a little more readable.
Of course, you may want to process the results further to make it a little more readable.
Code: Select all
select
timestamp, id, from_address, to_address, subject isspam, ishighspam, issaspam, sascore, spamreport
from
mailscanner.maillog
where
spamreport like '% ImageCerberusPLG5 %'
order by
sascore;
-
- Posts: 25
- Joined: 09 Feb 2015 11:29
Re: ImageCerberusPLG5 high score, no?
Guys/Shawn,
Just curious, what value you suggest to score for ImageCerberusPLG5? Maybe 1 point instead of 4.50?
I dont have root access, but Ill ask my sysadmin to see if he assess this, filter out a few days of emails and see how many good results it has, etc. I see some very weird false positives. A very basic customers signature in an email with his company logo and name and it hit ImageCerberusPLG5 4.50
I wonder if this works at all and what ratio. If anyone has any analysis, pls send.
Will try also to get more details on the ratio of when it works versus causes FP.
Thanks!
Just curious, what value you suggest to score for ImageCerberusPLG5? Maybe 1 point instead of 4.50?
I dont have root access, but Ill ask my sysadmin to see if he assess this, filter out a few days of emails and see how many good results it has, etc. I see some very weird false positives. A very basic customers signature in an email with his company logo and name and it hit ImageCerberusPLG5 4.50
I wonder if this works at all and what ratio. If anyone has any analysis, pls send.
Will try also to get more details on the ratio of when it works versus causes FP.
Thanks!
Re: ImageCerberusPLG5 high score, no?
You can see my changes near the top of this thread. I've had fewer false positives since then.
-
- Posts: 25
- Joined: 09 Feb 2015 11:29
Re: ImageCerberusPLG5 high score, no?
Thanks a lot, pdwalker. Just curious, you see it have some legit hits?
How does it work more or less, it analyses images, an OCR type, but trying to find patterns, seems hard to do... The FP are strange, basic logos of companies with peoples names.
How does it work more or less, it analyses images, an OCR type, but trying to find patterns, seems hard to do... The FP are strange, basic logos of companies with peoples names.