Page 1 of 1

Training Bayes Script

Posted: 21 May 2019 07:53
by nicola.piazzi
Bayesian database is very useful to fight spam if correctly trained
http://artinvoice.hu/spams/ mantain daily archive of spam
This script can be put in cron and train our bayes system


# bayeslearn.sh
# -------------
#
# Download daily spam samples from http://artinvoice.hu/spam and learn bayes
# Script create its own work folder if not exist
# Put in crontab a line like this to run every night
#
# # Bayeslearn, train bayes spam from http://artinvoice.hu/spam daily
# 00 03 * * * /batch/bayeslearn.sh

# Parameters
VFOLDER=/batch/bayeslearn # Work Directory to use
VLOGFILE=/batch/bayeslearn.log # Logfile of last run

# Date & Time
NOW=$(date +"%m-%d-%Y %r")
start=`date +%s.%N`

# Initialize log file
echo $NOW > $VLOGFILE


# Create Work Directory if not exist
if [ ! -d $VFOLDER ] ; then
echo "Creating $VFOLDER" >> $VLOGFILE
mkdir $VFOLDER
chmod 755 $VFOLDER
fi

# Getting daily file in Work Directory and learning it
cd $VFOLDER
spamfile=spam--`date '+%Y-%m-%d'`.gz
spamfile_unpacked=spam--`date '+%Y-%m-%d'`
sleep 5
wget http://artinvoice.hu/spams/$spamfile >> $VLOGFILE 2>&1
sleep 5
gunzip -f $spamfile >> $VLOGFILE 2>&1
sleep 5
sa-learn --mbox --spam --progress $spamfile_unpacked >> $VLOGFILE 2>&1
sleep 5
rm -f $spamfile
sleep 5
/etc/init.d/spamassassin condrestart


# Logging
end=`date +%s.%N`
runtime="$(bc <<<"$end-$start")"
echo "Time elapsed: $runtime sec." >> $VLOGFILE

Re: Training Bayes Script

Posted: 25 May 2019 15:38
by ladylinux
Hello,

Nice script thanks.

You have a "Top" command at the bottom of the script that should be "top" and also at least on Centos 7 you need to install the bc rpm which in my case is not present.
Restarting spamassassin (via systemctl): [ OK ]
./bayeslearn.sh: line 48: bc: command not found
./bayeslearn.sh: line 50: Top: command not found
yum install bc

Thanks!!

Frannie