Page 1 of 1

Learn Spam Mails from untroubled.org

Posted: 23 Jan 2018 10:42
by benscha
Hi Guys

for a long time i train the Mails from http://untroubled.org/spam/ to my Bayes DB. There are a lot Textmails and Mails with Attachments to train.

that i don't need to do this manual i have created some scripts that will be run every night with a cronjob:

you need the following folder structure that the scripts runs or you need to customize the path in the scripts:

Code: Select all

home/
├── root/
       ├── scripts
              ├── learn
                     ├── ham
                     └── spam
you also need 7zip installed on your system, the spam mails are in a compressed format on the website to download.

First Part - Download and train Spam File "download-spam"

Code: Select all

cd /home/root/scripts/learn/spam
filename=$(date '+%Y-%m'.7z)
foldername=$(date '+%m')
wget http://untroubled.org/spam/$filename
echo "Extracting Files"
/usr/bin/7z/7za e /home/root/scripts/learn/spam/$filename -o/home/root/scripts/learn/spam/
sleep 10
rm /home/root/scripts/learn/spam/$filename
/home/root/scripts/spam-learn
rm -r /home/root/scripts/learn/spam/$foldername
/home/root/scripts/download-spam-attachments
for f in *.orig; do rm "$f"; done
for f in *.txt; do rm "$f"; done
Second Part - Download and train Spam File "download-spam-attachments"

Code: Select all

wget -r -l1 -A.orig http://untroubled.org/spam/attachments/ -P /home/root/scripts/learn/spam
echo "spam verschieben"
mv /home/root/scripts/learn/spam/untroubled.org/spam/attachments/*.orig /home/root/scripts/learn/spam/
echo "warte 5 Sekunden bis Spam verschoben ist"
sleep 5
echo "Ordner wird gelöscht"
rm -r /home/root/scripts/learn/spam/untroubled.org/
echo "Spam wird angelernt"
cd /usr/bin
./sa-learn --spam /home/root/scripts/learn/spam --progress 
./sa-learn --ham /home/root/scripts/learn/ham --progress 
rm /home/root/scripts/learn/spam/*
rm /home/root/scripts/learn/ham/*

i start the scripts by a Cronjob and train a lot of new spam.


cheers

Re: Learn Spam Mails from untroubled.org

Posted: 19 Mar 2018 20:06
by genotix
Hi Buddy,

Your initiative seems pretty good only during applying your script I seem to be missing the
/home/root/scripts/spam-learn
Script.

Would you be so kind to provide it too?

Re: Learn Spam Mails from untroubled.org

Posted: 19 Mar 2018 20:40
by genotix
Ok, I did some googling and think this does the trick for spam-learn

Code: Select all

#!/bin/sh
# The site doesn't provide ham I think so just disable it for now.
#nice -n 19 sa-learn --progress  --no-sync --ham  /home/root/scripts/learn/ham/* 
nice -n 19 sa-learn --progress  --no-sync --spam /home/root/scripts/learn/spam/*
nice -n 19 sa-learn --progress  --sync

Re: Learn Spam Mails from untroubled.org

Posted: 20 Mar 2018 10:32
by benscha
Hi genotix

nice if someone can use my script. :dance:

my spam-learn script looks similar to yours.

Code: Select all

cd /usr/bin
./sa-learn --spam /home/root/scripts/learn/spam --progress 
rm /home/root/scripts/learn/spam/*
echo '*** fight the spam ***'

Re: Learn Spam Mails from untroubled.org

Posted: 25 Mar 2018 07:25
by genotix
benscha wrote: 20 Mar 2018 10:32 Hi genotix

nice if someone can use my script. :dance:

my spam-learn script looks similar to yours.
Hmm didn't get notified by the Forum about your message.
Thanks for your reply!
I've learnt the bayes algorithm thanks to your post now.
It's blocked quite some spam since then.