while tweaking the spam filters on our mail server i finally took the step of adding a cron job to learn from the inbox and junk folders of each user. as we are using spamassassin as part of our spam defense this basically involves a couple of invocations of sa-learn
to
- learn the “ham” from each users inbox folder
- learn the “spam” from each users junk folder
below is the shell script that gets invoked by cron once a day:
#!/bin/bash
echo "updating spamassassin bayesian spam/ham filter"
echo
for userDir in /home/*; do
user=$(basename $userDir)
ham=$userDir/Maildir/{cur,new}
spam=$userDir/Maildir/.Junk/{cur,new}
echo " learning from $user"
echo " spam: $spam"
/usr/bin/sa-learn --no-sync --spam $spam | while read line; do
echo " $line"
done
echo " ham: $ham"
/usr/bin/sa-learn --no-sync --ham $ham | while read line; do
echo " $line"
done
echo
done
echo "syncing:"
/usr/bin/sa-learn --sync | while read line; do
echo " $line"
done
echo
echo "stats:"
sa-learn --dump magic | while read line; do
echo " $line"
done
the while read line; do ... done
bits are there so that i can nicely indent the output of sa-learn
.
works rather nicely.