Page 2 of 2

Re: Improving my EFA Performance

Posted: 01 Apr 2015 17:04
by shawniverson
Try running sa-learn with --no-sync and report back please.

Re: Improving my EFA Performance

Posted: 02 Apr 2015 04:47
by pdwalker
Agreed it is too slow. Don't know why. Still working on it.

Re: Improving my EFA Performance

Posted: 02 Apr 2015 04:51
by pdwalker
shawniverson wrote:Try running sa-learn with --no-sync and report back please.
like so?

Code: Select all

sa-learn --no-sync --{ham|spam} -f /var/spool/MailScanner/quarantine/<date>/{spam|nonspam}/<queuefile>

Re: Improving my EFA Performance

Posted: 02 Apr 2015 05:34
by pdwalker
[edit: ignore this]
Is there an address I can send the results to? A lot of this information cannot be posted in public form, and there is too much to sanitize.

Re: Improving my EFA Performance

Posted: 02 Apr 2015 08:45
by pdwalker
I made the following change in /var/www/html/mailscanner/functions.php just to get an idea of how long the UI was taking to train

Code: Select all

diff -c functions.php functions.php.org
*** functions.php    2015-04-02 15:59:20.040362566 +0800
--- functions.php.org    2015-04-02 16:00:15.688371543 +0800
***************
*** 2760,2766 ****
      $status = array();
      if (!$rpc_only && is_local($list[0]['host'])) {
          foreach ($num as $key => $val) {
-             audit_log('learning started on message ' . $list[$val]['msgid'] . ' as ' . $type);
              $use_spamassassin = false;
              switch ($type) {
                  case "ham":
--- 2760,2765 ---- 
Here are the results from the audit log

Code: Select all

02/04/15 15:58:11	<username>	<ipaddr>	SpamAssassin was trained on message 52BAB181625.AF91B as ham
02/04/15 15:56:32	<username>	<ipaddr>	learning started on message 52BAB181625.AF91B as ham
That's 1m 39s to train a single message from the UI.

Also, while the web interface is "learning" the message, the entire mailscanner website becomes completely unresponsive until the learn completes. Even if I have another browser session open on another page (say the reports page for example), that session is also unresponsive. Apache itself is still responsive, but everything under /mailscanner/ is not.

enabling apache server-status, then viewing that page, I can see that all the requests are waiting on apache to return.

It's almost like the php engine has become a single thread/single process and everything waits for php to complete the exec call.

Re: Improving my EFA Performance

Posted: 02 Apr 2015 08:56
by pdwalker
times without and with --no-sync

Code: Select all

# su - apache
-bash-4.1$ time /usr/local/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --ham --file /var/spool/MailScanner/quarantine/20150402/nonspam/07A1B181E17.A009C
Learned tokens from 1 message(s) (1 message(s) examined)

real  1m7.466s
user  0m2.987s
sys   0m0.143s

-bash-4.1$ time /usr/local/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --spam --file /var/spool/MailScanner/quarantine/20150402/nonspam/07A1B181E17.A009C
Learned tokens from 1 message(s) (1 message(s) examined)

real  1m18.832s
user  0m4.140s
sys   0m0.140s

-bash-4.1$ time /usr/local/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --ham --no-sync --file /var/spool/MailScanner/quarantine/20150402/nonspam/07A1B181E17.A009C
Learned tokens from 1 message(s) (1 message(s) examined)

real  1m42.449s
user  0m3.865s
sys   0m0.160s

-bash-4.1$ time /usr/local/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --spam --no-sync --file /var/spool/MailScanner/quarantine/20150402/nonspam/07A1B181E17.A009C
Learned tokens from 1 message(s) (1 message(s) examined)

real  1m6.208s
user  0m2.833s
sys   0m0.118s
So, calling mysql appears to be a bottleneck. Not sure why as the db is running from cache in memory, not from disk.

Still looking.

[edit]: the apache user was given a shell to use only for the duration of this test.

Re: Improving my EFA Performance - benchmarking mysql

Posted: 02 Apr 2015 11:16
by pdwalker
I've used sysbench to do some performance testing on mysql.

Short version:
- MySQL performance inside my efa kvm is 1/7th the speed of the mysql performance on another older physical machine.
- io wait % in the physical machine (and vm host machine) is very low.
- io wait % inside the vm is very high.

something is killing the disk performance for me in kvm

Re: Improving my EFA Performance

Posted: 02 Apr 2015 11:58
by pdwalker
ok, I've gone from 1/7 the speed to 1/2 the speed (vm vs physical machine mysql performance).

I have two arrays in the vm host, a 2 drive raid1 array and a 6 drive raid6 array. The raid6 array has a faster readwrite performance than the raid1 array. The raw vm image for efa was on the slower raid1 array. Moving to the faster array gave me a boost.

I still see high io wait though and that shouldn't be.

still looking.

Re: Improving my EFA Performance

Posted: 02 Apr 2015 12:49
by pdwalker
I'm going to call it a day on this one.

Summary:*
  • the efa kvm suffers from some disk performance issues, especially as processes spend a lot of time waiting for i/o, even when the host is not busy. Searching returns quite a lot on this issue over the years, but I could find no concrete solutions.
  • using myisam or innodb tables made no appreciable difference
  • switching from mysql 5.1 to percona server 5.6 didn't make an appreciable difference (well, myisam got a bit slower)
  • setting the elevator=noop (recommended for VMs) in the /etc/grub.conf, kernel line didn't make an appreciable difference
  • setting the kvm disk settings to use the virtio driver, raw disk image preallocated, and io='direct' helped slightly.
  • moving the raw disk image to a faster drive array helped considerably; up to roughtly half to a third of the native hardware speed; fast enough that I'm happy with the results.
* in my case with my hardware, not necessarily yours.

The only other thing that I can think of is to move the database onto a real, physical machine to avoid the kvm disk io issue completely.

Notes and references:

Re: Improving my EFA Performance

Posted: 02 Apr 2015 18:44
by darky83
Ah you are using KVM, not vmware or hyper-v, that explains why I cant replicate it :).

just wondering what kind of disk speeds do you get with different block sizes?

Code: Select all

$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 3.57153 s, 301 MB/s
and using 10k:

Code: Select all

$ dd if=/dev/zero of=1000MB.bin bs=10k count=102400
102400+0 records in
102400+0 records out
1048576000 bytes (1.0 GB) copied, 0.77117 s, 1.4 GB/s
and using 1k:

Code: Select all

dd if=/dev/zero of=1000MB.bin bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 1.46977 s, 713 MB/s
And using urandom (giving me 100%cpu usage on 1 core)

Code: Select all

$ dd if=/dev/urandom of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 93.2176 s, 11.5 MB/s

Re: Improving my EFA Performance

Posted: 02 Apr 2015 19:06
by pdwalker
Yeah, it seems to be an issue in KVM that's been known about for years. It's fine if the disk io is not to high, but when it does get high, the vm bogs down.

Here are the results from the efa vm itself

Code: Select all

[itsupport@efa tmp]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 12.7931 s, 83.9 MB/s

[itsupport@efa tmp]$ dd if=/dev/zero of=1000MB.bin bs=10k count=102400
102400+0 records in
102400+0 records out
1048576000 bytes (1.0 GB) copied, 7.64498 s, 137 MB/s

[itsupport@efa tmp]$ dd if=/dev/zero of=1000MB.bin bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 10.9982 s, 95.3 MB/s

[itsupport@efa tmp]$ dd if=/dev/urandom of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 175.187 s, 6.1 MB/s
and here are the results from the host machine, same drive array as the vm

Code: Select all

[root@kvm1 disk2]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.47506 s, 728 MB/s

[root@kvm1 disk2]# dd if=/dev/zero of=1000MB.bin bs=10k count=102400
102400+0 records in
102400+0 records out
1048576000 bytes (1.0 GB) copied, 4.97657 s, 211 MB/s

[root@kvm1 disk2]# dd if=/dev/zero of=1000MB.bin bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 7.76032 s, 135 MB/s

[root@kvm1 disk2]# dd if=/dev/urandom of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 166.427 s, 6.5 MB/s
and on my "slow" drive where I had the efa vm originally

Code: Select all

[root@kvm1 tmp]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.43541 s, 748 MB/s

[root@kvm1 tmp]# dd if=/dev/zero of=1000MB.bin bs=10k count=102400
102400+0 records in
102400+0 records out
1048576000 bytes (1.0 GB) copied, 14.0923 s, 74.4 MB/s

[root@kvm1 tmp]# dd if=/dev/zero of=1000MB.bin bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 15.1012 s, 69.4 MB/s

[root@kvm1 tmp]# dd if=/dev/urandom of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 171.508 s, 6.3 MB/s
Is your machine using flash drives, or spinning metal?

Re: Improving my EFA Performance

Posted: 02 Apr 2015 19:11
by darky83
I only use flashdrives, performance drops sometimes when multiple machine's use high IO but even then its faster than what you are seeing.

Code: Select all

[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.56707 s, 685 MB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 2.73333 s, 393 MB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 0.727354 s, 1.5 GB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 0.704008 s, 1.5 GB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 0.745681 s, 1.4 GB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 2.57086 s, 418 MB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.32075 s, 813 MB/s

Re: Improving my EFA Performance

Posted: 03 Apr 2015 00:29
by shawniverson
You have me beat ;)

Code: Select all

[postmaster@efa ~]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.91086 s, 562 MB/s
[postmaster@efa ~]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 3.5845 s, 300 MB/s
[postmaster@efa ~]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.30461 s, 249 MB/s
[postmaster@efa ~]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.09217 s, 262 MB/s
I have a hybrid. Easy on the gas but definitely not a hot rod :lol:

Re: Improving my EFA Performance

Posted: 03 Apr 2015 06:25
by pdwalker
You guys sure do know how to make a guy feel inadequate. Thanks!

As for the original poster, Toddh, what version of the hypervisor are you running?

Re: Improving my EFA Performance

Posted: 04 Apr 2015 21:15
by mitgib
If you are using KVM and virtio try this, if you get an improvement, add to /etc/rc.local

Code: Select all

echo deadline > /sys/block/vda/queue/scheduler
If you are not using virtio, why not? But change vda to sda in the above.

Just to add another datapoint, the node is 4 disk raid10 with 30 other containers running

Code: Select all

[root@hormel ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.29689 s, 250 MB/s
[root@hormel ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 5.09777 s, 211 MB/s
[root@hormel ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.78092 s, 225 MB/s
[root@hormel ~]#

Re: Improving my EFA Performance

Posted: 04 Apr 2015 21:19
by pdwalker
Nope, already using the virtio drivers and have the scheduler set. That's not it.

Also, you're better off using the deadline scheduler on the host, and the noop scheduler in the vm.

Re: Improving my EFA Performance

Posted: 04 Apr 2015 21:34
by mitgib
pdwalker wrote:Nope, already using the virtio drivers and have the scheduler set. That's not it.

Also, you're better off using the deadline scheduler on the host, and the noop scheduler in the vm.
I am not noticing a difference, tried noop and deadline in the past, on about 40 nodes. I'd be hard pressed to find it, but I read some IBM case study and went with it. Also, how is your disk cache set for your container? And my final thought is how is the VM setup? LVM or qcow2? I have noticed much better IO with none as the setting in libvirt while using qcow2 images.

Re: Improving my EFA Performance

Posted: 06 Apr 2015 16:32
by pdwalker
none and preallocated raw.

I got better performance with raw over qcow2 when I tested the vm performance 2 years ago. Not tested it since.

Underlying file system is ext4 over lvm which does reduce my overall performance.

Using the deadline (host) / noop (vms) schedulers reduced the disk io contention between vms considerably.

Re: Improving my EFA Performance

Posted: 06 Apr 2015 18:47
by toddh
We are running on Hyper-V 2012 R2.

I have 2 EFA boxes running and mail is split between them. 1st is on a Hyper-V 2012 R2 Cluster with attached SAN via 40Gb infiniband. 2nd is on SSDs installed in the Hyper-V server. The 2nd EFA on SSDs consistently runs 20 - 25% lower CPU on 15 min average.

FYI i have the SQLGrey db broke out into a separate MySQL server leaving MySQL free to handle SA.

.

Re: Improving my EFA Performance

Posted: 06 Apr 2015 19:16
by pdwalker
and running some of the dd speed timings above gives you...? (use the ssd one - let's focus on that since that machine/vm combo should be absolutely smoking)

Re: Improving my EFA Performance

Posted: 23 Apr 2015 03:02
by cdburgess75
Do you have greylisting on? If so, insure there is no other relay between efa and the sending email servers.

Once I used postfix forwarder on pfsense (add on) in front of Efa and I had a similar resource problem you are describing. The sqlgrey and postfix forwarder get very confused, it breaks greylisting too :)

Re: Improving my EFA Performance

Posted: 23 Apr 2015 03:21
by pdwalker
It sounds like you had an entirely different problem, perhaps related to some kind of misconfiguration.

I use EFA as the mail gateway, so there is nothing after EFA except the receiving mail servers.

Greylisting has caused me no serious issues once I had everything setup correctly, and has caused a drop in the amount of crap I've received.

Re: Improving my EFA Performance

Posted: 23 Apr 2015 09:25
by darky83
cdburgess75 wrote:Once I used postfix forwarder on pfsense (add on) in front of Efa and I had a similar resource problem you are describing. The sqlgrey and postfix forwarder get very confused, it breaks greylisting too :)
That is expected behaviour, if you place something in front of EFA, like an pfsense forwarder, loadbalancer etc.. Then all systems think that all the mail you are receiving is received from just one single host, this will trigger a bunch of things useless in EFA (RBL checking, greylisting, razor, pyzor etc) making your spam filter less effective.

Also you will run into problems like postfix who is rate limiting the amount of connections from the forwarding box and the reverse DNS lookup that fails from the sending mail server causing your spam score to always be much higher.