Improving my EFA Performance

Questions and answers about how to do stuff
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: Improving my EFA Performance

Post by shawniverson »

Try running sa-learn with --no-sync and report back please.
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

Agreed it is too slow. Don't know why. Still working on it.
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

shawniverson wrote:Try running sa-learn with --no-sync and report back please.
like so?

Code: Select all

sa-learn --no-sync --{ham|spam} -f /var/spool/MailScanner/quarantine/<date>/{spam|nonspam}/<queuefile>
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

[edit: ignore this]
Is there an address I can send the results to? A lot of this information cannot be posted in public form, and there is too much to sanitize.
Last edited by pdwalker on 02 Apr 2015 08:45, edited 1 time in total.
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

I made the following change in /var/www/html/mailscanner/functions.php just to get an idea of how long the UI was taking to train

Code: Select all

diff -c functions.php functions.php.org
*** functions.php    2015-04-02 15:59:20.040362566 +0800
--- functions.php.org    2015-04-02 16:00:15.688371543 +0800
***************
*** 2760,2766 ****
      $status = array();
      if (!$rpc_only && is_local($list[0]['host'])) {
          foreach ($num as $key => $val) {
-             audit_log('learning started on message ' . $list[$val]['msgid'] . ' as ' . $type);
              $use_spamassassin = false;
              switch ($type) {
                  case "ham":
--- 2760,2765 ---- 
Here are the results from the audit log

Code: Select all

02/04/15 15:58:11	<username>	<ipaddr>	SpamAssassin was trained on message 52BAB181625.AF91B as ham
02/04/15 15:56:32	<username>	<ipaddr>	learning started on message 52BAB181625.AF91B as ham
That's 1m 39s to train a single message from the UI.

Also, while the web interface is "learning" the message, the entire mailscanner website becomes completely unresponsive until the learn completes. Even if I have another browser session open on another page (say the reports page for example), that session is also unresponsive. Apache itself is still responsive, but everything under /mailscanner/ is not.

enabling apache server-status, then viewing that page, I can see that all the requests are waiting on apache to return.

It's almost like the php engine has become a single thread/single process and everything waits for php to complete the exec call.
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

times without and with --no-sync

Code: Select all

# su - apache
-bash-4.1$ time /usr/local/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --ham --file /var/spool/MailScanner/quarantine/20150402/nonspam/07A1B181E17.A009C
Learned tokens from 1 message(s) (1 message(s) examined)

real  1m7.466s
user  0m2.987s
sys   0m0.143s

-bash-4.1$ time /usr/local/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --spam --file /var/spool/MailScanner/quarantine/20150402/nonspam/07A1B181E17.A009C
Learned tokens from 1 message(s) (1 message(s) examined)

real  1m18.832s
user  0m4.140s
sys   0m0.140s

-bash-4.1$ time /usr/local/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --ham --no-sync --file /var/spool/MailScanner/quarantine/20150402/nonspam/07A1B181E17.A009C
Learned tokens from 1 message(s) (1 message(s) examined)

real  1m42.449s
user  0m3.865s
sys   0m0.160s

-bash-4.1$ time /usr/local/bin/sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --spam --no-sync --file /var/spool/MailScanner/quarantine/20150402/nonspam/07A1B181E17.A009C
Learned tokens from 1 message(s) (1 message(s) examined)

real  1m6.208s
user  0m2.833s
sys   0m0.118s
So, calling mysql appears to be a bottleneck. Not sure why as the db is running from cache in memory, not from disk.

Still looking.

[edit]: the apache user was given a shell to use only for the duration of this test.
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance - benchmarking mysql

Post by pdwalker »

I've used sysbench to do some performance testing on mysql.

Short version:
- MySQL performance inside my efa kvm is 1/7th the speed of the mysql performance on another older physical machine.
- io wait % in the physical machine (and vm host machine) is very low.
- io wait % inside the vm is very high.

something is killing the disk performance for me in kvm
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

ok, I've gone from 1/7 the speed to 1/2 the speed (vm vs physical machine mysql performance).

I have two arrays in the vm host, a 2 drive raid1 array and a 6 drive raid6 array. The raid6 array has a faster readwrite performance than the raid1 array. The raw vm image for efa was on the slower raid1 array. Moving to the faster array gave me a boost.

I still see high io wait though and that shouldn't be.

still looking.
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

I'm going to call it a day on this one.

Summary:*
  • the efa kvm suffers from some disk performance issues, especially as processes spend a lot of time waiting for i/o, even when the host is not busy. Searching returns quite a lot on this issue over the years, but I could find no concrete solutions.
  • using myisam or innodb tables made no appreciable difference
  • switching from mysql 5.1 to percona server 5.6 didn't make an appreciable difference (well, myisam got a bit slower)
  • setting the elevator=noop (recommended for VMs) in the /etc/grub.conf, kernel line didn't make an appreciable difference
  • setting the kvm disk settings to use the virtio driver, raw disk image preallocated, and io='direct' helped slightly.
  • moving the raw disk image to a faster drive array helped considerably; up to roughtly half to a third of the native hardware speed; fast enough that I'm happy with the results.
* in my case with my hardware, not necessarily yours.

The only other thing that I can think of is to move the database onto a real, physical machine to avoid the kvm disk io issue completely.

Notes and references:
User avatar
darky83
Site Admin
Posts: 540
Joined: 30 Sep 2012 11:03
Location: eFa
Contact:

Re: Improving my EFA Performance

Post by darky83 »

Ah you are using KVM, not vmware or hyper-v, that explains why I cant replicate it :).

just wondering what kind of disk speeds do you get with different block sizes?

Code: Select all

$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 3.57153 s, 301 MB/s
and using 10k:

Code: Select all

$ dd if=/dev/zero of=1000MB.bin bs=10k count=102400
102400+0 records in
102400+0 records out
1048576000 bytes (1.0 GB) copied, 0.77117 s, 1.4 GB/s
and using 1k:

Code: Select all

dd if=/dev/zero of=1000MB.bin bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 1.46977 s, 713 MB/s
And using urandom (giving me 100%cpu usage on 1 core)

Code: Select all

$ dd if=/dev/urandom of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 93.2176 s, 11.5 MB/s
Version eFa 4.x now available!
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

Yeah, it seems to be an issue in KVM that's been known about for years. It's fine if the disk io is not to high, but when it does get high, the vm bogs down.

Here are the results from the efa vm itself

Code: Select all

[itsupport@efa tmp]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 12.7931 s, 83.9 MB/s

[itsupport@efa tmp]$ dd if=/dev/zero of=1000MB.bin bs=10k count=102400
102400+0 records in
102400+0 records out
1048576000 bytes (1.0 GB) copied, 7.64498 s, 137 MB/s

[itsupport@efa tmp]$ dd if=/dev/zero of=1000MB.bin bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 10.9982 s, 95.3 MB/s

[itsupport@efa tmp]$ dd if=/dev/urandom of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 175.187 s, 6.1 MB/s
and here are the results from the host machine, same drive array as the vm

Code: Select all

[root@kvm1 disk2]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.47506 s, 728 MB/s

[root@kvm1 disk2]# dd if=/dev/zero of=1000MB.bin bs=10k count=102400
102400+0 records in
102400+0 records out
1048576000 bytes (1.0 GB) copied, 4.97657 s, 211 MB/s

[root@kvm1 disk2]# dd if=/dev/zero of=1000MB.bin bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 7.76032 s, 135 MB/s

[root@kvm1 disk2]# dd if=/dev/urandom of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 166.427 s, 6.5 MB/s
and on my "slow" drive where I had the efa vm originally

Code: Select all

[root@kvm1 tmp]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.43541 s, 748 MB/s

[root@kvm1 tmp]# dd if=/dev/zero of=1000MB.bin bs=10k count=102400
102400+0 records in
102400+0 records out
1048576000 bytes (1.0 GB) copied, 14.0923 s, 74.4 MB/s

[root@kvm1 tmp]# dd if=/dev/zero of=1000MB.bin bs=1k count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 15.1012 s, 69.4 MB/s

[root@kvm1 tmp]# dd if=/dev/urandom of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 171.508 s, 6.3 MB/s
Is your machine using flash drives, or spinning metal?
User avatar
darky83
Site Admin
Posts: 540
Joined: 30 Sep 2012 11:03
Location: eFa
Contact:

Re: Improving my EFA Performance

Post by darky83 »

I only use flashdrives, performance drops sometimes when multiple machine's use high IO but even then its faster than what you are seeing.

Code: Select all

[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.56707 s, 685 MB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 2.73333 s, 393 MB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 0.727354 s, 1.5 GB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 0.704008 s, 1.5 GB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 0.745681 s, 1.4 GB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 2.57086 s, 418 MB/s
[root@splitter ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.32075 s, 813 MB/s
Version eFa 4.x now available!
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: Improving my EFA Performance

Post by shawniverson »

You have me beat ;)

Code: Select all

[postmaster@efa ~]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.91086 s, 562 MB/s
[postmaster@efa ~]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 3.5845 s, 300 MB/s
[postmaster@efa ~]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.30461 s, 249 MB/s
[postmaster@efa ~]$ dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.09217 s, 262 MB/s
I have a hybrid. Easy on the gas but definitely not a hot rod :lol:
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

You guys sure do know how to make a guy feel inadequate. Thanks!

As for the original poster, Toddh, what version of the hypervisor are you running?
mitgib
Posts: 9
Joined: 27 Mar 2015 14:28
Location: Rock Hill SC
Contact:

Re: Improving my EFA Performance

Post by mitgib »

If you are using KVM and virtio try this, if you get an improvement, add to /etc/rc.local

Code: Select all

echo deadline > /sys/block/vda/queue/scheduler
If you are not using virtio, why not? But change vda to sda in the above.

Just to add another datapoint, the node is 4 disk raid10 with 30 other containers running

Code: Select all

[root@hormel ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.29689 s, 250 MB/s
[root@hormel ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 5.09777 s, 211 MB/s
[root@hormel ~]# dd if=/dev/zero of=1000MB.bin bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.78092 s, 225 MB/s
[root@hormel ~]#
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

Nope, already using the virtio drivers and have the scheduler set. That's not it.

Also, you're better off using the deadline scheduler on the host, and the noop scheduler in the vm.
mitgib
Posts: 9
Joined: 27 Mar 2015 14:28
Location: Rock Hill SC
Contact:

Re: Improving my EFA Performance

Post by mitgib »

pdwalker wrote:Nope, already using the virtio drivers and have the scheduler set. That's not it.

Also, you're better off using the deadline scheduler on the host, and the noop scheduler in the vm.
I am not noticing a difference, tried noop and deadline in the past, on about 40 nodes. I'd be hard pressed to find it, but I read some IBM case study and went with it. Also, how is your disk cache set for your container? And my final thought is how is the VM setup? LVM or qcow2? I have noticed much better IO with none as the setting in libvirt while using qcow2 images.
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

none and preallocated raw.

I got better performance with raw over qcow2 when I tested the vm performance 2 years ago. Not tested it since.

Underlying file system is ext4 over lvm which does reduce my overall performance.

Using the deadline (host) / noop (vms) schedulers reduced the disk io contention between vms considerably.
toddh
Posts: 69
Joined: 16 Feb 2015 18:52

Re: Improving my EFA Performance

Post by toddh »

We are running on Hyper-V 2012 R2.

I have 2 EFA boxes running and mail is split between them. 1st is on a Hyper-V 2012 R2 Cluster with attached SAN via 40Gb infiniband. 2nd is on SSDs installed in the Hyper-V server. The 2nd EFA on SSDs consistently runs 20 - 25% lower CPU on 15 min average.

FYI i have the SQLGrey db broke out into a separate MySQL server leaving MySQL free to handle SA.

.
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

and running some of the dd speed timings above gives you...? (use the ssd one - let's focus on that since that machine/vm combo should be absolutely smoking)
cdburgess75
Posts: 49
Joined: 11 Jun 2014 21:43

Re: Improving my EFA Performance

Post by cdburgess75 »

Do you have greylisting on? If so, insure there is no other relay between efa and the sending email servers.

Once I used postfix forwarder on pfsense (add on) in front of Efa and I had a similar resource problem you are describing. The sqlgrey and postfix forwarder get very confused, it breaks greylisting too :)
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: Improving my EFA Performance

Post by pdwalker »

It sounds like you had an entirely different problem, perhaps related to some kind of misconfiguration.

I use EFA as the mail gateway, so there is nothing after EFA except the receiving mail servers.

Greylisting has caused me no serious issues once I had everything setup correctly, and has caused a drop in the amount of crap I've received.
User avatar
darky83
Site Admin
Posts: 540
Joined: 30 Sep 2012 11:03
Location: eFa
Contact:

Re: Improving my EFA Performance

Post by darky83 »

cdburgess75 wrote:Once I used postfix forwarder on pfsense (add on) in front of Efa and I had a similar resource problem you are describing. The sqlgrey and postfix forwarder get very confused, it breaks greylisting too :)
That is expected behaviour, if you place something in front of EFA, like an pfsense forwarder, loadbalancer etc.. Then all systems think that all the mail you are receiving is received from just one single host, this will trigger a bunch of things useless in EFA (RBL checking, greylisting, razor, pyzor etc) making your spam filter less effective.

Also you will run into problems like postfix who is rate limiting the amount of connections from the forwarding box and the reverse DNS lookup that fails from the sending mail server causing your spam score to always be much higher.
Version eFa 4.x now available!
Post Reply