Hello Shawnn,
I finally had some time to debug a bit the message-id extraction logic.
The first issue is that with the actual code, it will return an empty message-id if the Message-ID header is the last header. This happens more often that we could think.
This was due to the loop needing another run to the next line to identify that previous line was the end of the message-id.
To fix this I moved some stuff out of the loop.
I also replaced whites spaces in regex'es with '\s' in case it would be a problem.
Finally when removing the header name from the buffer string I used 'message-id:\s+' instead of 'message-id:\s' in case there are multiple spaces.
It seems this fixed part of the problem for unfolding as the buffer seems to contain double space at some point, like shown in the debug output:
Code: Select all
Nov 21 06:39:34 mx1 MailScanner[31696]: MailWatch: DEBUG step1: messageidbuffer: 'Message-ID: '
Nov 21 06:39:34 mx1 MailScanner[31696]: MailWatch: DEBUG step1-unfold: messageidbuffer: 'Message-ID: <DBBPR09MB3048541501331ADC215B1708F39E9@DBBPR09MB3048.eurprd09.prod.outlook.com>'
Nov 21 06:39:34 mx1 MailScanner[31696]: MailWatch: DEBUG HEADER -message-id- : '<DBBPR09MB3048541501331ADC215B1708F39E9@DBBPR09MB3048.eurprd09.prod.outlook.com>'
Bonus: I also added a log warning in case of $messageid still being blank after all the processing, to catch other possible issues.
Here is the code:
Code: Select all
# Message-ID
my ($messageid, $inmessageid, $messageidbuffer);
$messageid = "";
$messageidbuffer = "";
$inmessageid = 0;
# Extract message id from header (unfold header if needed)
foreach (@{$message->{headers}}) {
if ( $_ =~ /^message-id:\s/i ) {
# RFC 822 unfold message-id
$messageidbuffer = $_;
$inmessageid = 1;
next;
} elsif ($inmessageid) {
if ($_ =~ /^\s/) {
# In continuation line
$messageidbuffer .= $_;
} else {
# End of message-id field
last;
}
}
}
# Set the re-formatted message-id and trim it
($messageid = $messageidbuffer) =~ s/^message-id:\s+//i;
$messageid =~ s/^\s+|\s+$//g;
# Warn if Message-ID was not found
if ($messageid eq "") {
MailScanner::Log::WarnLog("MailWatch: Could not extract Message-ID for %s", $message->{id});
}
On my tests here it seems to have improved the message-id extraction success rate and the unfolding seems to work now.