PDF attachment converting to Base64 plain text

General eFa discussion
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

PDF attachment converting to Base64 plain text

Post by mattch »

some PDF attachment received is converted to Base64 plain text in the email body. It happen only from one person, pdf attachment from everywhere else come in fine and attached.

Any pointers where to look as to why they coming in plain text? :romance-heartbeating:

email body

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


---250551898-1560-1614715292=268
Content-Type: APPLICATION/pdf; NAME=?ISO-8859-1?Q?Statement_-_Lake_837.pdf?=
Content-Transfer-Encoding: BASE64

(removed data -- the document is here if i copy paste into a base64-to-pdf program)

---250551898-1560-1614715292=268--

Image

edit forgot to add, the same email is also sent to my gmail and is ok there.
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

Is there any log file i can try to glean some more info on how the attachment is handled?


i dont know if this is a bug, or how to even test but i think it has something to do with the file name of the said pdf attachment having a "/" in the actual file name. Not shown above. I dont know how to test this theory bc well, i idk how to add a / to file name to send.

in the plain text encoding i find a forward slash in the name part. i didnt catch it when i cleaned up the first post.

Code: Select all

---250551898-1560-1614715292=268
Content-Type: APPLICATION/pdf; NAME=?ISO-8859-1?Q?MY/Statement_-_Lake_837.pdf?=
Content-Transfer-Encoding: BASE64 

I also notice some anomaly in the screen shot above, the partialMessage.bin attachment
On these messages the MIME Type is in all caps "TEXT/PLAIN" and "APPLICATION/pdf".
On good messages its all lower case. "text/plain" "application/pdf". idk is relevant or not.
User avatar
shawniverson
Posts: 3590
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: PDF attachment converting to Base64 plain text

Post by shawniverson »

The slash in the filename is interesting. I would not be surprised if it is causing a problem.

To test this I am going to try to construct a filename with a slash in it and see what happens.
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

just following up with update and looking for any ideas.

i made file with backslash and confirm it doesn't like it. I cant figure out forward slash like in the original attachment but have a theory, maybe the forward slash is unicode \u002f. I tried contacting them but is big company and wont let me the customer talk to IT department.

I searched and found mailscanner has some ability to rename files with rules. (learning) I think
i thought if i can make a rule to fix up blackslash test, then i can -try- the same rule and try search unicode forward slash looking character.

i come up with this but i dont think im doing it correctly.
in /etc/MailScanner/filename.rules.conf
rename to File_$1_$3_$4 ^(.*)(\\)(.*)(\.text)
also tried
rename to _ \\
examples:

Code: Select all

rename to .ppt    \.pps$    Renamed pps to ppt    Renamed file
rename to Dangerous_$1_$2    ^(.*)\.(exe|com|scr)$    Renamed dangerous exes   Renamed file
test with backslash:

Code: Select all

--1268592695-1616609901=:24113
Content-ID: <20210324141821.24113.1@ubu21-1>
Content-Type: application/octet-stream; name=test-file\with-a-slash.text
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=test-file\with-a-slash.text

--1268592695-1616609901=:24113--
gmail/ms exchange accept these and just remove the slashes.



When using the new "rename" instruction in a rule, any matching file
will be automatically renamed using the new "Rename Pattern" setting in
MailScanner.conf. This allows you to add a prefix or a suffix to any
filename.

When using the new "rename to" instruction in a rule, any matching file
will be automatically renamed so that the portion of the filename that
matches the pattern string is replaced with new text. So for example,
you can rename all *.pps files to *.ppt with the rule

rename to .ppt \.pps$ Renamed pps to ppt Renamed file

If you want to be even cleverer, you can use parenthesised sections of
the match pattern within the replacement text. I'm not quite sure who
this will be useful to, but I'm sure you will find some clever uses (you
folks always do!). As a random example,

rename to Dangerous_$1_$2 ^(.*)\.(exe|com|scr)$ Renamed dangerous
exes Renamed file

That will rename any file such as "PleaseRunMe.exe" to
"Dangerous_PleaseRunMe_exe" and rename "DodgyScreensaver.scr" to
"Dangerous_DodgyScreensaver_scr" which means the user cannot run it
without renaming it first.
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

i cant seem to get any "rename to" rules working, even basic one. "rename" works which uses default mailscanner.conf setting (.disarmed)

rename attachment with .test to .new

Code: Select all

rename to  .new \.test$

Code: Select all

rename to  .new \.test$   -   -
ajmind
Posts: 53
Joined: 28 Mar 2017 15:26

Re: PDF attachment converting to Base64 plain text

Post by ajmind »

We are having maybe a similar problem with misspelled messages arriving from one single sender domain. (automated system creating e-mails with PDF attachments.)

When they arrive and transferred to our internal exchange server, the PDF attachment is not visible but partly corrupted text:
img-2021-05-04 15.46.33.png
img-2021-05-04 15.46.33.png (142.5 KiB) Viewed 6029 times
In the MailWatch-UI the PDF is mentioned as "partialMessage.bin"

Code: Select all

MIME Typ: application/pdf
partialMessage.bin Download

Code: Select all

--18446744072854621553-1179195149-1618495305=:2667
Content-Type: application/pdf;
        name="=?UTF-8?B?QmVzdGVsbHVuZyBOci4gMjE3MTk5ODM4OC5wZGY=?="
Content-Transfer-Encoding: BASE64
Content-Disposition: attachment

---removed data---

--18446744072854621553-1179195149-1618495305=:2667--
If I resend the message via MailWatch-UI to another addressee on the same MTA, the file is correctly shown with the PDF attachment.

Obviously there is a difference when the message arrives at eFa and then examined and forwarded, compared to manually started transmission via MailWatch.

Anybody an idea how to solve it? The sender is a huge international company, which sends us automated purchase orders and we do not want to lose any of it.
BR Andreas

Update:
Maybe I have found a possible root cause of the problem. All received messages from that sender do contain a colon followed by a four digit number.

boundary="18446744072854621553-907769865-1620118963=:6511"

Could this be a reason for theis problem? Mmmhh.
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

In my case,
These come in my inbox as plain text email with the bas64 (not ascii) in the body. i been coping base64 code from email to a decoder to spit out pdf.

If I resend the message via MailWatch-UI to another addressee on the same MTA, the file is correctly shown with the PDF attachment.
i never tried this. amazing that works. re-sending (releasing) to any email internal or external the attachment come ok.

i believed it had something to do with filename having invalid character but not able to confirm. Now, not sure why re-sending is ok if has bogus file name. maybe filename is not issue.

thank you for posting ajmind
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

ajmind wrote: 04 May 2021 14:09

Update:
Maybe I have found a possible root cause of the problem. All received messages from that sender do contain a colon followed by a four digit number.

boundary="18446744072854621553-907769865-1620118963=:6511"

Could this be a reason for theis problem? Mmmhh.


same here on not working message

Code: Select all

Content-Type: MULTIPART/MIXED; BOUNDARY="1427368158-23745-1620125044=:9728"

released and resent

Code: Select all

Content-Type: multipart/mixed;
     boundary="_004_05BCEB3A535D084BB84FFD845D1D64E40561E072FE20SERVER_"
User avatar
shawniverson
Posts: 3590
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: PDF attachment converting to Base64 plain text

Post by shawniverson »

So, besides the weird boundary, is the data consistent in the partial message?
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

shawniverson wrote: 06 May 2021 16:39 So, besides the weird boundary, is the data consistent in the partial message?

Yes.
I get two different emails with this pdf attachment being encoded to base64 in msg body. both from same place just generated and sent from different systems. I bring up bc not sure if helps Ive noticed different mime types and file downloaded between the two.

the first one.

Code: Select all

MIME Type: TEXT/PLAIN
partialMessage.bin Download
downloads partialMessage.bin, contains the email message body.

Code: Select all

MIME Type: APPLICATION/pdf
partialMessage.bin 
Downloads "partialMessage.bin", if renamed extension to PDF the doc opens up.


the second one:

Code: Select all

MIME Type: TEXT/PLAIN  
partialMessage.bin 
is empty message body

Code: Select all

MIME Type: APPLICATION/octet-stream
partialMessage.bin 
Downloads a file named "Viewpart.php" If renamed extension to PDF the doc opens up
User avatar
shawniverson
Posts: 3590
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: PDF attachment converting to Base64 plain text

Post by shawniverson »

I've been unable to trigger the problem so far. I'll probably need someone's help. What I need is the MIME part structure. This is basically the raw message before it hits MailScanner, which means intercepting it. The content and headers can be stripped out, I just need the MIME information and boundaries to troubleshoot further. I'm pretty sure we are dealing with something interesting with the MIME boundaries.
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

shawniverson wrote: 10 May 2021 01:11 I've been unable to trigger the problem so far. I'll probably need someone's help. What I need is the MIME part structure. This is basically the raw message before it hits MailScanner, which means intercepting it. The content and headers can be stripped out, I just need the MIME information and boundaries to troubleshoot further. I'm pretty sure we are dealing with something interesting with the MIME boundaries.
im getting tcpdump turned on with rotating file on efa interface, should give me a small window of time to capture it raw. thank you thank you
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

mattch wrote: 10 May 2021 18:26
shawniverson wrote: 10 May 2021 01:11 I've been unable to trigger the problem so far. I'll probably need someone's help. What I need is the MIME part structure. This is basically the raw message before it hits MailScanner, which means intercepting it. The content and headers can be stripped out, I just need the MIME information and boundaries to troubleshoot further. I'm pretty sure we are dealing with something interesting with the MIME boundaries.
im getting tcpdump turned on with rotating file on efa interface, should give me a small window of time to capture it raw. thank you thank you
i forget its TLS smtp now. i can capture but stumped how to decode it. I try with wireshark and efa pem file but i dont think i know what im doing.
-puts on thinking cap-
ajmind
Posts: 53
Joined: 28 Mar 2017 15:26

Re: PDF attachment converting to Base64 plain text

Post by ajmind »

I have created now a scanMessages.rule were all mails form this particular automated sender account is excluded from incoming processing (mmmhh, what is exactly excluded, Idk).

We have in the meantime received a few messages passing eFa successfully towards our MS-Exchange2010 server, showing the correct (expected) result (PDF attachment).

I could not see these incoming messages in MailWatch, but in \var\log\maillog, due to the a.m. rule I assume.

I would also like to help to identify the problem, unfortunately the request from Shawn is something I do not know how to archive. :-?
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

ajmind wrote: 12 May 2021 16:35 I have created now a scanMessages.rule were all mails form this particular automated sender account is excluded from incoming processing (mmmhh, what is exactly excluded, Idk).

We have in the meantime received a few messages passing eFa successfully towards our MS-Exchange2010 server, showing the correct (expected) result (PDF attachment).

I could not see these incoming messages in MailWatch, but in \var\log\maillog, due to the a.m. rule I assume.

I would also like to help to identify the problem, unfortunately the request from Shawn is something I do not know how to archive. :-?

do you mind sharing how you set up the scanmessages rule to exclude a sender from being processed? exch2010 is the destination in my case too.
If the actual message is not processed (i assume) then the raw message should be able to be captured along the way.

i dont know how involved turning off TLS is temporarily. Im also willing to try that.
wireshark can reconstruct the message if can get around or decrypt tls. tcpdump via yum command:
#sudo tcpdump port 25 -ni ens192 -W 10 -C 100 -s 0 -w /home/admin/captures/efa.pcap
-ni interface
-W number of files to keep in rotation 10 files
-C Size of file 100 Mb
-w path to save capture
ajmind
Posts: 53
Joined: 28 Mar 2017 15:26

Re: PDF attachment converting to Base64 plain text

Post by ajmind »

In MailScanner.conf you could set a ruleset:

Code: Select all

# Processing Incoming Mail
# ------------------------
...
Scan Messages = %rules-dir%/scan.messages.rules
...
The rule itself:

Code: Select all

		
From:           somebody@domain.com        			no
From:           somebody.else@anotherdomain.com            	no
FromOrTo:       default                                 			yes

Capturing an incoming message needs the knowledge when the message will arrive... :-?
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

ajmind wrote: 17 May 2021 14:35 In MailScanner.conf you could set a ruleset:

Code: Select all

# Processing Incoming Mail
# ------------------------
...
Scan Messages = %rules-dir%/scan.messages.rules
...
The rule itself:

Code: Select all

		
From:           somebody@domain.com        			no
From:           somebody.else@anotherdomain.com            	no
FromOrTo:       default                                 			yes

Capturing an incoming message needs the knowledge when the message will arrive... :-?

Thank you!!
In my case the message comes roughly around the same time +/- a few min.
Im also considering to setup a test subdomain on test efa to have my pdf sent to thia test email domain/efa. That way i can mess around (like tls) without interfering with regular emails.
ajmind
Posts: 53
Joined: 28 Mar 2017 15:26

Re: PDF attachment converting to Base64 plain text

Post by ajmind »

mattch wrote: 17 May 2021 15:31
Thank you!!
In my case the message comes roughly around the same time +/- a few min.
Im also considering to setup a test subdomain on test efa to have my pdf sent to thia test email domain/efa. That way i can mess around (like tls) without interfering with regular emails.
Have you been able to capture messages showing the problem in the meantime?
ajmind
Posts: 53
Joined: 28 Mar 2017 15:26

Re: PDF attachment converting to Base64 plain text

Post by ajmind »

Today again we received a message with the problem already reported:

Code: Select all

Message-ID: <4204.3564.2616992.1629086691.830.0@eu.pneu.local>
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="1773658847-17789-1629086691=:4204"
From both senders we have now seen the problem they do send automated order confirmations and again we see the "=:" in the boundary.

Any idea how to solve this problem?
mattch
Posts: 42
Joined: 28 Mar 2018 22:26

Re: PDF attachment converting to Base64 plain text

Post by mattch »

Im so curious where its getting changed at, if it is being changed at all. For me it come from a trusted sender with always same source so bypassing has been my work around and it fell on the back burner. If the automated messages come from random source then that can be bigger source of head ache.

In next couple months im setting up centos8-stream fresh install at attempt in migrating. Doing a little homework first. Interested if the problem follows or not. Ideally a fresh default install would be ideal for troubleshooting though.
thewomble
Posts: 50
Joined: 17 Jan 2017 12:52

Re: PDF attachment converting to Base64 plain text

Post by thewomble »

I too am seeing this on version 4 code, I forced rejected the connection and it is then processed by the version 3 code box I still have running which delivers it as successful.

If the attachment is too big, so not processed by Mailscanner, it is delivered as expected on the v4 box.

Its getting a lot of attention because they tend to be invoices in PDF, so either they are complaining the invoices are not being paid, or leading to potential supply chain issues.

Will look further into the boundary details posted here, but just a quick look cannot see anything like =:6511

Did read an article around "Content-Transfer-Encoding: quoted-printable" as a potential issue, but still looking into it.
thewomble
Posts: 50
Joined: 17 Jan 2017 12:52

Re: PDF attachment converting to Base64 plain text

Post by thewomble »

I did get one of the companies to send the message to my personal email (external from EFA) and forward it to the internal users and was delivered as expected. Three of the companies use the same "advanced email filter" so it does seem to server specific.
thewomble
Posts: 50
Joined: 17 Jan 2017 12:52

Re: PDF attachment converting to Base64 plain text

Post by thewomble »

If this means anything to anyone, or point to a resource to understand further.

Content-Type: multipart/mixed;
boundary="_008_daa66855df2349c898d4c61b4b5403c0********couk_"
MIME-Version: 1.0

--_008_daa66855df2349c898d4c61b4b5403c0********couk_
Content-Type: multipart/related;
boundary="_007_daa66855df2349c898d4c61b4b5403c0********couk_";
type="multipart/alternative"

--_007_daa66855df2349c898d4c61b4b5403c0********couk_
Content-Type: multipart/alternative;
boundary="_000_daa66855df2349c898d4c61b4b5403c0********couk_"

--_000_daa66855df2349c898d4c61b4b5403c0********couk_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable


Further down where the PDF is we get

--_008_daa66855df2349c898d4c61b4b5403c0arbtechcouk_
Content-Type: application/pdf; name="Invoice 36412 - PEA PRA - M5.pdf"
Content-Description: Invoice 36412 - PEA PRA - M5.pdf
Content-Disposition: attachment;
filename="Invoice 36412 - PEA PRA - M5.pdf"; size=167587;
creation-date="Fri, 21 Jan 2022 09:06:00 GMT";
modification-date="Fri, 21 Jan 2022 09:06:00 GMT"
Content-Transfer-Encoding: base64
User avatar
shawniverson
Posts: 3590
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: PDF attachment converting to Base64 plain text

Post by shawniverson »

So, those mime boundaries look correct. I'm now thinking we are dealing with a MIME property mismatch.

Is it possible that the size of the mime part representing the PDF is not the actual size of the PDF content, triggering MailScanner to flag it as a partial message?
Triumf
Posts: 13
Joined: 05 Jan 2014 13:18

Re: PDF attachment converting to Base64 plain text

Post by Triumf »

I have the same issue with invoices sent from one company. Releasing the same message from efa gui to any recipient sends it correctly.

Update: after playing with scan.messages.rules finally got it configured in a right way. Hope the proper fix will be found.
Post Reply