need some help for a bash script ...

anti-spam · Post by **anti-spam** » 07 Oct 2015 18:17

Hello,

we are looking to make a bash script to extract all the hosted domains in our hostings servers.
My main problem is the subdomains ... If there is a subdomain and the related domain in the postfix transport file, then EFA is not working correctly. If i remember, we had "relay denied" errors. So, my idea is to take a list of all the domains hosted on a server in a text file, and to delete every subdomains if there is a related domain present. A exemple of such a text file :

efa-project.com
efa-project.org
test.efa-project.org
demo.efa-project.org

I want to delete all the subdomains, like in ths exemple test and demo.
If someone can help me on this, i will also share the resulting bash script

Thanks

anti-spam · Post by **anti-spam** » 08 Oct 2015 15:51

nobody here to help me ?
It can't be that complicated to do ...
Regards

Post by **pdwalker** » 08 Oct 2015 15:58

This is a forum for EFA, rather than for bash/sed/awk scripting. However, I'll take a stab at it. And it is a non trivial problem for a shell script to do correctly, even though the problem is easy to specify.

anti-spam · Post by **anti-spam** » 08 Oct 2015 16:40

well, i know that this forum is dedicated to EFA, but the script we are looking for is to export the domains that our EFA appliances can import into the postfix transport file ... So a hosting company can also use EFA with this kind of script to start ...

Post by **pdwalker** » 08 Oct 2015 17:26

I suppose it is possible to use bash to handle this, but I figured perl would be easier.

And since I don't know perl, I had to teach myself that.

First, we have to prepare our domains. Assuming one domain per line, then I process the file using reversed string sort, so my incoming domain text file:

Code: Select all

efa-project.com
test.efa-project.org
efa-project.org
demo.efa-project.org
my.efa-project.com
a.org
b.org
a.a.org

will look like this:

Code: Select all

a.org
a.a.org
b.org
efa-project.org
demo.efa-project.org
test.efa-project.org
efa-project.com
my.efa-project.com

meaning all my subdomains appear immediately after my top domain

I accomplish this with the following shell command:

Code: Select all

cat domains-in.txt |rev|sort|rev

Now the algorithm is simple. Assuming a correct ordering in the text file, I grab the first line, then I compare it with the next line. If I find the domain of the first line is present in the next line, then I exclude it, otherwise I keep it and make it my "previous line". Repeat until end of file. Like so:

Code: Select all

#!/usr/bin/perl
use strict;
use warnings;

my $pline = <> ;
my $nline = '';

chomp $pline;
print $pline ."\n";

while ( $nline = <> )
{
        chomp $nline;

        if (index($nline, $pline) == -1 )
        {
                # if I don't find the previous string in this string then keep it and move on
                print $nline . "\n";
                $pline = $nline;
        }
        else
        {
                # the domain is found in this subdomain, so skip it and move on
                #print $nline . " contains " . $pline . "\n";
        }
}

chmod u+x the perl file, then run it like so:

Code: Select all

$ cat domains-in.txt |rev|sort|rev|./process.pl
a.org
b.org
efa-project.org
efa-project.com

Will this work in all cases? Don't know, not sure, I didn't test it fully. But it's a start.

Is this the best solution? No idea. You could use awk. It's possible to do it with sed, but the syntax for sed is kinda weird, and bash (v4 and greater?) is an option because it does have decent string handling capabilities. However, this seems to work so you can take it from there.

Good luck.

anti-spam · Post by **anti-spam** » 08 Oct 2015 18:14

hello,

many thanks for your post. I did a try, but the sort is not working, because it should sort the domain names (domain.com) the under it the subdomain.domain.com
I did a test, and my subdomains are not sorted right under the domains.
But, i tested your perl code, IT WORKS if the subdomain is right under the domain.

So if we can make a sort like we need - exemple :

efa-project.org
efa-project.com
test.efa-project.com

or if the perl script can check all the lines, sorted or not, than we will have everything we need

Post by **pdwalker** » 08 Oct 2015 18:41

1/ show me an example of your input that fails. Give me the smallest test case you can create that does not sort correctly

2/ what environment (shell version, OS) are you running it under?

anti-spam · Post by **anti-spam** » 09 Oct 2015 10:25

hi,

i think that I'm wrong ... Had so much to do yesterday ...
I see some subdomains always in the filtred list, but did not realize that this are OK because we don't host the related domain ...

Today i did a fast test again, and it seems to work fine!
My excuses

I will try this week-end to make a first script that import CPanel domains and compile it to be imported into EFA's Postfix transport file.
When this will be done, then it can be modified to use any other shared hostings servers like Plesk, DirectAdmin, or others

Will let you know as soon it's finished.
Many thanks to everyone for your help
Regards

Post by **pdwalker** » 09 Oct 2015 16:03

Like I said, I don't know perl, and I don't know your dataset, so my suggestion could break under different circumstances. If so, let me know and I'll find a fix.

anti-spam · Post by **anti-spam** » 18 Oct 2015 11:05

just a quick reply to say that i don't forget this project, it's only that i'm always looking for a bash doing this stuff ...
pdwalker's perl script is ok, but if we can make a simple bash, then i will be able to add the SMTP destination behind every filtred domain.

If anybody can help here ...

Post by **shawniverson** » 18 Oct 2015 21:02

Code: Select all

#!/bin/bash
INPUTFILE="allmydomains"
OUTPUTFILE="myupperdomains"

# sort and remove dupes, if any
sort -u $INPUTFILE > $OUTPUTFILE

# loop and remove subdomains
while read line; do
  sed -i "/^.\+$line$/d" $OUTPUTFILE
done < $OUTPUTFILE

anti-spam · Post by **anti-spam** » 18 Oct 2015 21:35

wow ... Many thanks shawniverson, i will give it a try, then test it.
If everything works like i hope, then i will share my resulting scripts !
Please allow me some days.
Thanks you
regards

Post by **shawniverson** » 18 Oct 2015 23:41

I probably should have escaped all '.' in the domain names before running through sed...

I'm thinking this scenario could occur...

a.org
b.org
c.org
a.b.c.org
c.b.a.org
b.a.c.org
aabbc.org
ccbba.org
bbaac.org

Result:

a.org
b.org
c.org

Expected result:
a.org
b.org
c.org
aabbc.org
ccbba.org
bbaac.org

Fixing....

Post by **shawniverson** » 18 Oct 2015 23:49

Code: Select all

#!/bin/bash

INPUTFILE="domains"
OUTPUTFILE="result"

# sort and remove dupes, if any
sort -u $INPUTFILE > $OUTPUTFILE

# loop and remove subdomains
while read line; do
  line=$(echo $line | sed 's/\./\\\./g')
  sed -i "/^.\+\.$line$/d" $OUTPUTFILE
done < $OUTPUTFILE

Fixes two problems...

1. escapes all dots in domain names so that they aren't wildcard characters
2. checks for a dot preceding the $line so that a.org doesn't match aa.org but will match a.a.org

Post by **pdwalker** » 19 Oct 2015 01:55

Eeek. That's a bit ugly, reading your output and rewriting it at the same time in multiple passes.

I'm pretty sure there is a way we can use sed to do it in one pass using some of sed's more advanced options. I'll see if I can puzzle out the syntax required.

Post by **shawniverson** » 19 Oct 2015 11:49

ugly scripts need love too!

Post by **shawniverson** » 19 Oct 2015 15:50

Code: Select all

#!/bin/bash

INPUTFILE="domains"
TMPFILE="temp"
OUTPUTFILE="result"

# sort and remove dupes, if any
sort -u $INPUTFILE > $TMPFILE
cp -f $TMPFILE $OUTPUTFILE

# loop and remove subdomains
while read line; do
  line=$(echo $line | sed 's/\./\\\./g')
  sed -i "/^.\+\.$line$/d" $OUTPUTFILE
done < $TMPFILE

rm -f $TMPFILE

Post by **pdwalker** » 19 Oct 2015 16:25

Code: Select all

That's much better.

Now we just have to replace your O(N^2) algorithm with an O(N) version.

anti-spam · Post by **anti-spam** » 26 Oct 2015 22:33

Hi,

just want to let you know that i have, thanks to shawniverson, a CPanel domains export script and a EFA import script, compatible with the post : How-to Prevent external sender spoofing to EFA
viewtopic.php?f=14&t=1278
If someone want to help us to test it, i'm ready to share it.
Just PM me, and i will give you the instructions.
When this scripts are fully tested and rock solid, then i will share it here for everybody.
I want to tune this scripts more, so that it will work also if a CPanel server is down or refuse the wget from the EFA's

Regards

anti-spam · Post by **anti-spam** » 02 Dec 2015 17:33

hello,

is really nobody interested to test our domains import solution ?
It seems to work fine until now ...
Let me know
Regards

Post by **shawniverson** » 05 Dec 2015 10:38

I will, and perhaps we could roll it into EFA itself?

anti-spam · Post by **anti-spam** » 05 Dec 2015 11:15

well, yes, but it's a package of scripts...
The "end user" will have to :

- copy and configure a script on every CPanel server
- configure every efa to import the domains list and compile 1 list for postfix plus 1 list for the internal sender access

...
It's not a very easy setup like a plug and play ... But it's not very complicated to setup, if you follow the howto.
regard

Post by **shawniverson** » 06 Dec 2015 14:14

May make for a good wiki article instead

efa-project.org

need some help for a bash script ...

need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...

Re: need some help for a bash script ...