need some help for a bash script ...

Questions and answers about how to do stuff
Post Reply
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

need some help for a bash script ...

Post by anti-spam »

Hello,

we are looking to make a bash script to extract all the hosted domains in our hostings servers.
My main problem is the subdomains ... If there is a subdomain and the related domain in the postfix transport file, then EFA is not working correctly. If i remember, we had "relay denied" errors. So, my idea is to take a list of all the domains hosted on a server in a text file, and to delete every subdomains if there is a related domain present. A exemple of such a text file :

efa-project.com
efa-project.org
test.efa-project.org
demo.efa-project.org

I want to delete all the subdomains, like in ths exemple test and demo.
If someone can help me on this, i will also share the resulting bash script :P
Thanks
Last edited by anti-spam on 08 Oct 2015 15:58, edited 1 time in total.
:arrow: always fighting spams ... :hand:
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

nobody here to help me ?
It can't be that complicated to do ...
Regards
:arrow: always fighting spams ... :hand:
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: need some help for a bash script ...

Post by pdwalker »

This is a forum for EFA, rather than for bash/sed/awk scripting. However, I'll take a stab at it. And it is a non trivial problem for a shell script to do correctly, even though the problem is easy to specify.
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

well, i know that this forum is dedicated to EFA, but the script we are looking for is to export the domains that our EFA appliances can import into the postfix transport file ... So a hosting company can also use EFA with this kind of script to start ...
:arrow: always fighting spams ... :hand:
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: need some help for a bash script ...

Post by pdwalker »

I suppose it is possible to use bash to handle this, but I figured perl would be easier.

And since I don't know perl, I had to teach myself that.

First, we have to prepare our domains. Assuming one domain per line, then I process the file using reversed string sort, so my incoming domain text file:

Code: Select all

efa-project.com
test.efa-project.org
efa-project.org
demo.efa-project.org
my.efa-project.com
a.org
b.org
a.a.org
will look like this:

Code: Select all

a.org
a.a.org
b.org
efa-project.org
demo.efa-project.org
test.efa-project.org
efa-project.com
my.efa-project.com
meaning all my subdomains appear immediately after my top domain

I accomplish this with the following shell command:

Code: Select all

cat domains-in.txt |rev|sort|rev
Now the algorithm is simple. Assuming a correct ordering in the text file, I grab the first line, then I compare it with the next line. If I find the domain of the first line is present in the next line, then I exclude it, otherwise I keep it and make it my "previous line". Repeat until end of file. Like so:

Code: Select all

#!/usr/bin/perl
use strict;
use warnings;

my $pline = <> ;
my $nline = '';

chomp $pline;
print $pline ."\n";

while ( $nline = <> )
{
        chomp $nline;

        if (index($nline, $pline) == -1 )
        {
                # if I don't find the previous string in this string then keep it and move on
                print $nline . "\n";
                $pline = $nline;
        }
        else
        {
                # the domain is found in this subdomain, so skip it and move on
                #print $nline . " contains " . $pline . "\n";
        }
}
chmod u+x the perl file, then run it like so:

Code: Select all

$ cat domains-in.txt |rev|sort|rev|./process.pl
a.org
b.org
efa-project.org
efa-project.com
Will this work in all cases? Don't know, not sure, I didn't test it fully. But it's a start.

Is this the best solution? No idea. You could use awk. It's possible to do it with sed, but the syntax for sed is kinda weird, and bash (v4 and greater?) is an option because it does have decent string handling capabilities. However, this seems to work so you can take it from there.

Good luck.
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

hello,

many thanks for your post. I did a try, but the sort is not working, because it should sort the domain names (domain.com) the under it the subdomain.domain.com
I did a test, and my subdomains are not sorted right under the domains.
But, i tested your perl code, IT WORKS if the subdomain is right under the domain.

So if we can make a sort like we need - exemple :

efa-project.org
efa-project.com
test.efa-project.com

or if the perl script can check all the lines, sorted or not, than we will have everything we need :)
:arrow: always fighting spams ... :hand:
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: need some help for a bash script ...

Post by pdwalker »

1/ show me an example of your input that fails. Give me the smallest test case you can create that does not sort correctly

2/ what environment (shell version, OS) are you running it under?
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

hi,

i think that I'm wrong ... Had so much to do yesterday ...
I see some subdomains always in the filtred list, but did not realize that this are OK because we don't host the related domain ...

Today i did a fast test again, and it seems to work fine!
My excuses :?
I will try this week-end to make a first script that import CPanel domains and compile it to be imported into EFA's Postfix transport file.
When this will be done, then it can be modified to use any other shared hostings servers like Plesk, DirectAdmin, or others ;-)
Will let you know as soon it's finished.
Many thanks to everyone for your help
Regards
:arrow: always fighting spams ... :hand:
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: need some help for a bash script ...

Post by pdwalker »

Like I said, I don't know perl, and I don't know your dataset, so my suggestion could break under different circumstances. If so, let me know and I'll find a fix.
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

just a quick reply to say that i don't forget this project, it's only that i'm always looking for a bash doing this stuff ...
pdwalker's perl script is ok, but if we can make a simple bash, then i will be able to add the SMTP destination behind every filtred domain.

If anybody can help here ... :P
:arrow: always fighting spams ... :hand:
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: need some help for a bash script ...

Post by shawniverson »

Code: Select all

#!/bin/bash
INPUTFILE="allmydomains"
OUTPUTFILE="myupperdomains"

# sort and remove dupes, if any
sort -u $INPUTFILE > $OUTPUTFILE

# loop and remove subdomains
while read line; do
  sed -i "/^.\+$line$/d" $OUTPUTFILE
done < $OUTPUTFILE
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

wow ... Many thanks shawniverson, i will give it a try, then test it.
If everything works like i hope, then i will share my resulting scripts !
Please allow me some days.
Thanks you
regards
:arrow: always fighting spams ... :hand:
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: need some help for a bash script ...

Post by shawniverson »

I probably should have escaped all '.' in the domain names before running through sed...

I'm thinking this scenario could occur...

a.org
b.org
c.org
a.b.c.org
c.b.a.org
b.a.c.org
aabbc.org
ccbba.org
bbaac.org

Result:

a.org
b.org
c.org

Expected result:
a.org
b.org
c.org
aabbc.org
ccbba.org
bbaac.org

Fixing....
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: need some help for a bash script ...

Post by shawniverson »

Code: Select all

#!/bin/bash

INPUTFILE="domains"
OUTPUTFILE="result"

# sort and remove dupes, if any
sort -u $INPUTFILE > $OUTPUTFILE

# loop and remove subdomains
while read line; do
  line=$(echo $line | sed 's/\./\\\./g')
  sed -i "/^.\+\.$line$/d" $OUTPUTFILE
done < $OUTPUTFILE
Fixes two problems...

1. escapes all dots in domain names so that they aren't wildcard characters
2. checks for a dot preceding the $line so that a.org doesn't match aa.org but will match a.a.org
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: need some help for a bash script ...

Post by pdwalker »

Eeek. That's a bit ugly, reading your output and rewriting it at the same time in multiple passes.

I'm pretty sure there is a way we can use sed to do it in one pass using some of sed's more advanced options. I'll see if I can puzzle out the syntax required.
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: need some help for a bash script ...

Post by shawniverson »

ugly scripts need love too! :lol:
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: need some help for a bash script ...

Post by shawniverson »

Code: Select all

#!/bin/bash

INPUTFILE="domains"
TMPFILE="temp"
OUTPUTFILE="result"

# sort and remove dupes, if any
sort -u $INPUTFILE > $TMPFILE
cp -f $TMPFILE $OUTPUTFILE

# loop and remove subdomains
while read line; do
  line=$(echo $line | sed 's/\./\\\./g')
  sed -i "/^.\+\.$line$/d" $OUTPUTFILE
done < $TMPFILE

rm -f $TMPFILE
User avatar
pdwalker
Posts: 1553
Joined: 18 Mar 2015 09:16

Re: need some help for a bash script ...

Post by pdwalker »

That's much better.

Now we just have to replace your O(N^2) algorithm with an O(N) version. :lol:
Last edited by pdwalker on 29 Oct 2015 00:48, edited 1 time in total.
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

Hi,

just want to let you know that i have, thanks to shawniverson, a CPanel domains export script and a EFA import script, compatible with the post : How-to Prevent external sender spoofing to EFA
viewtopic.php?f=14&t=1278
If someone want to help us to test it, i'm ready to share it.
Just PM me, and i will give you the instructions.
When this scripts are fully tested and rock solid, then i will share it here for everybody.
I want to tune this scripts more, so that it will work also if a CPanel server is down or refuse the wget from the EFA's ;-)
Regards
:arrow: always fighting spams ... :hand:
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

hello,

is really nobody interested to test our domains import solution ?
It seems to work fine until now ...
Let me know
Regards
:arrow: always fighting spams ... :hand:
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: need some help for a bash script ...

Post by shawniverson »

I will, and perhaps we could roll it into EFA itself?
anti-spam
Posts: 40
Joined: 06 Oct 2015 14:32
Contact:

Re: need some help for a bash script ...

Post by anti-spam »

well, yes, but it's a package of scripts...
The "end user" will have to :

- copy and configure a script on every CPanel server
- configure every efa to import the domains list and compile 1 list for postfix plus 1 list for the internal sender access

...
It's not a very easy setup like a plug and play ... But it's not very complicated to setup, if you follow the howto.
regard
:arrow: always fighting spams ... :hand:
User avatar
shawniverson
Posts: 3644
Joined: 13 Jan 2014 23:30
Location: Indianapolis, Indiana USA
Contact:

Re: need some help for a bash script ...

Post by shawniverson »

May make for a good wiki article instead :D
Post Reply