Home > OS >  BASH Recursive Function
BASH Recursive Function

Time:01-08

I have a test site that performs a chain of 302 redirects. I'm looking to write a BASH script that will print the redirect URL from the response header from each page in the chain.

The following command will strip the URL from an individual response header and print it.

$ curl -Is http://iperformalotofredirects.com/ | sed -ne 's/Location: //' -e6p
http://iperformalotofredirects.com/firststep/

$ curl -Is http://iperformalotofredirects.com/firststep/ | sed -ne 's/Location: //' -e6p
http://iperformalotofredirects.com/nextstep/

I wrote a python script that functions properly, but for the sake of research, I'm also trying to do the same thing in a BASH script.

#!/bin/bash

geturl() {
  if [[ $1 == http* ]]
  then
    echo $1
    geturl $(curl -Is $1 | sed -ne 's/Location: //' -e6p)
  fi
}

geturl http://iperformalotofredirects.com/

The script comes up empty after the first iteration

$ ./curlit.sh
http://iperformalotofredirects.com/
http://iperformalotofredirects.com/firststep/

What I need is for the script to continue until exhaustion. My manual results show that more comes after http://iperformalotofredirects.com/firststep/ whereas my script is failing at achieving this.

Edit: Included @konsolebox's answer as reference

#!/bin/bash

main() {
    local location=$1

    while [[ $location == http* ]]; do
        echo "$location"
        location=$(curl -Is "$location" | awk '/Location: / { print $2 }')
    done
}

main "$@"

Copied and pasted this script identically. Ran it:

$ ./curlit.sh http://iperformalotofredirects.com/
http://iperformalotofredirects.com/
http://iperformalotofredirects.com/firststep/

What I expect to see:

$ ./curlit.sh http://iperformalotofredirects.com/
http://iperformalotofredirects.com/
http://iperformalotofredirects.com/firststep/
http://iperformalotofredirects.com/nextstep/
http://iperformalotofredirects.com/anotherstep/
http://iperformalotofredirects.com/moresteps/
http://iperformalotofredirects.com/finalstep/

Edit: fravadona requested the entire header

$ curl -Is http://iperformalotofredirects.com/
HTTP/1.1 302 Moved Temporarily
Server: nginx/1.18.0
Date: Fri, 07 Jan 2022 09:20:56 GMT
Content-Type: text/html
Content-Length: 145
Location: http://iperformalotofredirects.com/firststep/
Connection: keep-alive

$ curl -Is http://iperformalotofredirects.com/firststep/
HTTP/1.1 302 Moved Temporarily
Server: nginx/1.18.0
Date: Fri, 07 Jan 2022 09:20:56 GMT
Content-Type: text/html
Content-Length: 145
Location: http://iperformalotofredirects.com/nextstep/
Connection: keep-alive

CodePudding user response:

Try this code for debugging:

#!/bin/bash

main() {
    local location=$1

    while [[ $location == http* ]]; do 
        printf '%q\n' "$location"
        location=$(curl -Is "$location" | awk '/^Location: / { print $2 }')
    done
}

main "$@"

With that I found out that running main https://kernel.org outputs $'https://www.kernel.org/\r' which means that there may be a carriage return at the end of the location, which obviously is the possible cause of the problem. The fix is simple:

#!/bin/bash

main() {
    local location=$1

    while [[ $location == http* ]]; do 
        printf '%q\n' "$location"
        location=$(curl -Is "$location" | awk '/^Location: / { print $2 }')
        location=${location%$'\r'}
    done
}

main "$@"

CodePudding user response:

This works but with no recursion:

#!/bin/bash

main() {
    local location=$1

    while [[ $location == http* ]]; do
        echo "$location"
        location=$(curl -Is "$location" | awk '/Location: / { print $2 }')
    done
}

main "$@"

You can examine the responses you receive by changing the location=$(curl ... line to this:

response=$(curl -Is "$location")
printf "---- Response ----\n%s\n------------------\n" "$response"
location=$(awk '/Location: / { print $2 }' <<< "$response")
echo "Found location: $location"
  •  Tags:  
  • Related