I have booking SAAS system in which I have thousands of merchants which run their business over this platform.
To complete their bookings on daily basis I have a cron which runs every 5 mins. In this cron, I have a scheduler api which is called inside main go routine but completed independently using another go routine along with execute command. I suddenly started getting this error on my cron system.
runtime: goroutine stack exceeds 1000000000-byte limit
Here is my code structure:
package cron
import (
"gopkg.in/robfig/cron.v3"
)
func RunCron() {
c := cron.New()
c.AddFunc("@every 0h5m0s", SendBookingMail)
c.Start()
}
func SendBookingMail() {
// this function get all merchants & issue curl command for api url for each merchant. and then the below function is executed.
}
func sendMailCron() {
completeBkMailData := struct {
Booking models.Booking `json:"booking"`
TestCustomerIds []int `json:"test_customer_ids"`
SmsPermission bool `json:"sms_permission"`
SmsKeys map[string]string `json:"sms_keys"`
}{
booking,
testCids,
smsPermission,
smsKeys,
}
b, err := json.Marshal(completeBkMailData)
if err != nil {
fmt.Println(err)
}
jsonString := string(b)
command := "https://example.com/booking-mail"
StartCurlCommand(command, "POST", jsonString)
}
func StartCurlCommand(url, reqType, jsonData string, headers ...string) error {
var ip, userAgent, bearerToken string
var cmd *exec.Cmd
if len(headers) > 0 {
ip = headers[0]
userAgent = headers[1]
bearerToken = headers[2]
}
if reqType == "POST" {
cmd = exec.Command("curl", "-H", "Connection: close", "--no-keepalive", "-H", "Content-Type: application/json", "-X", "POST", "-d", jsonData, url)
} else {
cmd = exec.Command("curl", "-H", "Connection: close", "--no-keepalive", url)
}
var out bytes.Buffer
var stderr bytes.Buffer
cmd.Stdout = &out
cmd.Stderr = &stderr
err := cmd.Start()
if err == nil {
go func(cmd *exec.Cmd) {
_ = cmd.Wait()
}(cmd)
}
return err
}
I have already searched for this and found that there can be some sort of recursion in the code. But I am not able to identify where it is. Please help what is wrong here ?
CodePudding user response:
. But I am not able to identify where it is.
That might be because issue 7181 has yet to be released.
Any way to vote on this? Our dev team just spent a day trying to guess which part of our code might have been in the stack because our entire codebase was elided from the trace.
commit 3a81338 was reverted, but using a patched Go with this commit, just for testing, would help: instead of printing massive stack traces during endless recursion, which spams users and aren't useful, it now prints out the top and bottom 50 frames.
That would go a long way to help identify the root cause for the recursion.
The OP Amandeep Kaur confirms in the comments:
There was recursion mentioned in the stack trace.
I fixed that. Now the system is working fine.
CodePudding user response:
goroutine stack exceeds 1000000000-byte limit means you have infinite recursion (or too deep recursion) in your program. Call stack is a limited resource, so recursion should be used sparingly.
Example:
package main
func test(x int) int {
return x test(x 1)
}
func main() {
test(1)
}
Print a panic:
$ go run .
runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc020160398 stack=[0xc020160000, 0xc040160000]
fatal error: stack overflow
runtime stack:
runtime.throw(0x474d4b, 0xe)
/usr/local/go/src/runtime/panic.go:1116 0x72
runtime.newstack()
/usr/local/go/src/runtime/stack.go:1067 0x78d
runtime.morestack()
/usr/local/go/src/runtime/asm_amd64.s:449 0x8f
goroutine 1 [running]:
main.test(0xffffdf, 0x0)
/home/test/gtest/test.go:3 0x50 fp=0xc0201603a8 sp=0xc0201603a0 pc=0x45dcd0
main.test(0xffffde, 0x0)
/home/test/gtest/test.go:4 0x2f fp=0xc0201603c8 sp=0xc0201603a8 pc=0x45dcaf
main.test(0xffffdd, 0x0)
/home/test/gtest/test.go:4 0x2f fp=0xc0201603e8 sp=0xc0201603c8 pc=0x45dcaf
main.test(0xffffdc, 0x0)
/home/test/gtest/test.go:4 0x2f fp=0xc020160408 sp=0xc0201603e8 pc=0x45dcaf
main.test(0xffffdb, 0x0)
/home/test/gtest/test.go:4 0x2f fp=0xc020160428 sp=0xc020160408 pc=0x45dcaf
main.test(0xffffda, 0x0)
/home/test/gtest/test.go:4 0x2f fp=0xc020160448 sp=0xc020160428 pc=0x45dcaf
main.test(0xffffd9, 0x0)
/home/test/gtest/test.go:4 0x2f fp=0xc020160468 sp=0xc020160448 pc=0x45dcaf
main.test(0xffffd8, 0x0)
/home/test/gtest/test.go:4 0x2f fp=0xc020160488 sp=0xc020160468 pc=0x45dcaf
main.test(0xffffd7, 0x0)
/home/test/gtest/test.go:4 0x2f fp=0xc0201604a8 sp=0xc020160488 pc=0x45dcaf
. . .
From the goroutine trace we can see that the issue is in test.go at line 4, which is the recursive call to line 3. That should give us enough knowledge to fix our code.
