3.9 KiB
Another day another incident #02
Today's incident is all about Go context.
TLDR: context got canceled, but it shouldn't.
The problem
Imagine a chain of APIs:
- Calling API A
- Calling API B
Normally, if API A fails, API B should not be called. But what if API A is optional, whether it successes or fails, API B should be called anyway.
My buggy code is like this:
if err := doA(ctx); err != nil {
log.Error(err)
// Skip error
}
doB(ctx)
The problem is doA
taking too long, so ctx
is canceled, and the parent of
ctx
is canceled too. So when doB
is called with ctx
, it will be canceled
too (not what we want but sadly that what we got).
Example buggy code (The Go Playground):
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
doA(ctx)
doB(ctx)
}
func doA(ctx context.Context) {
ctx, ctxCancel := context.WithTimeout(ctx, 1*time.Second)
defer ctxCancel()
select {
case <-time.After(2 * time.Second):
fmt.Println("doA")
case <-ctx.Done():
fmt.Println("doA", ctx.Err())
}
}
func doB(ctx context.Context) {
ctx, ctxCancel := context.WithTimeout(ctx, 3*time.Second)
defer ctxCancel()
select {
case <-time.After(2 * time.Second):
fmt.Println("doB")
case <-ctx.Done():
fmt.Println("doB", ctx.Err())
}
}
The output is:
doA context deadline exceeded
doB context deadline exceeded
As you see both doA
and doB
are canceled.
The (temporary) solution
Quick Google search leads me to context: add WithoutCancel #40221 and I quote:
This is useful in multiple frequently recurring and important scenarios:
- Handling of rollback/cleanup operations in the context of an event (e.g., HTTP request) that has to continue regardless of whether the triggering event is canceled (e.g., due to timeout or the client going away)
- Handling of long-running operations triggered by an event (e.g., HTTP request) that terminates before the termination of the long-running operation
So beside waiting to upgrade to Go 1.21
to use context.WithoutCancel
, you
can use this workaround code:
func DisconnectContext(parent context.Context) context.Context {
if parent == nil {
return context.Background()
}
return disconnectedContext{
parent: parent,
}
}
type disconnectedContext struct {
parent context.Context
}
func (ctx disconnectedContext) Deadline() (deadline time.Time, ok bool) {
return
}
func (ctx disconnectedContext) Done() <-chan struct{} {
return nil
}
func (ctx disconnectedContext) Err() error {
return nil
}
func (ctx disconnectedContext) Value(key any) any {
return ctx.parent.Value(key)
}
So the buggy code becomes (The Go Playground):
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
doA(ctx)
doB(ctx)
}
func doA(ctx context.Context) {
ctx, ctxCancel := context.WithTimeout(ctx, 1*time.Second)
defer ctxCancel()
select {
case <-time.After(2 * time.Second):
fmt.Println("doA")
case <-ctx.Done():
fmt.Println("doA", ctx.Err())
}
}
func doB(ctx context.Context) {
ctx, ctxCancel := context.WithTimeout(DisconnectContext(ctx), 3*time.Second)
defer ctxCancel()
select {
case <-time.After(2 * time.Second):
fmt.Println("doB")
case <-ctx.Done():
fmt.Println("doB", ctx.Err())
}
}
The output is:
doA context deadline exceeded
doB
As you see only doA
is canceled, doB
is done perfectly. And that what we
want in this case.