1# GoKer
2
3The following examples are obtained from the publication
4"GoBench: A Benchmark Suite of Real-World Go Concurrency Bugs"
5(doi:10.1109/CGO51591.2021.9370317).
6
7**Authors**
8Ting Yuan (yuanting@ict.ac.cn):
9 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences,
10 University of Chinese Academy of Sciences, Beijing, China;
11Guangwei Li (liguangwei@ict.ac.cn):
12 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences,
13 University of Chinese Academy of Sciences, Beijing, China;
14Jie Lu† (lujie@ict.ac.an):
15 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences;
16Chen Liu (liuchen17z@ict.ac.cn):
17 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences,
18 University of Chinese Academy of Sciences, Beijing, China
19Lian Li (lianli@ict.ac.cn):
20 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences,
21 University of Chinese Academy of Sciences, Beijing, China;
22Jingling Xue (jingling@cse.unsw.edu.au):
23 University of New South Wales, School of Computer Science and Engineering, Sydney, Australia
24
25White paper: https://lujie.ac.cn/files/papers/GoBench.pdf
26
27The examples have been modified in order to run the goroutine leak profiler.
28Buggy snippets are moved from within a unit test to separate applications.
29Each is then independently executed, possibly as multiple copies within the
30same application in order to exercise more interleavings. Concurrently, the
31main program sets up a waiting period (typically 1ms), followed by a goroutine
32leak profile request. Other modifications may involve injecting calls to
33`runtime.Gosched()`, to more reliably exercise buggy interleavings, or reductions
34in waiting periods when calling `time.Sleep`, in order to reduce overall testing
35time.
36
37The resulting goroutine leak profile is analyzed to ensure that no unexpecte
38leaks occurred, and that the expected leaks did occur. If the leak is flaky, the
39only purpose of the expected leak list is to protect against unexpected leaks.
40
41The examples have also been modified to remove data races, since those create flaky
42test failures, when really all we care about are leaked goroutines.
43
44The entries below document each of the corresponding leaks.
45
46## Cockroach/10214
47
48| Bug ID | Ref | Patch | Type | Sub-type |
49| ---- | ---- | ---- | ---- | ---- |
50|[cockroach#10214]|[pull request]|[patch]| Resource | AB-BA leak |
51
52[cockroach#10214]:(cockroach10214_test.go)
53[patch]:https://github.com/cockroachdb/cockroach/pull/10214/files
54[pull request]:https://github.com/cockroachdb/cockroach/pull/10214
55
56### Description
57
58This goroutine leak is caused by different order when acquiring
59coalescedMu.Lock() and raftMu.Lock(). The fix is to refactor sendQueuedHeartbeats()
60so that cockroachdb can unlock coalescedMu before locking raftMu.
61
62### Example execution
63
64```go
65G1 G2
66------------------------------------------------------------------------------------
67s.sendQueuedHeartbeats() .
68s.coalescedMu.Lock() [L1] .
69s.sendQueuedHeartbeatsToNode() .
70s.mu.replicas[0].reportUnreachable() .
71s.mu.replicas[0].raftMu.Lock() [L2] .
72. s.mu.replicas[0].tick()
73. s.mu.replicas[0].raftMu.Lock() [L2]
74. s.mu.replicas[0].tickRaftMuLocked()
75. s.mu.replicas[0].mu.Lock() [L3]
76. s.mu.replicas[0].maybeQuiesceLocked()
77. s.mu.replicas[0].maybeCoalesceHeartbeat()
78. s.coalescedMu.Lock() [L1]
79--------------------------------G1,G2 leak------------------------------------------
80```
81
82## Cockroach/1055
83
84| Bug ID | Ref | Patch | Type | Sub-type |
85| ---- | ---- | ---- | ---- | ---- |
86|[cockroach#1055]|[pull request]|[patch]| Mixed | Channel & WaitGroup |
87
88[cockroach#1055]:(cockroach1055_test.go)
89[patch]:https://github.com/cockroachdb/cockroach/pull/1055/files
90[pull request]:https://github.com/cockroachdb/cockroach/pull/1055
91
92### Description
93
941. `Stop()` is called and blocked at `s.stop.Wait()` after acquiring the lock.
952. `StartTask()` is called and attempts to acquire the lock. It is then blocked.
963. `Stop()` never finishes since the task doesn't call SetStopped.
97
98### Example execution
99
100```go
101G1 G2.0 G2.1 G2.2 G3
102-------------------------------------------------------------------------------------------------------------------------------
103s[0].stop.Add(1) [1]
104go func() [G2.0]
105s[1].stop.Add(1) [1] .
106go func() [G2.1] .
107s[2].stop.Add(1) [1] . .
108go func() [G2.2] . .
109go func() [G3] . . .
110<-done . . . .
111. s[0].StartTask() . . .
112. s[0].draining == 0 . . .
113. . s[1].StartTask() . .
114. . s[1].draining == 0 . .
115. . . s[2].StartTask() .
116. . . s[2].draining == 0 .
117. . . . s[0].Quiesce()
118. . . . s[0].mu.Lock() [L1[0]]
119. s[0].mu.Lock() [L1[0]] . . .
120. s[0].drain.Add(1) [1] . . .
121. s[0].mu.Unlock() [L1[0]] . . .
122. <-s[0].ShouldStop() . . .
123. . . . s[0].draining = 1
124. . . . s[0].drain.Wait()
125. . s[0].mu.Lock() [L1[1]] . .
126. . s[1].drain.Add(1) [1] . .
127. . s[1].mu.Unlock() [L1[1]] . .
128. . <-s[1].ShouldStop() . .
129. . . s[2].mu.Lock() [L1[2]] .
130. . . s[2].drain.Add() [1] .
131. . . s[2].mu.Unlock() [L1[2]] .
132. . . <-s[2].ShouldStop() .
133----------------------------------------------------G1, G2.[0..2], G3 leak-----------------------------------------------------
134```
135
136## Cockroach/10790
137
138| Bug ID | Ref | Patch | Type | Sub-type |
139| ---- | ---- | ---- | ---- | ---- |
140|[cockroach#10790]|[pull request]|[patch]| Communication | Channel & Context |
141
142[cockroach#10790]:(cockroach10790_test.go)
143[patch]:https://github.com/cockroachdb/cockroach/pull/10790/files
144[pull request]:https://github.com/cockroachdb/cockroach/pull/10790
145
146### Description
147
148It is possible that a message from `ctxDone` will make `beginCmds`
149return without draining the channel `ch`, so that anonymous function
150goroutines will leak.
151
152### Example execution
153
154```go
155G1 G2 helper goroutine
156-----------------------------------------------------
157. . r.sendChans()
158r.beginCmds() . .
159. . ch1 <- true
160<- ch1 . .
161. . ch2 <- true
162...
163. cancel()
164<- ch1
165------------------G1 leak----------------------------
166```
167
168## Cockroach/13197
169
170| Bug ID | Ref | Patch | Type | Sub-type |
171| ---- | ---- | ---- | ---- | ---- |
172|[cockroach#13197]|[pull request]|[patch]| Communication | Channel & Context |
173
174[cockroach#13197]:(cockroach13197_test.go)
175[patch]:https://github.com/cockroachdb/cockroach/pull/13197/files
176[pull request]:https://github.com/cockroachdb/cockroach/pull/13197
177
178### Description
179
180One goroutine executing `(*Tx).awaitDone()` blocks and
181waiting for a signal `context.Done()`.
182
183### Example execution
184
185```go
186G1 G2
187-------------------------------
188begin()
189. awaitDone()
190return .
191. <-tx.ctx.Done()
192-----------G2 leaks------------
193```
194
195## Cockroach/13755
196
197| Bug ID | Ref | Patch | Type | Sub-type |
198| ---- | ---- | ---- | ---- | ---- |
199|[cockroach#13755]|[pull request]|[patch]| Communication | Channel & Context |
200
201[cockroach#13755]:(cockroach13755_test.go)
202[patch]:https://github.com/cockroachdb/cockroach/pull/13755/files
203[pull request]:https://github.com/cockroachdb/cockroach/pull/13755
204
205### Description
206
207The buggy code does not close the db query result (rows),
208so that one goroutine running `(*Rows).awaitDone` is blocked forever.
209The blocking goroutine is waiting for cancel signal from context.
210
211### Example execution
212
213```go
214G1 G2
215---------------------------------------
216initContextClose()
217. awaitDone()
218return .
219. <-tx.ctx.Done()
220---------------G2 leaks----------------
221```
222
223## Cockroach/1462
224
225| Bug ID | Ref | Patch | Type | Sub-type |
226| ---- | ---- | ---- | ---- | ---- |
227|[cockroach#1462]|[pull request]|[patch]| Mixed | Channel & WaitGroup |
228
229[cockroach#1462]:(cockroach1462_test.go)
230[patch]:https://github.com/cockroachdb/cockroach/pull/1462/files
231[pull request]:https://github.com/cockroachdb/cockroach/pull/1462
232
233### Description
234
235Executing `<-stopper.ShouldStop()` in `processEventsUntil` may cause
236goroutines created by `lt.RunWorker` in `lt.start` to be stuck sending
237a message over `lt.Events`. The main thread is then stuck at `s.stop.Wait()`,
238since the sender goroutines cannot call `s.stop.Done()`.
239
240### Example execution
241
242```go
243G1 G2 G3
244-------------------------------------------------------------------------------------------------------
245NewLocalInterceptableTransport()
246lt.start()
247lt.stopper.RunWorker()
248s.AddWorker()
249s.stop.Add(1) [1]
250go func() [G2]
251stopper.RunWorker() .
252s.AddWorker() .
253s.stop.Add(1) [2] .
254go func() [G3] .
255s.Stop() . .
256s.Quiesce() . .
257. select [default] .
258. lt.Events <- interceptMessage(0) .
259close(s.stopper) . .
260. . select [<-stopper.ShouldStop()]
261. . <<<done>>>
262s.stop.Wait() .
263----------------------------------------------G1,G2 leak-----------------------------------------------
264```
265
266## Cockroach/16167
267
268| Bug ID | Ref | Patch | Type | Sub-type |
269| ---- | ---- | ---- | ---- | ---- |
270|[cockroach#16167]|[pull request]|[patch]| Resource | Double Locking |
271
272[cockroach#16167]:(cockroach16167_test.go)
273[patch]:https://github.com/cockroachdb/cockroach/pull/16167/files
274[pull request]:https://github.com/cockroachdb/cockroach/pull/16167
275
276### Description
277
278This is another example of goroutine leaks caused by recursively
279acquiring `RWLock`.
280There are two lock variables (`systemConfigCond` and `systemConfigMu`)
281which refer to the same underlying lock. The leak invovlves two goroutines.
282The first acquires `systemConfigMu.Lock()`, then tries to acquire `systemConfigMu.RLock()`.
283The second acquires `systemConfigMu.Lock()`.
284If the second goroutine interleaves in between the two lock operations of the
285first goroutine, both goroutines will leak.
286
287### Example execution
288
289```go
290G1 G2
291---------------------------------------------------------------
292. e.Start()
293. e.updateSystemConfig()
294e.execParsed() .
295e.systemConfigCond.L.Lock() [L1] .
296. e.systemConfigMu.Lock() [L1]
297e.systemConfigMu.RLock() [L1] .
298------------------------G1,G2 leak-----------------------------
299```
300
301## Cockroach/18101
302
303| Bug ID | Ref | Patch | Type | Sub-type |
304| ---- | ---- | ---- | ---- | ---- |
305|[cockroach#18101]|[pull request]|[patch]| Resource | Double Locking |
306
307[cockroach#18101]:(cockroach18101_test.go)
308[patch]:https://github.com/cockroachdb/cockroach/pull/18101/files
309[pull request]:https://github.com/cockroachdb/cockroach/pull/18101
310
311### Description
312
313The `context.Done()` signal short-circuits the reader goroutine, but not
314the senders, leading them to leak.
315
316### Example execution
317
318```go
319G1 G2 helper goroutine
320--------------------------------------------------------------
321restore()
322. splitAndScatter()
323<-readyForImportCh .
324<-readyForImportCh <==> readyForImportCh<-
325...
326. . cancel()
327<<done>> . <<done>>
328 readyForImportCh<-
329-----------------------G2 leaks--------------------------------
330```
331
332## Cockroach/2448
333
334| Bug ID | Ref | Patch | Type | Sub-type |
335| ---- | ---- | ---- | ---- | ---- |
336|[cockroach#2448]|[pull request]|[patch]| Communication | Channel |
337
338[cockroach#2448]:(cockroach2448_test.go)
339[patch]:https://github.com/cockroachdb/cockroach/pull/2448/files
340[pull request]:https://github.com/cockroachdb/cockroach/pull/2448
341
342### Description
343
344This bug is caused by two goroutines waiting for each other
345to unblock their channels:
346
3471) `MultiRaft` sends the commit event for the Membership change
3482) `store.processRaft` takes it and begins processing
3493) another command commits and triggers another `sendEvent`, but
350 this blocks since `store.processRaft` isn't ready for another
351 `select`. Consequently the main `MultiRaft` loop is waiting for
352 that as well.
3534) the `Membership` change was applied to the range, and the store
354 now tries to execute the callback
3555) the callback tries to write to `callbackChan`, but that is
356 consumed by the `MultiRaft` loop, which is currently waiting
357 for `store.processRaft` to consume from the events channel,
358 which it will only do after the callback has completed.
359
360### Example execution
361
362```go
363G1 G2
364--------------------------------------------------------------------------
365s.processRaft() st.start()
366select .
367. select [default]
368. s.handleWriteResponse()
369. s.sendEvent()
370. select
371<-s.multiraft.Events <----> m.Events <- event
372. select [default]
373. s.handleWriteResponse()
374. s.sendEvent()
375. select [m.Events<-, <-s.stopper.ShouldStop()]
376callback() .
377select [
378 m.callbackChan<-,
379 <-s.stopper.ShouldStop()
380] .
381------------------------------G1,G2 leak----------------------------------
382```
383
384## Cockroach/24808
385
386| Bug ID | Ref | Patch | Type | Sub-type |
387| ---- | ---- | ---- | ---- | ---- |
388|[cockroach#24808]|[pull request]|[patch]| Communication | Channel |
389
390[cockroach#24808]:(cockroach24808_test.go)
391[patch]:https://github.com/cockroachdb/cockroach/pull/24808/files
392[pull request]:https://github.com/cockroachdb/cockroach/pull/24808
393
394### Description
395
396When we `Start` the `Compactor`, it may already have received
397`Suggestions`, leaking the previously blocking write to a full channel.
398
399### Example execution
400
401```go
402G1
403------------------------------------------------
404...
405compactor.ch <-
406compactor.Start()
407compactor.ch <-
408--------------------G1 leaks--------------------
409```
410
411## Cockroach/25456
412
413| Bug ID | Ref | Patch | Type | Sub-type |
414| ---- | ---- | ---- | ---- | ---- |
415|[cockroach#25456]|[pull request]|[patch]| Communication | Channel |
416
417[cockroach#25456]:(cockroach25456_test.go)
418[patch]:https://github.com/cockroachdb/cockroach/pull/25456/files
419[pull request]:https://github.com/cockroachdb/cockroach/pull/25456
420
421### Description
422
423When `CheckConsistency` (in the complete code) returns an error, the queue
424checks whether the store is draining to decide whether the error is worth
425logging. This check was incorrect and would block until the store actually
426started draining.
427
428### Example execution
429
430```go
431G1
432---------------------------------------
433...
434<-repl.store.Stopper().ShouldQuiesce()
435---------------G1 leaks----------------
436```
437
438## Cockroach/35073
439
440| Bug ID | Ref | Patch | Type | Sub-type |
441| ---- | ---- | ---- | ---- | ---- |
442|[cockroach#35073]|[pull request]|[patch]| Communication | Channel |
443
444[cockroach#35073]:(cockroach35073_test.go)
445[patch]:https://github.com/cockroachdb/cockroach/pull/35073/files
446[pull request]:https://github.com/cockroachdb/cockroach/pull/35073
447
448### Description
449
450Previously, the outbox could fail during startup without closing its
451`RowChannel`. This could lead to goroutine leaks in rare cases due
452to channel communication mismatch.
453
454### Example execution
455
456## Cockroach/35931
457
458| Bug ID | Ref | Patch | Type | Sub-type |
459| ---- | ---- | ---- | ---- | ---- |
460|[cockroach#35931]|[pull request]|[patch]| Communication | Channel |
461
462[cockroach#35931]:(cockroach35931_test.go)
463[patch]:https://github.com/cockroachdb/cockroach/pull/35931/files
464[pull request]:https://github.com/cockroachdb/cockroach/pull/35931
465
466### Description
467
468Previously, if a processor that reads from multiple inputs was waiting
469on one input to provide more data, and the other input was full, and
470both inputs were connected to inbound streams, it was possible to
471cause goroutine leaks during flow cancellation when trying to propagate
472the cancellation metadata messages into the flow. The cancellation method
473wrote metadata messages to each inbound stream one at a time, so if the
474first one was full, the canceller would block and never send a cancellation
475message to the second stream, which was the one actually being read from.
476
477### Example execution
478
479## Cockroach/3710
480
481| Bug ID | Ref | Patch | Type | Sub-type |
482| ---- | ---- | ---- | ---- | ---- |
483|[cockroach#3710]|[pull request]|[patch]| Resource | RWR Deadlock |
484
485[cockroach#3710]:(cockroach3710_test.go)
486[patch]:https://github.com/cockroachdb/cockroach/pull/3710/files
487[pull request]:https://github.com/cockroachdb/cockroach/pull/3710
488
489### Description
490
491The goroutine leak is caused by acquiring a RLock twice in a call chain.
492`ForceRaftLogScanAndProcess(acquire s.mu.RLock())`
493`-> MaybeAdd()`
494`-> shouldQueue()`
495`-> getTruncatableIndexes()`
496`->RaftStatus(acquire s.mu.Rlock())`
497
498### Example execution
499
500```go
501G1 G2
502------------------------------------------------------------
503store.ForceRaftLogScanAndProcess()
504s.mu.RLock()
505s.raftLogQueue.MaybeAdd()
506bq.impl.shouldQueue()
507getTruncatableIndexes()
508r.store.RaftStatus()
509. store.processRaft()
510. s.mu.Lock()
511s.mu.RLock()
512----------------------G1,G2 leak-----------------------------
513```
514
515## Cockroach/584
516
517| Bug ID | Ref | Patch | Type | Sub-type |
518| ---- | ---- | ---- | ---- | ---- |
519|[cockroach#584]|[pull request]|[patch]| Resource | Double Locking |
520
521[cockroach#584]:(cockroach584_test.go)
522[patch]:https://github.com/cockroachdb/cockroach/pull/584/files
523[pull request]:https://github.com/cockroachdb/cockroach/pull/584
524
525### Description
526
527Missing call to `mu.Unlock()` before the `break` in the loop.
528
529### Example execution
530
531```go
532G1
533---------------------------
534g.bootstrap()
535g.mu.Lock() [L1]
536if g.closed { ==> break
537g.manage()
538g.mu.Lock() [L1]
539----------G1 leaks---------
540```
541
542## Cockroach/6181
543
544| Bug ID | Ref | Patch | Type | Sub-type |
545| ---- | ---- | ---- | ---- | ---- |
546|[cockroach#6181]|[pull request]|[patch]| Resource | RWR Deadlock |
547
548[cockroach#6181]:(cockroach6181_test.go)
549[patch]:https://github.com/cockroachdb/cockroach/pull/6181/files
550[pull request]:https://github.com/cockroachdb/cockroach/pull/6181
551
552### Description
553
554The same `RWMutex` may be recursively acquired for both reading and writing.
555
556### Example execution
557
558```go
559G1 G2 G3 ...
560-----------------------------------------------------------------------------------------------
561testRangeCacheCoalescedRquests()
562initTestDescriptorDB()
563pauseLookupResumeAndAssert()
564return
565. doLookupWithToken()
566. . doLookupWithToken()
567. rc.LookupRangeDescriptor() .
568. . rc.LookupRangeDescriptor()
569. rdc.rangeCacheMu.RLock() .
570. rdc.String() .
571. . rdc.rangeCacheMu.RLock()
572. . fmt.Printf()
573. . rdc.rangeCacheMu.RUnlock()
574. . rdc.rangeCacheMu.Lock()
575. rdc.rangeCacheMu.RLock() .
576-----------------------------------G2,G3,... leak----------------------------------------------
577```
578
579## Cockroach/7504
580
581| Bug ID | Ref | Patch | Type | Sub-type |
582| ---- | ---- | ---- | ---- | ---- |
583|[cockroach#7504]|[pull request]|[patch]| Resource | AB-BA Deadlock |
584
585[cockroach#7504]:(cockroach7504_test.go)
586[patch]:https://github.com/cockroachdb/cockroach/pull/7504/files
587[pull request]:https://github.com/cockroachdb/cockroach/pull/7504
588
589### Description
590
591The locks are acquired as `leaseState` and `tableNameCache` in `Release()`, but
592as `tableNameCache` and `leaseState` in `AcquireByName`, leading to an AB-BA deadlock.
593
594### Example execution
595
596```go
597G1 G2
598-----------------------------------------------------
599mgr.AcquireByName() mgr.Release()
600m.tableNames.get(id) .
601c.mu.Lock() [L2] .
602. t.release(lease)
603. t.mu.Lock() [L3]
604. s.mu.Lock() [L1]
605lease.mu.Lock() [L1] .
606. t.removeLease(s)
607. t.tableNameCache.remove()
608. c.mu.Lock() [L2]
609---------------------G1, G2 leak---------------------
610```
611
612## Cockroach/9935
613
614| Bug ID | Ref | Patch | Type | Sub-type |
615| ---- | ---- | ---- | ---- | ---- |
616|[cockroach#9935]|[pull request]|[patch]| Resource | Double Locking |
617
618[cockroach#9935]:(cockroach9935_test.go)
619[patch]:https://github.com/cockroachdb/cockroach/pull/9935/files
620[pull request]:https://github.com/cockroachdb/cockroach/pull/9935
621
622### Description
623
624This bug is caused by acquiring `l.mu.Lock()` twice.
625
626### Example execution
627
628## Etcd/10492
629
630| Bug ID | Ref | Patch | Type | Sub-type |
631| ---- | ---- | ---- | ---- | ---- |
632|[etcd#10492]|[pull request]|[patch]| Resource | Double locking |
633
634[etcd#10492]:(etcd10492_test.go)
635[patch]:https://github.com/etcd-io/etcd/pull/10492/files
636[pull request]:https://github.com/etcd-io/etcd/pull/10492
637
638### Description
639
640A simple double locking case for lines 19, 31.
641
642## Etcd/5509
643
644| Bug ID | Ref | Patch | Type | Sub-type |
645| ---- | ---- | ---- | ---- | ---- |
646|[etcd#5509]|[pull request]|[patch]| Resource | Double locking |
647
648[etcd#5509]:(etcd5509_test.go)
649[patch]:https://github.com/etcd-io/etcd/pull/5509/files
650[pull request]:https://github.com/etcd-io/etcd/pull/5509
651
652### Description
653
654`r.acquire()` returns holding `r.client.mu.RLock()` on a failure path (line 42).
655This causes any call to `client.Close()` to leak goroutines.
656
657## Etcd/6708
658
659| Bug ID | Ref | Patch | Type | Sub-type |
660| ---- | ---- | ---- | ---- | ---- |
661|[etcd#6708]|[pull request]|[patch]| Resource | Double locking |
662
663[etcd#6708]:(etcd6708_test.go)
664[patch]:https://github.com/etcd-io/etcd/pull/6708/files
665[pull request]:https://github.com/etcd-io/etcd/pull/6708
666
667### Description
668
669Line 54, 49 double locking
670
671## Etcd/6857
672
673| Bug ID | Ref | Patch | Type | Sub-type |
674| ---- | ---- | ---- | ---- | ---- |
675|[etcd#6857]|[pull request]|[patch]| Communication | Channel |
676
677[etcd#6857]:(etcd6857_test.go)
678[patch]:https://github.com/etcd-io/etcd/pull/6857/files
679[pull request]:https://github.com/etcd-io/etcd/pull/6857
680
681### Description
682
683Choosing a different case in a `select` statement (`n.stop`) will
684lead to goroutine leaks when sending over `n.status`.
685
686### Example execution
687
688```go
689G1 G2 G3
690-------------------------------------------
691n.run() . .
692. . n.Stop()
693. . n.stop<-
694<-n.stop . .
695. . <-n.done
696close(n.done) . .
697return . .
698. . return
699. n.Status()
700. n.status<-
701----------------G2 leaks-------------------
702```
703
704## Etcd/6873
705
706| Bug ID | Ref | Patch | Type | Sub-type |
707| ---- | ---- | ---- | ---- | ---- |
708|[etcd#6873]|[pull request]|[patch]| Mixed | Channel & Lock |
709
710[etcd#6873]:(etcd6873_test.go)
711[patch]:https://github.com/etcd-io/etcd/pull/6873/files
712[pull request]:https://github.com/etcd-io/etcd/pull/6873
713
714### Description
715
716This goroutine leak involves a goroutine acquiring a lock and being
717blocked over a channel operation with no partner, while another tries
718to acquire the same lock.
719
720### Example execution
721
722```go
723G1 G2 G3
724--------------------------------------------------------------
725newWatchBroadcasts()
726wbs.update()
727wbs.updatec <-
728return
729. <-wbs.updatec .
730. wbs.coalesce() .
731. . wbs.stop()
732. . wbs.mu.Lock()
733. . close(wbs.updatec)
734. . <-wbs.donec
735. wbs.mu.Lock() .
736---------------------G2,G3 leak--------------------------------
737```
738
739## Etcd/7492
740
741| Bug ID | Ref | Patch | Type | Sub-type |
742| ---- | ---- | ---- | ---- | ---- |
743|[etcd#7492]|[pull request]|[patch]| Mixed | Channel & Lock |
744
745[etcd#7492]:(etcd7492_test.go)
746[patch]:https://github.com/etcd-io/etcd/pull/7492/files
747[pull request]:https://github.com/etcd-io/etcd/pull/7492
748
749### Description
750
751This goroutine leak involves a goroutine acquiring a lock and being
752blocked over a channel operation with no partner, while another tries
753to acquire the same lock.
754
755### Example execution
756
757```go
758G2 G1
759---------------------------------------------------------------
760. stk.run()
761ts.assignSimpleTokenToUser() .
762t.simpleTokensMu.Lock() .
763t.simpleTokenKeeper.addSimpleToken() .
764tm.addSimpleTokenCh <- true .
765. <-tm.addSimpleTokenCh
766t.simpleTokensMu.Unlock() .
767ts.assignSimpleTokenToUser() .
768...
769t.simpleTokensMu.Lock()
770. <-tokenTicker.C
771tm.addSimpleTokenCh <- true .
772. tm.deleteTokenFunc()
773. t.simpleTokensMu.Lock()
774---------------------------G1,G2 leak--------------------------
775```
776
777## Etcd/7902
778
779| Bug ID | Ref | Patch | Type | Sub-type |
780| ---- | ---- | ---- | ---- | ---- |
781|[etcd#7902]|[pull request]|[patch]| Mixed | Channel & Lock |
782
783[etcd#7902]:(etcd7902_test.go)
784[patch]:https://github.com/etcd-io/etcd/pull/7902/files
785[pull request]:https://github.com/etcd-io/etcd/pull/7902
786
787### Description
788
789If the follower gooroutine acquires `mu.Lock()` first and calls
790`rc.release()`, it will be blocked sending over `rcNextc`.
791Only the leader can `close(nextc)` to unblock the follower.
792However, in order to invoke `rc.release()`, the leader needs
793to acquires `mu.Lock()`.
794The fix is to remove the lock and unlock around `rc.release()`.
795
796### Example execution
797
798```go
799G1 G2 (leader) G3 (follower)
800---------------------------------------------------------------------
801runElectionFunc()
802doRounds()
803wg.Wait()
804. ...
805. mu.Lock()
806. rc.validate()
807. rcNextc = nextc
808. mu.Unlock() ...
809. . mu.Lock()
810. . rc.validate()
811. . mu.Unlock()
812. . mu.Lock()
813. . rc.release()
814. . <-rcNextc
815. mu.Lock()
816-------------------------G1,G2,G3 leak--------------------------
817```
818
819## Grpc/1275
820
821| Bug ID | Ref | Patch | Type | Sub-type |
822| ---- | ---- | ---- | ---- | ---- |
823|[grpc#1275]|[pull request]|[patch]| Communication | Channel |
824
825[grpc#1275]:(grpc1275_test.go)
826[patch]:https://github.com/grpc/grpc-go/pull/1275/files
827[pull request]:https://github.com/grpc/grpc-go/pull/1275
828
829### Description
830
831Two goroutines are involved in this leak. The main goroutine
832is blocked at `case <- donec`, and is waiting for the second goroutine
833to close the channel.
834The second goroutine is created by the main goroutine. It is blocked
835when calling `stream.Read()`, which invokes `recvBufferRead.Read()`.
836The second goroutine is blocked at case `i := r.recv.get()`, and it is
837waiting for someone to send a message to this channel.
838It is the `client.CloseSream()` method called by the main goroutine that
839should send the message, but it is not. The patch is to send out this message.
840
841### Example execution
842
843```go
844G1 G2
845-----------------------------------------------------
846testInflightStreamClosing()
847. stream.Read()
848. io.ReadFull()
849. <-r.recv.get()
850CloseStream()
851<-donec
852---------------------G1, G2 leak---------------------
853```
854
855## Grpc/1424
856
857| Bug ID | Ref | Patch | Type | Sub-type |
858| ---- | ---- | ---- | ---- | ---- |
859|[grpc#1424]|[pull request]|[patch]| Communication | Channel |
860
861[grpc#1424]:(grpc1424_test.go)
862[patch]:https://github.com/grpc/grpc-go/pull/1424/files
863[pull request]:https://github.com/grpc/grpc-go/pull/1424
864
865### Description
866
867The goroutine running `cc.lbWatcher` returns without
868draining the `done` channel.
869
870### Example execution
871
872```go
873G1 G2 G3
874-----------------------------------------------------------------
875DialContext() . .
876. cc.dopts.balancer.Notify() .
877. . cc.lbWatcher()
878. <-doneChan
879close()
880---------------------------G2 leaks-------------------------------
881```
882
883## Grpc/1460
884
885| Bug ID | Ref | Patch | Type | Sub-type |
886| ---- | ---- | ---- | ---- | ---- |
887|[grpc#1460]|[pull request]|[patch]| Mixed | Channel & Lock |
888
889[grpc#1460]:(grpc1460_test.go)
890[patch]:https://github.com/grpc/grpc-go/pull/1460/files
891[pull request]:https://github.com/grpc/grpc-go/pull/1460
892
893### Description
894
895When gRPC keepalives are enabled (which isn't the case
896by default at this time) and PermitWithoutStream is false
897(the default), the client can leak goroutines when transitioning
898between having no active stream and having one active
899stream.The keepalive() goroutine is stuck at “<-t.awakenKeepalive”,
900while the main goroutine is stuck in NewStream() on t.mu.Lock().
901
902### Example execution
903
904```go
905G1 G2
906--------------------------------------------
907client.keepalive()
908. client.NewStream()
909t.mu.Lock()
910<-t.awakenKeepalive
911. t.mu.Lock()
912---------------G1,G2 leak-------------------
913```
914
915## Grpc/3017
916
917| Bug ID | Ref | Patch | Type | Sub-type |
918| ---- | ---- | ---- | ---- | ---- |
919|[grpc#3017]|[pull request]|[patch]| Resource | Missing unlock |
920
921[grpc#3017]:(grpc3017_test.go)
922[patch]:https://github.com/grpc/grpc-go/pull/3017/files
923[pull request]:https://github.com/grpc/grpc-go/pull/3017
924
925### Description
926
927Line 65 is an execution path with a missing unlock.
928
929### Example execution
930
931```go
932G1 G2 G3
933------------------------------------------------------------------------------------------------
934NewSubConn([1])
935ccc.mu.Lock() [L1]
936sc = 1
937ccc.subConnToAddr[1] = 1
938go func() [G2]
939<-done .
940. ccc.RemoveSubConn(1)
941. ccc.mu.Lock()
942. addr = 1
943. entry = &subConnCacheEntry_grpc3017{}
944. cc.subConnCache[1] = entry
945. timer = time.AfterFunc() [G3]
946. entry.cancel = func()
947. sc = ccc.NewSubConn([1])
948. ccc.mu.Lock() [L1]
949. entry.cancel()
950. !timer.Stop() [true]
951. entry.abortDeleting = true
952. . ccc.mu.Lock()
953. . <<<done>>>
954. ccc.RemoveSubConn(1)
955. ccc.mu.Lock() [L1]
956-------------------------------------------G1, G2 leak-----------------------------------------
957```
958
959## Grpc/660
960
961| Bug ID | Ref | Patch | Type | Sub-type |
962| ---- | ---- | ---- | ---- | ---- |
963|[grpc#660]|[pull request]|[patch]| Communication | Channel |
964
965[grpc#660]:(grpc660_test.go)
966[patch]:https://github.com/grpc/grpc-go/pull/660/files
967[pull request]:https://github.com/grpc/grpc-go/pull/660
968
969### Description
970
971The parent function could return without draining the done channel.
972
973### Example execution
974
975```go
976G1 G2 helper goroutine
977-------------------------------------------------------------
978doCloseLoopUnary()
979. bc.stop <- true
980<-bc.stop
981return
982. done <-
983----------------------G2 leak--------------------------------
984```
985
986## Grpc/795
987
988| Bug ID | Ref | Patch | Type | Sub-type |
989| ---- | ---- | ---- | ---- | ---- |
990|[grpc#795]|[pull request]|[patch]| Resource | Double locking |
991
992[grpc#795]:(grpc795_test.go)
993[patch]:https://github.com/grpc/grpc-go/pull/795/files
994[pull request]:https://github.com/grpc/grpc-go/pull/795
995
996### Description
997
998Line 20 is an execution path with a missing unlock.
999
1000## Grpc/862
1001
1002| Bug ID | Ref | Patch | Type | Sub-type |
1003| ---- | ---- | ---- | ---- | ---- |
1004|[grpc#862]|[pull request]|[patch]| Communication | Channel & Context |
1005
1006[grpc#862]:(grpc862_test.go)
1007[patch]:https://github.com/grpc/grpc-go/pull/862/files
1008[pull request]:https://github.com/grpc/grpc-go/pull/862
1009
1010### Description
1011
1012When return value `conn` is `nil`, `cc(ClientConn)` is not closed.
1013The goroutine executing resetAddrConn is leaked. The patch is to
1014close `ClientConn` in `defer func()`.
1015
1016### Example execution
1017
1018```go
1019G1 G2
1020---------------------------------------
1021DialContext()
1022. cc.resetAddrConn()
1023. resetTransport()
1024. <-ac.ctx.Done()
1025--------------G2 leak------------------
1026```
1027
1028## Hugo/3251
1029
1030| Bug ID | Ref | Patch | Type | Sub-type |
1031| ---- | ---- | ---- | ---- | ---- |
1032|[hugo#3251]|[pull request]|[patch]| Resource | RWR deadlock |
1033
1034[hugo#3251]:(hugo3251_test.go)
1035[patch]:https://github.com/gohugoio/hugo/pull/3251/files
1036[pull request]:https://github.com/gohugoio/hugo/pull/3251
1037
1038### Description
1039
1040A goroutine can hold `Lock()` at line 20 then acquire `RLock()` at
1041line 29. `RLock()` at line 29 will never be acquired because `Lock()`
1042at line 20 will never be released.
1043
1044### Example execution
1045
1046```go
1047G1 G2 G3
1048------------------------------------------------------------------------------------------
1049wg.Add(1) [W1: 1]
1050go func() [G2]
1051go func() [G3]
1052. resGetRemote()
1053. remoteURLLock.URLLock(url)
1054. l.Lock() [L1]
1055. l.m[url] = &sync.Mutex{} [L2]
1056. l.m[url].Lock() [L2]
1057. l.Unlock() [L1]
1058. . resGetRemote()
1059. . remoteURLLock.URLLock(url)
1060. . l.Lock() [L1]
1061. . l.m[url].Lock() [L2]
1062. remoteURLLock.URLUnlock(url)
1063. l.RLock() [L1]
1064...
1065wg.Wait() [W1]
1066----------------------------------------G1,G2,G3 leak--------------------------------------
1067```
1068
1069## Hugo/5379
1070
1071| Bug ID | Ref | Patch | Type | Sub-type |
1072| ---- | ---- | ---- | ---- | ---- |
1073|[hugo#5379]|[pull request]|[patch]| Resource | Double locking |
1074
1075[hugo#5379]:(hugo5379_test.go)
1076[patch]:https://github.com/gohugoio/hugo/pull/5379/files
1077[pull request]:https://github.com/gohugoio/hugo/pull/5379
1078
1079### Description
1080
1081A goroutine first acquire `contentInitMu` at line 99 then
1082acquire the same `Mutex` at line 66
1083
1084## Istio/16224
1085
1086| Bug ID | Ref | Patch | Type | Sub-type |
1087| ---- | ---- | ---- | ---- | ---- |
1088|[istio#16224]|[pull request]|[patch]| Mixed | Channel & Lock |
1089
1090[istio#16224]:(istio16224_test.go)
1091[patch]:https://github.com/istio/istio/pull/16224/files
1092[pull request]:https://github.com/istio/istio/pull/16224
1093
1094### Description
1095
1096A goroutine holds a `Mutex` at line 91 and is then blocked at line 93.
1097Another goroutine attempts to acquire the same `Mutex` at line 101 to
1098further drains the same channel at 103.
1099
1100## Istio/17860
1101
1102| Bug ID | Ref | Patch | Type | Sub-type |
1103| ---- | ---- | ---- | ---- | ---- |
1104|[istio#17860]|[pull request]|[patch]| Communication | Channel |
1105
1106[istio#17860]:(istio17860_test.go)
1107[patch]:https://github.com/istio/istio/pull/17860/files
1108[pull request]:https://github.com/istio/istio/pull/17860
1109
1110### Description
1111
1112`a.statusCh` can't be drained at line 70.
1113
1114## Istio/18454
1115
1116| Bug ID | Ref | Patch | Type | Sub-type |
1117| ---- | ---- | ---- | ---- | ---- |
1118|[istio#18454]|[pull request]|[patch]| Communication | Channel & Context |
1119
1120[istio#18454]:(istio18454_test.go)
1121[patch]:https://github.com/istio/istio/pull/18454/files
1122[pull request]:https://github.com/istio/istio/pull/18454
1123
1124### Description
1125
1126`s.timer.Stop()` at line 56 and 61 can be called concurrency
1127(i.e. from their entry point at line 104 and line 66).
1128See [Timer](https://golang.org/pkg/time/#Timer).
1129
1130## Kubernetes/10182
1131
1132| Bug ID | Ref | Patch | Type | Sub-type |
1133| ---- | ---- | ---- | ---- | ---- |
1134|[kubernetes#10182]|[pull request]|[patch]| Mixed | Channel & Lock |
1135
1136[kubernetes#10182]:(kubernetes10182_test.go)
1137[patch]:https://github.com/kubernetes/kubernetes/pull/10182/files
1138[pull request]:https://github.com/kubernetes/kubernetes/pull/10182
1139
1140### Description
1141
1142Goroutine 1 is blocked on a lock held by goroutine 3,
1143while goroutine 3 is blocked on sending message to `ch`,
1144which is read by goroutine 1.
1145
1146### Example execution
1147
1148```go
1149G1 G2 G3
1150-------------------------------------------------------------------------------
1151s.Start()
1152s.syncBatch()
1153. s.SetPodStatus()
1154. s.podStatusesLock.Lock()
1155<-s.podStatusChannel <===> s.podStatusChannel <- true
1156. s.podStatusesLock.Unlock()
1157. return
1158s.DeletePodStatus() .
1159. . s.podStatusesLock.Lock()
1160. . s.podStatusChannel <- true
1161s.podStatusesLock.Lock()
1162-----------------------------G1,G3 leak-----------------------------------------
1163```
1164
1165## Kubernetes/11298
1166
1167| Bug ID | Ref | Patch | Type | Sub-type |
1168| ---- | ---- | ---- | ---- | ---- |
1169|[kubernetes#11298]|[pull request]|[patch]| Communication | Channel & Condition Variable |
1170
1171[kubernetes#11298]:(kubernetes11298_test.go)
1172[patch]:https://github.com/kubernetes/kubernetes/pull/11298/files
1173[pull request]:https://github.com/kubernetes/kubernetes/pull/11298
1174
1175### Description
1176
1177`n.node` used the `n.lock` as underlaying locker. The service loop initially
1178locked it, the `Notify` function tried to lock it before calling `n.node.Signal()`,
1179leading to a goroutine leak. `n.cond.Signal()` at line 59 and line 81 are not
1180guaranteed to unblock the `n.cond.Wait` at line 56.
1181
1182## Kubernetes/13135
1183
1184| Bug ID | Ref | Patch | Type | Sub-type |
1185| ---- | ---- | ---- | ---- | ---- |
1186|[kubernetes#13135]|[pull request]|[patch]| Resource | AB-BA deadlock |
1187
1188[kubernetes#13135]:(kubernetes13135_test.go)
1189[patch]:https://github.com/kubernetes/kubernetes/pull/13135/files
1190[pull request]:https://github.com/kubernetes/kubernetes/pull/13135
1191
1192### Description
1193
1194```go
1195G1 G2 G3
1196----------------------------------------------------------------------------------
1197NewCacher()
1198watchCache.SetOnReplace()
1199watchCache.SetOnEvent()
1200. cacher.startCaching()
1201. c.Lock()
1202. c.reflector.ListAndWatch()
1203. r.syncWith()
1204. r.store.Replace()
1205. w.Lock()
1206. w.onReplace()
1207. cacher.initOnce.Do()
1208. cacher.Unlock()
1209return cacher .
1210. . c.watchCache.Add()
1211. . w.processEvent()
1212. . w.Lock()
1213. cacher.startCaching() .
1214. c.Lock() .
1215...
1216. c.Lock()
1217. w.Lock()
1218--------------------------------G2,G3 leak-----------------------------------------
1219```
1220
1221## Kubernetes/1321
1222
1223| Bug ID | Ref | Patch | Type | Sub-type |
1224| ---- | ---- | ---- | ---- | ---- |
1225|[kubernetes#1321]|[pull request]|[patch]| Mixed | Channel & Lock |
1226
1227[kubernetes#1321]:(kubernetes1321_test.go)
1228[patch]:https://github.com/kubernetes/kubernetes/pull/1321/files
1229[pull request]:https://github.com/kubernetes/kubernetes/pull/1321
1230
1231### Description
1232
1233This is a lock-channel bug. The first goroutine invokes
1234`distribute()`, which holds `m.lock.Lock()`, while blocking
1235at sending message to `w.result`. The second goroutine
1236invokes `stopWatching()` function, which can unblock the first
1237goroutine by closing `w.result`. However, in order to close `w.result`,
1238`stopWatching()` function needs to acquire `m.lock.Lock()`.
1239
1240The fix is to introduce another channel and put receive message
1241from the second channel in the same `select` statement as the
1242`w.result`. Close the second channel can unblock the first
1243goroutine, while no need to hold `m.lock.Lock()`.
1244
1245### Example execution
1246
1247```go
1248G1 G2
1249----------------------------------------------
1250testMuxWatcherClose()
1251NewMux()
1252. m.loop()
1253. m.distribute()
1254. m.lock.Lock()
1255. w.result <- true
1256w := m.Watch()
1257w.Stop()
1258mw.m.stopWatching()
1259m.lock.Lock()
1260---------------G1,G2 leak---------------------
1261```
1262
1263## Kubernetes/25331
1264
1265| Bug ID | Ref | Patch | Type | Sub-type |
1266| ---- | ---- | ---- | ---- | ---- |
1267|[kubernetes#25331]|[pull request]|[patch]| Communication | Channel & Context |
1268
1269[kubernetes#25331]:(kubernetes25331_test.go)
1270[patch]:https://github.com/kubernetes/kubernetes/pull/25331/files
1271[pull request]:https://github.com/kubernetes/kubernetes/pull/25331
1272
1273### Description
1274
1275A potential goroutine leak occurs when an error has happened,
1276blocking `resultChan`, while cancelling context in `Stop()`.
1277
1278### Example execution
1279
1280```go
1281G1 G2
1282------------------------------------
1283wc.run()
1284. wc.Stop()
1285. wc.errChan <-
1286. wc.cancel()
1287<-wc.errChan
1288wc.cancel()
1289wc.resultChan <-
1290-------------G1 leak----------------
1291```
1292
1293## Kubernetes/26980
1294
1295| Bug ID | Ref | Patch | Type | Sub-type |
1296| ---- | ---- | ---- | ---- | ---- |
1297|[kubernetes#26980]|[pull request]|[patch]| Mixed | Channel & Lock |
1298
1299[kubernetes#26980]:(kubernetes26980_test.go)
1300[patch]:https://github.com/kubernetes/kubernetes/pull/26980/files
1301[pull request]:https://github.com/kubernetes/kubernetes/pull/26980
1302
1303### Description
1304
1305A goroutine holds a `Mutex` at line 24 and blocked at line 35.
1306Another goroutine blocked at line 58 by acquiring the same `Mutex`.
1307
1308## Kubernetes/30872
1309
1310| Bug ID | Ref | Patch | Type | Sub-type |
1311| ---- | ---- | ---- | ---- | ---- |
1312|[kubernetes#30872]|[pull request]|[patch]| Resource | AB-BA deadlock |
1313
1314[kubernetes#30872]:(kubernetes30872_test.go)
1315[patch]:https://github.com/kubernetes/kubernetes/pull/30872/files
1316[pull request]:https://github.com/kubernetes/kubernetes/pull/30872
1317
1318### Description
1319
1320The lock is acquired both at lines 92 and 157.
1321
1322## Kubernetes/38669
1323
1324| Bug ID | Ref | Patch | Type | Sub-type |
1325| ---- | ---- | ---- | ---- | ---- |
1326|[kubernetes#38669]|[pull request]|[patch]| Communication | Channel |
1327
1328[kubernetes#38669]:(kubernetes38669_test.go)
1329[patch]:https://github.com/kubernetes/kubernetes/pull/38669/files
1330[pull request]:https://github.com/kubernetes/kubernetes/pull/38669
1331
1332### Description
1333
1334No sender for line 33.
1335
1336## Kubernetes/5316
1337
1338| Bug ID | Ref | Patch | Type | Sub-type |
1339| ---- | ---- | ---- | ---- | ---- |
1340|[kubernetes#5316]|[pull request]|[patch]| Communication | Channel |
1341
1342[kubernetes#5316]:(kubernetes5316_test.go)
1343[patch]:https://github.com/kubernetes/kubernetes/pull/5316/files
1344[pull request]:https://github.com/kubernetes/kubernetes/pull/5316
1345
1346### Description
1347
1348If the main goroutine selects a case that doesn’t consumes
1349the channels, the anonymous goroutine will be blocked on sending
1350to channel.
1351
1352### Example execution
1353
1354```go
1355G1 G2
1356--------------------------------------
1357finishRequest()
1358. fn()
1359time.After()
1360. errCh<-/ch<-
1361--------------G2 leaks----------------
1362```
1363
1364## Kubernetes/58107
1365
1366| Bug ID | Ref | Patch | Type | Sub-type |
1367| ---- | ---- | ---- | ---- | ---- |
1368|[kubernetes#58107]|[pull request]|[patch]| Resource | RWR deadlock |
1369
1370[kubernetes#58107]:(kubernetes58107_test.go)
1371[patch]:https://github.com/kubernetes/kubernetes/pull/58107/files
1372[pull request]:https://github.com/kubernetes/kubernetes/pull/58107
1373
1374### Description
1375
1376The rules for read and write lock: allows concurrent read lock;
1377write lock has higher priority than read lock.
1378
1379There are two queues (queue 1 and queue 2) involved in this bug,
1380and the two queues are protected by the same read-write lock
1381(`rq.workerLock.RLock()`). Before getting an element from queue 1 or
1382queue 2, `rq.workerLock.RLock()` is acquired. If the queue is empty,
1383`cond.Wait()` will be invoked. There is another goroutine (goroutine D),
1384which will periodically invoke `rq.workerLock.Lock()`. Under the following
1385situation, deadlock will happen. Queue 1 is empty, so that some goroutines
1386hold `rq.workerLock.RLock()`, and block at `cond.Wait()`. Goroutine D is
1387blocked when acquiring `rq.workerLock.Lock()`. Some goroutines try to process
1388jobs in queue 2, but they are blocked when acquiring `rq.workerLock.RLock()`,
1389since write lock has a higher priority.
1390
1391The fix is to not acquire `rq.workerLock.RLock()`, while pulling data
1392from any queue. Therefore, when a goroutine is blocked at `cond.Wait()`,
1393`rq.workLock.RLock()` is not held.
1394
1395### Example execution
1396
1397```go
1398G3 G4 G5
1399--------------------------------------------------------------------
1400. . Sync()
1401rq.workerLock.RLock() . .
1402q.cond.Wait() . .
1403. . rq.workerLock.Lock()
1404. rq.workerLock.RLock()
1405. q.cond.L.Lock()
1406-----------------------------G3,G4,G5 leak-----------------------------
1407```
1408
1409## Kubernetes/62464
1410
1411| Bug ID | Ref | Patch | Type | Sub-type |
1412| ---- | ---- | ---- | ---- | ---- |
1413|[kubernetes#62464]|[pull request]|[patch]| Resource | RWR deadlock |
1414
1415[kubernetes#62464]:(kubernetes62464_test.go)
1416[patch]:https://github.com/kubernetes/kubernetes/pull/62464/files
1417[pull request]:https://github.com/kubernetes/kubernetes/pull/62464
1418
1419### Description
1420
1421This is another example for recursive read lock bug. It has
1422been noticed by the go developers that RLock should not be
1423recursively used in the same thread.
1424
1425### Example execution
1426
1427```go
1428G1 G2
1429--------------------------------------------------------
1430m.reconcileState()
1431m.state.GetCPUSetOrDefault()
1432s.RLock()
1433s.GetCPUSet()
1434. p.RemoveContainer()
1435. s.GetDefaultCPUSet()
1436. s.SetDefaultCPUSet()
1437. s.Lock()
1438s.RLock()
1439---------------------G1,G2 leak--------------------------
1440```
1441
1442## Kubernetes/6632
1443
1444| Bug ID | Ref | Patch | Type | Sub-type |
1445| ---- | ---- | ---- | ---- | ---- |
1446|[kubernetes#6632]|[pull request]|[patch]| Mixed | Channel & Lock |
1447
1448[kubernetes#6632]:(kubernetes6632_test.go)
1449[patch]:https://github.com/kubernetes/kubernetes/pull/6632/files
1450[pull request]:https://github.com/kubernetes/kubernetes/pull/6632
1451
1452### Description
1453
1454When `resetChan` is full, `WriteFrame` holds the lock and blocks
1455on the channel. Then `monitor()` fails to close the `resetChan`
1456because the lock is already held by `WriteFrame`.
1457
1458
1459### Example execution
1460
1461```go
1462G1 G2 helper goroutine
1463----------------------------------------------------------------
1464i.monitor()
1465<-i.conn.closeChan
1466. i.WriteFrame()
1467. i.writeLock.Lock()
1468. i.resetChan <-
1469. . i.conn.closeChan<-
1470i.writeLock.Lock()
1471----------------------G1,G2 leak--------------------------------
1472```
1473
1474## Kubernetes/70277
1475
1476| Bug ID | Ref | Patch | Type | Sub-type |
1477| ---- | ---- | ---- | ---- | ---- |
1478|[kubernetes#70277]|[pull request]|[patch]| Communication | Channel |
1479
1480[kubernetes#70277]:kubernetes70277_test.go
1481[patch]:https://github.com/kubernetes/kubernetes/pull/70277/files
1482[pull request]:https://github.com/kubernetes/kubernetes/pull/70277
1483
1484### Description
1485
1486`wait.poller()` returns a function with type `WaitFunc`.
1487the function creates a goroutine and the goroutine only
1488quits when after or done closed.
1489
1490The `doneCh` defined at line 70 is never closed.
1491
1492## Moby/17176
1493
1494| Bug ID | Ref | Patch | Type | Sub-type |
1495| ---- | ---- | ---- | ---- | ---- |
1496|[moby#17176]|[pull request]|[patch]| Resource | Double locking |
1497
1498[moby#17176]:(moby17176_test.go)
1499[patch]:https://github.com/moby/moby/pull/17176/files
1500[pull request]:https://github.com/moby/moby/pull/17176
1501
1502### Description
1503
1504`devices.nrDeletedDevices` takes `devices.Lock()` but does
1505not release it (line 36) if there are no deleted devices. This will block
1506other goroutines trying to acquire `devices.Lock()`.
1507
1508## Moby/21233
1509
1510| Bug ID | Ref | Patch | Type | Sub-type |
1511| ---- | ---- | ---- | ---- | ---- |
1512|[moby#21233]|[pull request]|[patch]| Communication | Channel |
1513
1514[moby#21233]:(moby21233_test.go)
1515[patch]:https://github.com/moby/moby/pull/21233/files
1516[pull request]:https://github.com/moby/moby/pull/21233
1517
1518### Description
1519
1520This test was checking that it received every progress update that was
1521produced. But delivery of these intermediate progress updates is not
1522guaranteed. A new update can overwrite the previous one if the previous
1523one hasn't been sent to the channel yet.
1524
1525The call to `t.Fatalf` terminated the current goroutine which was consuming
1526the channel, which caused a deadlock and eventual test timeout rather
1527than a proper failure message.
1528
1529### Example execution
1530
1531```go
1532G1 G2 G3
1533----------------------------------------------------------
1534testTransfer() . .
1535tm.Transfer() . .
1536t.Watch() . .
1537. WriteProgress() .
1538. ProgressChan<- .
1539. . <-progressChan
1540. ... ...
1541. return .
1542. <-progressChan
1543<-watcher.running
1544----------------------G1,G3 leak--------------------------
1545```
1546
1547## Moby/25384
1548
1549| Bug ID | Ref | Patch | Type | Sub-type |
1550| ---- | ---- | ---- | ---- | ---- |
1551|[moby#25384]|[pull request]|[patch]| Mixed | Misuse WaitGroup |
1552
1553[moby#25384]:(moby25384_test.go)
1554[patch]:https://github.com/moby/moby/pull/25384/files
1555[pull request]:https://github.com/moby/moby/pull/25384
1556
1557### Description
1558
1559When `n=1` (where `n` is `len(pm.plugins)`), the location of `group.Wait()` doesn’t matter.
1560When `n > 1`, `group.Wait()` is invoked in each iteration. Whenever
1561`group.Wait()` is invoked, it waits for `group.Done()` to be executed `n` times.
1562However, `group.Done()` is only executed once in one iteration.
1563
1564Misuse of sync.WaitGroup
1565
1566## Moby/27782
1567
1568| Bug ID | Ref | Patch | Type | Sub-type |
1569| ---- | ---- | ---- | ---- | ---- |
1570|[moby#27782]|[pull request]|[patch]| Communication | Channel & Condition Variable |
1571
1572[moby#27782]:(moby27782_test.go)
1573[patch]:https://github.com/moby/moby/pull/27782/files
1574[pull request]:https://github.com/moby/moby/pull/27782
1575
1576### Description
1577
1578### Example execution
1579
1580```go
1581G1 G2 G3
1582-----------------------------------------------------------------------
1583InitializeStdio()
1584startLogging()
1585l.ReadLogs()
1586NewLogWatcher()
1587. l.readLogs()
1588container.Reset() .
1589LogDriver.Close() .
1590r.Close() .
1591close(w.closeNotifier) .
1592. followLogs(logWatcher)
1593. watchFile()
1594. New()
1595. NewEventWatcher()
1596. NewWatcher()
1597. . w.readEvents()
1598. . event.ignoreLinux()
1599. . return false
1600. <-logWatcher.WatchClose() .
1601. fileWatcher.Remove() .
1602. w.cv.Wait() .
1603. . w.Events <- event
1604------------------------------G2,G3 leak-------------------------------
1605```
1606
1607## Moby/28462
1608
1609| Bug ID | Ref | Patch | Type | Sub-type |
1610| ---- | ---- | ---- | ---- | ---- |
1611|[moby#28462]|[pull request]|[patch]| Mixed | Channel & Lock |
1612
1613[moby#28462]:(moby28462_test.go)
1614[patch]:https://github.com/moby/moby/pull/28462/files
1615[pull request]:https://github.com/moby/moby/pull/28462
1616
1617### Description
1618
1619One goroutine may acquire a lock and try to send a message over channel `stop`,
1620while the other will try to acquire the same lock. With the wrong ordering,
1621both goroutines will leak.
1622
1623### Example execution
1624
1625```go
1626G1 G2
1627--------------------------------------------------------------
1628monitor()
1629handleProbeResult()
1630. d.StateChanged()
1631. c.Lock()
1632. d.updateHealthMonitorElseBranch()
1633. h.CloseMonitorChannel()
1634. s.stop <- struct{}{}
1635c.Lock()
1636----------------------G1,G2 leak------------------------------
1637```
1638
1639## Moby/30408
1640
1641| Bug ID | Ref | Patch | Type | Sub-type |
1642| ---- | ---- | ---- | ---- | ---- |
1643|[moby#30408]|[pull request]|[patch]| Communication | Condition Variable |
1644
1645[moby#30408]:(moby30408_test.go)
1646[patch]:https://github.com/moby/moby/pull/30408/files
1647[pull request]:https://github.com/moby/moby/pull/30408
1648
1649### Description
1650
1651`Wait()` at line 22 has no corresponding `Signal()` or `Broadcast()`.
1652
1653### Example execution
1654
1655```go
1656G1 G2
1657------------------------------------------
1658testActive()
1659. p.waitActive()
1660. p.activateWait.L.Lock()
1661. p.activateWait.Wait()
1662<-done
1663-----------------G1,G2 leak---------------
1664```
1665
1666## Moby/33781
1667
1668| Bug ID | Ref | Patch | Type | Sub-type |
1669| ---- | ---- | ---- | ---- | ---- |
1670|[moby#33781]|[pull request]|[patch]| Communication | Channel & Context |
1671
1672[moby#33781]:(moby33781_test.go)
1673[patch]:https://github.com/moby/moby/pull/33781/files
1674[pull request]:https://github.com/moby/moby/pull/33781
1675
1676### Description
1677
1678The goroutine created using an anonymous function is blocked
1679sending a message over an unbuffered channel. However there
1680exists a path in the parent goroutine where the parent function
1681will return without draining the channel.
1682
1683### Example execution
1684
1685```go
1686G1 G2 G3
1687----------------------------------------
1688monitor() .
1689<-time.After() .
1690. .
1691<-stop stop<-
1692.
1693cancelProbe()
1694return
1695. result<-
1696----------------G3 leak------------------
1697```
1698
1699## Moby/36114
1700
1701| Bug ID | Ref | Patch | Type | Sub-type |
1702| ---- | ---- | ---- | ---- | ---- |
1703|[moby#36114]|[pull request]|[patch]| Resource | Double locking |
1704
1705[moby#36114]:(moby36114_test.go)
1706[patch]:https://github.com/moby/moby/pull/36114/files
1707[pull request]:https://github.com/moby/moby/pull/36114
1708
1709### Description
1710
1711The the lock for the struct svm has already been locked when calling
1712`svm.hotRemoveVHDsAtStart()`.
1713
1714## Moby/4951
1715
1716| Bug ID | Ref | Patch | Type | Sub-type |
1717| ---- | ---- | ---- | ---- | ---- |
1718|[moby#4951]|[pull request]|[patch]| Resource | AB-BA deadlock |
1719
1720[moby#4951]:(moby4951_test.go)
1721[patch]:https://github.com/moby/moby/pull/4951/files
1722[pull request]:https://github.com/moby/moby/pull/4951
1723
1724### Description
1725
1726The root cause and patch is clearly explained in the commit
1727description. The global lock is `devices.Lock()`, and the device
1728lock is `baseInfo.lock.Lock()`. It is very likely that this bug
1729can be reproduced.
1730
1731## Moby/7559
1732
1733| Bug ID | Ref | Patch | Type | Sub-type |
1734| ---- | ---- | ---- | ---- | ---- |
1735|[moby#7559]|[pull request]|[patch]| Resource | Double locking |
1736
1737[moby#7559]:(moby7559_test.go)
1738[patch]:https://github.com/moby/moby/pull/7559/files
1739[pull request]:https://github.com/moby/moby/pull/7559
1740
1741### Description
1742
1743Line 25 is missing a call to `.Unlock`.
1744
1745### Example execution
1746
1747```go
1748G1
1749---------------------------
1750proxy.connTrackLock.Lock()
1751if err != nil { continue }
1752proxy.connTrackLock.Lock()
1753-----------G1 leaks--------
1754```
1755
1756## Serving/2137
1757
1758| Bug ID | Ref | Patch | Type | Sub-type |
1759| ---- | ---- | ---- | ---- | ---- |
1760|[serving#2137]|[pull request]|[patch]| Mixed | Channel & Lock |
1761
1762[serving#2137]:(serving2137_test.go)
1763[patch]:https://github.com/ knative/serving/pull/2137/files
1764[pull request]:https://github.com/ knative/serving/pull/2137
1765
1766### Description
1767
1768### Example execution
1769
1770```go
1771G1 G2 G3
1772----------------------------------------------------------------------------------
1773b.concurrentRequests(2) . .
1774b.concurrentRequest() . .
1775r.lock.Lock() . .
1776. start.Done() .
1777start.Wait() . .
1778b.concurrentRequest() . .
1779r.lock.Lock() . .
1780. . start.Done()
1781start.Wait() . .
1782unlockAll(locks) . .
1783unlock(lc) . .
1784req.lock.Unlock() . .
1785ok := <-req.accepted . .
1786. b.Maybe() .
1787. b.activeRequests <- t .
1788. thunk() .
1789. r.lock.Lock() .
1790. . b.Maybe()
1791. . b.activeRequests <- t
1792----------------------------G1,G2,G3 leak-----------------------------------------
1793```
1794
1795## Syncthing/4829
1796
1797| Bug ID | Ref | Patch | Type | Sub-type |
1798| ---- | ---- | ---- | ---- | ---- |
1799|[syncthing#4829]|[pull request]|[patch]| Resource | Double locking |
1800
1801[syncthing#4829]:(syncthing4829_test.go)
1802[patch]:https://github.com/syncthing/syncthing/pull/4829/files
1803[pull request]:https://github.com/syncthing/syncthing/pull/4829
1804
1805### Description
1806
1807Double locking at line 17 and line 30.
1808
1809### Example execution
1810
1811```go
1812G1
1813---------------------------
1814mapping.clearAddresses()
1815m.mut.Lock() [L2]
1816m.notify(...)
1817m.mut.RLock() [L2]
1818----------G1 leaks---------
1819```
1820
1821## Syncthing/5795
1822
1823| Bug ID | Ref | Patch | Type | Sub-type |
1824| ---- | ---- | ---- | ---- | ---- |
1825|[syncthing#5795]|[pull request]|[patch]| Communication | Channel |
1826
1827[syncthing#5795]:(syncthing5795_test.go)
1828[patch]:https://github.com/syncthing/syncthing/pull/5795/files
1829[pull request]:https://github.com/syncthing/syncthing/pull/5795
1830
1831### Description
1832
1833`<-c.dispatcherLoopStopped` at line 82 is blocking forever because
1834`dispatcherLoop()` is blocking at line 72.
1835
1836### Example execution
1837
1838```go
1839G1 G2
1840--------------------------------------------------------------
1841c.Start()
1842go c.dispatcherLoop() [G3]
1843. select [<-c.inbox, <-c.closed]
1844c.inbox <- <================> [<-c.inbox]
1845<-c.dispatcherLoopStopped .
1846. default
1847. c.ccFn()/c.Close()
1848. close(c.closed)
1849. <-c.dispatcherLoopStopped
1850---------------------G1,G2 leak-------------------------------
1851```
View as plain text