golang 傳值、傳指標 觀點

golang topic: by value or by pointer

--

Photo by Ben White on Unsplash

經常可以聽到一種說法,如果 struct 結構很大,為了避免 copy 整個結構,所以會建議用 by pointer ,只傳遞指標效率較好。

golang 在編譯時會進行逃逸分析 ( escape analysis ) ,傳遞指標的話,系統有可能會暗自的將相關變數移往 heap ,基本判斷方法,看你沒有在後續的系統中繼續引用該變數,有的話變數就會逃逸到 heap,更多參考可查 [1]

當 heap 逐漸累積,到達某個大小,就會引起 GC ( garbage collection ) ,為了 mark 哪些變數還有繼續被引用,總不能 mark 後變數又改變狀態,這樣判斷就會不準確,因此整個系統會短暫的停擺,也就是所謂的 STW ( Stop-the-world ) , 頻繁的觸發 GC 對整體系統的效能會有引響。

Pacer 是用來決定,什麼樣的情況觸發下一次的 GC cycle, GC 發生時,清除多餘了垃圾,標記目前活躍的 live memory ,並且設定下次 GC 發生的條件是 heap 的大小為 live memory 的兩倍時 [2]。兩倍並不是固定值,可調整環境變數 GOGC 其預設值是 100 [3], 表示垃圾為 live memory 的 100% 也就是 garbage + live memory = heap = 2倍 live memory。

說完了 by pointer 的缺點 , 來看看 by value 的好處, by value 則是會將變數 copy 到 stack 上, stack 在函數結束時,就會自己消滅,可以減少 GC 觸發的次數。

到底哪個方式效率比較好,只看單點函數傳遞,指標傳遞效率是較好的,但以系統面來說,還是要做 benchmark 的檢測,才能判斷。

by pointer 情境:

  1. 有需要 set 修改結構中某個欄位的值
  2. 回傳值的型別為 interface , 由於 T 型別只能取得 T 的方法集,而 *T 能取得 T 的 T & *T 方法集,為了確保介面可以取得正確的 method 進行判斷 , 用 Pointer 傳遞是最周全的辦法
  3. 對於套件等級的變數,為了確定是同一個變數,因此用指標回傳,確保唯一性, ptr 方便用 if 做比較,不然還要比對 Error 的字串或者用 DeepEqual 來判斷是不是已知的錯誤 [5]
e1 := errors.New("hey")
e2 := errors.New("hey")
fmt.Println(e1, e2, e1 == e2)
// hey hey false,如果用 value method 那兩個錯誤就會相等

by value 情境:

  1. 型別只有 GET 之類的方法,不會修改欄位值
  2. 確定 T 型別只會有 Value receiver , 不會引響 interface 的判斷
  3. 生命週期很短,避免大量變數累積在 heap , 可以使用 by value

Code Review comment,寫完這篇文章,連自己都不太確定哪個方式比較好了,只是想告訴其他人,by pointer 並不是沒有缺點。

If the receiver is a map, func or chan, don’t use a pointer to it.

If the receiver is a slice and the method doesn’t reslice or reallocate the slice, don’t use a pointer to it.

If the method needs to mutate the receiver, the receiver must be a pointer.

If the receiver is a struct that contains a sync.Mutex or similar synchronizing field, the receiver must be a pointer to avoid copying.

If the receiver is a large struct or array, a pointer receiver is more efficient. How large is large? Assume it’s equivalent to passing all its elements as arguments to the method. If that feels too large, it’s also too large for the receiver.

Can function or methods, either concurrently or when called from this method, be mutating the receiver? A value type creates a copy of the receiver when the method is invoked, so outside updates will not be applied to this receiver. If changes must be visible in the original receiver, the receiver must be a pointer.

If the receiver is a struct, array or slice and any of its elements is a pointer to something that might be mutating, prefer a pointer receiver, as it will make the intention more clear to the reader.

If the receiver is a small array or struct that is naturally a value type (for instance, something like the time.Time type), with no mutable fields and no pointers, or is just a simple basic type such as int or string, a value receiver makes sense.
A value receiver can reduce the amount of garbage that can be generated; if a value is passed to a value method, an on-stack copy can be used instead of allocating on the heap. (The compiler tries to be smart about avoiding this allocation, but it can't always succeed.) Don't choose a value receiver type for this reason without profiling first.

Finally, when in doubt, use a pointer receiver.

// Write writes the headers described in h to w.
//
// This method has a value receiver, despite the somewhat large size
// of h, because it prevents an allocation. The escape analysis // // isn't smart enough to realize this function doesn't mutate h.
func (h extraHeader) Write(w *bufio.Writer) {
...
}

--

--

Gopher is cute
Caesar's study review on Web development

我的第一份後端工作結束了,短短四個月,部門全員掰掰,尋找新的機會。