Concurrency in Go
Concurrency can be notoriously difficult to get right, but fortunately, the Go open source programming language makes working with concurrency tractable and even easy. If you're a developer familiar with Go, this practical book demonstrates best practices and patterns to help you incorporate concurrency into your systems. Author Katherine Cox-Buday takes you step-by-step through the process.
Amdahl’s law describes a way in which to model the potential performance gains from implementing the solution to a problem in a parallel manner. Simply put, it states that the gains are bounded by how much of the program must be written in a sequential manner.
A race condition occurs when two or more operations must execute in the correct order, but the program has not been written so that this order is guaranteed to be maintained. Most of the time this shows up in what’s called a data race, where one concurrent operation attempts to read a variable while at some undetermined time another concurrent operation is attempting to write to the same variable.
When something is considered atomic, or to have the property of atomicity, this means that within the context that it is operating, it is indivisible, or uninterruptable.
fact, there’s a name for a section of your program that needs exclusive access to a shared resource. This is called a critical section. In this example, we have three critical sections: Our goroutine which is incrementing the data variables. Our if statement which checks whether the value of data is 0. Our fmt.Printf statement which retrieves the value of data for output. There are various ways to guard your program’s critical sections, and Go has some better ideas on how to deal with this, but one way is to solve this problem is to synchronize access to the memory between your critical sections. Let’s take a look and see what that looks like. The code below is not idiomatic Go (and I don’t suggest you attempt to solve your data race problems like this), but it very simply demonstrates memory access synchronization. If any of the types, functions, or methods, in this example are foreign to you, that’s OK. Focus on the
Deadlock A deadlocked program is one in which all concurrent processes are waiting on one another. In this state, the program will never recover without outside intervention.
Livelocks are programs which are actively performing concurrent operations, but these operations do nothing to move the state of the program forward. Have you ever been in a hallway walking towards another person? She moves to one side to let you pass, but you’ve just done the same. So you move to the other side, but she’s also done the same. Imagine this going on forever, and you understand livelocks.
Starvation is any situation where a concurrent process cannot get all the resources it needs to perform work. When we discussed livelocks, the resource each goroutine was starved of was a shared lock. Livelocks warrant discussion separate from starvation because in a livelock, all the concurrent processes are starved equally, and no work is accomplished. More broadly, starvation usually implies that there are one or more greedy concurrent process which are unfairly preventing one or more concurrent processes from accomplishing work as efficiently as possible, or maybe at all.
CSP stands for “Communicating Sequential Processes” which is both a technique and the name of the paper that introduced it. In 1978, Charles Antony Richard Hoare published the paper1 in the Association for Computing Machinery (more popularly referred to as ACM). In this paper, Hoare suggests that input and output are two overlooked primitives of programming — particularly in concurrent code. At the time Hoare authored this paper, research was still being done on how to structure programs, but most of this effort was being directed to techniques for sequential code: usage of the goto statement was being debated, and the object-oriented paradigm was beginning to take root. Concurrent operations weren’t being given much thought. Hoare set out to correct this, and thus his paper, and CSP, were born. In the 1978 paper, CSP was only a simple programming language constructed solely to demonstrate the power of communicating sequential processes; in fact, he even says in the paper: Thus the concepts and notations introduced in this paper … should not be regarded as suitable for use as a programming language, either for abstract or for concrete programming. Hoare was deeply concerned that the techniques he was presenting did nothing to further the study of correctness of programs, and that the techniques may not be performant in a real language based on his own. Over the next six years, the idea of CSP was refined into a formal representation of something called process calculus in an effort to take the ideas of communicating sequential processes and actually begin to reason about program correctness. Process calculus is a way to mathematically model concurrent systems and also provides algebraic laws to perform transformations on these systems to analyze their various properties, e.g. efficiency and correctness. Although process calculi are an interesting topic in their own right, they are beyond the scope of this book. And since the original paper on CSP and the language that evolved from it were largely the inspiration for Go’s concurrency model, it’s these we’ll focus on.
a goroutine is a function which is running concurrently (remember: not necessarily in parallel!)
Goroutines are unique to Go (though some other languages have a concurrency primitive that is similar). They’re not OS threads, and they’re not exactly green threads — threads which are managed by a language’s runtime — they’re a higher level of abstraction known as coroutines. Coroutines are simply concurrent subroutines (functions, closures, or methods in Go) which are nonpreemptive — that is they cannot be interrupted. Instead, coroutines have multiple points throughout which allow for suspension or reentry.
Join points are what guarantee our program’s correctness and remove the race condition. In order to a create a join point, you have to synchronize the main goroutine and the sayHello goroutine. This can be done in a number of ways, but I’ll use one we’ll talk about in the section, “fuzzy:The sync package[fuzzy:The sync package]”: sync.WaitGroup. Right now it’s not important to understand how this example creates a join point, only that it creates one between the two goroutines.
In some sense, immutable data is ideal because it is implicitly concurrent-safe. Each concurrent process may operate on the same data, but it may not modify it. If it wants to create new data, it must create a new copy of the data with the desired modifications.