I'm of the opinion that we chose the wrong term for concurrency. Concurrency means multiple things going on simultaneously, which is not what's happening with our version of it. Only one thing is happening at any given time, but the tasks are being multiplexed on a shared resource. Parallelism = concurrency, "concurrency" = multiplexing.
Maybe there's a more appropriate term than multiplexing, but I think that's certainly better than concurrency at describing it.
Yes that's true the terms are confusing but nevertheless it's important. For example having read through some of this book it's still not clear if this book involves parallelism, especially since the book compares itself to Python, which does not involve parallelism without running C extensions.
Data races and other race conditions are still present in concurrent systems without parallelism (of the actually executing at the same time sense, like with multiple cores). If they weren't, we wouldn't need most uses of mutexes and semaphores on single core processors. As the book gets into, concurrency is about multiple tasks that are arbitrarily interleaved with each other. That interleaving is why you can have data races and other errors even in a single core system.
Yes that's true and I was wrong to say otherwise. A data race can happen with preemptive multithreading on data whose size exceeds what the platform guarantees to access atomically, typically the word size.
A more accurate statement would be that parallelism introduces additional possibilities for data races than those possible from concurrent execution (without parallelism).
Robert van Renesse is a hugely respected distributed systems researcher with decades of influential publications, and is highly respected as a mentor in the community.
I'm sorry that you feel his contributions are meaningless because he hasn't caught up with the way that devops people talk about these things.
Then I'm even more disappointed. Appeal to authority does not make a good publication. He should know better or at least clearly define the terms he uses. Words have meaning. If you use the same words, but with a different meaning, it will produce confusion. Confusion may be a feature in marketing material, but if this publication is intended to teach something, then the confusion is a bug.
Programming with concurrency is hard. Concurrency can make programs faster than sequential ones, but having multiple threads read and update shared variables concurrently and synchronize with one another makes programs more complicated than programs where only one thing happens at a time. Why are concurrent programs more complicated than sequential ones? There are, at least, two reasons:
The execution of a sequential program is usually deterministic: If you run the program twice with the same input, the same output will be produced. Bugs are reproducible and thus easy to track down, for example by instrumenting the program. They are called Bohrbugs. However, the output of running concurrent programs depends on how the execution of the various threads are interleaved. Some bugs may occur only occasionally and may never occur when the program is instrumented to find them. We call these Heisenbugs—overhead caused by instrumentation leads to timing changes that can make such bugs less likely to cause havoc.
In a sequential program, each statement and each function can be thought of as happening atomically (indivisibly) because there is no other activity interfering with their execution. Even though a statement or function may be compiled into multiple machine instructions, they are executed back-to-back until completion. Not so with a concurrent program, where other threads may update memory locations while a statement or function is being executed.
I don't think anybody can learn enough about concurrent things because it's happening all the time whether you want it to or not.
[flagged]
How would a distinction between concurrency and parallelism benefit the modeling of program logic?
The programming language incorporates thread+locking mechanisms.
Parallelism introduces additional hazards such as data races, which are not present in concurrent code that lacks parallelism.
I'm of the opinion that we chose the wrong term for concurrency. Concurrency means multiple things going on simultaneously, which is not what's happening with our version of it. Only one thing is happening at any given time, but the tasks are being multiplexed on a shared resource. Parallelism = concurrency, "concurrency" = multiplexing.
Maybe there's a more appropriate term than multiplexing, but I think that's certainly better than concurrency at describing it.
Yes that's true the terms are confusing but nevertheless it's important. For example having read through some of this book it's still not clear if this book involves parallelism, especially since the book compares itself to Python, which does not involve parallelism without running C extensions.
Data races and other race conditions are still present in concurrent systems without parallelism (of the actually executing at the same time sense, like with multiple cores). If they weren't, we wouldn't need most uses of mutexes and semaphores on single core processors. As the book gets into, concurrency is about multiple tasks that are arbitrarily interleaved with each other. That interleaving is why you can have data races and other errors even in a single core system.
Data races are not possible on a single core system.
they are entirely possible assuming preemptive scheduling
Yes that's true and I was wrong to say otherwise. A data race can happen with preemptive multithreading on data whose size exceeds what the platform guarantees to access atomically, typically the word size.
A more accurate statement would be that parallelism introduces additional possibilities for data races than those possible from concurrent execution (without parallelism).
Data races are covered: https://harmony.cs.cornell.edu/book/#sec-57
Threads+locking introduce data race issues into Harmony's model. As far as I can tell.
Robert van Renesse is a hugely respected distributed systems researcher with decades of influential publications, and is highly respected as a mentor in the community.
I'm sorry that you feel his contributions are meaningless because he hasn't caught up with the way that devops people talk about these things.
Then I'm even more disappointed. Appeal to authority does not make a good publication. He should know better or at least clearly define the terms he uses. Words have meaning. If you use the same words, but with a different meaning, it will produce confusion. Confusion may be a feature in marketing material, but if this publication is intended to teach something, then the confusion is a bug.
Programming with concurrency is hard. Concurrency can make programs faster than sequential ones, but having multiple threads read and update shared variables concurrently and synchronize with one another makes programs more complicated than programs where only one thing happens at a time. Why are concurrent programs more complicated than sequential ones? There are, at least, two reasons: The execution of a sequential program is usually deterministic: If you run the program twice with the same input, the same output will be produced. Bugs are reproducible and thus easy to track down, for example by instrumenting the program. They are called Bohrbugs. However, the output of running concurrent programs depends on how the execution of the various threads are interleaved. Some bugs may occur only occasionally and may never occur when the program is instrumented to find them. We call these Heisenbugs—overhead caused by instrumentation leads to timing changes that can make such bugs less likely to cause havoc. In a sequential program, each statement and each function can be thought of as happening atomically (indivisibly) because there is no other activity interfering with their execution. Even though a statement or function may be compiled into multiple machine instructions, they are executed back-to-back until completion. Not so with a concurrent program, where other threads may update memory locations while a statement or function is being executed.
The bike shed of parallelism discussion, people with nothing useful to add just parrot this. It is the worst.