Show HN: Rill – Composable concurrency toolkit for Go

145 points | by destel 6 hours ago

73 comments

Groxx 6 minutes ago
Hmmm. Some stuff to like, but I do feel like this should have a big, noticeable cautionary note that it does not wait if you end early (e.g. via Err). Any pending actions continue running and draining in background goroutines, and are potentially VERY delayed if internal/core.Delay is ever exposed or your funcs sleep.
I've seen that kind of pattern lead to TONS of surprise race conditions in user-code, because everyone always thinks that "it returned" means "it's done". Which is reasonable, nothing else is measurable on the caller side, and changing that won't be noticeable by callers and may violate their expectations and cause crashes.
destel 6 hours ago
Hi everyone. Posting on HN for the first time. I'd like to share Rill - a toolkit for composable channel-based concurrency, that makes it easy to build concurrent programs from simple, reusable parts
Example of what it looks like:
```
    // Convert a slice into a channel
    ids := rill.FromSlice([]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil)


    // Read users from API with concurrency=3
    users := rill.Map(ids, 3, func(id int) (*User, error) {
        return api.GetUser(ctx, id)
    })

    // Process users with concurrency=2
    err := rill.ForEach(users, 2, func(u *User) error {
        if !u.IsActive {
            u.IsActive = true
            return api.SaveUser(ctx, u)
        }
        return nil
    })

    // Handle errors
    fmt.Println("Error:", err)
```
Key features:
```
  - Makes concurrent code composable and clean
  - Works for both simple cases and complex pipelines
  - Built-in batching support
  - Order preservation when needed
  - Centralized error handling
  - Zero dependencies
```
The library grew from solving real concurrency challenges in production. Happy to discuss the design decisions or answer any questions.
[-]
- slantedview 4 hours ago
  I'm curious - what other technologies/libraries/APIs, including in other languages, did you draw on for inspiration, or would you say are similar to Rill?
  [-]
  - destel 2 hours ago
    The short answer would be: I kept writing a code that spawns goroutines, that read from a channel, do some processing and write results to another channel. Add some wait/err groups to this and we'll get a lot of boilerplate repeated all over the place. I viewed this as "channel transformations" and wanted to abstract it away. When generics came out it became technically possible.
    Surprisingly, part of my inspiration came from scala (which I haven't touched since 2014). Back then Scala had transformable streams and the "Try" typethen.
- limit499karma 5 hours ago
  Is there an underlying assumption that the channels are containers and not streams?
  [-]
  - destel 4 hours ago
    No, it's the opposite - the library treats channels as streams, processing items as they arrive without needing to know the total size in advance. This is why it can handle infinite streams and large datasets that don't fit in memory.
- fillskills 5 hours ago
  Very intuitive API. Thanks!
jbendotnet21 5 hours ago
Looks good, similar to https://github.com/sourcegraph/conc which we've been using for a while. Will give this a look.
[-]
- alpb 5 hours ago
  There are also libraries like https://github.com/Jeffail/tunny or https://pkg.go.dev/go.uber.org/goleak or https://github.com/fatih/semgroup to help deal with concurrency limits and goroutine lifecycle management.
  As the author of https://github.com/ahmetb/go-linq, it's hard to find adoption for libraries offering "syntactic sugar" in Go, as the language culture discourages those kind of abstractions and keeping the code straightforward.
gregwebs 3 hours ago
Thanks for sharing what is working for you in production. I made a somewhat similar library that also does batching (docs are sparse, but I am updating docs on my libraries this week) [1].
I would call this parallelism rather than concurrency.
The main issue I have with this library's implementation is how errors are handled. Errors are retrieved rather than assigned- but assignment is preferable because it gets verified by tools. In my library I used a channel for errors- that gives ultimate flexiblitiy- it can be converted to wait and collect them to a slice or to perform a cancellation on first error.
[1] https://github.com/gregwebs/go-parallel
[-]
- destel 2 hours ago
  Thank you for the feedback. My design decision is of course a tradeoff. When multiple channels are exposed to the users (not encapsulated inside the lib), this forces them to use "select". And this is very error prone in my experience
  [-]
  - gregwebs 2 hours ago
    I never needed to use select on an error channel for my use cases because at the point I operate on the error channel I want to block for completion. And I provide helpers for the desired behavior for the channel so I don't even directly receive from it. I see that some of Rill is designed to operate on continuous streams, and in that light the design decision makes sense. For my use cases though the stream always had an end.
dangoodmanUT 4 hours ago
Love the idea, some weirdness though:
> Here's a practical example: finding the first occurrence of a specific string among 1000 large files hosted online. Downloading all files at once would consume too much memory, processing them sequentially would be too slow, and traditional concurrency patterns do not preserve the order of files, making it challenging to find the first match.
But this example will process ALL items, it won't break when a batch of 5 finds something?
[-]
- destel 4 hours ago
  It will. Otherwise the example wouldn't make sense. There's one important detail I haven't clarified enough in that part of the readme.
  For proper pipeline termination the context has to be cancelled. So it should have been be Like:
  func main() { ctx, cancel := context.WithCancel(context.Background()) defer cancel()
```
    urls := rill.Generate(func(send func(string), sendErr func(error)) {
        for i := 0; i < 1000 && ctx.Err() == nil; i++ {
                send(fmt.Sprintf("https://example.com/file-%d.txt", i))
        }
    })
    ...
```
  One of the reasons I've ommited context cancellation in this and some other examples is because everything's happening inside the main function. I'll probably add cancellations to avoid confusion.
- destel 2 hours ago
  I've also just pushed few small changes to the readme that clarify this things.
purpleidea 4 hours ago
Nice! I do a lot of concurrency work with DAG's in https://github.com/purpleidea/mgmt/ and I would love to swap out some of those concurrency runners with a lib if possible.
I was wondering if this could be it... Any thoughts in that direction, please let me know!
tschellenbach 5 hours ago
this is a real problem in go, very easy to have bugs when working with channels and the way it handles errors etc.
[-]
- latchkey 5 hours ago
  If you write comprehensive unit tests, it is not easy to have bugs in golang. Especially as things change over time. A library like this isn't going to protect you from having bugs.
  TIL: HN doesn't like writing tests. The downvotes on this are hilarious. "Job security" ¯\_(ツ)_/¯.
  [-]
  - bogota 4 hours ago
    Hard disagree on this. Large production apps that use channels have very subtle bugs that cause all kinds of annoying issues that only come up under load in prod. I have been using go for ten years and still pick it as my language of choice for most projects however I stay away from channels and especially any complex use of them unless it’s 100% required to scale the application. Even then you can most of the time come up with a better solution by re architecting that part of the application. For pet projects go crazy with them though.
    [-]
    - latchkey 4 hours ago
      What are you disagreeing with exactly, are you trying to argue against testing? Are you trying to argue that using a library protects you from bugs somehow?
      You stay away from something you don't understand after 10 years of working with it? What kind of logic is that? Channels aren't magic.
      Subtle bugs in what? Have you considered that maybe you have bugs because you aren't writing tests?
      If you aren't unit testing that stuff, then how are you able to fix/change things and know it is resolved?
      My experience is that I built a binary that had to run perfectly on 30,000+ servers across 7 data centers. It was full of concurrency. Without a litany of automated tests, there is no way that I would have trusted this to work... and it worked perfectly. The entire deployment cycle was fully automated. If it passed CI, I knew that it would work.
      It wasn't easy, it took a lot of effort to build that level of testing. But it was also totally bug free in production, even over years of use and development.
      [-]
      - edvinbesic 3 hours ago
        You asserted that bugs are hard if you write unit tests. The parent stated that some issues only occur under production load and a unit test will not catch it. Nowhere was it implied that unit tests are useless.
        Perhaps a less defensive posture might invite more discussion.
        [-]
        latchkey 2 hours ago
        > The parent stated that some issues only occur under production load and a unit test will not catch it.
        I can't think of a single production problem that can't be replicated with a unit test. If you're seeing a problem in production, you need to fix it. How do you fix it? You write a test that replicates the problem and then fix the code, which fixes the test.
      - glzone1 2 hours ago
        The original comment was about how concurrency expands / makes it easier for there to be errors in go (which avoids LOTs of other errors just with compile time / type safety stuff).
        "very easy to have bugs when working with channels and the way it handles errors etc"
        If you've done some programming you'll find this to be true. You have to think a LOT harder if doing concurrency, and you generally have to do a lot more tests.
        Go - WITHOUT that much testing, is often surprisingly error free compared to more dynamic languages just out of the box both language side and how a lot of development happens. Python by contrast, you can have errors in dependencies, in deployment environment (even if code is fine), based on platform differences (tz data on windows), and plenty of runtime messes.
        Channels are not as default safe / simple after compile as a lot of other go.
        Try programming without channels in go and this may become clearer.
  - tmoertel 3 hours ago
    I think you're getting downvoted for the unsupported assertion that "If you write comprehensive unit tests, it is not easy to have bugs in golang." Probably because you made that assertion in the context of a discussion of channels, widely believed to have underlying concurrency semantics that are subtle and easy to misunderstand, making "write comprehensive unit tests" seem like a strategy that's apt to let real-world problems slip through (because a programmer's belief that their tests are "comprehensive" is likely to be mistaken).
    [-]
    - steve_adams_86 3 hours ago
      Go makes it easier to write concurrent code, but it's a serious chore to iron out all of the kinks in more complex tasks. I've missed some weird stuff over the years.
      I don't blame Go. It's an inherently difficult problem space. As a result, testing isn't a trivial job either. I wish it was.
      [-]
      - latchkey 2 hours ago
        It is not a chore, it is our job. This is what we do. We write code. Of course you've missed stuff, we all have. Tests help alleviate the missed stuff. Even better is that they protect us over time, especially when we want to refactor things or find bugs in production. How do you fix a production bug without breaking something else? You write tests so that you know your code works.
        Again with the HN downvotes, hilarious. People really hate the truth.
        [-]
        tmoertel 30 minutes ago
        I think what you're missing is that "you write tests so that you know your code works" doesn't actually work for some important classes of "works," security and concurrency (the subject of this HN discussion) being two prominent ones. That's because testing only shows that your code works for the cases you test. And, when it comes to security and concurrency, identifying the cases that matter is a hard enough problem that if you can figure out how to do it more reliably, that's publishable research.
        Think about it: If you're writing code and don't realize that it can deadlock under certain conditions, how are you going to realize that you need to test for whether it deadlocks under those conditions? If you're writing code that interpolates one kind of string into another and don't realize that you may have created an XSS vulnerability, are you suddenly going to gain that insight when you're writing the tests?
  - hnlmorg 3 hours ago
    You’re getting downvoted because you’re essentially arguing that a language abstraction which is a known source of bugs can be solved simply by writing better code.
    which misses the point of the OP.
    [-]
    - steve_adams_86 3 hours ago
      They're also suggesting a method of testing which almost certainly doesn't offer sufficient assurance under most circumstances will uncover all possible bugs. When I've got concurrency in an application, I'll use unit tests here and there, but mostly I want assurance that the entire system behaves as expected. It's too much complexity to rely on unit tests.
      [-]
      - hnlmorg 3 hours ago
        Very true. As an author of a several multithreaded applications, I concur that unit testing thread interactions is hard and seldom exhaustive.
        [-]
        latchkey 2 hours ago
        It is not exhaustive because you haven't taken the effort to do it. It isn't easy, you have to write you code in a way that can be tested. It takes planning and effort to do this, but it pays off with having applications that aren't full of bugs.
        [-]
        macintux 18 minutes ago
        You sound like the people who argue that, despite decades of security vulnerabilities offering evidence otherwise, C is perfectly safe if you know what you’re doing and just put more effort into it.
        Technically you may be right, but it’s not a helpful viewpoint. What the world needs are abstractions that are easier to understand and program correctly, not assertions that everyone else is doing it wrong and just needs to be smarter/work harder.
        hnlmorg 16 minutes ago
        It’s not exhaustive because complex multi-threaded software has a plethora of hidden edge cases, many of which actually fall outside the traditional remit of a unit test.
        This is where other forms of software testing come into play. Such as integration tests.
      - latchkey 2 hours ago
        All this "complexity" can be unit tested, I've done it.
        Trying to handwave and say that your code is too complex to be tested is very strange to me. Maybe take a step back and ask yourself why your code is too complicated to test. Maybe refactor it a bit to be less complicated or more easily testable.
0x696C6961 4 hours ago
I think that using iterators in the public API would have been better than channels.
[-]
- destel 4 hours ago
  Rill might look like it tries to be a replacement for iterators, but it's not the case. It's a concurrency library, that's why it's based on channels
  [-]
  - 0x696C6961 4 hours ago
    I disagree that using channels is necessary for concurrency. Consider the following iterator based signature for your Map function:
```
    func Map[A, B any](in iter.Seq[Try[A]], n int, f func(A) (B, error)) iter.Seq[Try[B]]
```
    [-]
    - jerf 2 hours ago
      Channels are many-to-many; many goroutines can write to it simultaneously, as well as read from it. This library is pitched right at where people are using this, so it's rather a key feature. An iterator is not. Even if you wrap an iterator around a channel you're still losing features that channels have, such as, channels can also participate in select calls, which has varying uses on the read and write sides, but are both useful for a variety of use cases that iterators are not generally used for.
      They may not be "necessary" for concurrency but given the various primitives available they're the clear choice in this situation. They do everything an iterator does, plus some stuff. The only disadvantage is that their operations are relatively expensive, but as, again, we are already in a concurrency context, that is already a consideration in play.
      [-]
      - 0x696C6961 an hour ago
        The ability to participate in select statements is a good call out. Thanks for taking the time to reply.
    - eweise 3 hours ago
      that makes Scala look easy.
      [-]
      - 0x696C6961 2 hours ago
        For context, here's the existing Map signature from the linked library:
        func Map[A, B any](in <-chan Try[A], n int, f func(A) (B, error)) <-chan Try[B]
        Are you suggesting that this channel based signature is significantly easier to understand than the one I shared?
        [-]
        eweise 39 minutes ago
        No, it was more of a general comment that once Go added support for generics, doing functional style programming starts to look as complex (or more actually) than languages that built support from the beginning.
lspears 6 hours ago
This is great. I am working on a robotics application and this seems like a better abstraction than alternatives such local messaging servers. How do you deal with something like back pressure or not keeping up with incoming data?
[-]
- destel 5 hours ago
  The lib is based on channels and inherits the channel behavior in terms of backpressure. Simply put if no-one reads on one side of the pipeline, it wouldn't be possible to write anything on the other side. Still, it's possible to add buffering at arbitrary points in the pipeline using the rill.Buffer function.
hu3 6 hours ago
Hi! Looks great. I might use this to fan out/in my RSS reader HTTP calls.
How would I implement timeout? In case a HTTP call takes too long?
[-]
- mariusor 4 hours ago
  You might be interested by something that has been designed specifically for this problem. I created a state machine library for Go on top of which I mapped retry[1] and some other patterns. And funnily enough one of the first applications I implemented is an RSS reader[2]
  [1] https://pkg.go.dev/git.sr.ht/~mariusor/ssm#example-Retry
  [2] https://git.sr.ht/~mariusor/frankenjack/tree/master/item/sou...
- destel 4 hours ago
  For now, the library is context-agnostic by design. For HTTP timeouts, you'd use Go's standard approaches: either set the HTTP client timeout or pass a context with timeout to each request. Please let me know more about your use case - I'll let you know if Rill isn't a good fit.
- lsaferite 6 hours ago
  Based on the examples and documentation, rill doesn't manage context for you. You'd simply set the client timeout or give each http call a timeout context.
qaq 5 hours ago
any plans to add context support ?
[-]
- destel 5 hours ago
  I am thinking on it. To be honest, the current design works fine for my use cases: simply put, the function that defines a pipeline should have context.WithCancel() and defer cancel() calls.
  I need a feedback on this. What kind of builtin context support would work for you? Do you need something like errgroup's ability to automatically cancel the context on first error?
- destel 2 hours ago
  I've just pushed few small changes to the readme that better explain context usage
lyxell 6 hours ago
The API looks really nice and intuitive! What motivated you to build this?
[-]
- destel an hour ago
  Thank you! There are two pieces of motivation here. The first one is removing boilerplate of spawning goroutines that read from one channel and write to another. Such code also needs wait/err group to properly close the output channel. I wanted to abstract this away as "channel transformation" with user controlled concurrency.
  Another part is to provide solutions for some real problems I had. Most notably batching, error handling (in multi stage pipelines) and order preservation. I thought that they are generic enough to be the part of general purpose library.
pajeetz 3 hours ago
what sort of environment do you need to be in to have to compose concurrency like this instead of relying on native go's scaling?
[-]
- jerf 2 hours ago
  The same sort of environment in which one uses such abstractions like "functions" instead of relying on the language's native ability to run sequential instructions.
  It's generally good for languages to provide relatively low-level functionality and let libraries be able to build on top of it, because as the programming language development world has now learned many times over, the hardest code to change is the code in the language and its standard library. It isn't the job of the language itself to provide every possible useful iteration on the base primitives it provides.
- dougbarrett 3 hours ago
  Batching is a pattern I’ve had to manually build in the past to push large amounts of analytic data to a database. I’d push individual events to be logged, map reduce those in batches and then perform insert on duplicate update queries on the database, otherwise the threshold of incoming events was enough to saturate the connection pool making the app inoperable.
  Even optimizing to where if an app instance new it ran the inert on update for a specific unique index by storing that in a hash map and only running updates from there on out to increase the count of occurrences of that event was enough to find significant performance gains as well.
fidotron 4 hours ago
I think it is time to face the fact CSP style channels are a bad idea if you don’t also have the occam semantics for channel scope. (I know there are other people here that understood that sentence).
The problem in golang is the channel cleanup, particularly. is a mess. In occam they come and go simply by existing as variables in seq or par blocks. Occam is very static though, so the equivalent to goroutines are all allocated at build. (Occam pi attempted to resolve this iirc).
https://en.m.wikipedia.org/wiki/Occam_(programming_language)
Some of the patterns in this library are reminiscent of constructs occam has as basic constructs, such as the for each block, although the occam one must have the number of blocks known at build time.
The fact so many people in golang reach for mutexes constantly is a sign things are not all well there.
[-]
- 0x696C6961 3 hours ago
  Channels are just another synchronization primitive in your toolbox. They do make some things much simpler, but there's no reason to reach for one if a mutex does the job.
  The usage of mutexes doesn't make channels "bad" for the same reason that usage of atomics doesn't make mutexes bad.
c4pt0r 6 hours ago
Very handy!
noctane 6 hours ago
It looks great. What are other existing tools? And how do they compare to them?
[-]
- Scaevolus 5 hours ago
  Sourcegraph Conc is broadly similar in providing pool helpers, but doesn't provide the same fine grained batching options: https://github.com/sourcegraph/conc
  Uber CFF does code generation, and has more of a focus on readability and complex dependency chains: https://github.com/uber-go/cff
izolate 5 hours ago
The batching concept is a cool idea and could be useful in the right context. That said, this feels like a JavaScript engineer's take on Go. Abstractions like Map and ForEach don't align with Go's emphasis on simplicity and explicitness. The lack of context.Context handling also seems like an oversight, especially when considering concurrency.
Judging by the praise, I'm probably in the minority, but as a code reviewer, I’d much rather see straightforward loops, channels, and Go's native constructs over something like Rill.
[-]
- ARandomerDude 5 hours ago
  > Map and ForEach don't align with Go's emphasis on simplicity and explicitness
  I've never paid my bills with Go, but `Map` and `ForEach` don't seem all that different than `for _, u := range Users` to me. Yes, the former is "functional" but only mildly.
  [-]
  - prisenco 5 hours ago
    In that case there's no particular reason to use them. As far as Go's philosophy goes.
    [-]
    - ARandomerDude 5 hours ago
      touché
- zendist 4 hours ago
  If you were to build a library like `rill` in the Go-way, what would your Batch API usage look like?
- hnlmorg 3 hours ago
  I don’t agree with your comment about Map and ForEach, just by virtue of the fact that sync.Map exists in Go’s standard library.
  But your point about the lack of contexts is definitely a deal breaker for me personally too.
fatih-erikli-cg 4 hours ago
:= for assinging a variable sounds and looks weird to me
[-]
- steve_adams_86 3 hours ago
  It's the walrus operator. Pascal and Python use it as well. You get used to it pretty quickly.
  Pascal: https://www.freepascal.org/docs-html/ref/refse104.html#x224-... Python: https://docs.python.org/3/whatsnew/3.8.html#assignment-expre...
  [-]
  - fatih-erikli-cg 3 hours ago
    It was something introduced like a 1 april joke. Clean implementation of any programming language won't have that.
    [-]
    - hnlmorg 3 hours ago
      I actually think it’s more readable because it makes the distinction between assignment and equivalence very clear.
      bugs originating from the similarity of == vs = has probably cost the industry millions over the last 3 decades.
    - kortex 3 hours ago
      That ship has long sailed. ALGOL 1958 used := and Pascal popularized it.
      https://en.m.wikipedia.org/wiki/Assignment_(computer_science...
      [-]
      - fatih-erikli-cg 3 hours ago
        I think they scan the code by two characters because one is not enough for <= and => so what is why assignment is := or =:. Probably + is ++ too.