Sounds interesting. To me, the obvious next step would be to look at aggressive result caching for the micro-steps (subtasks).
By that I mean it sounds like the size of these micro-steps (including all input/context/etc passed to them) might be extremely small.
If their entire input is smaller than some yet-to-be-determined-threshold, then once the "correct" result is known (ie voted upon) it should be cached for extremely fast re-use rather than needing to run it through a sub-agent/model again.
Sounds interesting. To me, the obvious next step would be to look at aggressive result caching for the micro-steps (subtasks).
By that I mean it sounds like the size of these micro-steps (including all input/context/etc passed to them) might be extremely small.
If their entire input is smaller than some yet-to-be-determined-threshold, then once the "correct" result is known (ie voted upon) it should be cached for extremely fast re-use rather than needing to run it through a sub-agent/model again.
Calling a single LLM call "micro agent" is asinine.