I've been following agents.json for a little while. I think it has legs, and would love to see some protocol win this space soon.
Will be interesting to see where the state/less conversation goes, my gut tells me MCP and "something" (agent.json perhaps) will co-exist. My reasoning being purely organisational, MCP focuses on a lot more, and there ability to make a slimmed down stateless protocol might be nigh impossible.
---
Furthermore, if agents.json wants to win as a protocol through early adoption, the docs need to be far easier to grok. An example should be immediately viewable, and the schema close by. The pitch should be very succinct, the fields in the schema need to have the same amount of clarity at first glance. Maybe a tool, that anyone can paste their OpenAPI schema into, and it gets passed to an LLM to generate a first pass of what their agents.json could look like.
---
The OpenAPI <> agents.json portability is a nice touch, but might actually be overkill. OpenAPI is popular but it never actually took over the market imo. If there is added complexity to agents.json because of this, I'd really question if it is worth supporting it. They don't have to be 100% inoperable, custom converters could manage partial support.
---
A lot of people are using agentic IDE's now, would be nice if agent.json shared a snippet with instructions on how to use it, where to find docs and how to pull a list and/or search the registry that people can just drop straight into Windsurf/Cursor.
1) Thanks for being a part of the journey! We also want something that works for us as agent developers. We didn't feel like anything else was addressing this problem and felt like we had to do it ourselves.
We love feedback! This is our first time doing OSS.
I agree - MCP and agents.json are not mutually exclusive at all. They solve for difference clients.
2) Agreed. Something we're investing in soon is a generic SDK that can run any valid agents.json. That means the docs might getting a revamp soon too.
3) While many API services may not use OpenAPI, their docs pages often do! For example, readme.com lets you export your REST API docs as OpenAPI. As we add more types of action sources, agents.json won't be 1:1 with OpenAPI. In that way, we left the future of agents.json extensible.
In what ways is the agents.json file different from an OpenAPI Arazzo specification? Is it more native for LLM use? Looking at the example, I'm seeing similar concepts between them.
We've been in touch with Arazzo after we learned of the similarities. The long-term goal is to be aligned with Arazzo. However, the tooling around Arazzo isn't there today and we think it might take a while.
agents.json is meant to be more native to LLMs, since Arazzo serves other use cases than LLMs.
To be more specific, we're planning to support multiple types of sources alongside REST APIs, like internal SDKs, GraphQL, gRPC, etc.
Thanks, that's helpful. I agree there are many other sources REST APIs where this would be helpful. Outside of that I would be interested in understanding the ways where Arazzo takes a broader approach and doesn't really fit an LLM use case.
It's not that Arazzo can't work for LLMs, just that it's not the primary use case. We want to add LLM enabled transformations between linkages. Arazzo having to serve other use cases like API workflow testing and guided docs experiences may not be incentivized to support these types of features.
This is interesting but why do you make it so hard to view the actual agents.json file? After clicking around in the registry (https://wild-card.ai/registry) for 10 minutes I still haven't found one example.
Cool idea but seems to be dead on arrival due to licensing. Would love to have the team explain how anyone can possibly adopt their agpl package into their product.
A couple people have mentioned some relevant things in this thread. This SDK isn't meant to be restrictive. This can be implemented into other open-source frameworks as a plugin(ie. BrowserUse, Mastra, LangChain, CrewAI, ...). We just don't want someone like AWS flip this into a proxy service.
Some have asked us to host a version of the agents.json SDK. We're torn on this because we want to make it easier for people to develop with agents.json but acting as a proxy isn't appealing to us and many of the developers we've talked to.
That said, what do you think is the right license for something like this? This is our first time doing OSS.
Sounds like the spec is Apache 2.0. The Python package is AGPLv3, but the vast majority of the code in there looks to be codegen from OpenAPI specs. I'd imagine someone could create their own implementation without too much headache, though I'm just making an educated guess.
How does this compare to llms.txt? I think that’s also emerging as a sort of standard to let LLMs understand APIs. I guess agents.json does a better packaging/ structural understanding of different endpoints?
llms.txt is a great standard for making website content more readable to LLMs, but it doesn’t address the challenges of taking structured actions. While llms.txt helps LLMs retrieve and interpret information, agents.json enables them to execute multi-step workflows reliably.
This could be more simple, which is a good thing, well done!
BTW I might have found a bug in the info property title in the spec: "MUST provide the title of the `agents.json` specification. This title serves as a human-readable name for the specification."
Can some help me understand why agents can't just use APIs documented by an openapi spec? Seems to work well in my own testing but I'm sure I'm missing something.
LLMs do well with outcome-described tools and APIs are written as resource-based atomic actions. By describing an API as a collection of outcomes, LLMs don't need to re-reason each time an action needs to be taken.
Does anyone have any pro tips for large tool collections? (mine are getting fat)
Plan on doing a two layered system mentioned earlier, where the first layer of tool calls is as slim as they can be, then a second layer for more in depth tool documentation.
And/or chunking tools and creating embeddings and also using RAG.
Funnily enough, a search tool to solve this problem was our product going into YC. Now it’s a part of what we do with wild-card.ai and agents.json.
I’d love to extend the tool search functionality for all the tools in your belt
It took us a decently long time to get the search quality good. Just a heads up in case you want to implement this yourself
Thinking from the retrieval perspective, would it make sense to have two layers?
First layer just describes on high level, the tools available and what they do, and make the model pick or route the request (via system prompt, or small model).
Second layer implements the actual function calling or OpenAPI, which then would give the model more details on the params and structures of the request.
That approach does a lot better, but LLMs still have positional bias problem baked into the transformer architecture (https://arxiv.org/html/2406.07791v1).
This is where the LLM biases selecting information earlier in the prompt than later, which is unfortunate for tool selection accuracy.
Since 2 steps are required anyways, might as well use a dedicated semantic search for tools like in agents.json.
Thanks!
MCP is taking a stateful approach, where every client maintains a 1:1 connection with a server. This means that for each user/client connected to your platform, you'd need a dedicated MCP server.
We're used to writing software that interfaces with APIs, as stateless and deployment agnostic. agents.json keeps it that way.
For example, you can write an web-based chatbot that uses agents.json to interface with APIs. To do the same with MCP, you'd spin up a separate lambda or deployed MCP server for each user.
We work with API providers to write this file. It takes a non-negligible amount of thought to put together since we're encoding which outcomes would be useful to enable/disable for an LLM. The standard is open so anyone can write and read and agents.json. Mainly intended for API providers to write.
The end developer doesn't need to even see or read the agents.json file. It's a means for transparency and meant to be implemented by the API provider. Tooling to make creating an agents.json easier is on our roadmap. We have a process internally where we use a validator to guide creating an agents.json.
So,the api provider, like stripe, is supposed to publish a second API?
And then the "end developer" who is going to be making a chatbot/agent, is supposed to use that to make a chatbot?
Why does the plan involve there being multiple third party developers to make n products per provider? If the plan is to have third parties be creative and combine, say, Stripe with Google Ads, then how is a second API for LLMs useful.
I'm not seeing the vision here. I've seen something similar in a project where a guy wanted LLM developers to use his API for better browsing websites. If your plan involves:
1- Bigger players than you implementing your protocol
2- Everybody else doing the work.
It's just obviously not going to work and you need to rethink your place in the food chain.
We're grateful that bigger players like Resend, Alpaca, etc do want to implement the protocol. The problem is honestly onboarding them fast enough. That's one of the main areas we're going to build out in the next few weeks. Until then, we're writing every agents.json.
If you check out wild-card.ai and create your own collection, you'll find that it's actually really easy to develop with. As a developer, you never have to look at an agents.json if you don't want to.
MCP is great for the stateful systems, where shared context is a benefit, but this is a rarity. Developers generally write clients to use APIs in a stateless way, and we want to help this majority of users.
That said, agents.json is not mutually exclusive to MCP. I can see a future where an MCP for agents.json is created to access any API.
Can't you simply use a stateful protocol and not report any state? Doesn't statefulness subsume statelessness? I am beginning to wrap my head around this space, so excuse the naive questions.
No worries! In other cases, I believe you would be right. But splitting up context is not optional with MCP. Part of the whole state will always reside in an external entity.
I've been down this road - with OpenPlugin. It's all technically feasible - we did it successfully. The question is, so what? If the new models can zero-shot the API call and fix issues with long responses, boot parameters, lookup fields, etc, what's your business model?
The tail of the problem is quite long. Even if the average model is perfect at these things, do we want them to re-reason each time there's an impasse of outcomes? Often, the outcomes we want to achieve have well traversed flows anyways and we can just encode that.
In fact, I'm looking forward to the day that models are better at this so we can generate agents.json automatically and self-heal with RL.
On the business model, ¯\_(ツ)_/¯. We don't charge developers, anyways
I've been following agents.json for a little while. I think it has legs, and would love to see some protocol win this space soon.
Will be interesting to see where the state/less conversation goes, my gut tells me MCP and "something" (agent.json perhaps) will co-exist. My reasoning being purely organisational, MCP focuses on a lot more, and there ability to make a slimmed down stateless protocol might be nigh impossible.
---
Furthermore, if agents.json wants to win as a protocol through early adoption, the docs need to be far easier to grok. An example should be immediately viewable, and the schema close by. The pitch should be very succinct, the fields in the schema need to have the same amount of clarity at first glance. Maybe a tool, that anyone can paste their OpenAPI schema into, and it gets passed to an LLM to generate a first pass of what their agents.json could look like.
---
The OpenAPI <> agents.json portability is a nice touch, but might actually be overkill. OpenAPI is popular but it never actually took over the market imo. If there is added complexity to agents.json because of this, I'd really question if it is worth supporting it. They don't have to be 100% inoperable, custom converters could manage partial support.
---
A lot of people are using agentic IDE's now, would be nice if agent.json shared a snippet with instructions on how to use it, where to find docs and how to pull a list and/or search the registry that people can just drop straight into Windsurf/Cursor.
1) Thanks for being a part of the journey! We also want something that works for us as agent developers. We didn't feel like anything else was addressing this problem and felt like we had to do it ourselves.
We love feedback! This is our first time doing OSS. I agree - MCP and agents.json are not mutually exclusive at all. They solve for difference clients.
2) Agreed. Something we're investing in soon is a generic SDK that can run any valid agents.json. That means the docs might getting a revamp soon too.
3) While many API services may not use OpenAPI, their docs pages often do! For example, readme.com lets you export your REST API docs as OpenAPI. As we add more types of action sources, agents.json won't be 1:1 with OpenAPI. In that way, we left the future of agents.json extensible.
4) Great idea! I think this would be so useful
interoperable*
In what ways is the agents.json file different from an OpenAPI Arazzo specification? Is it more native for LLM use? Looking at the example, I'm seeing similar concepts between them.
We've been in touch with Arazzo after we learned of the similarities. The long-term goal is to be aligned with Arazzo. However, the tooling around Arazzo isn't there today and we think it might take a while. agents.json is meant to be more native to LLMs, since Arazzo serves other use cases than LLMs.
To be more specific, we're planning to support multiple types of sources alongside REST APIs, like internal SDKs, GraphQL, gRPC, etc.
Thanks, that's helpful. I agree there are many other sources REST APIs where this would be helpful. Outside of that I would be interested in understanding the ways where Arazzo takes a broader approach and doesn't really fit an LLM use case.
It's not that Arazzo can't work for LLMs, just that it's not the primary use case. We want to add LLM enabled transformations between linkages. Arazzo having to serve other use cases like API workflow testing and guided docs experiences may not be incentivized to support these types of features.
This is interesting but why do you make it so hard to view the actual agents.json file? After clicking around in the registry (https://wild-card.ai/registry) for 10 minutes I still haven't found one example.
That's a good point. I'll add a download button to the registry. The agents.json are also available here https://github.com/wild-card-ai/agents-json/tree/master/agen...
EDIT: updated
Great, thanks!
What's the license of the Python package: https://pypi.org/project/agentsjson/
AGPL? https://github.com/wild-card-ai/agents-json/blob/master/LICE...
Yup. The specification is under Apache 2.0 and the Python package is under AGPL.
The full licenses can be found here: https://docs.wild-card.ai/about/licenses
Cool idea but seems to be dead on arrival due to licensing. Would love to have the team explain how anyone can possibly adopt their agpl package into their product.
A couple people have mentioned some relevant things in this thread. This SDK isn't meant to be restrictive. This can be implemented into other open-source frameworks as a plugin(ie. BrowserUse, Mastra, LangChain, CrewAI, ...). We just don't want someone like AWS flip this into a proxy service.
Some have asked us to host a version of the agents.json SDK. We're torn on this because we want to make it easier for people to develop with agents.json but acting as a proxy isn't appealing to us and many of the developers we've talked to.
That said, what do you think is the right license for something like this? This is our first time doing OSS.
Elastic License V2? https://www.elastic.co/licensing/elastic-license
Potentially. I also have reservations about it not technically being open source
Sounds like the spec is Apache 2.0. The Python package is AGPLv3, but the vast majority of the code in there looks to be codegen from OpenAPI specs. I'd imagine someone could create their own implementation without too much headache, though I'm just making an educated guess.
Echoing this - is there a commericialization play you're hoping to make?
How does this compare to llms.txt? I think that’s also emerging as a sort of standard to let LLMs understand APIs. I guess agents.json does a better packaging/ structural understanding of different endpoints?
llms.txt is a great standard for making website content more readable to LLMs, but it doesn’t address the challenges of taking structured actions. While llms.txt helps LLMs retrieve and interpret information, agents.json enables them to execute multi-step workflows reliably.
This could be more simple, which is a good thing, well done!
BTW I might have found a bug in the info property title in the spec: "MUST provide the title of the `agents.json` specification. This title serves as a human-readable name for the specification."
It now reads "MUST provide the title of the `agents.json` specification file. ..." Thanks for the heads up!
Can some help me understand why agents can't just use APIs documented by an openapi spec? Seems to work well in my own testing but I'm sure I'm missing something.
LLMs do well with outcome-described tools and APIs are written as resource-based atomic actions. By describing an API as a collection of outcomes, LLMs don't need to re-reason each time an action needs to be taken.
Also, when an OpenAPI spec gets sufficiently big, you face a need-in-the-haystack problem https://arxiv.org/abs/2407.01437.
Does anyone have any pro tips for large tool collections? (mine are getting fat)
Plan on doing a two layered system mentioned earlier, where the first layer of tool calls is as slim as they can be, then a second layer for more in depth tool documentation.
And/or chunking tools and creating embeddings and also using RAG.
Funnily enough, a search tool to solve this problem was our product going into YC. Now it’s a part of what we do with wild-card.ai and agents.json. I’d love to extend the tool search functionality for all the tools in your belt
It took us a decently long time to get the search quality good. Just a heads up in case you want to implement this yourself
I can agree this is a huge problem with large APIs, we are doing it with twilios api and it’s rough
Thinking from the retrieval perspective, would it make sense to have two layers?
First layer just describes on high level, the tools available and what they do, and make the model pick or route the request (via system prompt, or small model).
Second layer implements the actual function calling or OpenAPI, which then would give the model more details on the params and structures of the request.
That approach does a lot better, but LLMs still have positional bias problem baked into the transformer architecture (https://arxiv.org/html/2406.07791v1). This is where the LLM biases selecting information earlier in the prompt than later, which is unfortunate for tool selection accuracy.
Since 2 steps are required anyways, might as well use a dedicated semantic search for tools like in agents.json.
Interesting. This is the first time I am hearing about intrinsic positional bias for LLM. I had some intuition on this but nothing concrete.
Looks cool! How is it similar/different from MCP?
Thanks! MCP is taking a stateful approach, where every client maintains a 1:1 connection with a server. This means that for each user/client connected to your platform, you'd need a dedicated MCP server. We're used to writing software that interfaces with APIs, as stateless and deployment agnostic. agents.json keeps it that way.
For example, you can write an web-based chatbot that uses agents.json to interface with APIs. To do the same with MCP, you'd spin up a separate lambda or deployed MCP server for each user.
Pardon my ignorance , how is it different from MCP servers and having a supervisor agent selecting and executing the right MCP tool
Hey this looks pretty interesting. I saw that you guys are a YC company, how do you intend on making money deploying a protocol?
We think the main opportunity is to charge API providers, to get white-gloved onto this standard.
Our team was just exploring an approach to building an AI Agent Builder by making API calls via LLM, so this is very helpful. I'll give it a try!
Interesting! Reach out if you want to chat about it :)
Can you explain what the LLM sees in your Gmail example instead of the chain?
And how is that translation layer created? Do you write it yourself for whatever you need? Or is the idea for API owners to provide this?
I’m sure the details are there if I dig deeper but I just read the readme and this post.
We work with API providers to write this file. It takes a non-negligible amount of thought to put together since we're encoding which outcomes would be useful to enable/disable for an LLM. The standard is open so anyone can write and read and agents.json. Mainly intended for API providers to write.
Is this Agents.json file automatically generated or is one supposed to invest thousands of lines into it?
The end developer doesn't need to even see or read the agents.json file. It's a means for transparency and meant to be implemented by the API provider. Tooling to make creating an agents.json easier is on our roadmap. We have a process internally where we use a validator to guide creating an agents.json.
So,the api provider, like stripe, is supposed to publish a second API?
And then the "end developer" who is going to be making a chatbot/agent, is supposed to use that to make a chatbot?
Why does the plan involve there being multiple third party developers to make n products per provider? If the plan is to have third parties be creative and combine, say, Stripe with Google Ads, then how is a second API for LLMs useful.
I'm not seeing the vision here. I've seen something similar in a project where a guy wanted LLM developers to use his API for better browsing websites. If your plan involves:
1- Bigger players than you implementing your protocol 2- Everybody else doing the work.
It's just obviously not going to work and you need to rethink your place in the food chain.
We're grateful that bigger players like Resend, Alpaca, etc do want to implement the protocol. The problem is honestly onboarding them fast enough. That's one of the main areas we're going to build out in the next few weeks. Until then, we're writing every agents.json.
If you check out wild-card.ai and create your own collection, you'll find that it's actually really easy to develop with. As a developer, you never have to look at an agents.json if you don't want to.
The resend api has around 10 endpoints.
I like your approach but it's not clear to me whether it's MCP compatible
Anthropic just announced a MCP registry
MCP is great for the stateful systems, where shared context is a benefit, but this is a rarity. Developers generally write clients to use APIs in a stateless way, and we want to help this majority of users.
That said, agents.json is not mutually exclusive to MCP. I can see a future where an MCP for agents.json is created to access any API.
I think MCP being stateful is true in the short term. It's currently at the top of their roadmap to add to the protocol https://modelcontextprotocol.io/development/roadmap.
We've been keeping a close eye on this topic: https://github.com/modelcontextprotocol/specification/discus...
The options being considered to do this are:
1) maintain a session token mapping to the state -- which is still statefulness
2) create a separate stateless MCP protocol and reimplement -- agents.json is already the stateless protocol
3) reimplement every MCP as stateless and abandon the existing stateful MCP initiative
As you can tell, we're not bullish on any of these.
Isn't the idea to create a data lake to better inform models? Why are you bearish on stateful protocols? Could you elaborate on your thinking?
Bearish on everyone needing to be on stateful protocols. Developers should have the option to have their state managed internal to their application.
Can't you simply use a stateful protocol and not report any state? Doesn't statefulness subsume statelessness? I am beginning to wrap my head around this space, so excuse the naive questions.
No worries! In other cases, I believe you would be right. But splitting up context is not optional with MCP. Part of the whole state will always reside in an external entity.
I've been down this road - with OpenPlugin. It's all technically feasible - we did it successfully. The question is, so what? If the new models can zero-shot the API call and fix issues with long responses, boot parameters, lookup fields, etc, what's your business model?
The tail of the problem is quite long. Even if the average model is perfect at these things, do we want them to re-reason each time there's an impasse of outcomes? Often, the outcomes we want to achieve have well traversed flows anyways and we can just encode that.
In fact, I'm looking forward to the day that models are better at this so we can generate agents.json automatically and self-heal with RL.
On the business model, ¯\_(ツ)_/¯. We don't charge developers, anyways