Still, it seems fairly well designed and elegant. Way better than YAML or TOML for example. Typeless seems like a bad decision in some ways but I can see the advantages.
The JSON-style "everything is one big tree" type of config file is really hard to split up into multiple files. The "Confetti" style "every thing you want to configure is its own unit" makes it natural to split up files by adding an include directive or rules like "all files in config.d will be read".
Nice looking project! The page in one place says it's opinionated and in another place says it's unopinionated. (I guess that means it's unopinionated :) ).
JSON, jsonc, json5, hcl, kdl, scfg, caddyfile... and that's just from earlier comments. After a brief search, puzzled, I ask: Is there really no more thorough comparison than wikipedia's[1]? No syntax-across-languages[2]? No design space characterization?
> Confetti source text consists of zero or more Unicode scalar values. For compatibility with source code editing tools that add end-of-file markers, if the last character of the source text is a Control-Z character (U+001A), implementations may delete this character.
I’ve heard of this once, when researching ASCII control codes and related ancient history, but never once seen it in real life. If you’re insisting on valid Unicode, it sounds to me like you’re several decades past that happening.
And then given that you forbid control characters in the next section… make up your mind. You’re saying both that implementations MAY delete this character, and that source MUST NOT use it. This needs clarification. In the interests of robustness, you need to specify what parsers MUST/SHOULD/MAY do in case of content MUST violations, whether it be reject the entire document, ignore the line, replace with U+FFFD, &c. (I would also recommend recapitalising the RFC 2119 terms. Decapitalising them doesn’t help readability because they’re often slightly awkward linguistically without the reminder of the specific technical meaning; rather it reduces their meaning and impact.)
> For compatibility with Windows operating systems, implementations may treat the sequence Carriage Return (U+000D) followed by Line Feed (U+000A) as a single, indivisible new line character sequence.
This is inviting unnecessary incompatibility. I recommend that you either mandate CRLF merging, or mandate CR stripping, or disallow special CRLF handling. Otherwise you can cause different implementations to parse differently, which has a long history of causing security problems, things like HTTP request smuggling.
I acknowledge this is intended as the base for a family of formats, rather than a strict single spec, but I still think allowing such variation for no good reason is a bad idea. (I’m not all that eager about the annexes, either.)
A plain ASCII document is a valid UTF-8 document, but I agree that special support for ^Z is pointless for a file format invented 20+ years after the demise of MS-DOS. Handling ^Z would probably be MS-DOS’ job anyway.
Loving this! Like other commenters here the syntax reminds me of KDL, except a lot simpler. I checked it out and was fully nerdsniped, so wrote an implementation <https://github.com/Heliodex/confetti-go> that passes all conformance tests, giving me a good feel for the language. Pretty easy to get working as well, though I haven't tried adding any of the appendices yet.
Whoa. This is really cool. I've thought a lot about markup / configuration languages. Aside from types (won't get into typed/typeless here) there are basically just a few possible structures: lists, maps, tables (lists of maps with same keys), and trees (xml-like with nested nodes of particular types) are the ones I think about.
Most existing formats are really bad for at least one of these. Tables in JSON have tons of repetition. XML doesn't have a clear and obvious way to do maps. Almost anything other than XML is awkward at best for node trees.
Confetti seems to cover maps, trees, and non-nested lists really well, which isn't a combination any other format I'm aware of covers as well.
Nested lists and tables seem like they would be more awkward, though from what I can tell "-" is a legal argument, so you could do:
nestedlist {
- { - 1 ; - 2 }
- {
- { - a ; - b }
- { - c ; - d }
}
}
To get something like [[1, 2], [[a, b], [c, d]]]. Of course you could also name the items (item { item 1 ; item 2 }), but either way this is certainly more awkward than a nested list in JSON or YAML.
I think a table could be done like JSON/HTML with repeated keys, but maybe also like:
table name age favorite-color {
row Bob 87 red
row "Someone else" 106 "bright neon green"
}
This is actually pretty nice.
In any event, I love seeing more exploration of configuration languages, so thanks for sharing this!
My number 1 request is a parser on the documentation page that shows parse tree and converts to JSON or other formats so you can play with it.
I like it! The spec could be more accessibly written, but it's somewhat understandable in casual reading. Perhaps it would benefit from a diagram like json's famous one
One thing I didn't understand is this example on the homepage:
> password "${ENV:ANONPASS}"
The spec doesn't seem to mention any ${}. Is this for the program to manage rather than the parser of the config going out to fetch an env var? If so, I find this a bit out of scope to show; at least, it confused me about whether that's built-in/supported syntax or if it's just a literal with syntax intended for a different program
Depending on how set in stone this is, another complaint I might have is that you still have the trailing comma issue from JSON, except it's not a comma but a backslash (reverse solidus, as the spec calls it—my mobile keyboard didn't even know that word). Maybe starting a list of arguments with [ could allow one to use any number of lines for the values, until a ] is encountered?
Yes, the "${}" would be for the program to evaluate; referencing environment variables that way isn't uncommon in Unix configuration files.
"Reverse Solidus" is the Unicode name for the character [1], so if you don't like the name, blame Unicode :)
I hadn't thought of using '[' and ']' for multi-line directives, that's an interesting suggestion. It vaguely resembles arrays as they appear in various other languages. It fits with Confetti's design of, ultimately, being user interpreted.
Nice! I like it. I've always liked INI for the exact advantage you cite - typelessness.
Blah blah blah it doesn't have a spec. Lack of a spec doesn't matter from the user's POV in this problem domain, as all configuration files are categorically application-specific anyway. It doesn't matter to the developer either, insofar as whatever implementation you use fits your needs. This isn't object notation, it's not data interchange, it's configuration.
You're going to read the configuration in a target programming language
So if the config format has its own type system you then have to convert between config types and language types
If the config type doesn't map exactly onto the target lang type you either ignore it and accept some values won't round-trip cleanly or without error, or you fall back to using strings (e.g. various possible integer type sizes, signed/unsigned etc, or decimal values via JSON)
Not saying it's always the right choice, but I can see why having lowest common denominator stringly-typed values as the config format can be seen as a feature allowing you to define the type system that the config will be parsed under to suit each particular application
I see your argument tho maybe im just not getting the real use case.
Because when saying defaulting back to string and thatfor ignoring typing, wouldnt that just be the same as beeing typeless? Therefor doesn't every format support string therefor supporting typeless?
Also, in how many cases do you need to parse the same configuration in multiple different languages?
Im not saying its not useful - i just try to get the use case for this arguments.
I think parsing the config in multiple target langs is definitely a counter-example where having a type system in the config lang can be useful as a lowest common denominator that you then conform all the parsers too
Here's an actual case I ran into with JSON and it's bizarre number treatment:
Neo4j uses 128 bit ids. The JSON API retrieves these ids as strings of digits, Python library reading this JSON decided to interpret these as double precision floats. And sometimes it works, other times: not so much...
The whole selling point of configuration formats is to allow multiple languages to access the same data. So, cases when multiple languages have to read the same configuration format are exceedingly common.
The fact that the format supports other types means that someone (probably not you) will use those types, and then (probably you) will have to deal with the mess created by that person. I don't trust programmers in general to write good code... if possible to prevent them from writing bad code, then why not?
Well im working alot with json in my job and privat coding life used from all sorts of different languages, and so far i always could sort stuff out.
And well - if anyone trusts external coders he should be damned (or isnt he already for doing so? never trust external data - the golden rule...)
Your case is interesting, i worked with Neo4j years ago in a PHP project and never run into such issues, but maybe i was just lucky.
Nowadays i code mostly golang and im always making sure that whatever an external party sends me is what im expecting (validation ...).
To your point of preventing somebody to write bad code - i've given up on that. Whenever i thought the environment will enforce someone to write proper code, people proof me to be wrong be finding new ways to do the most absurd things.
But ye, its worth a try.
So - why i question such a thing? Because i'm not a fan of adding more and more 3'rd party dependencies to my projects. And while confetti might be a good thing (i never said it can't be) it wont get into any default packaging in a forseeable future meaning i have to make sure that the dependency stays stable which adds another task and liability on my end. So instead of having to deal with the devil i know (validating json data) i have to deal with a new one to eliminate the old one.
Time will tell if confetti will make its way into stable reliable state for common languages - than i might give it a try.
I guess its because you can allow for custom data formats. You'll have to validate/parse the file anyways, and maybe having utc timestamps is worse than local date-time notation. Especially if the user is supposed to edit the file by hand.
I know for sure I'd like "timeout: 1h6m10s" more than "timeout: 3970". So unless you want to support really specific datatypes just being typeless is better. Putting everything in double quotes to get a string, while spec-wise would be typed, is not enough when the backing data type is not going to be a string. So you might as well throw it away and let the program handle all type conversion.
So i get your point with date/time - while i may be an oldie with still preferring just having an integer seconds - but thats subjective to me.
Tho the quote for string argument i can't fully agree on. While sure in for example a json i would have to quote the values if i want them "typeless as string" - tho json is far supported everywhere and i'm able to interpret the parsed string values in whatever way i want to.
Adding a new dependency (confetti parsing) to spare out quotes doesn't seem to be worth the convenience to me.
Though I'm not sure why using keywords like `true`, `false` or `null` are seen as a negative.
Especially the numeral digits, its the system that most of the world uses...
The "expression arguments extension" is intended to allow for 'richer' expressions wherever an argument is expected. In the following example, it shows how you might use it if you need basic control flow in your configuration:
if (x > y) {
print "x is greater than y"
}
It's freeform so your application would need to interpret what's between the '(' and ')'.
The "punctuator arguments extension" is the one I'm most anxious about since it might be too flexible, but I'd like to hear feedback. It lets you define your own domain-specific punctuators for your configuration so, for example, you might decide that "=" is a punctuator which means the following is equivalent:
x = 123
x=123
Under standard interpretation, if you wanted "=" it be distinct from "x" and "123", then white space would be required.
The "comment syntax extension" is just C style comments.
My goal was to keep the language basic while encouraging custom flavors. If an extension becomes ubiquitous, then - depending on what it is - it might merge with the standard or be added to the annex.
Understand being anxious about flexibility, but that's also potentially one of the coolest differentiators! Would be interesting to see what people come up with...
Indention is not significant. The example was supposed to demonstrate how you might use individual directives for pseudo-grouping. The example was inspired by premake [1] which takes this approach, but in Lua.
So weird, I was toying with a DSL 1-2 years ago and strongly considered turning it into a configuration language because the ergonomics were much nicer than JSON or YAML, and reminded me of HCL in a way. It looked very similar to this.
I abandoned the effort, but nice to know that someone else had a similar idea. Will be trying this out!
Okay, maybe it's quick. But it's also surprisingly hard to do "right". Just look at libexpat. Sure, many issues could be prevented with another programming language. But there are still regular updates because parsing custom entities is a minefield.
That said, I also like XML for all the other reasons you mentioned. Just don't do it like Maven.
I'm still a fan of HJSON, and JSON5 looks quite nice, but this does as well. That's all I can really say. There are so many choices, but looks like you did really well on this one.
I really like this kind of config file. People are saying it's useless because you should just use JSON, but I think that misses the fundamental point of this style of config: you configure "things" not as part of a huge tree structure, but as their own free-standing structures. Users don't go into an array of users as a 3rd level of indentation, users are their own top-level thing.
This allows really cool things, like modular configs where one "main" config file can include multiple specific-purpose configs. One file can contain the "default users" while another can contain additional users, for example. Or each user can get its own file.
Not at all in the direction where I'd want a configuration language to go... The marginal "improvements" wrt' punctuation are just inconsequential.
I'd take Prolog without I/O and (some? all?) extra-logical predicates as configuration language. Maybe if there's a way to require recursion to terminate, that'd be great, but not essential.
I like how the spec defines character classes by just passing the buck to Unicode
=====
Forbidden Characters
Forbidden characters are Unicode scalar values with general category Control, Surrogate, and Unassigned. Forbidden characters must not appear in the source text.
White Space
White space characters are those Unicode characters with the Whitespace property, including line terminators.
The "general category" [1] and "whitespace" [2] properties are real character properties defined by Unicode. Referring to them is, ideally, how a language that supports Unicode should do things.
> Confetti does not compete with JSON or XML, it competes with INI.
It clearly competes with JSON.
I think I would still much rather use JSON5 over this. It's quite similar in terms of structure and terseness, but I don't have to learn anything.
Still, it seems fairly well designed and elegant. Way better than YAML or TOML for example. Typeless seems like a bad decision in some ways but I can see the advantages.Top marks on the name!
The JSON-style "everything is one big tree" type of config file is really hard to split up into multiple files. The "Confetti" style "every thing you want to configure is its own unit" makes it natural to split up files by adding an include directive or rules like "all files in config.d will be read".
JSON with comments and trailing commas is all I want.
Call it j5on
This exists: jsonc – and it's somewhat widely used, such as for VS Code configuration.
That's JSONC. But JSON5 adds some more nice stuff without being overkill IMO.
https://json5.org/
Market quote, commas, and colons aren't that terse, and it seems too simple to have to learn much of anything
Nice, I found one typo/editing thing though which kind of makes it contradict itself:
The first paragraph says:
[...] It is minimalistic, untyped, and opinionated. [...]
but then under "Notable features" it begins with a big bold *Unopinionated*, so that was very confusing.
Good catch! It's "unopinionated" for the user and "opinionted" in its design decisions. I'll stick with "unopinionated" for consistency.
I don't trust a config file that doesn't enforce quotes around strings. it's a footgun especially when it collides with ill-defined boolean
I think you missed the "typeless" idea. That basically means every value is a string and so it's up for the application to parse and validate.
so you mean a config file that can have a different meaning depending on who's parsing? sounds nice and not dangerous at all
Isn't the meaning of a config file always dependent on the application you are configuring?
Nice looking project! The page in one place says it's opinionated and in another place says it's unopinionated. (I guess that means it's unopinionated :) ).
It's opinionated about it's unopinionatedness.
Like flammable and inflammable.
Looks similar to my favorite format KDL: https://kdl.dev/
Good to see a push towards less syntactic overhead, which is still considerable in JSON.
JSON, jsonc, json5, hcl, kdl, scfg, caddyfile... and that's just from earlier comments. After a brief search, puzzled, I ask: Is there really no more thorough comparison than wikipedia's[1]? No syntax-across-languages[2]? No design space characterization?
[1] https://en.wikipedia.org/wiki/Comparison_of_data-serializati... [2] https://rigaux.org/language-study/syntax-across-languages.ht...
Pkl to bind them all?
https://pkl-lang.org/index.html
In the spec <https://confetti.hgs3.me/specification/>:
> Confetti source text consists of zero or more Unicode scalar values. For compatibility with source code editing tools that add end-of-file markers, if the last character of the source text is a Control-Z character (U+001A), implementations may delete this character.
I’ve heard of this once, when researching ASCII control codes and related ancient history, but never once seen it in real life. If you’re insisting on valid Unicode, it sounds to me like you’re several decades past that happening.
And then given that you forbid control characters in the next section… make up your mind. You’re saying both that implementations MAY delete this character, and that source MUST NOT use it. This needs clarification. In the interests of robustness, you need to specify what parsers MUST/SHOULD/MAY do in case of content MUST violations, whether it be reject the entire document, ignore the line, replace with U+FFFD, &c. (I would also recommend recapitalising the RFC 2119 terms. Decapitalising them doesn’t help readability because they’re often slightly awkward linguistically without the reminder of the specific technical meaning; rather it reduces their meaning and impact.)
> For compatibility with Windows operating systems, implementations may treat the sequence Carriage Return (U+000D) followed by Line Feed (U+000A) as a single, indivisible new line character sequence.
This is inviting unnecessary incompatibility. I recommend that you either mandate CRLF merging, or mandate CR stripping, or disallow special CRLF handling. Otherwise you can cause different implementations to parse differently, which has a long history of causing security problems, things like HTTP request smuggling.
I acknowledge this is intended as the base for a family of formats, rather than a strict single spec, but I still think allowing such variation for no good reason is a bad idea. (I’m not all that eager about the annexes, either.)
A plain ASCII document is a valid UTF-8 document, but I agree that special support for ^Z is pointless for a file format invented 20+ years after the demise of MS-DOS. Handling ^Z would probably be MS-DOS’ job anyway.
Loving this! Like other commenters here the syntax reminds me of KDL, except a lot simpler. I checked it out and was fully nerdsniped, so wrote an implementation <https://github.com/Heliodex/confetti-go> that passes all conformance tests, giving me a good feel for the language. Pretty easy to get working as well, though I haven't tried adding any of the appendices yet.
Whoa. This is really cool. I've thought a lot about markup / configuration languages. Aside from types (won't get into typed/typeless here) there are basically just a few possible structures: lists, maps, tables (lists of maps with same keys), and trees (xml-like with nested nodes of particular types) are the ones I think about.
Most existing formats are really bad for at least one of these. Tables in JSON have tons of repetition. XML doesn't have a clear and obvious way to do maps. Almost anything other than XML is awkward at best for node trees.
Confetti seems to cover maps, trees, and non-nested lists really well, which isn't a combination any other format I'm aware of covers as well.
Nested lists and tables seem like they would be more awkward, though from what I can tell "-" is a legal argument, so you could do:
To get something like [[1, 2], [[a, b], [c, d]]]. Of course you could also name the items (item { item 1 ; item 2 }), but either way this is certainly more awkward than a nested list in JSON or YAML.I think a table could be done like JSON/HTML with repeated keys, but maybe also like:
This is actually pretty nice.In any event, I love seeing more exploration of configuration languages, so thanks for sharing this!
My number 1 request is a parser on the documentation page that shows parse tree and converts to JSON or other formats so you can play with it.
I like it! The spec could be more accessibly written, but it's somewhat understandable in casual reading. Perhaps it would benefit from a diagram like json's famous one
One thing I didn't understand is this example on the homepage:
> password "${ENV:ANONPASS}"
The spec doesn't seem to mention any ${}. Is this for the program to manage rather than the parser of the config going out to fetch an env var? If so, I find this a bit out of scope to show; at least, it confused me about whether that's built-in/supported syntax or if it's just a literal with syntax intended for a different program
Depending on how set in stone this is, another complaint I might have is that you still have the trailing comma issue from JSON, except it's not a comma but a backslash (reverse solidus, as the spec calls it—my mobile keyboard didn't even know that word). Maybe starting a list of arguments with [ could allow one to use any number of lines for the values, until a ] is encountered?
Yes, the "${}" would be for the program to evaluate; referencing environment variables that way isn't uncommon in Unix configuration files.
"Reverse Solidus" is the Unicode name for the character [1], so if you don't like the name, blame Unicode :)
I hadn't thought of using '[' and ']' for multi-line directives, that's an interesting suggestion. It vaguely resembles arrays as they appear in various other languages. It fits with Confetti's design of, ultimately, being user interpreted.
[1] https://www.compart.com/en/unicode/U+005C
It reminds me of the Caddyfile.
https://caddyserver.com/docs/caddyfileNice! I like it. I've always liked INI for the exact advantage you cite - typelessness.
Blah blah blah it doesn't have a spec. Lack of a spec doesn't matter from the user's POV in this problem domain, as all configuration files are categorically application-specific anyway. It doesn't matter to the developer either, insofar as whatever implementation you use fits your needs. This isn't object notation, it's not data interchange, it's configuration.
This also reminds me of HCL
https://github.com/hashicorp/hcl?tab=readme-ov-file#informat...
Why is typeless considered something good?
You're going to read the configuration in a target programming language
So if the config format has its own type system you then have to convert between config types and language types
If the config type doesn't map exactly onto the target lang type you either ignore it and accept some values won't round-trip cleanly or without error, or you fall back to using strings (e.g. various possible integer type sizes, signed/unsigned etc, or decimal values via JSON)
Not saying it's always the right choice, but I can see why having lowest common denominator stringly-typed values as the config format can be seen as a feature allowing you to define the type system that the config will be parsed under to suit each particular application
I see your argument tho maybe im just not getting the real use case.
Because when saying defaulting back to string and thatfor ignoring typing, wouldnt that just be the same as beeing typeless? Therefor doesn't every format support string therefor supporting typeless?
Also, in how many cases do you need to parse the same configuration in multiple different languages?
Im not saying its not useful - i just try to get the use case for this arguments.
I think parsing the config in multiple target langs is definitely a counter-example where having a type system in the config lang can be useful as a lowest common denominator that you then conform all the parsers too
Here's an actual case I ran into with JSON and it's bizarre number treatment:
Neo4j uses 128 bit ids. The JSON API retrieves these ids as strings of digits, Python library reading this JSON decided to interpret these as double precision floats. And sometimes it works, other times: not so much...
The whole selling point of configuration formats is to allow multiple languages to access the same data. So, cases when multiple languages have to read the same configuration format are exceedingly common.
The fact that the format supports other types means that someone (probably not you) will use those types, and then (probably you) will have to deal with the mess created by that person. I don't trust programmers in general to write good code... if possible to prevent them from writing bad code, then why not?
Well im working alot with json in my job and privat coding life used from all sorts of different languages, and so far i always could sort stuff out.
And well - if anyone trusts external coders he should be damned (or isnt he already for doing so? never trust external data - the golden rule...)
Your case is interesting, i worked with Neo4j years ago in a PHP project and never run into such issues, but maybe i was just lucky.
Nowadays i code mostly golang and im always making sure that whatever an external party sends me is what im expecting (validation ...).
To your point of preventing somebody to write bad code - i've given up on that. Whenever i thought the environment will enforce someone to write proper code, people proof me to be wrong be finding new ways to do the most absurd things.
But ye, its worth a try.
So - why i question such a thing? Because i'm not a fan of adding more and more 3'rd party dependencies to my projects. And while confetti might be a good thing (i never said it can't be) it wont get into any default packaging in a forseeable future meaning i have to make sure that the dependency stays stable which adds another task and liability on my end. So instead of having to deal with the devil i know (validating json data) i have to deal with a new one to eliminate the old one.
Time will tell if confetti will make its way into stable reliable state for common languages - than i might give it a try.
I guess its because you can allow for custom data formats. You'll have to validate/parse the file anyways, and maybe having utc timestamps is worse than local date-time notation. Especially if the user is supposed to edit the file by hand.
I know for sure I'd like "timeout: 1h6m10s" more than "timeout: 3970". So unless you want to support really specific datatypes just being typeless is better. Putting everything in double quotes to get a string, while spec-wise would be typed, is not enough when the backing data type is not going to be a string. So you might as well throw it away and let the program handle all type conversion.
So i get your point with date/time - while i may be an oldie with still preferring just having an integer seconds - but thats subjective to me.
Tho the quote for string argument i can't fully agree on. While sure in for example a json i would have to quote the values if i want them "typeless as string" - tho json is far supported everywhere and i'm able to interpret the parsed string values in whatever way i want to.
Adding a new dependency (confetti parsing) to spare out quotes doesn't seem to be worth the convenience to me.
Tho - both probably very subjective things to me.
I like the look of it, very clean
Though I'm not sure why using keywords like `true`, `false` or `null` are seen as a negative. Especially the numeral digits, its the system that most of the world uses...
"Most of the world" is less aligned with the transcultural goals of the project it seems :)
That said comment starting with a diese and curly brackets to group, as well as double quote for string delimiters are not that neutral on this side.
Congrats on shipping this. It's similar to something that was in the back of my mind for a while. I'll give it a try!
Nice that Unicode is supported, and the localization is a nice twist
Are there any examples of what's possible with extensions?
The "expression arguments extension" is intended to allow for 'richer' expressions wherever an argument is expected. In the following example, it shows how you might use it if you need basic control flow in your configuration:
It's freeform so your application would need to interpret what's between the '(' and ')'.The "punctuator arguments extension" is the one I'm most anxious about since it might be too flexible, but I'd like to hear feedback. It lets you define your own domain-specific punctuators for your configuration so, for example, you might decide that "=" is a punctuator which means the following is equivalent:
Under standard interpretation, if you wanted "=" it be distinct from "x" and "123", then white space would be required.The "comment syntax extension" is just C style comments.
My goal was to keep the language basic while encouraging custom flavors. If an extension becomes ubiquitous, then - depending on what it is - it might merge with the standard or be added to the annex.
Understand being anxious about flexibility, but that's also potentially one of the coolest differentiators! Would be interesting to see what people come up with...
To author:
In the "Material Definitions" example there are no { }. Why not? What's the difference? Is indentation significant?
Indention is not significant. The example was supposed to demonstrate how you might use individual directives for pseudo-grouping. The example was inspired by premake [1] which takes this approach, but in Lua.
[1] https://premake.github.io/docs/Your-First-Script
So weird, I was toying with a DSL 1-2 years ago and strongly considered turning it into a configuration language because the ergonomics were much nicer than JSON or YAML, and reminded me of HCL in a way. It looked very similar to this.
I abandoned the effort, but nice to know that someone else had a similar idea. Will be trying this out!
Great work!
Suggestion, might be good to include Lua in the comparison table - since it’s also used for config as well.
So much continual effort wasted when for over 20 years we've had XML.
XML still works well as a configuration format.
Is it verbose? Very much so, but it ticks all the boxes:
- No ambiguity
- Typed
- Quick to parse
- Has Schemas that allow validation
- Widespread tooling support
All we needed was for applications to publish their XML schema files and any XML tool could allow for friendly editing.
> Quick to parse
eh ...
Okay, maybe it's quick. But it's also surprisingly hard to do "right". Just look at libexpat. Sure, many issues could be prevented with another programming language. But there are still regular updates because parsing custom entities is a minefield.
That said, I also like XML for all the other reasons you mentioned. Just don't do it like Maven.
I'm still a fan of HJSON, and JSON5 looks quite nice, but this does as well. That's all I can really say. There are so many choices, but looks like you did really well on this one.
Looks a lot like scfg.
https://git.sr.ht/~emersion/scfg
I really like this kind of config file. People are saying it's useless because you should just use JSON, but I think that misses the fundamental point of this style of config: you configure "things" not as part of a huge tree structure, but as their own free-standing structures. Users don't go into an array of users as a 3rd level of indentation, users are their own top-level thing.
This allows really cool things, like modular configs where one "main" config file can include multiple specific-purpose configs. One file can contain the "default users" while another can contain additional users, for example. Or each user can get its own file.
Looks nice. Less syntactic noise that many other efforts, a good thing IMHO.
Not at all in the direction where I'd want a configuration language to go... The marginal "improvements" wrt' punctuation are just inconsequential.
I'd take Prolog without I/O and (some? all?) extra-logical predicates as configuration language. Maybe if there's a way to require recursion to terminate, that'd be great, but not essential.
Can you please add comparison of your language with nix lang?
I like how the spec defines character classes by just passing the buck to Unicode
=====
Forbidden Characters
Forbidden characters are Unicode scalar values with general category Control, Surrogate, and Unassigned. Forbidden characters must not appear in the source text.
White Space
White space characters are those Unicode characters with the Whitespace property, including line terminators.
The "general category" [1] and "whitespace" [2] properties are real character properties defined by Unicode. Referring to them is, ideally, how a language that supports Unicode should do things.
[1] https://www.unicode.org/reports/tr44/#GC_Values_Table
[2] https://www.unicode.org/reports/tr44/#White_Space
I don’t intend this to be mean, but is this satire? Confetti seems to proudly use concepts which are very much NOT popular right now.
For example, you’ve reintroduced the Norway Problem. https://news.ycombinator.com/item?id=36745212
And I personally hope to never edit another file which lacks a strict schema like this does.
Did you read TFA? It does not reintroduce the Norway problem, as every value is a string. Also, popular right now does not necessarily mean "good".
> every value is a string
Somehow this design flaw escaped my notice.
Obligatory XKCD: https://xkcd.com/927/