412
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 27 Feb 2024
412 points (97.7% liked)
Open Source
31901 readers
391 users here now
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Useful Links
- Open Source Initiative
- Free Software Foundation
- Electronic Frontier Foundation
- Software Freedom Conservancy
- It's FOSS
- Android FOSS Apps Megathread
Rules
- Posts must be relevant to the open source ideology
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
- !libre_culture@lemmy.ml
- !libre_software@lemmy.ml
- !libre_hardware@lemmy.ml
- !linux@lemmy.ml
- !technology@lemmy.ml
Community icon from opensource.org, but we are not affiliated with them.
founded 5 years ago
MODERATORS
Have you read the spec? It’s a total mess
I have read the 1.2 spec (I'm trying to make a round trip parser for JS, and I do maintainance on a fork of the rumel yaml python package). I actually think its very well thought out, with things I hadn't considered like future extensibility, streaming applications, and data-corruption detection.
The diagrams, color coding, and less-formailty of the spec was much appreciated. Especially compared to something like the ECMA Script spec, which reads like a math textbook had a child with a legal document.
I'm not saying YAML is perfect; round trip (the thing I'm working on) is nearly impossible because it wasn't a design goal. It has a few too many features (I've never seen a declaration in the wild), but it does a good job at accomplishing the creators goals, and the additional features basically only slow down parser-implementers like me. I often pick it because of the tag support, which I've struggled to find an equivalent for in other serialization languages. I use anchors in recursive data structures, and complex keys for serializing complex data structures (not human readable). The "document end" marker has been nice when I'm worried about detecting partial-writes. And the merge key is nice for config files.
The application/perspective matters. Yaml might be bad for you but its not bad for everyone.
Even if anchors are pretty novel… I’ve watched myself & others fail for things that seem like they should be simple like scalars, quoting, & indentation rules all for being confusing (while failing to understand how/why the tab character isn’t supported).
That sounds like a skill issue. Something isn’t bad because you don’t understand it. Suggesting quoting is an issue for yaml is beyond the pale; it happens to be an issue everywhere.
Despite my love of yaml. I actually think he has a small point with unquoted strings. I teach students and see their struggles. Bash also does unquoted strings and basically all students go years and years without realizing
To know the difference between special and normal-but-no-quotes you have to know literally every special symbol. And, for example, its rare to realize the
--
in --help, isn't special at a language level, its only special at a convention level.Same thing can happen in yaml files, but actually a little worse I'd say. In bash all the "special" things are at least symbols. But in yaml there are more special cases. Imagine editing this kind of a list:
Three of those are not strings. Syntax highlighting can help (which is why I don't think its a real issue). But still "why are three not strings? Well ... just because". AKA there isn't a syntax pattern, there's just a hardcoded list of names that need to be memorized. What is actually challeging is, unless students start with a proper yaml tutorial, or see examples of quotes in the config, its not obvious that quotes will solve the problem (students think
"true"
behaves like"\"true\""
). So even when they seetrue
is highlighted funny, they don't really know what to do about it. I've seem some try stuff like \true.Still doesn't mean yaml is bad, every language has edge cases.
While the subjective assessment that quote handling in yaml is worse than bash is understandable, it is really just two of many many cases where quotes complicate things. And for a pretty good reason. They are used to isolate strings in many languages, even prose. They, therefore, always get special handling in lexical analysis. Understanding which languages use single quotes, double quotes, backticks, heredocs, etc and when to use them is really just part of the game or the struggle I guess.
Most languages require you to put quotes around strings as the norm… breaking that is part of what causes all of the confusion in the first place. Better design upfront would lead to less common errors. I have way more quoting issues in YAML than I do JSON, Nix, Nickel, Dhall, etc. because they aren’t trying to be cute with strings.
When you're editing yaml, why not just always write JSON?
Almost all nix attr keys are unquoted strings. Maybe I'm missing the point list, but I kinda wouldn't expect it to be on the list.
Its easy for me to say "just start writing JSON in the yaml. It doesn't get more simple than JSON", but actually I do think there's a small point with the unquoted strings.
Back before I knew programming, I was trying to change grammar settings sublime 2, which uses yaml. I had no idea what yaml was. The default setting values used unquoted strings fot regex. I knew PCRE regex and escapes, but suddenly they didnt work, and when I tried to match a single quote inside of regex that also didn't work. I didn't know I was editing yaml file (it had a
.tmLanguage
extension). Even worse, if I remeber correctly, unparsable settings just silently fail. Not only did I have no errors to google, I didn't have any reason to believe the escapes were the cause of the problem (they worked in the command line). Sometimes I edited the regex and it was fine, and other times it just seemed to break. I didn't learn about quoting in YAML until years later.For me that was an unfortuate combination, which was exacerbated by yaml unquoted weirdness. But when you're talking about "did you read the spec" that's a whole other story.
.nan
for nan, tabs vs spaces, unquted string weirdness, etc should just be one error message+google away. I think they're a small hiccups with what is overall a great format.