73
How far should a programming language aware diff go?
(semanticdiff.com)
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Follow the wormhole through a path of communities !webdev@programming.dev
Using a tool like this to hide sections of code presented for review places a lot of trust in the automation. If Mallory were to discover a blind spot in the semantic diff logic, she could slip in a small change for eventual use in an exploit, and it would never be seen by another human.
For example, consider this part of the exploit used in the recent xz backdoor. In case you don't see the problem, here's the fix.
Rather than hiding code from review, if a tool figured out a way to use semantic understanding to highlight code that might be overlooked by a human (and should therefore be reviewed more carefully), it could conceivably help find such things.
I don't have an opinion on the topic but I see a blind spot in your argument, so I have to be that kind of person ... 🥺
One could use the exact same example to argue that humans are very bad at parsing code (especially if whitespace kicks in). In that regard a tool that allows them to reason on a standardized representation of the AST can be a protection against a whole class of attacks.
That's not a blind spot in my comment. See my final paragraph.
It's only one sentence. Maybe it was easy to miss. :)
Oh yeah, so I'm that other kind of guy 🥺
I kinda like your idea, but I think it can be difficult to detect some confusing situations. I think it would be a better idea, but I don't think it's a full replacement.
I like the idea, but I can't come up with any method that won't devolve into most reviewers only checking the highlighted parts tbh.
This is a very big stretch IMO. That xz change wasn't actually the exploit, it was just used to make the exploit less detectable. And it was added by people with commit access so it didn't even have to go through code review.
On top of that, code review is not magic. It's easy to get bugs past it hiding in plain sight (if that wasn't the case Linux would be bug free!).
Can you think of an actually realistic example?