cross-posted from: https://discuss.tchncs.de/post/3979328
Engineers in Princeton managed to train GPT4 and extend AutoSVA to generate SVA (systemverilog assertions) from buggy RTL and functionality description. SVA is widely used to verify digital design for ASIC and FPGAs. AutoSVA2, which extends open-source AutoSVA, improves the flow to generate SVA from English description. LLM was trained in multiple iterations to generate SVA with correct syntax, which is something GPT fails to do by itself. Authors argue that GPT's "creativity" allows it to write correct assertion even from a buggy RTL. Later authors used this tool to write RTL from scratch as well. RTL written by GPT was tested against the SVA generated by this tool, and SVA corrected by an engineer was fed back to LLM, which generated functionally correct FIFO queue in a few iterations.
Abstract—Formal property verification (FPV) has existed for decades and has been shown to be effective at finding intricate RTL bugs. However, formal properties, such as those written as SystemVerilog Assertions (SVA), are time-consuming and error- prone to write, even for experienced users. Prior work has attempted to lighten this burden by raising the abstraction level so that SVA is generated from high-level specifications. However, this does not eliminate the manual effort of reasoning and writing about the detailed hardware behavior. Motivated by the increased need for FPV in the era of heterogeneous hardware and the advances in large language models (LLMs), we set out to explore whether LLMs can capture RTL behavior and generate correct SVA properties. First, we design an FPV-based evaluation framework that measures the correctness and completeness of SVA. Then, we evaluate GPT4 iteratively to craft the set of syntax and semantic rules needed to prompt it toward creating better SVA. We extend the open-source AutoSVA framework by integrating our improved GPT4-based flow to generate safety properties, in addition to facilitating their existing flow for liveness properties. Lastly, our use cases evaluate (1) the FPV coverage of GPT4-generated SVA on complex open-source RTL and (2) using generated SVA to prompt GPT4 to create RTL from scratch. Through these experiments, we find that GPT4 can generate correct SVA even for flawed RTL—without mirroring design errors. Particularly, it generated SVA that exposed a bug in the RISC-V CVA6 core that eluded the prior work’s evaluation.
Wow !
A machine building a better machine and beating humans at this.
Edit :
is "RTL" as descibed in the answer to my comment, by @AbouBenAdhem@lemmy.world or is it as I guessed ? :
In this article RTL refers to register transfer level. It is a way of describing hardware on very low level, it uses registers for memory (which usually translates to flip-flops when/if synthesized), wires, basic arithmetic and logic operations, but terminology may slightly change based on which rtl language is being used. It can be used to design a CPU, or any ASIC (application specific integrated circuit) chip. Instructions may resemble to processor instructions, but the end result is fundamentally different. You may run a set of instructions on a processor, while what rtl describes is often synthesized and becomes the hardware itself which performs the operations (e.g. arithmetic logic unit in the cpu).
In the paper, RTL stands for Register Transfer Level, in the domain of digital circuit design -> https://en.wikipedia.org/wiki/Register-transfer_level
Thanks for not putting the paper behind a paywall!
When I read the abstract, I assumed RTL stood for Register Transfer Language:
It almost makes sense using either term, though the references to ASIC and RISC (and the cross-post to Chip Design) point to your reading being correct.
(It doesn’t help that they identified almost every acronym except the key one in their title.)
Sorry about that, I should have spelled out RTL as Register Transfer Level in the paper. But yeah given the references to Verilog and hardware design it can be deducted...
Oups, I should have read the article or I should have asked instead of trying to guess ! I do not work in this domain.