this post was submitted on 26 Aug 2025
4 points (100.0% liked)

blueteamsec

467 readers
15 users here now

For [Blue|Purple] Teams in Cyber Defence - covering discovery, detection, response, threat intelligence, malware, offensive tradecraft and tooling, deception, reverse engineering etc.

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] nymnympseudonym@piefed.social 1 points 1 day ago* (last edited 1 day ago)

Perplexity prompt: "quotes from openai about safety testing for gpt-oss"

Reply below:


Safety Testing Measures

  • OpenAI’s safety process for open-weight release included technical audits (robustness, bias detection), structured adversarial testing, and evaluation of prompt-injection and jailbreak vulnerabilities.
  • The organization also employed both automated and human review to test for outputs that could cause harm, leveraging cross-disciplinary teams including outside researchers, ethicists, and cybersecurity experts.
  • Feedback from these rounds led to incremental model improvements before the open weights went public.

Transparency and External Collaboration

  • OpenAI has collaborated with third-party security and ethics researchers to validate its safety protocols and stress-test new models prior to release.
  • The company acknowledges that “Releasing open weights is a significant responsibility due to risks of misuse. We want to be transparent about our process and invite the AI community to help report and address issues that may arise post-release.”