| 1. | | j1-micro and j1-nano: Tiny (0.6B, 1.7B) and Mighty Reward Models (github.com/haizelabs) |
| 3 points by leonardtang 8 months ago | past |
|
| 2. | | Verdict: A Library for Scaling Judge-Time Compute (twitter.com/leonardtang_) |
| 3 points by leonardtang 11 months ago | past |
|
| 3. | | Awesome-LLM-Judges (github.com/haizelabs) |
| 2 points by leonardtang 11 months ago | past |
|
| 4. | | LLM Judges (github.com/haizelabs) |
| 2 points by leonardtang 12 months ago | past |
|
| 5. | | Cascade: A fast, automated, multi-turn LLM jailbreaking method (twitter.com/haizelabs) |
| 2 points by leonardtang on Nov 4, 2024 | past |
|
| 6. | | RBAC RAG (github.com/haizelabs) |
| 1 point by leonardtang on Sept 6, 2024 | past |
|
| 7. | | RBAC RAG with MongoDB (github.com/haizelabs) |
| 2 points by leonardtang on Sept 5, 2024 | past |
|
| 8. | | Simple and Safe RAG with RBAC (github.com/haizelabs) |
| 2 points by leonardtang on Sept 3, 2024 | past |
|
| 9. | | Inducing LLM Hallucinations (github.com/haizelabs) |
| 2 points by leonardtang on Aug 1, 2024 | past |
|
| 10. | | Sphynx: Fuzz Testing Hallucination Detection Models (github.com/haizelabs) |
| 2 points by leonardtang on July 31, 2024 | past |
|
| 11. | | It's a bad day to be a language model (github.com/haizelabs) |
| 2 points by leonardtang on June 12, 2024 | past | 1 comment |
|
| 12. | | Thorn in a HaizeStack test for evaluating long-context adversarial robustness (github.com/haizelabs) |
| 19 points by leonardtang on May 6, 2024 | past | 11 comments |
|
| 13. | | Thorn in a HaizeStack Long-Context Jailbreak Test (github.com/haizelabs) |
| 5 points by leonardtang on May 5, 2024 | past |
|
| 14. | | A Convenient Ensembled Perplexity API (github.com/haizelabs) |
| 1 point by leonardtang on May 2, 2024 | past |
|
| 15. | | A Trivial Llama 3 Jailbreak (github.com/haizelabs) |
| 70 points by leonardtang on April 20, 2024 | past | 47 comments |
|
| 16. | | Making a SOTA Adversarial Attack on LLMs 38x Faster (haizelabs.com) |
| 2 points by leonardtang on March 28, 2024 | past |
|
| 17. | | LLM Red-Teaming Resistance Leaderboard (huggingface.co) |
| 2 points by leonardtang on March 1, 2024 | past |
|
| 18. | | OpenAI Content Moderation Is Really, Really Bad (haizelabs.com) |
| 2 points by leonardtang on Jan 12, 2024 | past | 1 comment |
|
| 19. | | Degraded Polygons Raise Fundamental Questions of Neural Network Perception (arxiv.org) |
| 1 point by leonardtang on Oct 6, 2023 | past |
|
| 20. | | Edwin Armstrong: Pioneer of the Airwaves (columbia.edu) |
| 1 point by leonardtang on June 22, 2023 | past |
|
| 21. | | Learning the Wrong Lessons: Inserting Trojans During Knowledge Distillation (arxiv.org) |
| 1 point by leonardtang on May 14, 2023 | past |
|
| 22. | | The Naughtyformer: A Transformer Understands Offensive Humor (arxiv.org) |
| 7 points by leonardtang on Dec 1, 2022 | past |
|
| 23. | | Show HN: SharinGAN - Generating Naruto Sharingans with GANs (sharingans.com) |
| 1 point by leonardtang on Aug 11, 2022 | past |
|
| 24. | | SharinGAN: Generating Naruto Sharingans with GANs (sharingans.com) |
| 3 points by leonardtang on Aug 7, 2022 | past |
|