| 1. | | Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations (arxiv.org) |
| 2 points by mnk47 on Nov 22, 2024 | past |
|
| 2. | | Leveraging Large Language Models for Advanced Multilingual Text-to-Speech (arxiv.org) |
| 1 point by mnk47 on Nov 5, 2024 | past | 1 comment |
|
| 3. | | Hertz-dev, the first open-source base model for conversational audio (si.inc) |
| 296 points by mnk47 on Nov 3, 2024 | past | 56 comments |
|
| 4. | | Thinking LLMs: General Instruction Following with Thought Generation (arxiv.org) |
| 2 points by mnk47 on Oct 15, 2024 | past | 1 comment |
|
| 5. | | What's the Magic Word? A Control Theory of LLM Prompting (arxiv.org) |
| 1 point by mnk47 on Oct 14, 2024 | past |
|
| 6. | | Swarm, a new agent framework by OpenAI (github.com/openai) |
| 258 points by mnk47 on Oct 12, 2024 | past | 106 comments |
|
| 7. | | Ask HN: Why is .NET never talked about as an option for solo/small team dev? |
| 53 points by mnk47 on Sept 22, 2024 | past | 73 comments |
|
| 8. | | Recursive Introspection: Teaching Language Model Agents How to Self-Improve (arxiv.org) |
| 5 points by mnk47 on July 27, 2024 | past |
|
| 9. | | The rise–and fall–of the software developer (adpri.org) |
| 3 points by mnk47 on June 17, 2024 | past | 1 comment |
|
| 10. | | Building with OpenAI What's Ahead [video] (vimeo.com) |
| 1 point by mnk47 on May 23, 2024 | past |
|
| 11. | | Chain of Thoughtlessness: An Analysis of Cot in Planning (arxiv.org) |
| 2 points by mnk47 on May 9, 2024 | past |
|
| 12. | | GPT-4 users, how are you using it? Is ChatGPT Plus still worth it? |
| 4 points by mnk47 on April 19, 2024 | past | 5 comments |
|
| 13. | | Can Large Language Models Reason and Plan? (arxiv.org) |
| 4 points by mnk47 on April 15, 2024 | past | 1 comment |
|
| 14. | | Wu's Method Can Boost AlphaGeometry to Outperform Gold Medalists at IMO Geometry (arxiv.org) |
| 7 points by mnk47 on April 13, 2024 | past | 1 comment |
|
| 15. | | Ask HN: Usefulness of formal verification (Coq) and formal specification (TLA+)? |
| 3 points by mnk47 on April 8, 2024 | past | 1 comment |
|
| 16. | | After 2 Weeks of Testing, What Do Developers Think About Claude 3? (favtutor.com) |
| 3 points by mnk47 on March 24, 2024 | past |
|
| 17. | | Ask HN: Anyone here using Gemini Pro 1.5? How does it compare to Claude Opus? |
| 1 point by mnk47 on March 20, 2024 | past | 1 comment |
|