Feb 27 2025 13 mins
本期的 18 篇论文如下:
[00:23] 🌐 Kanana: Compute-efficient Bilingual Language Models(Kanana:计算高效的双语语言模型)
[00:54] 👤 GHOST 2.0: generative high-fidelity one shot transfer of heads(GHOST 2.0:生成高保真一次性头部转移)
[01:43] 🎥 TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding(定理解释代理:面向大语言模型定理理解的多模态解释)
[02:21] 🤖 Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems(代理奖励建模:将人类偏好与可验证的正确性信号结合以构建可靠的奖励系统)
[03:02] 🤖 Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?(大型语言模型能否检测长链推理中的错误?)
[03:47] 🌍 Language Models' Factuality Depends on the Language of Inquiry(语言模型的事实性依赖于查询语言)
[04:27] 🧠 Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation(语言模型能否证伪?评估算法推理中的反例创建)
[05:11] 🤖 Towards an AI co-scientist(迈向人工智能合作科学家)
[05:52] 🇬 Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance(普鲁托斯:在低资源希腊金融环境中评估大型语言模型)
[06:38] 🤖 VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model(VEM:利用价值环境模型训练GUI代理的无环境探索)
[07:12] 📏 Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator(蒸馏任意深度:蒸馏技术创造更强的单目深度估计器)
[07:52] 📚 Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs(亚历山大项目:通过大型语言模型解除科学知识的版权负担)
[08:35] 🛡 AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement(AISafetyLab:AI安全评估与改进的综合框架)
[09:23] 🧠 BIG-Bench Extra Hard(BIG-Bench 超难版本)
[10:07] 🔍 CritiQ: Mining Data Quality Criteria from Human Preferences(CritiQ:从人类偏好中挖掘数据质量标准)
[10:44] 🔬 MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra(MolSpectra:利用多模态能量光谱预训练三维分子表示)
[11:28] 📄 PosterSum: A Multimodal Benchmark for Scientific Poster Summarization(PosterSum:科学海报摘要的多模态基准)
[12:08] 🧠 DOEI: Dual Optimization of Embedding Information for Attention-Enhanced Class Activation Maps(双优化嵌入信息用于增强注意力类激活图)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递