顶会论文精读 | NeurIPS/ICML最新成果速递 - AI中国|教程|工具大全

列表

默认

浏览次数

发布时间

Dynamic Tanh革新Transformer架构，MetaAI重塑归一化技术标准

MetaAI推出Dynamic Tanh（DyT）技术，通过tanh函数替代传统LayerNorm，在H100s显卡上实现训练加速与成本降低，推动多模态Transformer高效化部署，或成下一代模型标准。

Linear-MoE统一序列建模，长文本处理迎来革命性突破

学术界与产业界联合研发的Linear-MoE框架，通过统一线性序列建模与混合专家系统，实现百万token长文本处理速度提升3倍，或定义下一代模型架构标准。

OCTS算法突破LLM推理瓶颈，逆长尾问题终得解法

新型OCTS算法通过答案聚类与动态停止机制，有效缓解LLM推理中的逆长尾效应，在复杂任务中提升45%响应速度，降低30%计算资源消耗。

Machine Learning Research

Large Language Models (LLMs)

Open Standard for Tool Use and Data Access Gains Momentum

OpenAI adopts Model Context Protocol to boost LLM tool integration

Machine Learning Research

Large Language Models (LLMs)

Open Standard for Tool Use and Data Access Gains Momentum OpenAI adopts Model Context Protocol to boost LLM tool integration

Loading the Elevenlabs Text to Speech AudioNative Player...OpenAI embraced Model Context Protocol, providing powerful support for an o

Machine Learning Research

Large Language Models (LLMs)

Toward LLMs That Understand Misspellings New byte-based model beats Llama 3 on spelling, noise, and translation

Loading the Elevenlabs Text to Speech AudioNative Player...Researchers built a model that’s more robust to noisy inputs like misspelli

Machine Learning Research

Large Language Models (LLMs)

Toward LLMs That Understand Misspellings

New byte-based model beats Llama 3 on spelling, noise, and translation