Dynamic Tanh革新Transformer架构,MetaAI重塑归一化技术标准 36 1 MetaAI推出Dynamic Tanh(DyT)技术,通过tanh函数替代传统LayerNorm,在H100s显卡上实现训练加速与成本降低,推动多模态Transformer高效化部署,或成下一代模型标准。
Linear-MoE统一序列建模,长文本处理迎来革命性突破 44 1 学术界与产业界联合研发的Linear-MoE框架,通过统一线性序列建模与混合专家系统,实现百万token长文本处理速度提升3倍,或定义下一代模型架构标准。
Business Machine Learning Research AI Agents Large Language Models (LLMs) Open Standard for Tool Use and Data Access Gains Momentum 166 0 OpenAI adopts Model Context Protocol to boost LLM tool integration
Business Machine Learning Research AI Agents Large Language Models (LLMs) Open Standard for Tool Use and Data Access Gains Momentum OpenAI adopts Model Context Protocol to boost LLM tool integration 167 0 <!--kg-card-begin: html-->Loading the Elevenlabs Text to Speech AudioNative Player...<!--kg-card-end: html-->OpenAI embraced Model Context Protocol, providing powerful support for an o
Machine Learning Research Transformers Large Language Models (LLMs) Toward LLMs That Understand Misspellings New byte-based model beats Llama 3 on spelling, noise, and translation 207 0 <!--kg-card-begin: html-->Loading the Elevenlabs Text to Speech AudioNative Player...<!--kg-card-end: html-->Researchers built a model that’s more robust to noisy inputs like misspelli
Machine Learning Research Transformers Large Language Models (LLMs) Toward LLMs That Understand Misspellings 210 0 New byte-based model beats Llama 3 on spelling, noise, and translation