🤗 Upvotes: 12 | cs.CL Authors: Yinjie Wang, Ling Yang, Guohao Li, Mengdi Wang, Bryon Aragam Title: ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization Arxiv: http://arxiv.org/abs/2502.04306v1 Abstract: Recent research has leveraged large language model multi-agent systems for complex problem-solving while trying to reduce the manual effort required to build them, driving the development of automated agent workflow optimization methods. However, existing methods rema...
Feb 08, 2025•21 min•Ep. 501
🤗 Upvotes: 10 | eess.AS, cs.AI, cs.CL, cs.MM, cs.SD Authors: Zhen Ye, Xinfa Zhu, Chi-Min Chan, Xinsheng Wang, Xu Tan, Jiahe Lei, Yi Peng, Haohe Liu, Yizhu Jin, Zheqi DAI, Hongzhan Lin, Jianyi Chen, Xingjian Du, Liumeng Xue, Yunlin Chen, Zhifei Li, Lei Xie, Qiuqiang Kong, Yike Guo, Wei Xue Title: Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis Arxiv: http://arxiv.org/abs/2502.04128v1 Abstract: Recent advances in text-based large language models (LLMs), parti...
Feb 08, 2025•23 min•Ep. 500
🤗 Upvotes: 90 | cs.CL Authors: Loubna Ben Allal, Anton Lozhkov, Elie Bakouch, Gabriel Martín Blázquez, Guilherme Penedo, Lewis Tunstall, Andrés Marafioti, Hynek Kydlíček, Agustín Piqueres Lajarín, Vaibhav Srivastav, Joshua Lochner, Caleb Fahlgren, Xuan-Son Nguyen, Clémentine Fourrier, Ben Burtenshaw, Hugo Larcher, Haojun Zhao, Cyril Zakka, Mathieu Morlon, Colin Raffel, Leandro von Werra, Thomas Wolf Title: SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Arxiv: htt...
Feb 07, 2025•22 min•Ep. 499
🤗 Upvotes: 27 | cs.CE, cs.CY Authors: Yuzhe Yang, Yifei Zhang, Minghao Wu, Kaidi Zhang, Yunmiao Zhang, Honghai Yu, Yan Hu, Benyou Wang Title: TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets Arxiv: http://arxiv.org/abs/2502.01506v2 Abstract: The study of social emergence has long been a central focus in social science. Traditional modeling approaches, such as rule-based Agent-Based Models (ABMs), struggle to capture the diversity and complexity of human behavior, pa...
Feb 07, 2025•23 min•Ep. 498
🤗 Upvotes: 26 | cs.CL, cs.LG Authors: Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neubig, Xiang Yue Title: Demystifying Long Chain-of-Thought Reasoning in LLMs Arxiv: http://arxiv.org/abs/2502.03373v1 Abstract: Scaling inference compute enhances reasoning in large language models (LLMs), with long chains-of-thought (CoTs) enabling strategies like backtracking and error correction. Reinforcement learning (RL) has emerged as a crucial method for developing these capabilities, yet the conditions un...
Feb 07, 2025•21 min•Ep. 497
🤗 Upvotes: 24 | cs.CL, cs.AI Authors: Yixin Ye, Zhen Huang, Yang Xiao, Ethan Chern, Shijie Xia, Pengfei Liu Title: LIMO: Less is More for Reasoning Arxiv: http://arxiv.org/abs/2502.03387v1 Abstract: We present a fundamental discovery that challenges our understanding of how complex reasoning emerges in large language models. While conventional wisdom suggests that sophisticated reasoning tasks demand extensive training data (>100,000 examples), we demonstrate that complex mathematical reason...
Feb 07, 2025•24 min•Ep. 496
🤗 Upvotes: 10 | cs.CL Authors: Jinyang Wu, Mingkuan Feng, Shuai Zhang, Ruihan Jin, Feihu Che, Zengqi Wen, Jianhua Tao Title: Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking Arxiv: http://arxiv.org/abs/2502.02339v1 Abstract: Multimodal large language models (MLLMs) exhibit impressive capabilities but still face challenges in complex visual reasoning. While recent efforts attempt to enhance MLLMs' reasoning by incorporating OpenAI o1-like structured thinking through explicit...
Feb 07, 2025•22 min•Ep. 495
🤗 Upvotes: 7 | cs.CV Authors: Yiren Song, Danze Chen, Mike Zheng Shou Title: LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer Arxiv: http://arxiv.org/abs/2502.01105v1 Abstract: Generating cognitive-aligned layered SVGs remains challenging due to existing methods' tendencies toward either oversimplified single-layer outputs or optimization-induced shape redundancies. We propose LayerTracer, a diffusion transformer based framework that bridges this gap by learning de...
Feb 07, 2025•26 min•Ep. 494
🤗 Upvotes: 6 | cs.LG, cs.AI, cs.CL, stat.ML Authors: Daniil Tiapkin, Daniele Calandriello, Johan Ferret, Sarah Perrin, Nino Vieillard, Alexandre Ramé, Mathieu Blondel Title: On Teacher Hacking in Language Model Distillation Arxiv: http://arxiv.org/abs/2502.02671v1 Abstract: Post-training of language models (LMs) increasingly relies on the following two stages: (i) knowledge distillation, where the LM is trained to imitate a larger teacher LM, and (ii) reinforcement learning from human feedback ...
Feb 07, 2025•21 min•Ep. 493
🤗 Upvotes: 5 | cs.LG, cs.AI Authors: Isha Puri, Shivchander Sudalairaj, Guangxuan Xu, Kai Xu, Akash Srivastava Title: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods Arxiv: http://arxiv.org/abs/2502.01618v2 Abstract: Large language models (LLMs) have achieved significant performance gains via scaling up model sizes and/or data. However, recent evidence suggests diminishing returns from such approaches, motivating scaling the computat...
Feb 07, 2025•24 min•Ep. 492
🤗 Upvotes: 4 | cs.CL, cs.AI, cs.CR, cs.LG Authors: Yu-Ling Hsu, Hsuan Su, Shang-Tse Chen Title: Jailbreaking with Universal Multi-Prompts Arxiv: http://arxiv.org/abs/2502.01154v1 Abstract: Large language models (LLMs) have seen rapid development in recent years, revolutionizing various applications and significantly enhancing convenience and productivity. However, alongside their impressive capabilities, ethical concerns and new types of attacks, such as jailbreaking, have emerged. While most p...
Feb 07, 2025•20 min•Ep. 491
🤗 Upvotes: 29 | cs.CV Authors: Hila Chefer, Uriel Singer, Amit Zohar, Yuval Kirstain, Adam Polyak, Yaniv Taigman, Lior Wolf, Shelly Sheynin Title: VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Arxiv: http://arxiv.org/abs/2502.02492v1 Abstract: Despite tremendous recent progress, generative video models still struggle to capture real-world motion, dynamics, and physics. We show that this limitation arises from the conventional pixel reconstructi...
Feb 06, 2025•19 min•Ep. 490
🤗 Upvotes: 22 | cs.LG, cs.CV Authors: Nikita Gushchin, David Li, Daniil Selikhanovych, Evgeny Burnaev, Dmitry Baranchuk, Alexander Korotin Title: Inverse Bridge Matching Distillation Arxiv: http://arxiv.org/abs/2502.01362v1 Abstract: Learning diffusion bridge models is easy; making them fast and practical is an art. Diffusion bridge models (DBMs) are a promising extension of diffusion models for applications in image-to-image translation. However, like many modern diffusion and flow models, DBM...
Feb 06, 2025•20 min•Ep. 489
🤗 Upvotes: 16 | cs.SE, cs.AI, cs.CL Authors: Huaye Zeng, Dongfu Jiang, Haozhe Wang, Ping Nie, Xiaotong Chen, Wenhu Chen Title: ACECODER: Acing Coder RL via Automated Test-Case Synthesis Arxiv: http://arxiv.org/abs/2502.01718v1 Abstract: Most progress in recent coder models has been driven by supervised fine-tuning (SFT), while the potential of reinforcement learning (RL) remains largely unexplored, primarily due to the lack of reliable reward data/model in the code domain. In this paper, we add...
Feb 06, 2025•20 min•Ep. 488
🤗 Upvotes: 12 | cs.LG, cs.AI Authors: Zongyu Lin, Yao Tang, Xingcheng Yao, Da Yin, Ziniu Hu, Yizhou Sun, Kai-Wei Chang Title: QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search Arxiv: http://arxiv.org/abs/2502.02584v1 Abstract: Language agents have become a promising solution to complex interactive tasks. One of the key ingredients to the success of language agents is the reward model on the trajectory of the agentic workflow, which provides valuable guidance during training ...
Feb 06, 2025•18 min•Ep. 487
🤗 Upvotes: 12 | cs.CL, cs.AI Authors: Maohao Shen, Guangtao Zeng, Zhenting Qi, Zhang-Wei Hong, Zhenfang Chen, Wei Lu, Gregory Wornell, Subhro Das, David Cox, Chuang Gan Title: Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search Arxiv: http://arxiv.org/abs/2502.02508v1 Abstract: Large language models (LLMs) have demonstrated remarkable reasoning capabilities across diverse domains. Recent studies have shown that increasing test-time comput...
Feb 06, 2025•24 min•Ep. 486
🤗 Upvotes: 7 | cs.CL, cs.LG Authors: Wenzhe Li, Yong Lin, Mengzhou Xia, Chi Jin Title: Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial? Arxiv: http://arxiv.org/abs/2502.00674v1 Abstract: Ensembling outputs from diverse sources is a straightforward yet effective approach to boost performance. Mixture-of-Agents (MoA) is one such popular ensemble method that aggregates outputs from multiple different Large Language Models (LLMs). This paper raises the question in...
Feb 06, 2025•24 min•Ep. 485
🤗 Upvotes: 7 | cs.CV Authors: Xueqing Deng, Qihang Yu, Ali Athar, Chenglin Yang, Linjie Yang, Xiaojie Jin, Xiaohui Shen, Liang-Chieh Chen Title: COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation Arxiv: http://arxiv.org/abs/2502.02589v1 Abstract: This paper introduces the COCONut-PanCap dataset, created to enhance panoptic segmentation and grounded image captioning. Building upon the COCO dataset with advanced COCONut panoptic masks, ...
Feb 06, 2025•25 min•Ep. 484
🤗 Upvotes: 84 | cs.LG Authors: Alexey Gorbatovski, Boris Shaposhnikov, Viacheslav Sinii, Alexey Malakhov, Daniil Gavrilov Title: The Differences Between Direct Alignment Algorithms are a Blur Arxiv: http://arxiv.org/abs/2502.01237v1 Abstract: Direct Alignment Algorithms (DAAs) simplify language model alignment by replacing reinforcement learning (RL) and reward modeling (RM) in Reinforcement Learning from Human Feedback (RLHF) with direct policy optimization. DAAs can be classified by their ran...
Feb 05, 2025•20 min•Ep. 483
🤗 Upvotes: 83 | cs.CV Authors: Gaojie Lin, Jianwen Jiang, Jiaqi Yang, Zerong Zheng, Chao Liang Title: OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Arxiv: http://arxiv.org/abs/2502.01061v1 Abstract: End-to-end human animation, such as audio-driven talking human generation, has undergone notable advancements in the recent few years. However, existing methods still struggle to scale up as large general video generation models, limiting their potential in r...
Feb 05, 2025•26 min•Ep. 482
🤗 Upvotes: 44 | cs.LG, cs.AI, cs.CL Authors: Ganqu Cui, Lifan Yuan, Zefan Wang, Hanbin Wang, Wendi Li, Bingxiang He, Yuchen Fan, Tianyu Yu, Qixin Xu, Weize Chen, Jiarui Yuan, Huayu Chen, Kaiyan Zhang, Xingtai Lv, Shuo Wang, Yuan Yao, Xu Han, Hao Peng, Yu Cheng, Zhiyuan Liu, Maosong Sun, Bowen Zhou, Ning Ding Title: Process Reinforcement through Implicit Rewards Arxiv: http://arxiv.org/abs/2502.01456v1 Abstract: Dense process rewards have proven a more effective alternative to the sparse outcome...
Feb 05, 2025•22 min•Ep. 481
🤗 Upvotes: 25 | cs.CL Authors: Ahmed Masry, Juan A. Rodriguez, Tianyu Zhang, Suyuchen Wang, Chao Wang, Aarash Feizi, Akshay Kalkunte Suresh, Abhay Puri, Xiangru Jian, Pierre-André Noël, Sathwik Tejaswi Madhusudhan, Marco Pedersoli, Bang Liu, Nicolas Chapados, Yoshua Bengio, Enamul Hoque, Christopher Pal, Issam H. Laradji, David Vazquez, Perouz Taslakian, Spandana Gella, Sai Rajeswar Title: AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Arxiv: http://arxiv.org/...
Feb 05, 2025•23 min•Ep. 480
🤗 Upvotes: 25 | cs.CR, cs.AI, cs.IR Authors: Xun Liang, Simin Niu, Zhiyu Li, Sensen Zhang, Hanyu Wang, Feiyu Xiong, Jason Zhaoxin Fan, Bo Tang, Shichao Song, Mengwei Wang, Jiawei Yang Title: SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model Arxiv: http://arxiv.org/abs/2501.18636v1 Abstract: The indexing-retrieval-generation paradigm of retrieval-augmented generation (RAG) has been highly successful in solving knowledge-intensive tasks by integrating extern...
Feb 05, 2025•24 min•Ep. 479
🤗 Upvotes: 25 | cs.LG, cs.AI, cs.CL Authors: Dawei Li, Renliang Sun, Yue Huang, Ming Zhong, Bohan Jiang, Jiawei Han, Xiangliang Zhang, Wei Wang, Huan Liu Title: Preference Leakage: A Contamination Problem in LLM-as-a-judge Arxiv: http://arxiv.org/abs/2502.01534v1 Abstract: Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods in model development. While their combination significantly enhances the efficiency of mod...
Feb 05, 2025•22 min•Ep. 478
🤗 Upvotes: 19 | cs.CV, cs.GR, cs.LG Authors: Rohit Gandikota, Zongze Wu, Richard Zhang, David Bau, Eli Shechtman, Nick Kolkin Title: SliderSpace: Decomposing the Visual Capabilities of Diffusion Models Arxiv: http://arxiv.org/abs/2502.01639v1 Abstract: We present SliderSpace, a framework for automatically decomposing the visual capabilities of diffusion models into controllable and human-understandable directions. Unlike existing control methods that require a user to specify attributes for eac...
Feb 05, 2025•25 min•Ep. 477
🤗 Upvotes: 15 | cs.AI, cs.CV Authors: Huanqia Cai, Yijun Yang, Winston Hu Title: MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models Arxiv: http://arxiv.org/abs/2502.00698v1 Abstract: IQ testing has served as a foundational methodology for evaluating human cognitive capabilities, deliberately decoupling assessment from linguistic background, language proficiency, or domain-specific knowledge to isolate core competencies in abstraction and reasoning. Yet, artificial int...
Feb 05, 2025•25 min•Ep. 476
🤗 Upvotes: 12 | cs.CV, cs.AI, cs.CL, cs.HC, cs.LG Authors: Ahmed Heakl, Sara Ghaboura, Omkar Thawkar, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan Title: AIN: The Arabic INclusive Large Multimodal Model Arxiv: http://arxiv.org/abs/2502.00094v1 Abstract: Amid the swift progress of large language models (LLMs) and their evolution into large multimodal models (LMMs), significant strides have been made in high-resource languages such as English and Chinese. While Arabic LLM...
Feb 05, 2025•21 min•Ep. 475
🤗 Upvotes: 54 | cs.CL, cs.AI, cs.LG Authors: Niklas Muennighoff, Zitong Yang, Weijia Shi, Xiang Lisa Li, Li Fei-Fei, Hannaneh Hajishirzi, Luke Zettlemoyer, Percy Liang, Emmanuel Candès, Tatsunori Hashimoto Title: s1: Simple test-time scaling Arxiv: http://arxiv.org/abs/2501.19393v1 Abstract: Test-time scaling is a promising new approach to language modeling that uses extra test-time compute to improve performance. Recently, OpenAI's o1 model showed this capability but did not publicly share its...
Feb 04, 2025•23 min•Ep. 474
🤗 Upvotes: 28 | cs.CL, cs.AI Authors: Baohao Liao, Yuhui Xu, Hanze Dong, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong Title: Reward-Guided Speculative Decoding for Efficient LLM Reasoning Arxiv: http://arxiv.org/abs/2501.19324v1 Abstract: We introduce Reward-Guided Speculative Decoding (RSD), a novel framework aimed at improving the efficiency of inference in large language models (LLMs). RSD synergistically combines a lightweight draft model with a more powerful target...
Feb 04, 2025•22 min•Ep. 473
🤗 Upvotes: 12 | cs.CL, cs.AI Authors: Qika Lin, Tianzhe Zhao, Kai He, Zhen Peng, Fangzhi Xu, Ling Huang, Jingying Ma, Mengling Feng Title: Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models Arxiv: http://arxiv.org/abs/2501.18119v1 Abstract: Due to the presence of the natural gap between Knowledge Graph (KG) structures and the natural language, the effective integration of holistic structural information of KGs with Large Language Mode...
Feb 04, 2025•21 min•Ep. 472