HN handfuloflight 29d ago

Hierarchical-Context-Compressor

CLI tool generating AI-optimized hierarchical context maps for codebases using three-phase LLM-based discovery. Open source, GitHub Actions compatible.

Ax Xiaohang Nie, Zihan Guo, Zicai Cui, Jiachi Yang, Zeyi Chen, Leheyi De, Yu Zhang, Junwei Liao, Bo Huang, Yingxuan Yang, Zhi Han, Zimian Peng, Linyao Chen, Wenzheng Tom Tang, Zongkai Liu, Tao Zhou, Botao Amber Hu, Shuyang Tang, Jianghao Lin, Weiwen Liu, Muning Wen, Yuanjian Zhou, Weinan Zhang 29d ago

Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web

Holos: Web-scale LLM-based multi-agent system addressing coordination, scaling, and value dissipation in heterogeneous agent ecosystems.

Ax Xue Liu, Xin Ma, Yuxin Ma, Yongchang Peng, Duo Wang, Zhoufutu Wen, Ge Zhang, Kaiyuan Zhang, Xinyu Chen, Tianci He, Jiani Hou, Liang Hu, Ziyun Huang, Yongzhe Hui, Jianpeng Jiao, Chennan Ju, Yingru Kong, Yiran Li, Mengyun Liu, Luyao Ma, Fei Ni, Yiqing Ni, Yueyan Qiu, Yanle Ren, Zilin Shi, Zaiyuan Wang, Wenjie Yue, Shiyu Zhang, Xinyi Zhang, Kaiwen Zhao, Zhenwei Zhu 29d ago

Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation

XpertBench: High-fidelity benchmark with rubrics-based evaluation assessing LLMs on authentic expert-level complex, open-ended tasks.

Ax Anugyan Das, Omkar Ghugarkar, Vishvesh Bhat, Asad Aali 29d ago

Compositional Neuro-Symbolic Reasoning

Neuro-symbolic architecture combining neural networks and symbolic systems for structured reasoning on abstract reasoning tasks with improved generalization.

Ax Ramaneswaran Selvakumar, Kaousheik Jayakumar, S Sakshi, Sreyan Ghosh, Ruohan Gao, Dinesh Manocha 29d ago

Do Audio-Visual Large Language Models Really See and Hear?

Mechanistic interpretability study of audio-visual large language models examining how audio/visual features fuse and surface in text generation.

Ax Bernd Bohnet, Michael C. Mozer, Kevin Swersky, Wil Cunningham, Aaron Parisi, Kathleen Kenealy, Noah Fiedel 29d ago

Analysis of Optimality of Large Language Models on Planning Problems

Analyzes frontier LLMs on classic AI planning problems, examining whether models reason optimally or rely on heuristic strategies in Blocksworld domain.

Ax Qianshan Wei, Yishan Yang, Siyi Wang, Jinglin Chen, Binyu Wang, Jiaming Wang, Shuang Chen, Zechen Li, Yang Shi, Yuqi Tang, Weining Wang, Yi Yu, Chaoyou Fu, Qi Li, Yi-Fan Zhang 29d ago

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

Benchmark evaluating multimodal LLM agents with tool integration capabilities including visual expansion and web search through agentic reasoning.

Ax Fabian Gloeckle, Ahmad Rammal, Charles Arnal, Remi Munos, Vivien Cabannes, Gabriel Synnaeve, Amaury Hayat 29d ago

Automatic Textbook Formalization

AI system automatically formalizes 500+ page graduate-level algebraic combinatorics textbook to Lean, achieving 130K lines of formal code.

Ax Mengzhou Wu, Yuzhe Guo, Yuan Cao, Haochuan Lu, Songhe Zhu, Pingzhe Qu, Xin Chen, Kang Qin, Zhongpu Wang, Xiaode Zhang, Xinyi Wang, Wei Dai, Gang Cao, Yuetang Deng, Zhi Gong, Dezhi Ran, Linyi Li, Wei Yang, Tao Xie 29d ago

UI-Oceanus: Scaling GUI Agents with Synthetic Environmental Dynamics

Framework for scaling GUI agents using synthetic environmental dynamics and self-supervised learning from ground-truth interaction feedback.

Ax Dun Yuan, Fuyuan Lyu, Ye Yuan, Weixu Zhang, Bowei He, Jiayi Geng, Linfeng Du, Zipeng Sun, Yankai Chen, Changjiang Han, Jikun Kang, Alex Chen, Haolun Wu, Xue Liu 29d ago

Beyond Message Passing: Toward Semantically Aligned Agent Communication

Analysis of agent communication protocols for LLM systems organized into communication, syntactic, and semantic layers with systematic evaluation of 18 protocols.