Aligned Agents, Biased Swarm: Measuring Bias Amplification in Multi-Agent Systems
Aligned Agents, Biased Swarm: Empirical study measuring how multi-agent system topologies and feedback loops amplify bias in emergent behaviors.
Aligned Agents, Biased Swarm: Empirical study measuring how multi-agent system topologies and feedback loops amplify bias in emergent behaviors.
Litmus ReAgent: Benchmark and agentic system for evaluating multilingual LLM performance prediction across 1,500 questions spanning six tasks and five evidence scenarios.
Neighbourhood Transformer: Graph neural network architecture using switchable attention to handle heterophilic graph learning where dissimilar nodes are frequently connected.
PerMix-RLVR: Training method for aligning LLM personas with reward models while preserving output diversity, avoiding inference-time computation overhead.
PinpointQA dataset and benchmark for evaluating small object localization and spatial reasoning in video MLLMs.
ASTRA: adaptive semantic tree reasoning architecture for LLM-based complex table question answering.
Survey and construction of linguistically-informed representations for English as a second/foreign language.
Named entity identification and anonymization system for cybercrime datasets using speech-to-text and image processing.
Regime-conditional retrieval with transferable router for two-hop question answering with theoretical foundations.
Noise-aware in-context learning approach to mitigate hallucinations in auditory large language models.
ImageProtector prevents multi-modal LLMs from analyzing images via visual prompt injection attacks.
Vision-language models for image geolocation with structured geographic reasoning and autonomous self-evolution.
CONDESION-BENCH evaluates LLM decision-making with compositional action spaces and conditional feasibility constraints.
U-Cast: simple probabilistic weather forecasting using standard U-Net architecture achieving frontier performance.
Watt Counts: open-access energy consumption benchmark for LLM inference across 50 models and 10 GPU architectures.
PDYffusion combines diffusion models with physics-informed dynamics for long-horizon spatiotemporal prediction.
Vision-Language-Action models for autonomous driving combining perception, reasoning, and temporal dynamics modeling.
Frequency-enhanced diffusion models for zero-shot skeleton action recognition in computer vision.
NyayaMind framework for transparent legal reasoning and judgment prediction in Indian courts using LLMs.
Method integrating graph-based embeddings into event sequence models for improved user prediction on digital platforms.
DeepGuard improves secure code generation by LLMs through multi-layer semantic aggregation to mitigate vulnerable patterns.
CLIP-Inspector detects backdoor attacks in prompt-tuned vision-language models through out-of-distribution trigger inversion.
Research on detecting covert misaligned AI behavior in real-world settings using open-source intelligence methods.
TensorHub introduces Reference-Oriented Storage for efficient weight transfer in LLM reinforcement learning across heterogeneous computational resources.
PS-TTS method for phonetic synchronization in automated dubbing, addressing duration and lip-sync challenges in AI-based video translation.
Interactive ASR system with human-like interaction and semantic coherence evaluation, replacing WER metric with agent-based correction mechanisms.
EquiformerV3: SE(3)-equivariant graph attention Transformer for 3D atomistic modeling, improving efficiency, expressivity, and physical consistency.
CORA framework for risk-controlled GUI automation agents using conformal prediction to provide formally verified, user-tunable safety guarantees for VLM-powered mobile automation.
LLM-based agents for scaffolding diagnostic reasoning in educational settings, combining scenario-based learning with learning analytics and personalized support.
Dataset for personality-shaped emotional responses to text events, addressing limitations of LLM role-playing and personality illusion in affective computing.
Theoretical analysis of generalization and scaling laws for Mixture-of-Experts Transformers, separating active capacity from routing combinatorics with covering-number bounds.
Symbolic-Neural Consistency Audit framework extracting and formalizing LLM self-stated safety policies.
Vision transformer application predicting chemotherapy response in ovarian cancer from preoperative CT scans.
GNN-based deep reinforcement learning scheduler for cloud workflow DAG assignment minimizing time and energy.
GRM gradient-ratio masking attack on audio LLMs balancing jailbreak success with utility preservation.
Computational model of Von Economo neurons implementing biological speed-accuracy tradeoff in decision-making.
Neural distribution prior method for LiDAR out-of-distribution detection in autonomous driving.
Statistical analysis of I-Ching King Wen sequence showing no improvements to neural network training.
Mosaic multimodal jailbreak attack against closed-source VLMs via multi-view ensemble optimization.
SkillMOO multi-objective optimization framework automatically evolving agent skill bundles for coding tasks.
Visually-guided policy optimization improving visual faithfulness in vision-language models via reinforcement learning.
LLM-Rosetta hub-and-spoke intermediate representation for cross-provider LLM API translation and interoperability.
BadSkill: backdoor attack formulation exploiting model artifacts bundled in agent skills.
AI Codebase Maturity Model framework for systematic progression from assisted coding to self-sustaining systems.
Policy proposal for nuanced consent frameworks in generative AI training data usage.
PhysInOne dataset with 2M videos of physical phenomena for training physics-aware AI systems.
Co-design of accessible 3D data visualization tool for blind and low-vision users.
Video diffusion model learning joint distribution of videos and camera trajectories for novel viewpoint rendering.
Experimental evaluation framework for quantum-inspired 1024-D document embeddings in RAG and information retrieval applications.
Physics-guided surrogate learning for zero-shot control of turbulent aerodynamic flows using reinforcement learning.