Far Out: Evaluating Language Models on Slang in Australian and Indian English
Evaluation of slang comprehension in state-of-the-art LLMs for Indian and Australian English varieties.
Evaluation of slang comprehension in state-of-the-art LLMs for Indian and Australian English varieties.
SecCodeBench-V2: benchmark of 98 scenarios evaluating LLM code generation security across 5 languages and 22 CWE categories.
STAPO method stabilizing RL fine-tuning for LLMs by controlling rare spurious token gradients to prevent training collapse.
Content-based framework for consistent refusal decisions in LLM-based cybersecurity agents avoiding over-restriction and brittleness.
Koopman-Bayesian framework combining nonlinear dynamics and haptic rendering for surgical simulation.
Benchmark for selecting humorous manga panel responses in conversations using contextual meme understanding.
ML model for personalized exercise recommendation using behavior-aware memory augmentation for student learning.
Distributed physics-informed neural networks framework using domain decomposition for faster flow reconstruction from sparse velocity measurements.
Adaptive semi-supervised training method for P300 brain-computer interface speller with reduced calibration effort.
R²Energy benchmark for evaluating deep learning models on renewable energy forecasting under extreme weather conditions.
B-DENSE proposes branching approach to improve diffusion model inference speed while preserving structural information from intermediate trajectory steps.
Multi-armed bandit algorithm using Thompson sampling with Gaussian priors for clustered arm problems in communications and portfolio management.
Flow model expansion method using verifier constraints for scientific discovery beyond training data distribution in molecular design space.
Mechanistic study of capability emergence tracking geometric measures across model scales, revealing scale-invariant representation collapse during training.
Flow-matching generative model for predicting molecular crystal structures with periodic arrangements handling large molecules and complex interactions.
Carbon-aware evaluation metric for AI models that incorporates energy consumption and carbon emissions alongside traditional performance benchmarks.
Expert budgeting optimization for efficient speculative decoding in Mixture-of-Experts LLMs to reduce memory pressure and maintain speedup.
Multi-objective alignment method for LLMs in psychotherapy applications balancing patient preferences with clinical safety using preference rankings.
Multi-view tensor decomposition framework to extract and analyze driver behavior patterns at railway crossings across temporal phases.
Theoretical analysis of generative AI robustness under data contamination from AI-generated content in recursive training, with guarantees on model survival.
Large-scale intracranial EEG dataset and benchmark for epilepsy research with automated seizure localization to reduce manual clinical review burden.
Continual learning approach for wheel fault detection in railway systems using semantic-aware sensor fusion for online predictive maintenance.
Computational pipeline for statistical analysis of shape graph datasets using feature extraction to capture both connectivity and geometric variations.
Investigation of source screening for learning shared feature extractors across heterogeneous data sources to filter low-quality or irrelevant data.
Analysis of graph neural network convergence on large random graphs with correlated node features to assess GNN expressiveness in realistic settings.
Study on cross-subject generalization in EEG-based brain-computer interfaces using spectral versus temporal representations for neural signal analysis.
Distributionally robust optimization method combining differential privacy with worst-case loss minimization for handling distribution shifts and adversarial perturbations.
Hierarchical reinforcement learning framework for training LLM agents on long-horizon tasks with sparse rewards using explicit credit assignment across action hierarchies.
Optimization method using orthogonalized updates for physics-informed neural networks and neural operators to handle ill-conditioned gradients and multi-scale behavior.
Non-autoregressive generation using masked diffusion language models with remasking samplers to reduce decoding latency while addressing error accumulation in iterative refinement.
Federated learning approach for energy theft detection in resource-constrained smart meters with privacy preservation.
Temporal-Prior Conditioning method for time series forecasting with LLMs, treating time as first-class modality across model depths.
Physics-informed neural networks using geometric compactification mappings to address multi-scale PDEs and gradient stiffness issues.
Graphon-based mean-field method for multi-agent reinforcement learning with heterogeneous agents and computational efficiency improvements.
ModalImmune framework for multimodal robustness via intentional modality collapse during training to handle missing input channels.
Training-free adaptation method for diffusion models using Doob's h-transform without additional training or differentiability assumptions.
Neurochaos Learning applied to linked data classification, demonstrating small-sample learning and low compute requirements on graph-structured data.
Neural operators for PDEs using Lie group-constrained latent dynamics to improve stability in multi-layer iteration and long-horizon rollout.
Graph neural network approach for sea ice modeling using collision physics in one-dimensional framework.
Uncertainty-aware hybrid CNN-Transformer architecture for ECG arrhythmia detection with reliability quantification.
Comprehensive survey of Bayesian quadrature as probabilistic approach to numerical integration and uncertainty quantification.
Lightweight MLP-Mixer model with random attention for multiscale long-term time series forecasting.
Amortized training framework addressing low-predictability samples in time series forecasting and classification.
Factored latent action world models learning from action-free video for scalable controllable video generation.
High-probability regret bounds for online Q-learning in infinite-horizon MDPs without optimism or bonus terms.
KV cache compaction method via attention matching to reduce memory overhead for long-context LLM inference.
Graph meta-network architecture for weight-space models to predict neural network accuracy on Kolmogorov-Arnold networks.
Machine learning approach to predict off-target behavior in CRISPR genome editing applications.
Hardware-aware framework for DNN approximation using multi-level sensitivity scoring and heterogeneous approximation blocks.
Theoretical analysis of implicit bias in momentum-based optimizers (Adam, Muon, MomentumGD) on homogeneous neural networks.