Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding
Leak@k: study showing existing LLM unlearning methods fail under probabilistic decoding despite success under greedy decoding evaluation.
Leak@k: study showing existing LLM unlearning methods fail under probabilistic decoding despite success under greedy decoding evaluation.
Blind-IGT: inverse game theory method jointly decoding rewards and rationality in entropy-regularized competitive games with unknown rationality parameter.
Quasimetric learning method for goal-conditioned RL using multi-step returns to estimate temporal distance between observations over long horizons.
FlowCast: conditional flow matching method for radar-based precipitation nowcasting addressing uncertainty and high-dimensional data modeling.
Gradient estimation method for multi-objective and meta reinforcement learning, partitioning n objectives into k groups for language model preference optimization.
InTAct: continual learning approach using interval-based task activation consolidation with mathematical guarantees against catastrophic forgetting.
MIST: neural network-based mutual information estimator trained on 625K synthetic distributions with known ground-truth MI.
E2E-GRec framework for end-to-end joint training of GNNs and recommender systems, replacing two-stage pipeline approach.
SelfAI multi-agent system for self-directed long-horizon scientific discovery with human-in-the-loop workflows and exploration trade-offs.
ML-Tool-Bench framework for tool-augmented planning in autonomous ML agents orchestrating data analysis and model optimization workflows.
GRAPE framework unifying positional encoding mechanisms using group actions for multiplicative rotations and additive biases.
ECHO benchmark for evaluating graph neural networks on long-range graph propagation and interaction tasks.
Clustered personalized federated learning framework using Population Stability Index to handle non-IID data across clients.
Soft-gated fractional mixture-of-experts with randomized adversarial training to defend ML models against adversarial attacks.
RL framework for adaptive precision tuning in linear solvers using contextual bandit approach to balance precision and efficiency.
HeurekaBench benchmarking framework for evaluating LLM-based AI co-scientist agents on end-to-end scientific analysis tasks.
Mathematical framework for polyphonic music generation using structural inductive bias and smart embeddings on Beethoven sonatas.
Multitask learning framework with denoising autoencoder for EEG signal analysis combining motor imagery and emotion recognition.
Mixture-of-experts model with self-augmentation for Quality of Service prediction in web service recommendation systems.
Method for inverting Self-Organizing Maps as generative models using activation patterns and distance geometry to reconstruct inputs.
Physics-informed inverse modeling framework for Arctic snow depth prediction combining process-based constraints with data-driven learning.
Machine unlearning method addressing long-tailed distributions in forget sets using forgetting-aware loss reweighting for privacy compliance.
Systematic comparison of explainability methods for detecting malicious hardware trojans in integrated circuits.
Online optimization algorithm for bi-level resource provisioning and scheduling with switching costs and cross-level constraints.
Framework combining VAE and deep metric learning for causal analysis of greenwashing in mining industry environmental disclosure.
Gradient-aligned calibration method for post-training quantization of diffusion models to accelerate inference and reduce memory usage.
Research on trade-off between LLM watermarking strength and speculative sampling efficiency, proposing methods to improve both simultaneously.
Method for learning per-layer equivariance relaxation in neural networks without manual hyperparameter tuning, improving optimization dynamics.
TextME enables multimodal expansion using text-only training, projecting diverse modalities into LLM embeddings without paired datasets.
Machine learning-based anomaly detection for 5G networks evaluated under realistic conditions without IID assumptions and adaptive attackers.
EBPO addresses stability issues in Group Relative Policy Optimization for LLM reasoning via Empirical Bayes shrinkage, reducing variance and gradient problems.
arXiv paper on causal Schrödinger bridges for constrained optimal transport in generative modeling under causal interventions.
arXiv paper on amortized neural symbolic regression addressing expression simplification bottleneck for discovering interpretable analytical expressions.
arXiv paper analyzing optimal batch size scheduling for large-scale deep learning using functional scaling law framework.
arXiv paper on formal temporal logic specifications enhancing safety of reinforcement learning control in aerospace F-16 simulation.
arXiv paper on HAWX, hardware-aware framework for fast DNN approximation using multi-level sensitivity scoring and heterogeneous approximate computing.
arXiv paper analyzing role of optimizer choice in Neural Collapse emergence during deep neural network training terminal phase.
arXiv paper proposing Action-Graph Policies for modeling action dependencies and coordination in multi-agent reinforcement learning systems.
arXiv paper on zeroth-order optimization for fine-tuning large-scale models via subspace gradient orthogonalization, improving accuracy-efficiency tradeoff.
Theoretical analysis of graph Laplacian methods for detecting singularities in point cloud manifolds with explicit bounds and geometric estimation tests.
Investigates stochastic localization techniques for sampling from unnormalized densities using score-based learning.
Studies optimal sampling complexity for estimating model order and parameters in one-dimensional Gaussian mixture models.
Research on local convergence rates of stochastic first-order methods under Polyak-Lojasiewicz conditions, a theoretical ML optimization problem.
Essay examining the interface problem between AI capabilities and real-world impact, citing Sakana AI's autonomous research system achieving peer-review publication.
Research paper demonstrating multiple AI agents connected to live trading APIs all bankrupted within 30 minutes due to LLM hallucination causing false market citations.
Curated directory of indie AI tools, startups, and APIs created by independent developers and solo founders with searchable categorization.
AI-native document database built in Rust enabling AI agents to reason through documents via structural reasoning rather than vector similarity retrieval.
AI voice agent that autonomously navigates IVR phone systems and negotiates customer retention discounts.
thisorthis.ai compares responses from 47+ text and image models side-by-side. Users submit one prompt and see outputs from ChatGPT, Claude, Gemini and others simultaneously with SmartPick LLM evaluation.
Anno API extracts clean structured text from web pages, reducing AI agent token consumption by 93% (600 vs 15,000 tokens per page). HTTP-based with ensemble extraction and confidence scoring.