Show HN: Skyreels V4 – AI Video Generator with Native Audio Sync
Multimodal AI video generator with native audio synchronization using dual-stream Diffusion Transformer architecture.
Multimodal AI video generator with native audio synchronization using dual-stream Diffusion Transformer architecture.
Personal anecdote about AI system disabling wireless drivers during automated recovery in an RV, causing offline situation.
Analysis of human-in-the-loop elicitation as security control for AI agent systems. Discusses exploit chain interruption.
Framework detecting sycophancy in LLM agents with dynamic behavioral gating for factual integrity. ArXiv research on agent behavior.
Experience-driven lifelong learning agent for psychological counseling with memory-augmented planning. AI agent research paper.
Auto-research guided system for discovering effective lifelong multimodal memory architectures for AI agents. ArXiv research paper.
Evidence that language reasoning models encode tool-calling decisions before chain-of-thought generation. Analysis of model decision-making timing.
Framework using LLMs to generate curriculum for RL agents. Applied to Blackjack with progressive action introduction.
Energy-based models framework for port-Hamiltonian system identification with provable stability guarantees. Physical AI application.
Analysis of OOD anomaly where deep networks assign higher density to simple out-of-distribution data than in-distribution test data.
MOON3.0 multimodal representation learning for e-commerce product understanding using reasoning-aware MLLMs to capture fine-grained attributes.
Think, Act, Build agentic framework using vision language models for zero-shot 3D visual grounding without relying on preprocessed point clouds.
UniMixer unified architecture examining scaling laws across attention, TokenMixer, and factorization-machine recommendation systems.
Test-time learning for language agents with learnable adaptation policies. Improves agent behavior through iterative refinement at inference.
Dignified Peer framework countering sycophancy and evasiveness in aligned LLMs through anti-sycophancy and empathy.
MyPhoneBench evaluation framework measuring privacy compliance in phone-use agents during mobile task completion.
ORBIT generates 20K training queries for search agents integrating LMs with web search using scalable and verifiable methods.
DR-LoRA assigns dynamic ranks to expert modules in MoE models for efficient parameter-specific fine-tuning of LLMs.
Triadic Cognitive Architecture for tool-using agents with principled bounds on information-acquisition costs and deliberation.
BIOGEN multi-agent reasoning framework using evidence-grounding for transcriptomic interpretation in antimicrobial resistance.
TaCarla comprehensive benchmarking dataset for end-to-end autonomous driving with perception and planning tasks.
MemFactory unified framework for training and inference in memory-augmented LLMs using RL to optimize memory operations.
Sven optimization algorithm exploiting natural loss decomposition using Moore-Penrose pseudoinverse for efficient neural network training.
Framework training LLMs to forecast supply chain disruptions using calibrated probabilistic forecasts from disruption outcomes.
UQ-SHRED adds uncertainty quantification to shallow recurrent decoder networks for sparse spatiotemporal reconstruction.
Online machine learning framework for multi-resolution energy system design optimization and performance analysis.
JetPrism diagnoses convergence issues in Conditional Flow Matching for physics simulations and inverse problems.
Distributed graph modeling approach for detecting money laundering transaction patterns at scale.
Tutorial on Bayesian Optimization as a principled framework for automating scientific discovery using surrogate models.
Principled layer-wise optimization approach for model merging via data-free covariance estimation without task-specific training.
SECURE framework addressing robustness issues in deep learning models for autonomous driving collision prediction.
GPU-accelerated inference algorithm for multivariate Hawkes processes achieving O(N) complexity with parallelization.
Novel Langevin-based algorithm for adaptive inverse reinforcement learning using Malliavin calculus for gradient estimation.
PI-JEPA: Physics-informed surrogate model for multiphysics simulation exploiting unlabeled parameter fields via latent prediction.
Residuals-based offline reinforcement learning approach for high-stakes applications with restrictive data coverage assumptions.
Benchmark datasets and evaluation protocols for machine learning methods on photoplethysmography medical signals.
Train-to-Test scaling laws optimizing model size, training tokens, and inference samples jointly for compute-optimal LLM deployment.
Study of reward hacking in LLM RL showing reproducible failure patterns and mitigation strategies using representation-level signals.
Hierarchical RL framework for privacy-preserving synthetic clinical data generation combining LLMs with structured learning.
CuTeGen: LLM-based agentic framework for automated generation and optimization of high-performance GPU kernels using CuTe abstraction.
Comparative study of Evolution Strategies vs GRPO for LLM post-training showing ES achieves comparable accuracy with different parameter geometry.
Residual decomposition framework for improving classifier performance on long-tailed datasets beyond standard logit adjustment.
Self-supervised framework for learning clinical ECG image representations without access to raw signal recordings.
ZEUS: Training-free acceleration method for diffusion models using second-order predictors to reduce sampling steps.
Care-Conditioned Neuromodulation framework for LLM-based dialogue agents that balances helpfulness with user autonomy preservation.
EEG seizure detection method using graph neural networks with self-supervised learning and information bottleneck principles.
Influence-Guided PPO framework for LLM post-training that filters noisy rollouts using data attribution to improve training efficiency.
Research on training LLMs to develop both in-context and in-weights learning capabilities simultaneously via contrastive context sampling.
Novel reinforcement learning algorithm addressing noisy temporal difference errors in deep RL through pseudo-quantization methods.
arXiv paper on expert-choice routing for diffusion language models. Deterministic load balancing improves throughput and convergence vs token-choice.