Structured Legal Document Generation in India: A Model-Agnostic Wrapper Approach with VidhikDastaavej
VidhikDastaavej: Model-agnostic wrapper for automated legal document generation in India using LLMs, with anonymized private document dataset.
VidhikDastaavej: Model-agnostic wrapper for automated legal document generation in India using LLMs, with anonymized private document dataset.
arXiv paper on VidhikDastaavej, model-agnostic wrapper for structured legal document generation in India with large-scale anonymized dataset.
arXiv paper comparing linguistic features of AI-generated vs human responses to mental health queries. Application of LLMs in healthcare.
arXiv paper introducing Distance Explainer for generating post-hoc explanations of embedded vector spaces using saliency-based techniques.
arXiv paper on Bottlenecked Transformers using periodic KV cache consolidation for improved reasoning with auxiliary latent-space computation.
arXiv paper on RestoreVAR using visual autoregressive modeling for fast all-in-one image restoration, replacing slower diffusion approaches.
arXiv paper on 3D Gaussian Splatting for wideband RF signal modeling. Computer vision/signal processing, not AI/ML development.
arXiv paper demonstrating in-context reinforcement learning (ICRL) emerges during LLM inference. Proposes ICRL prompting for inference-time self-improvement.
arXiv paper exploring persona prompts as jailbreak attack vector against LLMs. Security analysis of prompt injection vulnerabilities.
arXiv paper on SafeSieve, progressive pruning algorithm for LLM-based multi-agent systems to reduce token overhead and communication redundancy.
arXiv paper introducing TimeAlign, representation learning method using contrastive alignment for time series forecasting.
arXiv paper on PromptLoop for iterative prompt refinement in diffusion models via latent RL feedback. Improves generalization and robustness.
Proposes declarative OS interfaces to improve computer-use agents' ability to interact with GUIs, replacing error-prone imperative action sequences.
SyTTA: Label-free test-time adaptation for LLMs using only 4 extra tokens, enabling domain-specific deployment without fine-tuning on expensive labeled data.
Network traffic characterization study analyzing ChatGPT, Copilot, and Gemini usage patterns and their impact on internet infrastructure.
Proposes future summary prediction as alternative to next-token prediction during LLM pretraining, improving long-horizon reasoning and planning capabilities.
OffSim: Model-based offline inverse reinforcement learning framework that learns environment dynamics and reward functions from offline data without manual specification.
Multilingual LLM watermarking robustness study showing current methods fail on low-resource languages, proposes back-translation approach for 100+ language coverage.
QUARK: FPGA acceleration framework leveraging quantization and common patterns in nonlinear operations to accelerate transformer inference.
Curiosity-driven quantized Mixture-of-Experts framework using Bayesian uncertainty routing for accurate inference on resource-constrained devices.
ContagionRL: Gymnasium-compatible RL platform for reward engineering in spatial epidemic simulations, enabling systematic evaluation of behavioral learning strategies.
Unified distillation and adaptation framework for diffusion models enabling fast, high-quality image generation in novel domains with single-stage pipeline.
Vision-Language-Action models enhanced via Tweedie discrete diffusion for improved generalization and fine-grained control in robotic manipulation tasks.
Proposes goal-oriented multi-agent semantic networking architecture for 6G services integrating AI-native communication with network-level intelligence.
Biomedical vision-language pretraining approach that captures fine-grained correspondences in scientific figures and text, improving domain-specific representations.
Proposes adaptive frame selection method for long-form video understanding with large multimodal models, reducing computational overhead while maintaining query awareness.
Research on LLM-based agents for decision support, proposing collaborative sensemaking approach where agents act as partners rather than answer engines to improve human-AI complementarity.
ODMA proposes on-demand memory allocation strategy for efficient LLM serving on low-bandwidth accelerators, addressing limitations of static pre-allocation and fine-grained paging.
Physics-driven computing using magnetic tunnel junction dynamics for neuromorphic working memory, demonstrating energy efficiency over GPUs on vision tasks.
Theoretical comparison between DNNs as discrete dynamical systems and physics-based differential equation solvers on benchmark PDEs.
Analysis of textual reasoning in blind image quality assessment models. Investigates information flow between image, text, and quality predictions.
Probability-guided token selection for SFT to address overfitting to single reference answers. Leverages multiple references while managing data costs.
100M high-quality Chinese image-text dataset for vision-language pre-training. Addresses bottleneck in Chinese VLP model development.
Scalable compliance evaluation framework for multi-policy AI governance. Integrates comprehensive model-card format and streamlines policy compliance burden.
Reference-free hallucination detection for LLM-generated code review comments. Identifies context misalignment without ground truth, enabling practical adoption in code review automation.
Framework addressing sycophancy in LLM decision support systems through premise governance. Proposes structured verification for deep-uncertainty decisions.
Self-distillation approach for machine unlearning in text-to-image diffusion models. Balances effective forgetting with retention of unrelated concepts.
Statistical analysis of variance in agentic system evaluations. Shows single-run pass@1 scores on SWE-Bench vary substantially (2.2-6.0%), calling for improved evaluation methodology.
Hierarchical framework for log anomaly detection that preserves component execution structure. Addresses spurious correlations in flat-sequence approaches.
AceGRPO combines adaptive curriculum learning with GRPO for autonomous ML engineering agents. Addresses behavioral stagnation and data inefficiency in long-horizon optimization tasks.
Joint audio-video generation model for synchronized customization of video identity and audio timbre from reference inputs.
Heterogeneous multi-agent framework treating diverse LLM models as specialized tools. Introduces orchestrator calibration for efficient test-time scaling through coordinated tool calling.
Smooth gate functions for stabilizing GRPO LLM training. Replaces hard clipping with sigmoid-based gating to improve optimization stability in reasoning tasks.
Theoretical analysis of offline reinforcement learning with general function approximation and parametric policies, extending beyond finite action spaces.
Open-source framework for deploying DARPA AIxCC cyber reasoning systems locally. Makes competition CRSs usable outside original infrastructure with improved accessibility.
Evaluation framework for persona-adaptive LLM-powered agents in multi-modal settings, addressing user-aware behavior in customer experience management.
Mathematical analysis of Collatz conjecture dynamics using modular arithmetic and combinatorial methods. Pure mathematics research unrelated to AI/ML.
Mathematical analysis of Collatz conjecture dynamics using modular arithmetic and combinatorial methods with LLM assistance.
Red-teaming Vision-Language-Action models through quality diversity prompt generation to improve robot policy robustness.
AgentDrift: reveals safety risks in LLM agent recommendations when tools are corrupted, hidden by standard metrics.