Search: Agentic AI costs

Paper page - FeatCal: Feature Calibration for Post-Merging Models

…Get this paper in your agent: hf papers read 2605.13030 Don't have the latest CLI? curl -LsSf https://hf.co/cli/install.sh | bash No dataset linking this paper Cite…

May 14, 2026

Paper page - Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why

…View arXiv page View PDF Add to collection Community https://x.com/MFarajtabar/status/2054275640946458785 Get this paper in your agent: hf papers read 2605.10889 Don't have the latest CLI…

May 12, 2026

Paper page - Triplet-Block Diffusion RWKV

…AI-generated summary Causal Transformer language models suffer from strictly sequential decoding and a quadratic per-step attention cost. While linear-time causal models and discrete diffusion models each address these weaknesses…

May 28, 2026

Paper page - Position: LLM Inference Should Be Evaluated as Energy-to-Token Production

…Listed API prices vary by over an order of magnitude across providers, but we use price dispersion only as directional motivation, not as causal evidence of marginal cost. The core physical question…

May 14, 2026

Paper page - FAAST: Forward-Only Associative Learning via Closed-Form Fast Weights for Test-Time Supervised Adaptation

…AI-generated summary Adapting pretrained models typically involves a trade-off between the high training costs of backpropagation and the heavy inference overhead of memory-based or in-context learning . We propose…

May 14, 2026

Paper page - DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models

…Encoding cost stays roughly constant in K instead of scaling with it. Findings. Multi-token helps every diffusion backbone we test, on every benchmark (MS MARCO, TREC DL'19/'20, BEIR-7…

May 12, 2026

Paper page - Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

…We introduce Policy Optimization with Internal State Value Estimation ), which obtains a baseline at negligible cost by using the policy model's internal signals already computed during the policy forward pass. A…

May 13, 2026

Followed topics

Search

Paper page - FeatCal: Feature Calibration for Post-Merging Models

Paper page - Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why

Paper page - Triplet-Block Diffusion RWKV

Paper page - Position: LLM Inference Should Be Evaluated as Energy-to-Token Production

Paper page - FAAST: Forward-Only Associative Learning via Closed-Form Fast Weights for Test-Time Supervised Adaptation

Paper page - DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models

Paper page - Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

Paper page - MASCing: Configurable Mixture-of-Experts Behavior via Activation Steering Masks

Paper page - Teaching Language Models to Think in Code

Paper page - Fast Byte Latent Transformer