benqi's picture

24 9

benqi

Aidabenk

AI & ML interests

NLP

Recent Activity

upvoted a paper 5 days ago

SWE-QA: Can Language Models Answer Repository-level Code Questions?

liked a dataset 6 days ago

tencent/WildSpeech-Bench

upvoted a paper 7 days ago

RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

View all activity

Organizations

None yet

upvoted a paper 5 days ago

SWE-QA: Can Language Models Answer Repository-level Code Questions?

Paper • 2509.14635 • Published 12 days ago • 34

upvoted a paper 7 days ago

RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

Paper • 2509.16198 • Published 10 days ago • 117

upvoted a paper about 1 month ago

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published Aug 19 • 119

upvoted 4 papers 4 months ago

VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos

Paper • 2505.23693 • Published May 29 • 55

GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control

Paper • 2505.22421 • Published May 28 • 11

Table-R1: Inference-Time Scaling for Table Reasoning

Paper • 2505.23621 • Published May 29 • 94

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Paper • 2505.22651 • Published May 28 • 50

upvoted 3 papers 5 months ago

DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models

Paper • 2504.15716 • Published Apr 22 • 12

Towards Understanding Camera Motions in Any Video

Paper • 2504.15376 • Published Apr 21 • 159

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 93

upvoted a paper 7 months ago

SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models

Paper • 2503.07605 • Published Mar 10 • 68

upvoted 9 papers 8 months ago

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published Feb 13 • 194

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Paper • 2502.06772 • Published Feb 10 • 22

MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents

Paper • 2502.05957 • Published Feb 9 • 16

Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding

Paper • 2502.05609 • Published Feb 8 • 18

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10 • 59

VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation

Paper • 2502.07531 • Published Feb 11 • 14

Retrieval-augmented Large Language Models for Financial Time Series Forecasting

Paper • 2502.05878 • Published Feb 9 • 42

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published Feb 11 • 54

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11 • 44

Лучший частный хостинг