Felix Tuma's picture

107 59

Felix Tuma

floom

·

AI & ML interests

NLP

Recent Activity

upvoted a paper about 17 hours ago

TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them

updated a collection 8 days ago

PotentialApplication

updated a collection 12 days ago

PotentialApplication

View all activity

Organizations

None yet

upvoted a paper about 17 hours ago

TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them

Paper • 2509.21117 • Published 5 days ago • 25

upvoted a paper 16 days ago

MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML

Paper • 2509.06806 • Published 22 days ago • 63

upvoted a paper 19 days ago

Language Self-Play For Data-Free Training

Paper • 2509.07414 • Published 21 days ago • 28

upvoted 4 papers about 1 month ago

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Paper • 2508.19229 • Published Aug 26 • 20

Hermes 4 Technical Report

Paper • 2508.18255 • Published Aug 25 • 36

Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19 • 48

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6 • 127

upvoted 7 papers about 2 months ago

AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators

Paper • 2508.09101 • Published Aug 12 • 8

Can LLM-Generated Textual Explanations Enhance Model Classification Performance? An Empirical Study

Paper • 2508.09776 • Published Aug 13 • 3

Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models

Paper • 2508.09968 • Published Aug 13 • 15

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

Paper • 2508.09138 • Published Aug 12 • 36

Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL

Paper • 2508.07976 • Published Aug 11 • 49

LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?

Paper • 2508.01780 • Published Aug 3 • 18

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 236

upvoted 5 papers 2 months ago

Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21 • 64

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 299

Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning

Paper • 2507.16784 • Published Jul 22 • 119

Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR

Paper • 2507.15778 • Published Jul 21 • 20

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

Paper • 2507.11412 • Published Jul 15 • 28

upvoted a paper 3 months ago

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Paper • 2507.07095 • Published Jul 9 • 54

Лучший частный хостинг