lynx   »   [go: up one dir, main page]

TRL documentation

Training with Jobs

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.23.0).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Training with Jobs

Hugging Face Jobs lets you run training scripts on fully managed infrastructure—no need to manage GPUs or local environment setup.

In this guide, you’ll learn how to:

  • Use TRL Jobs to easily run pre-optimized TRL training
  • Run any TRL training script with uv scripts

For general details about Hugging Face Jobs (hardware selection, job monitoring, etc.), see the Jobs documentation.

Requirements

Using TRL Jobs

TRL Jobs is a high-level wrapper around Hugging Face Jobs and TRL that streamlines training. It provides optimized default configurations so you can start quickly without manually tuning parameters.

Example:

pip install trl-jobs
trl-jobs sft --model_name Qwen/Qwen3-0.6B --dataset_name trl-lib/Capybara

TRL Jobs supports everything covered in this guide, with additional optimizations to simplify workflows.

Using uv Scripts

For more control, you can run Hugging Face Jobs directly with your own scripts, using uv scripts.

Create a Python script (e.g., train.py) containing your training code:

from datasets import load_dataset
from trl import SFTTrainer

dataset = load_dataset("trl-lib/Capybara", split="train")
trainer = SFTTrainer(
    model="Qwen/Qwen2.5-0.5B",
    train_dataset=dataset,
)
trainer.train()
trainer.push_to_hub("Qwen2.5-0.5B-SFT")

Launch the job using either the hf jobs CLI or the Python API:

bash
python
hf jobs uv run \
    --flavor a100-large \
    --with trl \
    --secrets HF_TOKEN \
    train.py

To run successfully, the script needs:

  • TRL installed: Use the --with trl flag or the dependencies argument. uv installs these dependencies automatically before running the script.
  • An authentication token: Required to push the trained model (or perform other authenticated operations). Provide it with the --secrets HF_TOKEN flag or the secrets argument.

When training with Jobs, be sure to:

  • Set a sufficient timeout. Jobs time out after 30 minutes by default. If your job exceeds the timeout, it will fail and all progress will be lost. See Setting a custom timeout.
  • Push the model to the Hub. The Jobs environment is ephemeral—files are deleted when the job ends. If you don’t push the model, it will be lost.

You can also run a script directly from a URL:

bash
python
hf jobs uv run \
    --flavor a100-large \
    --with trl \
    --secrets HF_TOKEN \
    "https://gist.githubusercontent.com/qgallouedec/eb6a7d20bd7d56f9c440c3c8c56d2307/raw/69fd78a179e19af115e4a54a1cdedd2a6c237f2f/train.py"

To make a script self-contained, declare dependencies at the top:

# /// script
# dependencies = [
#     "trl",
#     "peft",
# ]
# ///

from datasets import load_dataset
from peft import LoraConfig
from trl import SFTTrainer

dataset = load_dataset("trl-lib/Capybara", split="train")

trainer = SFTTrainer(
    model="Qwen/Qwen2.5-0.5B",
    train_dataset=dataset,
    peft_config=LoraConfig(),
)
trainer.train()
trainer.push_to_hub("Qwen2.5-0.5B-SFT")

You can then run the script without specifying dependencies:

bash
python
hf jobs uv run \
    --flavor a100-large \
    --secrets HF_TOKEN \
    train.py

TRL example scripts are fully uv-compatible, so you can run a complete training workflow directly on Jobs. You can customize training with standard script arguments plus hardware and secrets:

bash
python
hf jobs uv run \
    --flavor a100-large \
    --secrets HF_TOKEN \
    https://raw.githubusercontent.com/huggingface/trl/refs/heads/main/examples/scripts/prm.py \
    --model_name_or_path Qwen/Qwen2-0.5B-Instruct \
    --dataset_name trl-lib/prm800k \
    --output_dir Qwen2-0.5B-Reward \
    --push_to_hub

See the full list of examples in Maintained examples.

Docker Images

An up-to-date Docker image with all TRL dependencies is available at huggingface/trl and can be used directly with Hugging Face Jobs:

bash
python
hf jobs uv run \
    --flavor a100-large \
    --secrets HF_TOKEN \
    --image huggingface/trl \
    train.py

Jobs runs on a Docker image from Hugging Face Spaces or Docker Hub, so you can also specify any custom image:

bash
python
hf jobs uv run \
    --flavor a100-large \
    --secrets HF_TOKEN \
    --image <docker-image> \
    --secrets HF_TOKEN \
    train.py
< > Update on GitHub

Лучший частный хостинг