progressive-llm/docs
2025-07-10 22:25:11 +09:00
..
README.md こんにちは 2025-07-10 22:25:11 +09:00

Progressive LLM Training Documentation

Setup

curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync

Training

Single GPU

uv run scripts/train_progressive.py --config config/training_config_gemma3_1b.yaml

8 GPUs

./scripts/train_gemma3_1b_8gpu.sh --strategy deepspeed

Configuration

  • config/training_config_gemma3_1b.yaml - Single GPU
  • config/training_config_gemma3_1b_8gpu_deepspeed.yaml - 8 GPUs

Environment

Copy .env.example to .env and set:

  • HF_TOKEN - HuggingFace token
  • WANDB_API_KEY - W&B API key

Troubleshooting

  • Reduce per_device_batch_size for memory issues
  • export NCCL_DEBUG=INFO for NCCL errors
  • nvidia-smi to check GPUs