| .. | ||
| README.md | ||
Progressive LLM Training Documentation
Setup
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
Training
Single GPU
uv run scripts/train_progressive.py --config config/training_config_gemma3_1b.yaml
8 GPUs
./scripts/train_gemma3_1b_8gpu.sh --strategy deepspeed
Configuration
config/training_config_gemma3_1b.yaml- Single GPUconfig/training_config_gemma3_1b_8gpu_deepspeed.yaml- 8 GPUs
Environment
Copy .env.example to .env and set:
HF_TOKEN- HuggingFace tokenWANDB_API_KEY- W&B API key
Troubleshooting
- Reduce
per_device_batch_sizefor memory issues export NCCL_DEBUG=INFOfor NCCL errorsnvidia-smito check GPUs