centra/progressive-llm

History

Soma Nakamura 6d823eb371 こんにちは		2025-07-10 22:43:31 +09:00
..
README.md	こんにちは	2025-07-10 22:43:31 +09:00

README.md

Progressive LLM Training Documentation

Setup

pip install -r requirements.txt

Training

Single GPU

python scripts/train_progressive.py --config config/training_config_gemma3_1b.yaml

8 GPUs

./scripts/train_gemma3_1b_8gpu.sh --strategy deepspeed

Configuration

config/training_config_gemma3_1b.yaml - Single GPU
config/training_config_gemma3_1b_8gpu_deepspeed.yaml - 8 GPUs

Environment

Copy .env.example to .env and set:

HF_TOKEN - HuggingFace token
WANDB_API_KEY - W&B API key

Troubleshooting

Reduce per_device_batch_size for memory issues
export NCCL_DEBUG=INFO for NCCL errors
nvidia-smi to check GPUs