43 lines
No EOL
1.2 KiB
Markdown
43 lines
No EOL
1.2 KiB
Markdown
# Progressive LLM Training
|
|
|
|
Progressive training for LLMs with 8-GPU support for 松尾研LLMコンペ2025.
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Install uv
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
|
|
# Setup project
|
|
git clone <repository-url>
|
|
cd progressive-llm-training
|
|
uv sync
|
|
|
|
# Start training
|
|
uv run scripts/train_progressive.py --config config/training_config_gemma3_1b.yaml
|
|
./scripts/train_gemma3_1b_8gpu.sh --strategy deepspeed
|
|
```
|
|
|
|
## Training Stages
|
|
|
|
1. **basic_cot** - Basic reasoning
|
|
2. **math_reasoning** - Math with OpenR1-Math-220k
|
|
3. **complex_reasoning** - Complex reasoning with Mixture-of-Thoughts
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
uv sync # Install dependencies
|
|
uv run scripts/train_progressive.py --config config/training_config_gemma3_1b.yaml # Single GPU
|
|
./scripts/train_gemma3_1b_8gpu.sh --strategy deepspeed # 8 GPUs
|
|
uv run pytest # Run tests
|
|
```
|
|
|
|
## Key Files
|
|
|
|
- `config/training_config_gemma3_1b_8gpu_deepspeed.yaml` - 8-GPU config
|
|
- `scripts/train_progressive.py` - Main training script
|
|
- `scripts/train_gemma3_1b_8gpu.sh` - 8-GPU launcher
|
|
- `src/progressive_model.py` - Core model implementation
|
|
|
|
Ready to train! 🚀 |