No description
Find a file
2025-07-10 22:43:31 +09:00
.devenv initial 2025-07-10 18:09:14 +09:00
config こんにちは 2025-07-10 22:25:11 +09:00
docs こんにちは 2025-07-10 22:43:31 +09:00
scripts こんにちは 2025-07-10 22:43:31 +09:00
src こんにちは 2025-07-10 22:25:11 +09:00
.devenv.flake.nix initial 2025-07-10 18:09:14 +09:00
.env.example こんにちは 2025-07-10 22:25:11 +09:00
.gitignore こんにちは 2025-07-10 22:43:31 +09:00
=2.5.0 initial 2025-07-10 18:09:14 +09:00
devenv.lock initial 2025-07-10 18:09:14 +09:00
flake-minimal.nix initial 2025-07-10 18:09:14 +09:00
flake.lock initial 2025-07-10 18:09:14 +09:00
flake.nix ok 2025-07-10 21:00:42 +09:00
LORA_TARGET_MODULES.md initial 2025-07-10 18:09:14 +09:00
README.md こんにちは 2025-07-10 22:43:31 +09:00
requirements-cpu.txt initial 2025-07-10 18:09:14 +09:00
requirements-torch.txt initial 2025-07-10 18:09:14 +09:00
requirements.txt こんにちは 2025-07-10 22:43:31 +09:00
test_data_load.py initial 2025-07-10 18:09:14 +09:00

Progressive LLM Training

Progressive training for LLMs with 8-GPU support for 松尾研LLMコンペ2025.

Quick Start

# Setup project
git clone <repository-url>
cd progressive-llm-training

# Install dependencies
pip install -r requirements.txt

# Start training
python scripts/train_progressive.py --config config/training_config_gemma3_1b.yaml
./scripts/train_gemma3_1b_8gpu.sh --strategy deepspeed

Training Stages

  1. basic_cot - Basic reasoning
  2. math_reasoning - Math with OpenR1-Math-220k
  3. complex_reasoning - Complex reasoning with Mixture-of-Thoughts

Commands

pip install -r requirements.txt                                                     # Install dependencies
python scripts/train_progressive.py --config config/training_config_gemma3_1b.yaml # Single GPU
./scripts/train_gemma3_1b_8gpu.sh --strategy deepspeed                             # 8 GPUs
pytest                                                                              # Run tests

Key Files

  • config/training_config_gemma3_1b_8gpu_deepspeed.yaml - 8-GPU config
  • scripts/train_progressive.py - Main training script
  • scripts/train_gemma3_1b_8gpu.sh - 8-GPU launcher
  • src/progressive_model.py - Core model implementation

Ready to train! 🚀