Secrets access
To access the the Gemma 3 model from Keras Hub, you will need to provide your Kaggle username and API key. The Colab web UI has a user data feature to store and access secrets without having to hardcode them in the workbook. However, this is currently not yet supported within the VS Code extension.
My workaround for secrets is to create a widget for the user to upload an environment variables file which gets processed.
Where the input environment variables file is in the format:
KAGGLE_USERNAME='myuser'
KAGGLE_KEY='a1b2c3d4e5f6g7h8i9j0'
This is a more secure option than just copying/pasting the secrets into the workbook’s cells.
Configuring LoRA tuning
There are various settings you can tune and play around in this example depending on your Colab subscription plan.
- Keras backend: Changes the underlying deep learning engine that is performing all the heavy computations, model compilations, and execution. The three options are TensorFlow, JAX, and PyTorch. TensorFlow is the most mature and JAX focuses on high-performance training.
- Number of examples used from the dataset: Out of the 15k rows of examples, I am using only 1000 for tuning. You can use more, however it will increase your fine-tuning time.
- Rank: This adjusts the number of trainable parameters. Start small for better efficiency, but gradually increase in future trails to see if it yields better results.
- Epoch: The number of times the learning algorithm will work through the dataset. More epochs generally lead to better learning, but too many can lead to overfitting.
- Batch size: The number of data samples to load at once. Training/tuning is memory-intensive, so you want to load samples from your dataset in small batches. If you are on free tier, you may run into errors if you set your batch size too high.
Training metrics for comparison
The TPU training results were decent, but a direct performance comparison with the NVIDIA GPUs is difficult because I used the TensorFlow backend for the TPUs and the JAX backend for the GPUs.
With the CPU-only Colab server, I stopped the training after 50 minutes as it had only gone through 1/10 of the dataset samples and I wasn’t about to wait another 8 hours for it to complete.
NVIDIA A100 was blazingly fast. The NVIDIA T4 also performed quite well and is part of the Google Colab free-tier offering.
Tuning results
Even though I used LoRA tuning parameters — specifically chosen to drastically reduce the number of trainable parameters for faster training — you can already see the results. After only one epoch, the loss was already around 0.75, which is actually quite good. This demonstrates how effective LoRA is at quickly adapting the model:
Source Credit: https://medium.com/google-cloud/local-code-meets-cloud-compute-using-google-colab-in-vs-code-206ff69483f4?source=rss—-e52cf94d98af—4
