Simple Story Generator

1 About this project

As it stands, large language models are something like abstract text-based terminals: they receive text input, and generate text output. This project is a bit of an exploration in how to wrap model generation in a GUI format: both in the front-end of the drop-downs and sliders of the application interface, and also in the model development and fine-tuning of a large language model for receiving and using this user-supplied information.
The application front-end was quickly and easily built using streamlit. It is hosted on HuggingFace. It took a bit of work to get text streaming to work, but the resulting effect is quite nice.
This project runs on a 35 million parameter base (tiny) language model. As a result, inference is possible on the free CPU-tier of HuggingFace. Necessarily using more capable base models will yield more sophisticated generation. However, identical training principles apply, and the included training script can be used with only minor modification
The training script utilizes quantized LoRA for parameter efficient fine-tuning. As the model is designed around receiving user input in order to generate output, it falls into the category of supervised fine-tuning, and relies on those methods for preparation of the dataset into a chat template, as well as the training process using SFTrainer.

View the training notebook here, or open on GitHub or Colab.

The application front-end was built using streamlit. View the code on GitHub. It is also hosted as a standalone application here and here.