This repo runs a local LLM to turn raw course-evaluation comments into a structured summary (positives, neutral/mixed, negatives, and an overall tone) using llama-cpp-python and a quantized GGUF model.
-
Download a GGUF model
- Any GGUF model should work as long as the path is correct in the code or via an environment variable.
- Example (recommended): download a Qwen 2.5 7B Instruct GGUF from Hugging Face:
https://huggingface.co/paultimothymooney/Qwen2.5-7B-Instruct-Q4_K_M-GGUF/tree/main - Either place the file at the repo root with the filename used in the script, or point to it with an environment variable (see below).
-
Install dependency
pip install llama-cpp-python
-
Prepare your inputs
system_prompt.txtis already included in this repo. You may customize it if you want to change the summarization style/thresholds.input.txt(the user prompt) already contains some starter text. Append your course-evaluation comments at the bottom where the placeholder data lives. Keep each comment numbered, e.g.:Feedback 1: ... Feedback 2: ... ...- The script won’t run if
input.txtis empty or ifsystem_prompt.txtis empty/missing.
-
Run
python run_file_io.py
The model will load, read
input.txtandsystem_prompt.txt, and write the summary tooutput.txt.
-
By default the script looks for a file named like a Qwen 2.5 7B Instruct GGUF in the repo root. If you use a different model or filename, either:
- Set an environment variable before running:
# Example export GGUF_PATH=/absolute/path/to/your-model.gguf python run_file_io.py
- Or edit the
MODEL_PATHat the top ofrun_file_io.py.
- Set an environment variable before running:
-
Optional tuning knobs (set as environment variables):
N_CTX(context length),N_THREADS(CPU threads),N_GPU_LAYERS(offload layers to GPU if supported)MAX_TOKENS(capped at 500 in the script),LLM_TEMP,TOP_P,REPEAT_PEN
- Loads the GGUF model (path from
GGUF_PATHor the default in code) - Reads
system_prompt.txt(already included) andinput.txt(your comments go at the bottom) - Streams the generated summary into
output.txt - Exits with a clear message if a required file is missing or empty
- Keep comments short and clear, one per line, and maintain numbering (
Feedback 1,Feedback 2, …). - If you edit
system_prompt.txt, keep instructions concise and consistent with how your comments are formatted. - If you have a GPU-supported build of
llama-cpp-python, try settingN_GPU_LAYERSto offload some layers.