Programming Assignment 2: GPT-2 Text Generation
- Due Nov 5, 2019 by 11:59pm
- Points 33
- Submitting a file upload
- File Types pdf
For the second programming assignment we will be learning about text generation using modern language models, specifically GPT-2. To do this you will be using a colab notebook Links to an external site. that has already been set up to facilitate fine tuning and generating from GPT-2.
The parts of the assignment are as follows:
- Identify a corpus of text that exemplifies a kind of text or text style that you want to generate. Briefly describe the text type or style, and provide a pointer for where you got your corpus. For example, when I was experimenting with GPT-2, I decided to generate text in the style of H.P. Lovecraft's weird fiction. I found a corpus of Lovecraft stories here Links to an external site., which I downloaded and concatenated to make my data set for fine tuning. But the texts don't have to be stories. You might try more structured texts, like recipes, or card descriptions (like the Magic The Gathering example we looked at in class) or whatever your creativity (and text searching and scraping skills) can dream up.
- Provide 5 examples of generated texts (you pick a reasonable length) for five different prompts that you generated using your tuned example. Provide generated examples for the same prompt from the untuned 774M (large) model. This will demonstrate the effect of tuning. For example, when I used Lovecraftian prompts with the untuned large model, the resulting texts were much less "Lovecrafty" than a tuned version of the smallest (124M) model. In both cases list the generation parameters you used.
- Search for your generated strings (or subsets of the string, like sentences) in your corpus to see if it's plagerizing text from the corpus. This can happen if your tuning corpus isn't big enough, or your generation parameters are too strict (e.g. your temperature or top-K are too low).
- Finally, provide a half-page to page-long description of a playable experience you could create using GPT-2 as a fundamental part of the design. For example, in the gameplay loop the player(s) might influence the generation prompt in some way, or you might envision some kind of playful writing tool, or something used to facilitate play of some other game (some kind of word game/GPT-2 hybrid, or something used to facilitate role playing). These are just a few examples to get your creativity flowing on how GPT-2 could be used in an AI-based design.
Submit the assignment as a PDF.
Remember that you can work in pairs on this project. But if you do work in pairs, make sure to include both names on the PDF, and both partners should submit the assignment (so I have an assignment from each person).