Masculinity Bot

Training an LLM on what it means to be a man

PythonScrapySpaCyGPT-2Hugging Face TransformersGQUnsplash API

I made a generative art bot to tell me what masculinity is by fine-tuning GPT-2 on 8,000 articles scraped from GQ magazine.

Training an LLM on what it means to be a man

What does it mean to be a man? I examined masculinity through machine learning and generative art.

every man should always have a beard
men should be ashamed of their bodies, right?

Creating the data set

Most men's magazines have similar topics. Health, style, gear, entertainment, news, grooming, parenting, and relationships. I narrowed down the ones that appeared across more than one magazine, then I used the Python library Scrapy to crawl through the GQ website. I gathered topics, titles, dates, and the full text of each article.

My original idea for this project involved web scraping more than just GQ magazine. I intended on including Men's Health and Men's Journal as well, but was not successful at scraping those two. GQ, however, allowed me to scrape about 8,000 articles in one go.

Natural Language Processing

To create a dataset for training, I had to clean up the scraped articles. I removed all the sponsored content which was conveniently tagged as a topic. I used the NLP Python library SpaCy to process all of the article texts. Virgil Abloh and John Galliano are both mentioned a lot. GQ is primarily a style magazine, after all.

I pulled out full sentence entities only and ended up with about 6000 cleaned up strings. This dataset lives on my Hugging Face repository.

Huggingface Transformers

To create a text generation pipeline, I wrote about 40 prompts that would be the basis upon which the LLM predicts the next words. They include phrases like

"Every man should"

"Being a dad means"

"A man is"

"A real man"

and a few dozen more.

I used the Hugging Face Transformers framework to fine-tune GPT-2. I set it to 5 epochs and fed it my dataset of sentences. Then I fed my prompts to the text generation pipeline to get an output of over 1000 statements on what it means to be a man.

Image Generation Script

I wrote a Python script to generate an image to post to social media.

  1. Pull random image of “man in nature” via Unsplash API
  2. Manipulate the photo using Pillow
  3. Stylize with an ASCII art effect using DrawBot
  4. Stylize and add generated text using DrawBot

Automation

I had the bot running on Twitter, but decided to take it down in favour of figuring out how to get it working on Instagram. It's not currently running but will be back in action soon.

the best dads are probably the ones who know how to get a good haircut
a real man would say, you need to get into the gym
a man needs more clothes
every dad should never have to worry about his kids having too many bags on their shoulders
it's time to man up and get out the door
all men including the majority of the men who work in the entertainment industry are more likely to date
being a father means to your kids that you be there for them
hot guys are always having fun

This was my final project for INFO 697 - How to Build a Bot - with Professor Filipa Calado at Pratt Institute, 2025.

GitHub →
↑ Data ProjectsNext Project →

Let’s work together

I'm currently open to new projects and opportunities.