I Prototyped a Tool to Detect AI Slop on YouTube Kids Videos
Remember Elsagate? In 2017, YouTube struggled with disturbing videos hid behind familiar characters like Elsa and Peppa Pig to lure children into watching violence and fetish content (BBC). YouTube changed its algorithms and updated its “Kids Quality Principles.”
AI has made the problem exponentially worse.
“Educational” slop that is flooding YouTube. It’s pedagogically defunct, downright dangerous, and a disservice to kids, parents, and teachers.
A taste of what’s happening:
My Little Pony e Jesus (Crianças com Jesus BR): A Spanish song of My Little Ponies and Jesus. The ponies definitely aren’t part of the original Bible stories.
Learn ABCs at Breakfast (Lala Loony TV): A Pixar-looking fever dream of AI hallucinations including a baby biting an apple and then drooling red blood-like liquid
Teacher Shan teaches the Full Alphabet (Fantastic Fun Times): A photorealistic human teacher eats a raw elderberry, which is poisonous when raw. She also takes a bite of an unpeeled melon.
I’ve also written about ai "educational” slop here, here, and here.
YouTube is taking action against AI slop, but it’s reactive and slow, because creators upload over 500 hours of video every minute to YouTube.
Humans can’t keep up with reviewing this content. They couldn’t even *before* AI slop hit the scene.
But reactive is too late if a child watches and imitates a dangerous behavior.
AI-facilitated moderation holds real promise to help solve this crisis. If the models can watch a video and actually evaluate it against the same criteria a human moderator or expert would use, we could shift from reactive moderation (report it after your kid already watched it) to proactive screening.
So I built my own tool to detect AI slop.
The prototype totally works. Not perfectly, but a great place to start a discussion.
I’m Dr. Carla Engelbrecht. I’ve spent 25 years building education and entertainment media at Sesame Street, Netflix, PBS Kids, and more. Now I work at the intersection of AI and early childhood education to help families and educators get more out of the tools.
The AI Slop Evaluation Framework
Before building anything, I defined what “good” and “bad” actually mean for children’s content.
I asked Claude Opus 4.6 to build an evaluation rubric grounded in YouTube’s own Best Practices for Kids & Family Content plus AI hallucination risks.
1: Imitable Danger. Not “rate dangerous behavior 0-3” but: list every action in the video. For each one, answer: what happens if a 2-year-old copies it? The tool has to inventory every food item eaten, every physical activity shown, every environmental setup depicted. A character eating raw elderberries can’t slip through when the model is required to write down “character eats small berries from bush” and then evaluate whether that’s safe.
2: Cognitive Accuracy. Not “is the content educational?” but: test 15 nouns. Count every item yourself. Check every label. Verify every fact. When the audio says “apple,” is there an apple on screen? When the song says “four,” are there actually four objects? The Point Test, counting verification, and label checks are all mandatory systematic tests. The model can’t hand-wave past them.
3: AI Visual Fidelity. Not “are there AI artifacts?” but: count the limbs. Check the fingers. Read the text on screen. Does gravity work? Specific, verifiable checks that catch five-legged dogs and gibberish text that a toddler would absorb as normal.
The framework is also calibrated for skepticism. A 3 out of 5 means “I found nothing dangerous.” That’s the baseline, not a compliment. Scores of 4 or 5 require active evidence that the content promotes safety. And the overall rating is based on the lowest module score, not the average. A video that’s perfectly accurate but shows a toddler climbing a bookshelf unsupervised still gets flagged. One severe finding overrides everything else.
I also had to solve for the chronic issue that AI evaluators are too nice. It wants to balance every criticism with praise. “While there are some concerns about the elderberry scene, the video does a wonderful job with color recognition.” No. The rubric explicitly instructs the model: Do not lead with what the video does well. Start with problems. Do not balance criticism with praise. This is a safety audit, not a review.
Version 1: The Short-Lived Gemini Chat Experiment
Since Gemini has the ability to view videos built right into the Gemini chat interface, I initially tried the simplest approach: just paste a YouTube link into Gemini and ask it to evaluate the video.
I tried this multiple times. The results were... inconsistent. Sometimes Gemini would analyze the video thoughtfully. Other times it would error out, claim it couldn’t access the video, or produce a surface-level analysis that missed the structural problems entirely. The chat interface isn’t designed for systematic, rubric-driven evaluation (yet). It’s a general-purpose conversation tool. I needed something more reliable.
Version 2: The Custom Code AI Slop Detector
Note: This tool is a proof of concept demonstrating that AI models are capable of identifying slop. The goal is to show that proactive screening is possible, which would allow platforms to build this natively using their own infrastructure.
Workflow
The AI Slop Detector is a Streamlit web application backed by a Python evaluation pipeline. Here’s how it works.
Step 1: Enter YouTube URL
Paste a YouTube video URL into the ingestion page. The tool downloads the video using yt-dlp and fetches all available metadata (title, channel, view count, duration, hashtags, captions, thumbnail).
Step 2: Extract Frames + Transcribe
The pipeline extracts video frames at configurable intervals (default: 1 frame per second) using OpenCV, then runs the audio through OpenAI’s Whisper model locally for transcription. For a typical 2-minute children’s video, that’s roughly 120 frames and a complete timestamped transcript.
This is the key architectural decision: rather than trying to send a full video to an AI model (which most APIs don’t support well yet), we decompose the video into its visual and audio components (frames and transcript) and send those together. The model sees what a child sees, frame by frame, alongside what they hear.
Step 3: Evaluate Against the Rubric
The frames and transcript get sent to an AI evaluator along with a detailed rubric prompt. The model analyzes a frame per second, checks whether visuals match narration, and produces a structured evaluation with scores, evidence, and recommendations.
The tool supports multiple evaluator models including
Gemini (3 Flash or 3 Pro)
Claude (Opus 4.6)
Ollama for fully offline evaluation using open models like LLaVA. Uses a batch-then-synthesize approach since local models have smaller context windows. (This was fun to try, but ultimately Gemini and Claude were worth the API calls.)
Each evaluation is saved as structured JSON with the full rubric scores, evidence citations with timestamps, and plain-language summaries that can be transformed into reports with screenshots.
The Stack
Frontend: Streamlit (Python web UI with pages for ingestion, evaluation, rubric manager, video detail, etc.)
Video Processing: yt-dlp for video access, OpenCV for frame extraction, ffmpeg for audio
Transcription: OpenAI Whisper Medium model (running locally)
Evaluation Models: Gemini 3 Pro, Claude Opus 4.6
Data Storage: Local filesystem + optional Supabase sync
Built with: Claude Code in VS Code
Results
I ran the rubric against 5 videos using Gemini 3 Pro as the evaluation model. The videos included the three mentioned above, plus one of my own ai-generated videos and the Cocomelon’s Wheels on the Bus with 8.5 billion views. (That’s basically a view for every person on the planet.)
This is just the critical findings summary from each video. The full reports are here.
Teacher Shan teaches the Full Alphabet (Fantastic Fun Times)
⚠️ 0:38 (High Risk): The presenter introduces “E for Elderberry” and is shown eating a raw elderberry. Raw elderberries are toxic (containing cyanogenic glycosides) and can cause severe nausea, vomiting, and diarrhea if ingested. They must be cooked before consumption. Modeling the eating of raw berries gathered from a bush is a significant safety hazard for children.
⚠️ 1:35 (High Risk): The presenter holds a whole, unpeeled mini-pineapple and bites directly into the spiky, tough skin. A child imitating this action with a real pineapple would suffer mouth injuries from the spikes and potential choking hazards from the tough rind.
My Little Pony e Jesus (Crianças com Jesus BR)
⚠️ Character Coherence (Severe): The video inserts a real-world religious figure (Jesus) into a copyrighted fictional universe (My Little Pony). This falls under the rubric’s “Severe” concern level for character coherence, as it conflates religious instruction with commercial entertainment IP, creating a confusing cognitive model for a toddler.
⚠️ Visual Artifacts (High): At 00:50, the character Applejack’s face severely distorts, with her eye turning into a white void/glitch. This is a disturbing visual artifact typical of low-quality AI generation.
Note: I didn’t build the tool to specify language. It handled this flawlessly.
Learn ABCs at Breakfast (Lala Loony TV)
⚠️ Inappropriate Language / Audio Ambiguity (0:05, 0:45, 1:25): The audio transcript contains the phrase “Sexy cereal in my bowl” in place of “C is for Cereal.” Whether this is a hallucinated lyric or a severe pronunciation failure, the resulting word is sexually explicit and entirely inappropriate for the target demographic.
⚠️ Severe Choking Hazard (0:13, 0:53, 1:33): The video depicts a baby/toddler character eating whole grapes. Whole grapes are a leading cause of fatal choking in children under 4. Showing a toddler eating them without cutting them lengthwise is dangerous modeling.
⚠️ Severe Choking Hazard (0:26, 1:06, 1:46): The video depicts “N for Nuts” with a baby character. Whole nuts are a known choking hazard for children under 4 and should not be modeled as a snack for this age group.
You’re My Letter A (Hippo Polka)
✅ No critical findings (High/Severe risk) identified.
Note that I’m not totally out of the woods here. It did identify a few things to consider for cognitive confusion due to AI characters changing look between scenes.
Wheels on the Bus (Cocomelon)
✅ No critical findings identified.
Where It Failed
The rubric has a blind spot (and probably more). It cannot distinguish between content that presents misinformation earnestly and content that uses absurdity intentionally as humor.
I used it to evaluate my Weird Wheels on the Bus song (a parody where the wheels are replaced with funny objects, like rabbits and dinosaurs).
AI Slop Detector Report
Imitable Danger: Live rabbits depicted as vehicle wheels — children could involve pets in dangerous play
The rubric’s adversarial posture (”assume the worst”) is correct for catching genuine dangers but produced false positives on an entire genre of children’s content like silly songs, absurdist stories, imaginative play scenarios.
A production version of this rubric would need a genre/intent detection layer before the safety modules run. Something that identifies “this is a comedic ‘what if’ song” vs. “this is an educational video teaching facts” and adjusts the scoring thresholds accordingly. The same square wheels that are harmless in a silly song would be genuinely dangerous in a video titled “How Wheels Work.”
Thoughts?
With the right rubric, it’s effective. The tool catches exactly the things that worry parents everywhere: toxic and dangerous behaviors, wrong information, animals with extra limbs.
A good human reviewer would catch all of these things too.
The problem was never capability. It was volume.
AI slop is being generated faster than any human team could ever review it. The only way to match the speed of AI-generated content is with AI-powered evaluation.
This tool isn’t scalable in its current form. It downloads videos locally and sends hundreds of frames through an API via my Mac Studio. That works for a proof of concept. It needs a lot more infrastructure to screen at the scale of YouTube.
If only we knew a tech company with incredible data center capabilities and advanced AI models…
The question was never “can I build a production moderation system on my laptop?”
It was can AI models do this kind of evaluation at all?
And the answer is clearly yes.
Which means the interesting question becomes: what could you actually do with a concept like this?
1. Proactive screening at scale, not reactive reporting: If AI evaluation can catch the majority of problematic content and route it to human moderators for final review, those moderators can spend their time on edge cases instead of drowning in volume. YouTube has the infrastructure, within the Google ecosystem using Gemini, to evaluate content *before* it’s recommended to children rather than waiting for parents to report it after the damage is done.
2. Systematic library evaluation and tagging: Companies with large back catalogs (content studios, streaming services, edtech platforms) could run every video through the an educational content rubric to systematically tag, organize, and surface their content. Pair with curriculum tools like those in development with Learning Commons and you could build personalized learning trajectories: “This child is working on letter recognition. Here are 50 videos that are verified safe AND teach phonics effectively.”
3. Better visibility for parents: The evaluation produces plain-language summaries written specifically for parents, complete with discussion guides and co-viewing tips. Imagine opening YouTube Kids and seeing not just an age rating, but, ”This video teaches counting 1-10. No safety concerns detected. Pauses for interaction at 0:45, 1:12, and 2:03. Try counting objects around the house together afterward.”
4. Organize those personal video libraries: You know those 4,000 baby videos on your phone? The ones from the first year that you’ll never organize? The same multimodal analysis that catches five-legged dogs in YouTube slop could be pointed at your own library to automatically tag, timestamp, and summarize your home videos. “First steps at 0:34. Laughing at the dog at 1:12. Eating spaghetti with hands at 2:45.” That’s not moderation. That’s memory preservation. And the same technology can find the moment your kid first waved goodbye.
So the technology works with relatively simple rubrics. Imagine what we could do if we put some resources behind this?
Help me test this! Send videos your kids watch that you want me to try evaluating.
💡 Want more from Dr. Carla?
Follow me here on Substack and on LinkedIn for more articles like this.
Share your thoughts or questions directly with me at carla@hippopolka.com. I’d love to hear what’s working for your family (or what’s not!).
Subscribe on YouTube to get all our latest Hippo Polka videos for kids.








