There’s a lot of hype about using AI for content. Most of it is about writing blog posts faster. That’s fine, but it misses the bigger opportunity: using AI as a component in a programmatic content pipeline that generates thousands of unique pages.
I run a stock comparison site with 287,000 pages across 12 languages. Every page has AI-generated narrative sections — not generic filler, but analysis that’s specific to each stock pair. The AI doesn’t write the page. It writes the parts of the page that need to feel human, while structured data handles everything else.
Here’s how the system works and what I’ve learned about making AI content that Google actually accepts.
Why Local LLM (and Why Llama 3)
First question everyone asks: why not just use the OpenAI API?
Cost. At 287,000 pages with roughly 500-800 tokens of AI-generated content per page, even at GPT-3.5 prices you’re looking at $200-400 just for the initial generation. Then factor in regeneration when you update templates, multilingual variants, and iterative improvements. It adds up fast.
Speed. API rate limits mean generating content for 287k pages would take days of queued requests. With a local Llama 3 instance running on a decent GPU, I can generate content for thousands of pages per hour without worrying about rate limits or API outages.
Control. I can fine-tune prompts, adjust generation parameters, and rerun entire batches without worrying about cost. This matters more than you’d think — I went through about 15 iterations of my prompt templates before landing on output that consistently passed quality checks.
Privacy. Financial data flowing through a third-party API isn’t ideal. Running locally means all data stays on my machine.
That said, cloud APIs absolutely have their place. For one-off content, complex reasoning tasks, or when you need the best possible quality on a small number of pages, GPT-4 or Claude is better. But for batch generation at scale, local is the way to go.
The Template + AI Hybrid
The key concept: AI doesn’t write the entire page. It writes specific sections within a structured template.
A stock comparison page has roughly this structure:
[Header — ticker names, logos, prices] ← Structured data
[Key Metrics Table — P/E, market cap, etc.] ← Structured data
[AI Comparison Summary — 2-3 paragraphs] ← AI generated
[Dividend Analysis] ← Mix of data + AI narrative
[Growth Metrics Chart] ← Structured data
[AI Investment Considerations] ← AI generated
[FAQ Section] ← AI generated from data
[Schema Markup] ← Auto-generated
Maybe 60-70% of each page is structured data rendered by the template. The AI fills in the 30-40% that needs natural language — summaries, analysis, and FAQs.
This hybrid approach solves two problems at once:
- Uniqueness. Every page has different data AND different narrative, so Google doesn’t flag it as duplicate content.
- Accuracy. The AI generates text based on actual financial data passed in the prompt, not from its training data. This means the content is factually grounded, not hallucinated.
The Prompt Architecture
I can’t share my exact production prompts (those are in the full blueprint), but here’s the general approach.
Context injection. Every prompt starts with the actual data for that specific stock pair. The AI isn’t generating from scratch — it’s analyzing and narrating data it’s been given.
You are analyzing {STOCK_A} vs {STOCK_B}.
Here is the current financial data:
- {STOCK_A} P/E: {pe_a}, Market Cap: {mcap_a}, Dividend Yield: {div_a}
- {STOCK_B} P/E: {pe_b}, Market Cap: {mcap_b}, Dividend Yield: {div_b}
Write a 2-paragraph comparison focusing on...
Variation instructions. To prevent all pages from reading the same way, I include randomized style directives: vary sentence length, alternate between starting with stock A or stock B, use different comparison frameworks (value vs growth, income vs appreciation, etc.).
Output constraints. Word count limits, formatting requirements, and explicit instructions about what NOT to include (no financial advice disclaimers in the body text, no “as an AI” self-references, no generic filler phrases).
Quality gates. After generation, every piece of content runs through automated checks: minimum uniqueness score against other generated pages, readability score, factual consistency against the source data, and keyword density checks.
What Google’s Helpful Content Update Means for This
Google’s official stance: AI-generated content isn’t automatically bad. Low-quality content is bad, regardless of how it’s made.
In practice, here’s what I’ve observed across my 287k pages:
Pages that get indexed tend to have: Unique data points, specific analysis tied to that data, proper schema markup, and they answer a real search query that a human would type.
Pages that DON’T get indexed tend to have: Generic narrative that could apply to any stock pair, thin analysis that just restates the numbers in sentence form, and patterns that are too similar across pages.
The lesson: the AI content needs to actually say something specific. “Stock A has a higher P/E than Stock B” is thin. “Stock A’s P/E of 35 suggests the market expects significant growth, which makes sense given their 40% revenue increase last quarter, while Stock B’s P/E of 12 reflects a more mature business with stable but slower growth” — that’s useful analysis.
Getting the AI to consistently produce the latter instead of the former is the core challenge. It’s solvable with good prompts, but it took me 15+ iterations.
Multilingual Generation
One of the biggest advantages of programmatic SEO is going multilingual cheaply. Here’s how I handle it:
Template strings (headers, labels, button text) are translated once by a human translator and stored in locale files. This is maybe 200-300 strings per language.
AI narrative sections are generated in the target language directly, not translated from English. This produces more natural-sounding content than translation. The prompt is in English, but I instruct the model to output in the target language with the data injected.
Data stays the same. Numbers, ticker symbols, percentages — these are universal. The template handles formatting (date formats, number separators) based on locale.
Result: 12 languages with minimal per-language effort. The heavy lifting is in building the system. Each additional language is maybe 2-3 days of work.
Cost Breakdown
For anyone wondering about the economics:
| Item | Cost |
|---|---|
| Local GPU (one-time, used RTX 3090) | ~$700 |
| Electricity for generation runs | ~$5/batch |
| Supabase (free tier + small paid) | $25/mo |
| Hosting (DigitalOcean Spaces) | $5/mo |
| Cloudflare CDN | Free tier |
| Domain | $12/year |
| Monthly recurring | ~$30/mo |
Compare that to paying for API calls at scale or hiring writers. The local LLM pays for itself after generating content for maybe 5,000-10,000 pages.
Mistakes I Made
Starting with too many pages. I generated content for 287k pages before validating that Google would index them. Should have started with 5k, gotten indexed, then scaled.
Not enough variation in prompts. My first generation pass used identical prompt structures for every page. The output was technically unique but structurally identical. Google noticed. Second pass included randomized style directives and it made a big difference.
Ignoring readability. Early AI output was dense and clinical. Real financial analysis writing varies between technical detail and accessible explanations. I had to explicitly prompt for this variation.
No quality gate initially. I generated everything in batch and published it all. Should have implemented automated quality checks before publishing. Catching the bottom 10% of AI output before it goes live saves you from thin content flags.
The System Today
After iterating, my current pipeline looks like this:
- Data refresh — Pull latest financial data via yfinance (daily cron job)
- Content generation — Run Llama 3 on pages with stale or missing narrative (weekly)
- Quality check — Automated scoring: uniqueness, readability, factual accuracy
- Build — Astro generates static HTML for all pages
- Deploy — Push to CDN
The whole thing runs on a single machine. Total human time per week: about 2 hours of monitoring and occasional prompt tweaking.
If you want the full technical details — the exact prompt templates, the quality scoring system, the Astro project structure, and the complete deployment pipeline — I wrote it all up in the Programmatic SEO Blueprint. It includes the MIT-licensed code examples you can adapt for your own projects.
Follow for more on building AI-powered content systems at scale.
