Home/Work/E-commerce Catalog Agent
Engagement · E-commerce · Case study · 03

Catalog & PDP generation agent for 40,000 SKUs.

Under NDA — name redactedE-commerceD2C brand6 weeks · 2025
11×Faster listingsvs. human writer baseline
+19%PDP conversion4-week A/B vs. control
6Locales liveEN, FR, ES, DE, IT, NL
40,000SKUs generatedacross 3 collection waves

The brief.

A fast-growing D2C fashion brand was launching three new collections a month — and bottlenecking on the team writing product descriptions, alt text and SEO copy. Each new collection meant 800–1,200 new SKUs, six locales, and a brand voice that took the founder thirty minutes per product to get right.

The team had tried generic AI writing tools and a freelancer pool. The first produced bland, off-brand copy that hurt their differentiation. The second was inconsistent — some pages read like the brand, others read like Amazon listings. Neither could keep up with the volume.

Devmint partnered with the brand's product and marketing teams to ship a content generation agent that's on-brand by default, image-conditioned, and accountable to a measurable conversion outcome.

What we shipped.

  • Generation · 01Image-conditioned copyThe agent sees the product image alongside the structured attributes (material, colour, season, occasion). Copy is generated against both — not from a SKU sheet in isolation.
  • Voice · 02Brand-voice evalEvery generated page is scored against 1,200 hand-written reference pages on tone, sentence cadence, CTA style and emotive register. Below 88 → reviewer queue. Above 92 → ships.
  • Locale · 036-market variantsEN, FR, ES, DE, IT, NL — not translations of each other. Each locale is generated with its own brand-voice dictionary so the tone reads native, not localised.
  • Outcome · 04Conversion-tracked outputEvery generated page is tagged in the analytics layer. The brand can see, per SKU, which generated copy converted higher than its predecessor — and feed that signal back into the eval set.

How we built it.

The framing the founder asked for was: “don't make me write product copy at midnight ever again, and don't hurt the brand.” Those were the two non-negotiables.

The technical answer was a generation pipeline with the brand-voice eval as the spine. Without that eval, the system would've shipped bland copy at scale — the worst possible outcome. With the eval, the system was forced to match the standard the founder had been holding for years.

The 4-week A/B test was the operational outcome contract. Both arms had identical product images, identical prices, identical SKUs. The only variable was the PDP copy: human-written baseline vs. agent-generated. +19% conversion lift on assisted sessions for the agent arm, statistically significant at week three.

The stack.

AI & eval
  • Anthropic Claude (multimodal)
  • Custom brand-voice scorer
  • Per-locale dictionaries
  • Langfuse for traces
  • Versioned eval against 1,200 refs
Pipeline + integration
  • Next.js admin surface
  • Shopify integration · push to PDP
  • Postgres + S3 (images)
  • Inngest for batch generation
  • Per-SKU conversion tagging

We launched three collections in three weeks instead of three months. The PDP copy reads like I wrote it. Two months later the conversion data made the decision for us.

Founder · D2C E-commerce

Outcomes.

Across the three-month rollout, the agent generated content for 40,000 SKUs across six locales. The A/B test ran for four weeks on a matched cohort — +19% PDP conversion lift on the agent arm, statistically significant.

The brand has retained Devmint on a quarterly retainer to maintain the brand-voice eval, add locales (Portuguese on roadmap for the next quarter), and tune the conversion-feedback loop.

More work

Other recent builds.

Your case study next

Got an e-commerce content problem? Or a different volume problem?

A 30-minute call is enough to scope it. We come back with a written proposal in 48 hours.