Original research · 2026-05

What 132 Trustpilot 1-stars actually say about AI-video tools.

We tagged 132 1-star reviews across 8 AI-video platforms by complaint category. The headline finding: 77% of paid-tier 1-stars cite billing-mechanic complaints, not output quality. The dominant pattern is not “the model is bad” — it is “the vendor cheated me.”

AIVideoAuditor desk · Collected 2026-05-13 to 2026-05-19 · Triangulated against neutral Reddit cross-check

Methodology

How the corpus was built.

Source: trustpilot.com 1-star filter, pages 1-3, per platform. Reviews collected manually between 2026-05-13 and 2026-05-19. Each review was tagged with one or more complaint categories from a 6-tag taxonomy (tags are non-exclusive — one review can hit multiple).

Sample: 132 reviews across Higgsfield, Krea, Pollo, Pika, Runway, Luma, Sora, and Vidu.

Important caveat: the 1-star pool is self-selected (people who hate the product). It is not a representative customer sample. Read these tags as “what people who churned complain about,” not “what is wrong with the product on average.”

Update · 2026-05-20 · Reddit cross-check

We triangulated the 77% figure against Reddit. The honest read is more interesting.

The aggregate 77% billing-predation figure is correct for the Trustpilot 1-star pool. But Trustpilot is a self-selected angry-payment cohort. To check whether that figure survives outside Trustpilot, we ran neutral Reddit searches (“review,” “experience,” “honest opinion”) on three representative vendors and categorized the top 15 results by complaint type.

VendorTrustpilot billing %Reddit neutral billing %Δ
Pollo~77%78%+1pp
Higgsfield~77%64%-13pp
Runway~77%29%-48pp

What this means: the AI-video vendor universe contains two structurally different vendor types. The predatory tier (Pollo, Higgsfield, Krea, and similar small vendors with hostile subscription mechanics) is mostly a billing problem — the Trustpilot 77% reproduces almost identically in neutral Reddit samples. The mainstream tier (Runway, Luma, Sora, Veo) is mostly a quality problem in neutral samples; Trustpilot's 77% on these vendors is sample-bias amplification of a smaller billing-pain subpopulation.

Implication for buyers: the kind of risk you take when you subscribe is vendor-specific, not category-wide. Subscribing to Pollo is mostly a billing bet; subscribing to Runway is mostly a quality bet. Our prompt scoring + vendor reality check addresses both, but the per-vendor risk profile changes which surface matters more for your particular use case.

Method: Firecrawl Reddit search, 5 queries (billing-loaded + neutral), 75 results categorized BILLING / QUALITY / SUPPORT / POSITIVE / NEUTRAL by title + snippet excerpt. Full method + per-thread categorization committed in the source repo at docs/REDDIT-TRIANGULATION-2026-05-20.md.

The 6-tag taxonomy

  • Billing predation — refund denied · post-cancel charge · bait-and-switch on advertised features · price/credit-cost changed mid-sub · features removed mid-subscription · stacked-promo no-refund
  • Cost burn (pure) — “credits drain too fast” · explicit per-generation cost complaint without other tag
  • Quality — bad output · ignores reference image · marketing oversells the product
  • Support failure — no human · AI bot replies · ignored emails · weeks-long back-and-forth with no resolution
  • Access / technical — can't log in · page deactivated · features missing post-purchase
  • NSFW / brand filter — overzealous filter on legitimate commercial work · forfeited credits to filter blocks

Per-platform breakdown

Runway — 11 reviews (May 2026)

Trustpilot score 1.1/5. 100% of 1-stars cite billing/routing. Modal complaint: “Unlimited” tier routes to credit-metered behaviour after day 7. Wait times jumped from 5-10 min to 25-40 min over the course of May. 6 of 11 reviewers independently name the same wait-time shift in the same week — that is a vendor change, not crowding.

Higgsfield — 41 reviews (Mar-May 2026)

Trustpilot score 4/5 on 2,433 reviews overall — so the 1-star pool is self-selected complainers, not the average customer. Within that pool: 78% billing predation (refund denial after a single test gen, post-cancel charges, bait-and-switch on advertised features, price/credit-cost changed mid-sub), 37% cost burn, 29% quality, 22% support failure.

Krea — 78 reviews

Trustpilot score 2.4/5. 60% billing predation. Auto-renewals after cancelled subscriptions and a cancel button that lives at billing-portal-only (not in the main app, which routes to upsell pages) are the most common patterns. “Unlimited Seedance 2.0” promo routes to credit-metered after day 7.

Pollo — 1-star segment of 3,508 reviews

Aggregate score 4.4/5 hides a brutal 1-star segment. 42% billing + 32% NSFW filter policy churn. The modal NSFW complaint: “NSFW worked before I subscribed, now it doesn't.” — a silent filter policy revision mid-subscription.

Pika — 56 reviews

Trustpilot score 1.6/5. 63% quality failure + 53% support black-hole. Different pathology from the others: the vendor is not extracting money via bait-and-switch — the product itself just does not deliver, and when it fails, no human responds. Generations fail, credits expire, no resolution.

Sora — wind-down period

Sora 2 web sunsets 2026-04-26 (officially announced). Trustpilot 1-stars during the wind-down: 71% billing — users still being charged through the deprecation window.

Luma — selection

Cancel UI lives at billing.luma.ai/portal, NOT at any “Manage subscription” link in the main app. Refund denial language pattern: “credits were consumed during the billing cycle.” Pushing back with a generation log export usually reverses the denial.

Vidu — smaller dataset

Platform-parity issues most common: model behaviour differs measurably from marketing claims on specific shot types (text-to-video performance below benchmark on character work).

The two vendor pathologies

Reading the 132 reviews end-to-end, complaints cluster into two distinct vendor pathologies.

1. Billing predation

Runway, Higgsfield, and Krea fit this. The product works; the vendor's billing mechanic is the complaint. Refund denial after a single test gen. “Unlimited” tier that secretly converts to credit-metered. Auto-renewal traps. Cancel UI hidden behind a billing-portal sub-page.

2. Product + support failure

Pika fits this most clearly. The vendor is not trying to extract money via bait-and-switch — the product just does not deliver, and when it does not, no human responds. Generations fail, credits expire, no resolution.

Why this matters for paid users

The headline credit cost is the disclosed cost. The effective cost is approximately:

effective_cost = disclosed_cost × (1 / first_try_success_rate) × (1 + refund_denial_rate)

For Higgsfield Ultra: estimated effective cost is 2-3× the marketing math (60% first-try success × 78% refund denial across the 1-star pool).

For Runway Unlimited (May 2026): wait-time changes alone shifted effective throughput by 5-8× (5-10 min → 25-40 min per gen).

Raw corpus — direct download

The Higgsfield 1-star slice (41 reviews) is fully tagged with verbatim quotes + per-review category tags. Cross-vendor complaint rates (8 vendors, 132 reviews aggregate) are in a second file. No signup needed.

Source: Trustpilot 1-star pages, scraped 2026-05-13 to 2026-05-19. Tagging method documented above. Full per-row reading + per-vendor breakdown also lives on this page below — the CSV is for tooling / journalism / spreadsheet analysis.

Get weekly vendor-change alerts when the pipeline ships

The CSV corpus is already downloadable above. This form is for the recurring side: when a tracked vendor materially changes pricing, 'unlimited' routing, refund policy, or filter rules mid-subscription, you get one short alert. The monitoring pipeline ships next; you'll be on the first send. Early Pro access included when prompt scoring opens. No drip, no marketing spam.

One email when we launch + maybe one followup. No marketing spam, ever. Unsubscribe one-click.

Built on this data

AIVideoAuditor scores your prompt against this corpus

The 105 failure modes in our catalogue are tagged per vendor against this corpus + ongoing platform monitoring. Pre-flight prompt scoring tells you the failure-rate forecast for the specific platform and prompt shape you intend to use — before you commit credits to a generation that will fail.