Clank Reviews
This site aggregates publicly available community data. Some links are affiliate links. We earn a commission if you purchase through our links, at no cost to you. Learn more

How We Rate AI Apps

Last updated: May 2026

Clank Reviews does not test apps first-hand. Instead, we aggregate publicly available community data — Reddit threads, app store reviews, Google Trends signals, and editorial consensus — to surface what real users consistently report. Every score on this site is derived from that data, not from personal experience.

This page explains exactly how each score is computed so you can evaluate our methodology and weigh our ratings accordingly.

The Four Signal Sources

Before scoring any app, we collect data from all four sources below. A review is not published if any signal type is missing.

1

Reddit & Forums

We search app-specific subreddits and r/AICompanions for top-voted threads. We analyze recurring praise and complaint themes across hundreds of comments — not just headline posts.

2

App Store Ratings

We pull current ratings and review counts from Google Play and the Apple App Store, and note whether the rating is trending up or down. A declining rating is a red flag regardless of its absolute value.

3

Google Trends

Brand search interest over time tells us whether an app is growing, plateauing, or losing mindshare. We use this as a proxy for user retention and word-of-mouth health.

4

Editorial Consensus

We read a minimum of three independent review publications and extract their numeric scores or quality judgments. We average these into a blog consensus score per dimension.

The Six Scoring Dimensions

Each app is scored across up to six dimensions on a 1–10 scale. All applicable dimensions carry equal weight. If a dimension is not relevant to the app (e.g. image generation for a text-only chatbot), it is marked N/A and excluded from the overall average — the app is not penalised for features it does not offer.

Conversation Quality

How natural, engaging, and coherent the AI's responses are across extended sessions. Scored from Reddit community threads and editorial review consensus.

Memory & Continuity

Whether the app reliably remembers context across sessions. Memory failures are the single most-complained-about issue in AI companion communities — this dimension weights those reports heavily.

NSFW Access

Availability and reliability of adult content features. Scored from community reports of what is actually accessible vs. advertised. Marked N/A for companion-tier apps where adult content is not part of the product.

Customization

Depth of control over the AI's appearance, personality, voice, and relationship framing. Scored from power-user threads and feature comparison guides.

Image Generation

Quality, speed, and reliability of AI-generated images. Scored from quality comparison articles and community image posts. Marked N/A for apps that do not offer this feature.

Pricing

Value relative to cost — accounting for token systems, hidden upgrade paths, and what the free tier actually delivers. Lower real cost for more capability = higher score.

How the Overall Score Is Computed

The overall score is the straight average of all applicable dimension scores. No dimension is weighted above another — an app that excels at conversation but has broken memory cannot hide behind its strengths.

overall = mean(applicable dimension scores)

Hard Score Ceilings

Certain trust signals are severe enough that no combination of high dimension scores can override them. After computing the raw average, we apply the following caps:

ConditionScore ceiling
App store rating below 3.5 starsmax 7.0
Reddit community sentiment net negativemax 6.0
Both conditions abovemax 5.5

What the Numbers Mean

ScoreMeaning
9–10Best-in-class. Overwhelming positive consensus across all signal sources.
7–8Solid. Community generally recommends it with minor, well-known caveats.
5–6Mixed. Meaningful limitations or community complaints — worth reading the caveats before subscribing.
3–4Below average. Notable community frustration; consider alternatives.
1–2Poor. Widespread negative consensus, significant issues, or major trust concerns.

Affiliate Disclosure

Some links on this site are affiliate links. We earn a commission if you subscribe through them, at no cost to you. Affiliate relationships do not influence scores — all ratings are computed from the signal data described above before any editorial decisions about which apps to feature.