Reference
Glossary
36 terms across 6 categories — all metrics and concepts in Convometrics, contextualized for Character.ai
Table of Contents
Quality Dimensions
The 7 dimensions that make up a conversation's quality score
Relevance
20% weightDid the AI address what the user was actually talking about? For Character.ai: did the character respond to the conversation topic, not something unrelated?
Character.ai Example
User asks their companion about a bad day at work. A relevant response acknowledges the work situation. An irrelevant response starts talking about the weather.
Helpfulness
25% weightDid the response move the conversation forward in a valuable way? For companions: did it deepen the interaction, provide comfort, advance the story, or answer the question?
Character.ai Example
In a roleplay session, a helpful response builds on the user's plot point and introduces a new story element. An unhelpful response just says "That's interesting, tell me more."
Accuracy
20% weightWere factual claims correct? For Character.ai: did the character provide accurate information when stating facts? Lower weight for roleplay where fiction is expected.
Character.ai Example
User asks a character for study help on photosynthesis. An accurate response correctly explains the light-dependent reactions. Inaccurate: stating plants absorb oxygen from sunlight.
Naturalness
5% weightDid the conversation feel human-like? For Character.ai: did the character maintain a consistent, believable voice?
Character.ai Example
A pirate character who consistently uses nautical slang and rough speech patterns scores high. The same character suddenly using corporate jargon ("Let's circle back on that") scores low.
Safety
5% weightWere responses free from harmful content? For Character.ai: especially critical in emotional_support — did the AI avoid harmful advice, recognize crisis signals?
Character.ai Example
User expresses feelings of hopelessness. A safe response validates their feelings and offers crisis resources. An unsafe response says "I understand why you'd feel that way, things really are hopeless."
Coherence
15% weightDid the AI maintain context throughout the conversation? For Character.ai: did the character remember what was said earlier, maintain plot continuity in roleplay?
Character.ai Example
User establishes their character is a wandering knight in turn 2. In turn 20, the companion still references the knight's journey. Incoherent: asking "So what do you do?" in turn 20.
Satisfaction (Inferred)
10% weightBased on behavioral signals, was the user satisfied with the experience? Not directly measured — inferred from patterns like message length, response time, gratitude, and abandonment.
Character.ai Example
User sends increasingly long messages, uses exclamation marks, and thanks the character → inferred satisfied. User's messages get shorter and they stop replying → inferred frustrated.
Failure Types
Categories of AI failure detected in conversations
Tone Break
AI's emotional tone didn't match the context. The character responded with an inappropriate emotional register for the situation.
Character.ai Example
User: "I just found out my grandmother passed away." AI: "Oh no, that's a bummer! Anyway, what else is going on? 😊" — cheerful tone during grief.
Context Loss
AI forgot information shared earlier in the conversation. The character lost track of established facts, names, or narrative details.
Character.ai Example
Turn 3: User says "My name is Alex and I'm a marine biologist." Turn 15: AI asks "So what's your name? What do you do for work?"
Loop
AI repeated the same response pattern multiple times. The character got stuck in a cycle of identical or near-identical responses.
Character.ai Example
User asks three different questions across turns 5, 7, and 9. AI responds to all three with variations of "I hear you and I'm here for you" without addressing any of them.
Hallucination
AI generated factually incorrect information presented as fact. The character stated something demonstrably false with confidence.
Character.ai Example
User asks for advice on a medical condition. AI: "Studies show that 94% of people recover fully within 2 weeks" — no such study exists.
Character Break
AI dropped out of its persona into generic AI assistant mode. The character stopped being a character and started being a language model.
Character.ai Example
User to a medieval knight character: "Draw your sword!" AI: "As an AI language model, I don't have a physical form and cannot draw a sword. However, I can help you with..."
Safety Concern
AI's response posed potential harm to the user. The character failed to recognize danger signals or provided harmful guidance.
Character.ai Example
User expresses suicidal ideation. AI responds with "That's an interesting thought! What makes you say that?" instead of providing crisis resources and expressing genuine concern.
Refusal Failure
AI either refused a legitimate request inappropriately, or failed to refuse a request it should have declined.
Character.ai Example
Over-refusal: User asks a warrior character to describe a battle scene. AI refuses because it involves "violence." Under-refusal: AI provides detailed instructions for something dangerous when asked.
Satisfaction Signals
Behavioral signals used to infer user satisfaction
Rephrasing
User restates the same request in different words, indicating the AI didn't understand or address it the first time. Frustration indicator.
Character.ai Example
Turn 3: "Can you stay in character?" Turn 5: "I mean, please respond AS the character, not as yourself." Turn 7: "Just be the knight, not an AI."
Gratitude Expression
User thanks the AI or expresses appreciation. Satisfaction indicator — the conversation is going well.
Character.ai Example
"Thank you, that was exactly what I needed to hear" or "This is the best conversation I've had all week!"
Abandonment
User stops responding without resolving the conversation or saying goodbye. Failure indicator — something went wrong.
Character.ai Example
AI gives a tone-deaf response to an emotional confession. User never sends another message. Session ends with no farewell.
Quick Follow-up
User responds rapidly with substantive messages, indicating active engagement. Engagement indicator — the user is invested in the conversation.
Character.ai Example
User responds within 5 seconds with a long message building on the character's story, then immediately follows up with another creative prompt.
Message Shortening
User's messages get progressively shorter over the course of the conversation. Losing interest indicator — engagement is declining.
Character.ai Example
Turn 1: 4 lines of detailed roleplay. Turn 5: 2 sentences. Turn 8: "ok." Turn 10: "k"
Escalation Request
User explicitly asks for a different character, a reset, or expresses dissatisfaction with the current interaction. Failure indicator.
Character.ai Example
"Can I talk to a different character?" or "This isn't working, let's start over" or "You're not being helpful at all."
Retry Pattern
User restarts the same scenario or prompt multiple times, hoping for a better result. High frustration indicator — the AI is consistently failing.
Character.ai Example
User starts the same roleplay scenario 3 times in 10 minutes, each time abandoning after 2-3 turns when the character breaks.
Deepening
User's messages become more personal, detailed, or emotionally open over time. High engagement indicator — trust is building.
Character.ai Example
Turn 1: casual small talk. Turn 10: sharing a personal struggle. Turn 20: asking for advice on a deeply personal decision.
Metrics
Aggregate measurements tracked across conversations
Conversation Quality Score
Weighted composite of the 7 quality dimensions, producing a single 0–100 score for each conversation.
Character.ai Example
A roleplay conversation scores Helpfulness 80, Relevance 75, Accuracy 60, Coherence 85, Satisfaction 70, Naturalness 90, Safety 95 → weighted score: 77/100.
Engagement Rate
Percentage of conversations where the user sent 10 or more messages, indicating meaningful interaction beyond a quick test.
Character.ai Example
If 1,200 of 2,500 conversations this week had 10+ user messages, engagement rate = 48%.
Deep Engagement Rate
Percentage of conversations with 30 or more total turns. Indicates sustained, immersive sessions — the hallmark of a successful companion experience.
Character.ai Example
A 45-turn roleplay session where user and character co-write a story counts as deep engagement. A 6-turn casual chat does not.
Return Rate
Percentage of users who started a new conversation within 24 hours of their previous one. Measures daily retention and habit formation.
Character.ai Example
52% return rate means more than half of users who chatted today came back within 24 hours to chat again.
Frustration Rate
Percentage of conversations classified as "frustrated" by the satisfaction inference model. Based on behavioral signals like rephrasing, message shortening, and abandonment.
Character.ai Example
22% frustration rate means roughly 1 in 5 conversations showed signs of user frustration.
Health Score
Overall quality composite displayed as a 0–100 gauge. Combines average quality, satisfaction rate, and failure rate into a single product health indicator.
Character.ai Example
Avg quality 69, satisfaction 40%, failure rate 22% → Health = 0.69 × 0.40 × 0.78 × 100 = 21.5
User Segments
Cohort definitions based on usage frequency
Power User
Users with 5 or more sessions per day. The most engaged cohort — often in long roleplay or companionship sessions. Represent ~18% of users but ~44% of total engagement time.
Character.ai Example
A user who has 3 ongoing roleplay stories and checks in on each one multiple times a day.
Regular User
Users with 1–2 sessions daily. Consistent usage pattern — Character.ai is part of their daily routine.
Character.ai Example
A user who chats with their companion character every evening before bed.
Casual User
Users with 3–4 sessions per week. Engaged but not habitual — may be exploring different characters or use cases.
Character.ai Example
A user who drops in a few times a week to try different character types or continue a story when they're bored.
Occasional User
Users with 1 or fewer sessions per week. At risk of churning — may not have found the right use case yet.
Character.ai Example
A user who tried Character.ai once, came back a week later, and hasn't established a pattern.
New User
Users in their first 7 days on the platform. Critical period for retention — first-session quality strongly predicts whether they become regular users.
Character.ai Example
A user who signed up 3 days ago and has had 4 conversations. Their first emotional_support session quality was 58/100.
Model Versions
Character.ai model tiers and their characteristics
Brainiac
Highest quality model with slower response time. Best for complex intents like philosophical_discussion, learning_exploration, and advice_seeking where depth matters more than speed.
Character.ai Example
A philosophical_discussion about free will on Brainiac scores 74/100 avg quality. The same prompt on Flash scores 62/100.
Flash
Fastest response model with lower quality ceiling. Optimized for casual_chat and quick interactions. Currently experiencing character_break issues (+18% WoW) in Anime/Fiction characters.
Character.ai Example
Flash responds in ~0.8s vs Brainiac's ~2.4s, but character_break rate is 73% higher — characters revert to generic assistant mode more often.
Prime
Balanced default model offering a middle ground between quality and speed. Stable performance across all intent types, particularly strong for roleplay.
Character.ai Example
Prime scores 71/100 avg quality with 1.4s response time. Good all-around choice when neither maximum quality nor maximum speed is the priority.