Trust & Safety

AI content moderation platform

High-throughput tone check, HITL escalation, PII masking, and breach alerting for content platforms.

Use this package in Studio →Test in simulator ← All packages

What this package does

Governance guardrails for AI content moderation at scale. Applies tone analysis to every piece of AI-generated or AI-evaluated content, escalates borderline cases to a human moderator, masks PII before it enters moderation logs, and alerts when bulk content volumes approach breach thresholds. Designed for social, community, and UGC platforms running AI moderation pipelines.

Designed for

Social and community platforms with AI moderation
UGC platforms using AI content review
Trust & safety engineering teams

What's included

Tone check on every AI evaluation
HITL escalation within 1 hour for flagged content
PII masking before moderation logs
High-throughput limits: 3000 req/min
Breach alert on bulk content events

Controls in this bundle

Control	Action	Parameters
`rate_per_endpoint`	block	`max_per_minute=3000`
`velocity_per_user`	block	`max_events=1000, window_seconds=3600`
`outbound_tone_check`	review	`action="review", exclamation_threshold=0.05`
`outbound_hitl_escalation`	review	`sla_hours=1`
`pii_redaction`	flag	`strategy="mask"`
`breach_threshold`	review	`max_records=500`
`immutable_log`	flag	`retention_days=365`

Profile tiers

Switch profiles in Studio to retune default thresholds across the whole bundle without rewriting any control by hand. This package ships at balanced , anything you've already tightened by hand is preserved on switch.

Every control in this package uses identical parameters across all four profiles. Switching profile in Studio has no effect here.

Attestation

The canonical hash of these bundle bytes is sha256:530698bfea0edbf0fa2f95d7ca5d35c354ddc4be5a48abf2969631b63e536b76. The same hash is computed at lock time, at Stripe checkout, and again on the runtime side before any byte is honored.

Want to customize first? Opening this package in Studio prefills the canvas with the 7 controls above. You can add, remove, or retune any of them before you lock the hash and pay.

Open in Studio

What this package does

Designed for

Social and community platforms with AI moderation

UGC platforms using AI content review

Trust & safety engineering teams

What's included

Tone check on every AI evaluation

HITL escalation within 1 hour for flagged content

PII masking before moderation logs

High-throughput limits: 3000 req/min

Breach alert on bulk content events