Conscience Technology

Overview

A lightweight model for detecting hallucinations in RAG pipeline outputs. Rather than simple binary classification, it uses a claim-by-claim decomposition approach inspired by FActScore.

It decomposes answers into individual atomic claims and verifies each against the source context as Supported / Unsupported / Contradicted.

Key Results

Benchmark Evaluation (500 samples, 10 benchmarks)

Method	Accuracy	Hallu-F1	Faith-F1
Qwen3.5-9B (base)	83.0%	0.757	0.869
Qwen3.5-9B + LoRA	81.6%	0.774	0.845
GPT-5.4	69.8%	0.691	0.705

The LoRA model achieved the highest Hallu-F1 (0.774).

Claude 4.6 Agreement

Method	Agreement	FP	FN
Qwen3.5-9B (base)	75.0%	113	12
Qwen3.5-9B + LoRA	89.6%	48	4
GPT-5.4	91.0%	4	41

A 9B model trained in 22 minutes achieved 89.6% agreement with Claude 4.6, nearly matching GPT-5.4's 91.0%.

Per-Benchmark Performance

Benchmark	Accuracy
RAGBench-HotpotQA	95.1%
FaithEval-CF	93.3%
RAGBench-MSMARCO	89.8%
HaluEval	87.5%
RAGBench-FinQA	87.0%
HaluBench	84.1%

Trained only on RAGTruth data, yet generalizes to 80%+ on external benchmarks.

Architecture

Base Model: Qwen3.5-9B (Self-Attention + Mamba hybrid, 48 layers)

Fine-tuning: LoRA — Rank 16, Alpha 32, ~100M trainable params (1.1% of total), 510MB adapter

Training: 22 minutes on RTX 5090, 3 epochs, 980 balanced samples

Data Quality Matters

Version	Method	Accuracy
v2 (label leakage)	Ground truth in prompts	61.1%
v3 (clean)	Label-free analysis	82.1%

Same model, same code — 21%p difference from data quality alone.

Deployment

4-bit quantization + 510MB adapter = 16GB VRAM
GGUF Q4 can reduce to 6-8GB
Inference: ~4.3s/sample (batch 8)
Suited for async verification pipelines

Industry

Brix

Team

Nora Hallucination Detector: Frontier-level Hallucination Detection with a 9B Model