Eugene Yan: If you were building a Q&A feature (or chatbot) based on very long documents (like books), what evals would you focus on?

Eugene Yan eugeneyan.com
If you were building a Q&A feature (or chatbot) based on very long documents (like books), what evals would you focus on?
Apr 9, 2025 01:48
0 reposts 0 quotes 0 likes

View on Bluesky Download image Show all post labels
Pamper Me Network pampermenetwork.bsky.social · Apr 15
Great question! I’d focus on: • Context retention – Can the bot reference key details across long spans? • Answer accuracy – Especially for nuanced or indirect questions. • Faithfulness – Does it stick to the source or hallucinate? • Latency – Long docs can slow things down. Speed matters.

View on Bluesky Show all post labels

An unhandled error has occurred. Reload 🗙