Goal:Â
A leading tech company partnered with Firstsource to enhance their virtual assistant’s GenAI model using Reinforcement Learning from Human Feedback (RLHF). The goal was to improve accuracy and reliability by training the model with high-quality, annotated data, creating and verifying multi-turn conversations across multiple domains within a tight deadline.Â
How we made it happen:Â
Our tailored approach ensured precise execution at scale. Here’s how:Â
Data Diversity and SME Expertise:Â
Deployed over 1,000 domain-specific subject matter experts (SMEs) from a global talent pool to meet the high-volume demand of approximately 1 million annotations. These SMEs covered eight domains: Physics, Chemistry, Biology, Medicine, Mathematics, Coding, Social Sciences, and Finance.Â
Domain expert content generation:Â
Human domain experts generate multi-turn prompts and model responses to create diverse conversations for the Gen AI model.Â
Evaluation and Verification:Â
Each response was rigorously assessed across several criteria to ensure model responses are harmless, honest and helpful, while being factual and accurate.Â
Ethical AI Practices:Â
Strict ethical guidelines were followed to ensure all responses were unbiased, truthful, and supportive, utilizing a fully customized human-tech quality control mechanism.  Â
Platform and Governance:Â
Designed a custom UI and established robust governance for data and timeline management, ensuring a seamless project kick-off and efficient delivery.Â
Burst Capacity Management:
Leveraged domain experts with a focus on burst capacity to effectively manage 1–2-day project stand-ups, ensuring on-time delivery of weekly batches while maintaining quality at scale.Â