This description is a summary of our understanding of the job description. Click on ‘Apply’ button to find out more.
Role Description
Mercor is collaborating with a leading AI lab on a short-term project focused on improving preference ranking models for conversational AI systems. We’re seeking detail-oriented generalists—ideally with prior experience in data labeling or content evaluation—to assess and rank model outputs across a variety of domains. This opportunity is well-suited for professionals comfortable with nuanced judgment tasks and working independently in a remote setup.
Key Responsibilities
-
Evaluate and compare AI-generated responses based on quality, coherence, and helpfulness
-
Assign preference rankings to pairs or sets of model outputs
-
Follow detailed labeling guidelines and adjust based on evolving criteria
-
Provide brief written explanations for ranking decisions when required
-
Flag edge cases or inconsistencies in task design or model output
Qualifications
-
Prior experience in data labeling, content moderation, or preference ranking tasks
-
Excellent critical thinking and reading comprehension skills
-
Comfort working with evolving guidelines and ambiguity
-
Strong attention to detail and consistency across repetitive tasks
-
Availability for regular part-time work on a weekly basis
Requirements
-
Remote and asynchronous — set your own hours
-
Expected commitment: 10–20 hours/week
-
Flexible workload depending on your availability and performance
Benefits
-
$25–35/hour depending on experience and location
-
Payments issued weekly via Stripe Connect
-
This is a freelance engagement; you’ll be classified as an independent contractor
Application Process
-
Submit your resume to get started
-
Complete a short form to highlight your relevant experience
-
You may be asked to complete a brief assessment to evaluate task fit
-
Expect a response within 3–5 business days