Contributing
Development Setup
git clone https://github.com/ssnelavala-masstcs/usmle-llm-eval # TODO: Replace URL
cd usmle-llm-eval
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # Add your API keys
Run the test suite to verify everything is working:
All tests mock external API calls and pass without real API keys.
Adding a New Model
- Add an entry to
MODELSinpipeline/config.pywithprovider,model_id, and cost fields. - If the provider is new, create
pipeline/models/<provider>_model.pyextendingLLMModelfrompipeline/models/base.py. Implementanswer_question()with: - The
@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=2, max=30))decorator fromtenacity temperature=0- Returns a
ModelResponsedataclass - Register the new provider in
pipeline/models/registry.py. - Add tests in
tests/test_models.pyusing mocked API responses.
Adding New Question Sets
To evaluate on a different question source:
- Write a loader function in
pipeline/dataset/loader.pythat returns a list of dicts conforming to theQuestionschema inpipeline/dataset/schema.py. - Save the questions to
data/sampled/<name>.json. - Pass the file to script 03:
python scripts/03_run_evaluation.py --questions data/sampled/<name>.json.
Contributing Expert Annotations
If you are a medical professional and would like to contribute expert annotations:
- Request the annotation sheet from the authors (generated via
pipeline/evaluation/annotator.py). - Fill in the
expert_score(1–5),expert_notes, andimg_bias_detected(yes/no) columns. - Return the completed CSV to the authors for merging via
merge_annotations().
See Appendix C of the paper for the full annotation rubric.
Pull Request Guidelines
- Keep PRs focused — one logical change per PR.
- All new code must have corresponding tests.
- Ensure
pytest tests/ -vpasses before submitting. - Do not commit
.envfiles or API keys. - Follow the existing code style (no comments unless logic is non-obvious, no type annotations on unchanged code).
Docs Contributions
The docs site is built with MkDocs Material. To preview locally:
Then open http://localhost:8000.