SyntaxGym is a unified platform where language and NLP researchers can evaluate the performance of language models on targeted syntactic tests. Our goal is to make psycholinguistic assessment of language models more standardized, reproducible, and accessible to a wide variety of researchers. The project is run out of the MIT Computational Psycholinguistics Laboratory.
There are two primary components of SyntaxGym: test suites and language models.
SyntaxGym represents targeted syntactic evaluation experiments as test suites. Test suites evaluate language models’ knowledge of some particular grammatical phenomenon. Our standardized format for test suites is described at the SyntaxGym core documentation.
Language models are fit to a standard API and then evaluated on our test suites. To browse the existing list of language models available on SyntaxGym, please visit the LM Zoo registry.
Users can also add their own language models, which involves the following two steps. First, you will need to build an API-compliant Docker image for your model. Second, you will need to register the model on the SyntaxGym site. A SyntaxGym account is required to add language models.
For more information about the (independent) tools
please read about our Open-source tools.