• Application Process
  • Project Team
  • Project Description
  • Institute Expectations and Support

Application Process

  1. Applicants should carefully read our “About” page before filling out this form. If you have questions, please feel free to write to us at ajanco@haverford.edu and nataliae@princeton.edu with "NEWNLP" in the subject line.
  2. The application has two rounds. In the first round, we will evaluate this form. This form is due Sunday, January 10, 2021.
  3. If you make it to the second round, we may ask you for additional information and/or invite you to a 30-mintue Zoom interview between January 18 and January 29, 2021.
  4. Final notifications will be made by February 15, 2021.

Project Team

About the Applicant(s)

Applicants may apply as individuals or as pairs. Some pairs will include a language or domain expert working with a data scientist or computational linguist. Other pairs will have two language or domain experts. Team members may be from different institutions. Non-US individuals and those based at non-US institutions are eligible to apply.

Team Member 1

Team Member 2

Project Description

What is the "new" language your are proposing for this workshop? How might your research benefit from NLP? Describe any previous experience with text analysis or NLP (no previous experience necessary to participate). For the Institute you will need a collection of machine-readable texts containing 20,000 or more words/tokens in your language (here, machine-readable means OCRd text, transcribed documents, or born-digital). Do you presently have such a corpus? If yes, please describe how it was created. If no, how do you plan to gather these materials? Please paste an example of text in your language here (100-200 words). Have your texts been cleaned or post-processed? If not, how do you plan to cleaned or post-process before Workshop 1 (June 2021)? Do you plan to use Library of Congress materials in your project? (check if "yes", if "no" leave blank)

Institute Expectations and Support

Time commitment

The Institute is a series of three workshops during the 2021-2022 academic year. Participants will attend all three workshop and complete a variety of tasks between the workshops, including corpus annotation, model training, readings, and monthly 1-hour check-ins with Institute staff. We expect average time commitment for the full term of the Institute to be approximately 4 hours per week.


Participants should expect to join a remote project kick-off on Friday May 14th, after which they will start sharing project materials and preparing for Workshop 1.

Workshop 1: Annotation

Workshop 1 is tentatively scheduled for June 21-25 2021 (remote). Between Workshops 1 and 2, teams will continue to annotate their texts.

Workshop 2: Model Training

Workshop 2 is tentatively scheduled for January 10-13, 2022 (hopefully onsite in Princeton). Between Workshops 2 and 3, teams will work on their models and prepare their presentations.

Workshop 3: Conference and Presentations

Workshop 3 is tentatively scheduled for May 12-13, 2022 (hopefully onsite in Princeton). Attendees should be available for Institute wrap-up activities (reflections, surveys, etc), not to exceed 20 hours, until August 2022.

Participate support and funding

Each team member (maximum 2) will receive a $1,000 USD stipend for participating in the Institute. Stipends will be distributed in three installments over the course of the Institute.

Additionally, each team will have a budget of $1,000 USD to pay other team support members (for annotation work or other approved project needs).

For onsite workshops in Princeton, the grant will cover travel, lodging, and per diem. Our grant funding provides up to $400 travel costs per participant. If necessarily, we can discuss travel costs over $400 on a case by case basis. We are able to offer a letter of support for U.S. visas.

Please check to confirm that you are available to participate during the grant period from May 2021 to August 2022.

Is there anything you would like to tell us about your availability or needs to participate in this project?