The Vedic Treebank is a new treebank of Vedic Sanskrit annotated according to the Universal Dependencies standard, with minor revisions. It contains texts belonging to the Vedic literature, covering both metrical and prose texts and spanning from the R̥gveda to the early Upaniṣads.

The Vedic Treebank is a valuable tool for historical linguists, computational linguists, philologists, and Sanskrit scholars.
The current version can be downloaded from the GitHub repository in CoNLL-U format. It can be queried with processing and visualization tools provided by Universal Dependencies, such as Tred, Udapi, and CoNLL-U viewer.
https://universaldependencies.org/tools.html.

Contacts

Oliver Hellwig, University of Zurich: hellwig7@gmx.de

Credits

  • The Vedic Treebank is an ongoing project spearheaded by Oliver Hellwig and Sven Sellmer (University of Düsseldorf).
  • Other contributors were Prof. Paul Widmer (University of Zurich), Salvatore Scarlata (University of Zurich), Erica Biagetti (University of Pavia), and Elia Ackermann (University of Zurich)

Publications

  • Hellwig, Oliver, Salvatore Scarlata, Elia Ackermann, and Paul Widmer. 2020. The treebank of vedic sanskrit. In Proceedings of The 12th Language Resources and Evaluation Conference, pp. 5137-5146.
  • Biagetti, Erica, Hellwig, Oliver, Salvatore Scarlata, Elia Ackermann, and Paul Widmer. (submitted). The Vedic Treebank Reloaded. International Journal of Corpus Linguistics.

Links

  • Guidelines: https://github.com/OliverHellwig/sanskrit/blob/master/papers/2020lrec/paper/guidelines.pdf
  • GitHub Repository: https://raw.githubusercontent.com/OliverHellwig/sanskrit/master/papers/2020lrec/treebank/sanskrit.conllu;
  • Digital Corpus of Sanskrit: http://www.sanskrit-linguistics.org/dcs/index.php