Portuguese Language Blog

Portuguese Treebank Posted by on Nov 8, 2007 in Grammar

Here’s one for the linguistics enthusiasts out there!

A treebank, according to Wiki

…is a text corpus in which each sentence has been annotated with syntactic structure. Syntactic structure is commonly represented as a tree structure, hence the name treebank. Treebanks can be used in corpus linguistics for studying syntactic phenomena or in computational linguistics for training or testing parsers.

Simple, right?

Seriously though, this is cool stuff. I have sort of a peripheral interest in social linguistics and computational linguistics, and though I don’t have much direct use for these tools, I recognize that having a corpus and a treebank for a given language opens up a lot of doors. Say you want to record a conversation, then analyze it syntactically; a treebank would allow you to feed a transcription of the recording into a parser.

Ok, ok this is pretty esoteric stuff admittedly. What does this have to do with Portuguese? Good question…

I found this Portuguese Treebank and wanted to share it with you, meus caros leitores.

Keep learning Brazilian Portuguese with us!

Build vocabulary, practice pronunciation, and more with Transparent Language Online. Available anytime, anywhere, on any device.

Try it Free Find it at your Library
Share this:
Pin it

About the Author: Transparent Language

Transparent Language is a leading provider of best-practice language learning software for consumers, government agencies, educational institutions, and businesses. We want everyone to love learning language as much as we do, so we provide a large offering of free resources and social media communities to help you do just that!