Many text corpora contain linguistic annotations, representing POS tags,named entities, syntactic structures, semantic roles, and so forth. NLTK providesconvenient ways to access several of these corpora, and has data packages containing corporaand corpus samples, freely downloadable for use in teaching and research.1.2 lists some of the corpora. For information aboutdownloading them, see more examples of how to access NLTK corpora,please consult the Corpus HOWTO at

Perhaps the single most popular tool used by linguists for managing datais Toolbox, previously known as Shoebox since it replacesthe field linguist's traditional shoebox full of file cards.Toolbox is freely downloadable from

First off, download the .tar.* file, and save it. Don't open it. (In these examples, I'll be installing the Dropbox Beta build, because I was going to install it anyway, so I figured that I might as well document the installation.)

It is generally not advised to download and install applications from the internet files. Most applications for Ubuntu are available through the "Ubuntu Software Center" on your system (for example, K3B ). Installing from the Software Center is much more secure, much easier, and will allow the app to get updates from Ubuntu.

The best way is to download the tar.bz2 and tar.gz packages to your system first. Next is to rightclick on the file and select extract to decompress the files. Open the location of the folder you extracted and look for the Readme file and double click to open it and follow the instruction on how to install the particular package because, there could be different instruction available for the proper installation of the file which the normal routine might not be able to forestall without some errors.

I could not find something like this in the apt-get manual page. The most close I found was the --download-only switch, but this puts the package in /var/cache/apt/archives (which requires root permissions) and not in the current directory.




