Software and technical resources
Antconc is an easy-to-use free concordance; compiles KWIC (key words in contexts), word clusters, n-grams, word frequencies, etc.; downloadable from http://www.laurenceanthony.net. The website also contains tutorials on using the concordance.
WordSmith Tools is lexical analysis software, an integrated suite of programs that compile KWIC (key words in contexts), word clusters, n-grams, p-frames, word frequencies, etc., only available for PC; per pay, however, demo version available free of charge http://www.lexically.net/wordsmith
MonoConc Pro is a concordance program that provides KWIC concordance results, wordlists and collocation information; comes with a range of features such as Context Search, Regular Expression search, Part-of-Speech Tag Search, Collocations, and Corpus Comparison; per pay; available here http://www.athel.com/mono.html
MonoConc Easy has many of the features of MonoConc Pro, minus some of the advanced features such as Advanced Sort and Corpus Comparison; intuitive interface; best for student use and teaching rather than for corpus research; per pay; available here http://www.athel.com/mono.html
ParaConc is multilingual concordance program for parallel texts (translations); analyzes up to four languages in parallel; includes collocation tables, word frequency lists, collocation lists, regular expression search, as well as a parallel search option and translation and "hot words" utilities; per pay; available here http://www.athel.com/mono.html
Compleat Lexical Tutor website provides a range of tools for text analyses.
Simple Concordance Program (SCP) is a concordance and word listing program; creates word lists and search natural language text files for words, phrases, and patterns; available for free here http://www.textworld.com/scp
The Sketch Engine by Adam Kilgarriff and Pavel Rychly is a corpus search engine incorporating word sketches, grammatical relations, and a distributional thesaurus; per pay, free demo account is available after registration here http://www.sketchengine.co.uk
Custom List Analyzer (CLA) is a simple but powerful text analysis tool that allows users to create analyze texts using their own list dictionaries; list dictionaries can be of unlimited length and can consist of words, words with wildcards, and n-grams; available here http://www.kristopherkyle.com/cla.html
Sentiment Analysis and Cognition Engine (SEANCE) is a tool for sentiment analysis; includes 254 core indices and 20 component indices; allows for a number of customized indices including filtering for particular parts of speech and controlling for instances of negation; available here http://www.kristopherkyle.com/seance.html
The Simple Natural Language Processing Tool (SiNLP) is a simple tool that allows users to analyze texts using their own custom dictionaries; provides the name of each text processed, the number of words, number of types, TTR, Letters per word, number paragraphs, number of sentences, and number of words per sentence for each text; available here http://www.kristopherkyle.com/sinlp.html
Tool for the Automatic Analysis of Cohesion (TAACO) is a tool that calculates 150 indices of both local and global cohesion, including a number of type-token ratio indices (e.g. parts of speech, lemmas, bigrams, trigrams, etc.), adjacent overlap indices, and connectives indices; available here http://www.kristopherkyle.com/taaco.html
Tool for the Automatic Analysis of Lexical Sophistication (TAALES) is a tool that measures 135 different indices of lexical sophistication, including indices of frequency, range, academic language, and psycholinguistic word information. Included are indices for both single words and n-grams. TAALES indices have been used to inform models of second language (L2) speaking proficiency, first language (L1) and L2 writing proficiency, genre differences, and satirical language. http://www.kristopherkyle.com/taales.html
Lexical Complexity Analyzer is designed to automate lexical complexity analysis of English texts using 25 different measures of lexical density, variation and sophistication proposed in the first and second language development literature; available here http://www.personal.psu.edu/xxl13/downloads/lca.html
L2 Syntactic Complexity Analyzer is designed to automate syntactic complexity analysis of written English language samples produced by advanced learners of English using fourteen different measures proposed in the second language development literature; available here http://www.personal.psu.edu/xxl13/downloads/l2sca.html
IntelliText is a web-based corpus tool run by the Centre for Translation Studies, University of Leeds; allows access to monolingual and bilingual corpora for various languages; includes a “Build Your Own Corpus” function that allows users to create and annotate their own corpora; freely available for download or for use on the server http://www.corpus.leeds.ac.uk/itweb/htdocs/Query.html
Constituent Likelihood Automatic Word-tagging System (CLAWS) is a POS tagger developed by UCREL at Lancaster University. The latest version of the tagger, CLAWS4, was used to POS tag c.100 million words of the British National Corpus; consistently achieves 96-97% accuracy; CLAWS can be accessed through the web-based Wmatrix interface; per pay.