Updating AntConc for spoken and educational language data

Start date: 1 November 2018
End date: 31 January 2019
Funder: Japan Society for the Promotion of Science (JSPS)
Value: £9000
Primary investigator: 01030692
External co-investigators: Prof Laurence Anthony

About the project

In corpus linguistics, a range of tools are available to researchers for the computational analysis of language data, both online and offline. These tools are designed primarily for the analysis of written language, rather than spoken language; corpus linguistics has traditionally been more interested in written language, because it is easier to gather and convert into a digital format for corpus analysis.

I and other language researchers at the University of Leeds regularly use tools such as these to analyse written and spoken data, including educational materials, interview transcripts and transcripts of recorded lesson observations. However, the writing-centric nature of these tools means that analysing spoken data is more complex, time consuming, and ultimately more expensive than analysing written data.

As a result, the full potential of analysing spoken data cannot always be reached.

Prof Anthony is the creator and developer of AntConc – a freeware, multiplatform, corpus analysis toolkit for concordancing and text analysis – which is one of the most widely used corpus tools in the world.

In this project, Dr Love and Prof Anthony worked together to add new search and processing functions to AntConc to facilitate easier, and more effective processing of spoken language data.

Publications and outputs

Love, R. & Anthony, L. (in preparation). A case for improving the textual and sub-textual analysis of corpora.