Vol. 7. No. 1 R-12 June 2003
Return to Table of Contents Return to Main Page

Working with Specialized Language -- A practical guide to using corpora

Lynne Bowker and Jennifer Pearson (2002)
London: Routledge
Pp. xiv + 242
ISBN 0-415-23699-1 (paper)
$ 25.95

If you have ever wondered whether corpus linguistics could help you in your work, but never actually taken the plunge, then this clear and accessible handbook could be just what you need to get started. Working with Specialized Language: a practical guide to using corpora is geared towards language teachers and learners, as well as translators, technical writers and subject specialists who want to master the basics of corpus design and use. Unlike most introductions to corpus analysis, the specific focus of this book is on specialized language and the language for special purposes (LSP) corpus, rather than on general language corpora. However, if anything this simplifies rather than complicates the issue: since the corpus is necessarily a limited sample of language, it is particularly suitable for use in well-defined LSP contexts.

The book takes nothing for granted, beginning with a basic explanation of what corpora are, what LSP means, and how a corpus can facilitate LSP learning and use. The clear and practical approach in this book is exemplary, and the authors have taken pains to start from the very beginning and illustrate both the theories involved and the computer functions with straightforward real-life examples. The first half of the book takes the reader through the process of designing and compiling a special purpose corpus, and introduces the basic processing tools which make it possible to produce wordlists and concordances. The authors use WordSmith Tools for most of their examples, and provide an abundance of examples--the book has over a hundred figures, including diagrams and screen shots--illustrating what each function can produce. Explanations are clear and practical, and there is plenty of useful guidance about how to build up a good corpus and the difficulties that may be encountered on the way. The question of copyright is discussed, and there is even a helpful sample letter requesting permission to hold a text in your corpus. There are also sections on markup, annotation and tagging, and on compiling and aligning bilingual and multilingual corpora. [-1-]

Once the basics have been mastered, the second half of the book goes on to explain how corpus-based applications can be useful in different LSP contexts. The authors themselves suggest that readers who have already worked with general-purpose corpora might omit the first half of the book and begin here. It is this part of the book that constitutes its most original contribution, and it is certainly the part which will be of most direct relevance to our real needs as language teachers, learners and translators. The judicious use of LSP corpora can make it easier to identify specialized terms and detect collocations, and provides a wealth of information about structure, style and concepts in the specialized target language. Bowker and Pearson explain how this is possible, devoting chapters to the applications of corpora to glossary compilation, term extraction, writing and translation.

Of particular relevance to most practitioners involved in teaching or learning some kind of specific language is the chapter on "term extraction", which the authors recommend as a way of "getting a head start" on one's specialized glossary of choice. By comparing the LSP corpus to a general reference corpus, one can quite easily detect words with a higher-than-usual frequency that might be specialized terms. Other methods to achieve the same objective, such as identification of common noun-noun or adjective-noun combinations in a tagged corpus, or statistical searches for repeated series, are also discussed and their relative drawbacks are highlighted.

The two chapters which will be of greatest interest to the average reader must surely be those on using corpora as a writing guide and translation resource. Small specialized corpora containing texts of a particular genre can be extremely useful when one is learning to write a particular kind of text. Bowker and Pearson base their explanation on two case studies. First, they take the example of a research scientist who needs to learn to write articles in high-quality scientific prose. A mini-corpus is created using twelve articles from ScienceDirect, and the authors then show how concordances can be used to reveal the syntactic and lexical patterns in the difference sections of the research papers, thus providing models which scientists can use to write their own introductions, reviews of literature, reports of findings, discussions and conclusions. The second example they use is the narrow genre of the computer hardware product review. Their corpus of 46 short hardware reviews yields a collection of collocations that would no doubt be useful for would-be hardware reviewers. When tagged, the same corpus generates some rather surprising data, such as the fact that the most frequent third person singular verbs are "delivers" and "offers", which are here used for descriptive and evaluative purposes.

The LSP corpus is now an essential tool for translators, often proving more useful and more versatile than a dictionary. In the chapter on translation, a full explanation is given as to how monolingual and bilingual corpora can be used to identify terminological equivalents, explore usage and search for explanatory contexts. Moreover, space is also dedicated to two rough-and-ready corpus-based activities that are increasingly part of the translator's everyday routine, but which I have not seen discussed in detail elsewhere, that is, searching for unknown equivalents and verifying intuitions, both of which constitute an extremely useful kind of computer-aided guesswork.

Despite all the obvious uses of the corpus in language-based work, there is a certain danger of overkill, and enthusiasts like these two authors run the risk of alienating the "average reader" by providing too much information. It is frustrating that the actual processes involved in the use of corpora are sometimes rather intricate, and, as is often the case with computer-based procedures, it is a challenge to provide clear explanations without being able to demonstrate all the developments on screen. The chapter on glossaries is a case in point, as it takes the reader through the complex process of using keyword in context (KWiC) concordances to compile record sheets for the technical terms in specialized monolingual and bilingual corpora. In fact, the procedure being described is identical to that followed by the professional terminologist who is engaged in compiling a specialized dictionary. Although the intention is admirable, one feels that few readers will aspire to such heights of lexicographical detail.

The book also contains a glossary and an appendix containing useful resources and software, and most chapters have a section which includes further information, such as websites and information about available programs. [-2-]

Although the book is primarily intended for self-study, the key points, bibliographies and exercises included at the end of each chapter make it highly suitable for use as a course textbook. Alternatively, some key sections--perhaps those on writing, translation and term extraction--could be extracted for use with students on an LSP course, preferably after adapting the exercises to make them directly relevant to the LSP in question. It is essential that language teachers should strive to empower learners, helping them to acquire the skills and tools they need to become independent operators in the world beyond the classroom. I would argue that one important way we can help our LSP learners along this road is by introducing them to corpus analysis. This book provides the solid basic guidance that language teachers need in order to make this a reality.

Ruth Breeze
Universidad de Navarra
<rbreeze@unav.es>

© Copyright rests with authors. Please cite TESL-EJ appropriately.

Editor's Note: Dashed numbers in square brackets indicate the end of each page for purposes of citation..

Return to Table of Contents Return to Top Return to Main Page
[-3-]