Bővebb ismertető
Introduction
This book is for the most part a record of the project in lexical computing which has led to the publication of a new kind of English Dictionary - the Collins Cobuild English Language Dictionary. It was undertaken by a research team in the English Department of the University of Birmingham and is an example of co-operation between academic and industrial expertise. It has been in financial terms the University's largest single research project, and it has required co-ordination of resources within the University going well beyond normal practice.
Work started in 1980 on a feasibility study, swiftly followed by the appointment of a small core team. The starting point was in research dating back to 1961 and the earliest academic computing. The plan was to gather a large and representative selection of contemporary English - spoken and written - and put it into machine-readable form. The original corpus thus formed was over seven million words in length (called the Main Corpus).
The computer sorts the words in various ways, and delivers information on each word to a team of editors and compilers. They study the words and build up an elaborate profile of their meanings and uses in a database, back inside the computer. The database is then the primary source of a family of books which will span many years of editorial work.
What is new about the project, apart from the technology, is the ability to get for the first time a view of a language which is both broad and comprehensive. Many thousands of the observations are about the commonest patterns in the language. For example, we think of verbs like see, give, Iceep, as having each a basic meaning; we would probably expect those meanings to be the commonest. However, the database tells us that see is commonest in uses like I see, you see, give in uses like give a taiic and iceep in uses like Iceep warm.
The power of meanings made by phrases and near-phrases like the above is gradually being understood, and the database holds copious examples for future work. The following chapters give plentiful examples of the kind of information we have recorded.
Cobuild has created the first wholly new dictionary for many years, and the first which is based on a thorough study of the way words are used. So it was felt appropriate to overhaul the usual way in which dictionaries are written and presented. Many users have difficulty understanding the conventions of dictionaries - brackets, abbreviations, special symbols and different type faces. The new dictionary is aimed at the whole world of English users, who need a simple and practical presentation.
The text corpus of general English now stands at around 20 million words in daily use, backed up by a range of more specialised texts coming to a total of about another 20 million. The project is already the stimulus for some exciting and up-to-date research, and is attracting attention from all corners of the world.
This volume is written by several members of the Cobuild project team, each one taking an area of their expertise and writing about the way in which