VOCABULARY AQUISITION SOFTWARE: USER PREFERENCES AND
TUTORIAL GUIDANCE
R.E.
Cooley
The
Computing Laboratory,
The
University of Kent at Canterbury,
Canterbury,
Kent CT5 1EH, UK
What should be the role of AI in computer supported vocabulary acquisition? This paper presents a software system to aid the teaching and learning of vocabulary in the light of this question. It discusses the need to strike a balance between, on the one hand, providing tutorial guidance based on the knowledge and expertise of experienced language teachers, and on the other hand, providing facilities that a learner or teacher can use to control the acquisition process. The software system incorporates a tutoring module, and it has a user interface which is based on computer simulations of flash cards. The tutoring module is able to help either a teacher or learner specify various aspects of the selection, presentation and sequencing of lexical items. The flash card design is extended to allow the user a range of functions that can control the sequence of items and the amount of information that is presented.
At an elementary level, learning the
vocabulary of a foreign language can strike the learner as being both difficult
and time consuming. A desire to ease the learning process has attracted much software development,
some of a rather indifferent standard, (Green,D and Meara, P 1995[1]).
Moreover, language teachers have not consistently given vocabulary acquisition
a prominent place in the syllabus[2].
This is perhaps because the traditional manner of teaching vocabulary by means
of lists of words paired with their
translations is associated with behaviourism, which now has few if any
followers. However, drills, rote
memorization and related techniques can still be seen as useful, (N.Schmitt,
1997[3]). Work by Groot compared teaching
vocabulary using bilingual lists with
presenting words in appropriate
contexts using a computer program called CAVOCA (Groot, P.J.M. 2000[4]).
The design of this program is based on theories of first language vocabulary
acquisition that recognise that several stages may be involved. Three are
distinguished: 1)observation, 2) storage and linkage, 3) consolidation. It is
assumed that it will benefit learners of foreign languages if they experience
the same three stage process. This is
operationalised in CAVOCA so that the user spends some two minutes on
average learning and using a word before passing on to the next. Groot’s
experimental results are interpreted as supporting the staged theory of
vocabulary acquisition. But Groot also suggests that a student learning a
foreign language may be able to exploit their knowledge of conceptual
categories of their own language when extending their foreign language
vocabulary. He concludes that “a simple bilingual presentation followed by some
rehearsal practice may be more efficient” (Groot, P.M.J. 2000).
Groot’s theory of learning and its
implementation is not unique. Goodfellow’s system “Lexica” has the same
underpinning (Goodfellow, R 1995[5]).
The system described in this paper, in contrast to CAVOCA and Lexica, does not build on the idea of a staged
learning process, and does not attempt to tackle the problem of conceptual
differences between languages. A range of functions are provided by the system,
which contains a database of lexical items and has an interface modelled on
flash cards. There is a tutorial module, which is primarily concerned with the
selection of items to be learnt and the pacing of subsequent learning.
Flash cards, made of light cardboard with, at
a minimum, a single word from the
target language on one side and its translation into the user’s first language
on the other, are simple, cheap and widely used as an aid to rote learning. It is very easy to write a simple program to simulate flash cards on a
computer, and it must be this sort of software that Mark Warschauer is
criticising in his brief sketch of the history of Computer-Assisted Language
Learning. He calls it behaviourist
“drill and practice or …. drill and kill” software, (Warschauer, 1996[6]).
In this context of the first generation of CALL software, the word “tutor” is
used to describe the role of the
computer since it unflaggingly leads the learner through a drill. Of course,
the computer is here only the delivery mechanism, it is simulating pieces of
cardboard rather than a human tutor.
Though not the focus of professional
attention, flash card software has evolved, though slowly, in recent years. A
good account of this evolution can be traced in Goodfellow’s review,
(Goodfellow,T 1995[7]). The most
significant innovation has been the addition of sound. It is no longer
necessary for learners to struggle with the rewind button on tape recorders and
CDs to hear repetitions of a word or phrase.
They just have to click a button on the VDU screen. Voice input, could
also be included in flash card software. It is already found in a range of
language tutoring software. This feature
allows a comparison to made between the learners’ efforts to pronounce a
word and a model pronunciation. From an
AI-ED perspective, perhaps the most interesting innovation has been the
incorporation of automatic revision strategies within flash card programs,
(Zhao J. et al. 1998[8], Houser C. et al. 2000[9],
Wozniak, P[10]). The strategies
are not based on ideas of language teaching, but on early psychological studies
of memory (Ebbinghaus, H 1911)[11].
The work described in this paper is an attempt
to augment the flash card program with a tutoring module. It builds on both
recent results in language learning and developments in the application of AI
to education. It also reflects current thinking in the design of computer
interfaces. Work by Schneiderman has shown the potential advantages of systems
that allow direct manipulation of data as opposed to systems that change or
adapt in use, (Schneiderman, B. 1997[12]).
The intensity of a learner’s motivation, which is a difficult factor to measure
in experimental or indeed other settings,
is more likely to be enhanced by a learning process which places the
user in control, than by one which, though adaptive, may well from time to time
strike the user as inconsistent, perverse or unhelpful.
User preference can be interpreted as aspects
of learning strategies. Language learning strategies have been usefully
reviewed both in general and in connection with the development of “intelligent
CALL (ICALL)” systems by Susan Bull, (Bull, S. 1997[13]).
But as well as facilitating users’ natural desires to control their own
learning, it seems civilised to avoid arbitrary restrictions, and at a minimum
allow the users as much manipulative freedom with computer mediated flash cards
as they would have with a cardboard version.
This view motivates:
·
the need to permit sub-sets of a pack to be
created,
·
for the order of presentation of the target
language word and the word in the users’ first language to be changed,
·
for the audio content to be suppressed or
activated,
·
for the order of the cards within the pack to
be changed by reversing it, or by shuffling the cards in a random way or for it
to remain constant,
Freedom to make textual annotations on cards
is constrained by the format of the
display. Pictorial annotations are currently not available. It would be in
keeping with the design philosophy of the system to permit their inclusion,
though concern with space limitations argues against the inclusion of arbitrary
image files in the database.
These facilities succeed in allowing the users to implement those strategies that
would be possible with cardboard flash cards. It seems natural to augment these
features with those that might reasonably be expected in any PC application: a
count of the number of cards, the “current” position in the pack, the number of
cards whose words have been learnt, and those that have been viewed but which
the learners feel the need to revise.
Stevick, though not an enthusiast for Flash Cards, makes two creative suggestions for their
use, (Stevick, E.W. 1982[14]).
As well as recommending that learners annotate their cards, he suggests a
strategy for dealing with words that learners find difficult. Rather than just
replacing the card bearing the difficult word in its original place in the
sequence, he suggests advancing its position in the pack so that it will be
re-encountered after just a few intervening words. Although this could be
implemented in a straight forward fashion, it is rejected in favour of another
approach. The user may specify the number of items on which they wish to
concentrate. The group of items so specified may be viewed repeatedly, perhaps
until mastered. Then a further group of the same size becomes the focus of
attention. A related facility is the
option to record whether or not a user “knows” an item. When the translation is
presented, the user may change this recorded assessment. At any stage the user
may opt to be retested on those words recorded as being “unknown”. Using
cardboard flash cards, the same effect is achieved by separating the known from
the unknown cards. The main advantage of the computerised facility is that the
information can be stored and accessed independently of the physical
arrangement of the items.
A similar feature that a user might also wish
to control is the amount of information that they can viewed about individual items. Houser et al (2000) stress the value of
single word translations of the Kanji, but there are obvious linguistic objections
to this practice. In many cases there would be a strong desire to record synonyms
and phrasal examples on a flash card. This is the style that McNaughton and Li adopted in their book,
which has a very close similarity to a pack of flash cards: the characters are
printed on the left hand side of flash card sized panels, and the translation
is given on the right (McNaughton and Li Ying, 1999[15]).
The amount of information made available originally is determined by the author
of the vocabulary. However, the results of Laufer and Hill’s study of the use
of CALL dictionaries indicate that “different people have different lookup
preferences and that the use of multiple dictionary information seems to
reinforce retention” (Laufer, B. and Hill, M. 2000[16]).
To cater for this, there is provision
for a URL to be recorded with every lexical item. This can be used for several
purposes including accessing dictionary
entries and web pages that present a lexical item in context. The following screen dump of the interface
illustrates the user interface. The buttons below the display area are used for
learners’ self assessment both before and after the presentation of a
translation.
Figure
1: The user interface
For a single learning session, it is necessary to select the
items to be learnt from some syllabus.
In the absence of a reliable model of vocabulary acquisition, (Meara, P
1997), the best than can be done is to enable teachers and learners to devise
programmes that match their own needs. The selection may be defined by some
external requirement, possibly by a teaching strategy or by the need to fit in
to the strategy of an adopted text book. However, if the selection is not
defined externally, it may be advisable to choose items for learning up to a
predetermined workload. The general approach is to select items from externally
specified thematic categories in accordance with specified quotas, and ranking
principles.
Item difficulty
The notion of a workload implies a weighting system that
recognises that some items will be found to be more difficult than other. This
is in accord with the work of de Groot AMB and Keijzer, R (2000)[17],
who found cognate and concrete words were both easier to learn and less likely
to be forgotten than abstract and non-cognate words. They also found that word
frequency, (i.e. the prominence of words in frequency of occurrence lists), had
no effect. Items in the flash card system are allocated a default category of
either “easy”, “middling” or “difficult”. The person carrying out the
classification might choose a category based on de Groot and Keijzers criteria
or upon individual experience.
State of learning
The user interface of the system allows users to record
their responses to an item. This is either (a) they “know” the item at the
level of detail at which it is presented, (b) the do not “know” the item, these
items are said to be “seen”. This information is recorded along with a time
stamp. The user can specify that a selection of items can contain a quota,
specified as a percentage, of items which are “seen” and of items tagged “known”
which are to be revised.
Topic and Utility
The selection of items may be prioritised on the basis of
their expected utility to the learner. Items are classified as “essential”,
“central” and “peripheral”, and they are also classified by thematic content.
Although the centrality of a word might well depend on the theme of discourse
under consideration, the size of the task of specifying a utility category for
every word in every thematic area is too large to be contemplated. Moreover, it
seems sensible to use the utility category as a proxy for frequency, but also
to recognise that particular syllabus needs may make it desirable to study some
items that have a very low frequency of occurrence.
Level of presentation
Learning lexical items is complicated, depending on the
language, by a range of factors such as polysemy and morphological variation
not all of which are amenable to presentation on flash cards. The system has no
specific mechanism for handling such complexities, but it does provide two
general presentation devices which may be used. Firstly, as mentioned above,
provision is made to enable URL’s to be linked with lexical items. Secondly, items may be presented in a staged
fashion. Currently, the system uses two levels; and items have to have the
status “known” at the first level before they can be presented at the second.
Second level presentations have their own “item difficulty” category, which can
be used to control the priority of selection.
Collocation and Semantic Fields
Collocation, semantic fields, antonyms and similar relationships that exist between a lexical item and other items in the syllabus vocabulary are handled by a uniform mechanism. Any item may be paired with a list of other items, without any restriction. During selection, the inclusion of an item increases the desirability of including any other item from its associated list. Currently, this as implemented as a small fixed percentage increase. This percentage is chosen so that preference is only effective for items with the same utility category.
The user may fully and explicitly determine
the items which are presented for learning. However, in many cases users will
benefit from tutorial guidance in selecting items to be studied; and in almost
all cases will benefit from the pre-categorisation of the lexical items. To
take advantage of this guidance, a user will need to specify a workload in
terms of the equivalent number of lexical items with unit weight. The following
selection constraints must be specified:
The
topics to be represented,
The
percentage of “known” items to be included,
The
percentage of “seen” items to be included,
The percentage of
easy, middling and difficult items to be included.
The selection algorithm enforces user
specified priorities for “Level of difficulty” and “Utility”. The specification
is a list such as the following:
“essential”, “easy”, “middling”, “central”, “peripheral”
Which is interpreted as meaning:
Select
all the “essential” words, and from what remains,
next
select the “easy” words, and from what remains,
next
select the “middling” words, etc.
At any of the stages the collocation weighting
influences the process.
Database Preparation
The system is designed to be used with the
collaboration of an experienced teacher who specifies the syllabus and is
responsible for the categorisation of individual items. The teacher also has
the task of specifying the priority list and recommending appropriate selection
constraints. This enables the workload of every learner to match their own
level of knowledge as well as complying with any collective strategy of the
teacher. The detailed ordering of items within a study session is under the
control of the learner.
This system represents a compromise between the
“direct manipulation” approach to design advocated by Schneiderman and the need
to exploit general tutorial expertise in a range of specific circumstances.
[1]
Green, D. and Meara, P. (1995), “Guest Editorial”, Computer Assisted Language
Learning, Vol. 8, No. 2-3pp97-101.
[2]
Lewis, M. (1993) “The Lexical Approach: The state of ELT and the way forward”,
Language Teaching Publications, Hove.
[3]
Schmitt, N. 1997, “Vocabulary learning strategies”, in “Vocabulary Description,
Acquisition and Pedagogy”, ed. N. Schmitt and M. McCarthy, Cambridge University
Press.
[4]
Groot, P.J.M. (2000) “Computer assisted second language acquisition”, Language
Learning and Technology Vol. 4, no.1, pp. 60-81.
[5] Goodfellow,R. (1995), “A review of the types
of CALL program for vocabulary instruction”, Computer Assisted Language
Learning, Vol. 8, No. 2-3, pp205-226.
[6]
Warschauer, M. (1996). “Computer assisted Language Learning: An introduction”,
in “Multimedia Language Teaching”, ed. S. Fotos.,Logos International, Tokyo.
[7]
Goodfellow,R. (1995), “A review of the types of CALL program for vocabulary
instruction”, Computer Assisted Language Learning, Vol. 8, No. 2-3, pp205-226.
[8]
Zhao Jiali, Wang Zhiming and Luo Siwei (1998), “Two methods to Improve Current
Software of CAVL (Computer-aided vocabulary learning)”, Proceedings of ICCE’98,
Volume 2, pp264-267.
[9]
Houser C., Shigeki Yokoi and Takami Yasuda (2000), A new method for efficient
study of Kanjii using mnemonics and software, Proceedings of ICCE/ICCAI 2000,Volume 1, pp
383-387, Taiwan, November 21-24.
[10]
Supermemo, http://www.supermemo.com/
[11]
Ebbinghaus, H, (1911) “Grundzuge der Psychologie”, Veit, Leipzig.
[12]
Schneiderman, B (1997),”Direct Manipulation Versus Agents: paths to
predictable, controllable and comprehensive interfaces”, chapter 6 in “Software
Agents” (ed) Bradshaw, J.M. AAAI Press, MIT Press, Cambridge, Mass.
[13]
Bull,S (1997), “Promoting Effective Learning Strategy Use in CALL”, Computer
Assisted Language Learning, Vol. 10, No. 1 pp. 3-40.
[14]
Stevick, E.W. (1982), “Teaching and Learning Languages”, Cambridge University
Press.
[15]
McNaughton W. and Li Ying (1999) “Reading and Writing Chinese”, Revised
edition, Tuttle Publishing, Rutland, Vermont.
[16]
Laufer, B. and Hill, M. (2000) “What lexical information do L2 learners select
in a CALL dictionary and how does it affect word retention”, Language Learning
and Technology, Vol.3, No. 2, pp. 58-76.
[17]
De Groot AMB, Keizer R (2000) “What is hard to learn is easy to forget: the
roles of word concreteness, cognate status, and word frequency in foreign
language vocabulary learning and forgetting”, Language Learning, Volume 50, No
1, pp 1-56.