4 Case studies

Every empirical study needs a dataset. The methodological orientation of this project means that it does not aim for a linguistic description of some phenomenon in itself, but for the development of a tool that could aid such a description. Therefore, in order to test the workflow described in Chapter 2 and the visualization tools described in Chapter 3, the methodology was applied to a dataset. For that purpose, 32 Dutch nouns, adjectives and verbs exemplifying a range of semasiological phenomena were selected. The phenomena include: homonymy in the case of nouns, interaction between semantic variation and argument structure in the case of verbs and, for all parts of speech, metaphor, metonymy and generalization/specialization. The goal was to explore which phenomena were revealed by distributional models and whether they were related to certain parameter settings.

Homonymy occurs when the same lemma has two or more (sets of) senses that are not semantically or etymologically related. The rest of the relationships between senses can be broadly classified as generalization/specialization, metaphor, or metonymy. Specialization and generalization are two sides of the same coin: one of the senses involved is applied to a particular context or situation, and the other has a much broader application. Crucially, this process involves some additional semantic feature. For example, herstructureren ‘to restructure’ can be applied to a range of situations, but when it applies to companies or parts of companies in particular it does not only mean ‘to change the structure of something’ but also ‘to reduce the personnel,’ which is missing in the general application. The direction of the relationship, i.e. whether the first sense is a generalization of the second or the other way around, is not relevant for the purposes of this study. The relevance is instead linked to the expectation that specialized senses would be more easily identified than general ones. Within Cognitive Linguistics, metaphor and metonymy are understood as cognitive principles that influence semantic structure, rather than mere expressive tools. They are found to interact and, at the same time, the distinction between them is not always unambiguous (Lakoff & Johnson 2003; Barcelona 2015; Lemmens 2015; Geeraerts 2003). While metaphor is described in terms of comparison, similarity and mapping between different domains, metonymy is described in terms of reference, contiguity and mappings within a domain (Lemmens 2015). For example, when grijs ‘gray’ is applied to a weather-related term, e.g. grijze avond ‘gray evening,’ the colour of the overcast sky stands for the weather in a metonymical mapping; when it is applied to an abstract entity like a buurt ‘neighbourhood’ a metaphorical sense ‘boring, sad’ is activated instead. However, the definition of what counts as a domain is not without problems, leaving the boundaries between metaphor and metonymy challenging to define as well (Croft 2003). For the purposes of these case studies, the distinction is relevant to the extent that metonymical senses are more likely than metaphorical senses to occur in the same contexts as their literal counterparts.

In practice, the situation is even more complicated. In the case of structural metaphors (Lakoff & Johnson 2003), metaphorical extensions might be elaborated by means of longer expressions. For example, in we richten de spots op de zoektocht naar kandidaten ‘we aim the spotlights towards the search for candidates,’ richten ‘to direct’ and op ‘on’ can co-occur with either the literal or metaphorical senses of spots ‘spotlight,’ and zoektocht ‘search’ is the cue that makes the literal sense less appropriate. This leads us to a situation already discussed by Geeraerts (2003) regarding the interaction of metaphor and metonymy in idiomatic and composite expressions. In a case like hete aardappel ‘hot potato,’ which in the sample always refers to delicate situations that nobody wants to deal with, is the adjective ‘hot’ literal or metaphorical? Following Geeraerts’ prismatic model of composite expressions, it could be explained as a combination of literal heet ‘hot to the touch’ with literal aardappel ‘potato’ that together is metaphorically understood as a delicate situation; a reinterpretation could then complete the mapping between the potato and the situation, and between the property of being hot to the touch and that of being delicate and to be avoided. The degree to which these reinterpreted mappings match systematic metaphorical or metonymical mappings of the individual elements is a separate issue: it could be argued for heet, which has a ‘conflictive’ meaning in non idiomatic constructions, but not for aardappel ‘potato.’ As a rule, these cases have been annotated as literal, understanding that it is the situation as a whole that is metaphorical.

It should be noted that these criteria are argumentative and justify the selection of the lemmas, but cannot go further than that. It is unfortunate, but the intriguing question about mapping parameter settings to these phenomena has a negative answer. As the second part of the dissertation will show, other factors play a role in the formation of the clouds, relegating these traditional semantic categories to a secondary place, if not as extras on the show. Nevertheless, the phenomena are accounted for, the questions have been asked and, no matter how unsatisfactorily, they have been answered.

Hence, this chapter focuses on the selection, collection and annotation of the dataset on which the methodology was tested. First, Section 4.1 will introduce the 32 selected lemmas and their senses, making explicit which of the aforementioned phenomena they exhibit. I will not discuss each lemma in detail; instead, I will expand of those used for illustration in Part II as it becomes relevant. Section 4.2 will focus on how the concordance lines were collected and the manual annotation procedure. Relevant information regarding the annotation itself will also be provided. Finally, Section 4.3 rounds up the description and the technical part of this dissertation.

4.1 The lemmas

The selection of lemmas aimed to cover a wide range of phenomena: metaphor, metonymy, generalization/specialization, and more. The nouns were chosen because they exhibit both homonymy and polysemy: they have unrelated (groups of) meanings and at least one of them presents finer distinctions. The selection of adjectives also includes different kinds of semantic extension which are mostly related to the kind of noun that is modified by it. Finally, the verbs combine syntactic and semantic dimensions. The definitions provided to the annotators with their respective examples and their translations to English will be listed in tables, but no other examples will be shown in this chapter. Instead, relevant tokens and their contexts will be reproduced in the second part of the dissertation to illustrate the results from the analyses. Empty cells in the Dutch columns of the definitions indicate sense tags that were not present in the original selection of senses but instead were included after the annotation procedure — and assigned in a second stage — based on the results of the annotation itself. The Dutch definitions themselves are adaptations made by Dirk Geeraerts and me based on consultation of dictionaries (e.g. Sterkenburg 1991; Boon, Geeraerts & Arts 2007) and pilot surveys of small concordances from the corpus.

The selection of phenomena was attached to certain expectations. We expected specific senses to be easier to identify than general senses, i.e. to have a more identifiable context. With regard to nouns, homonyms were expected to be discriminated more easily that their internal distinctions. For verbs, instead, the expectation was to find more confusion between senses that either shared the semantic or the syntactic dimension than between senses that did not. We also expected metonymical senses to be harder to disambiguate than synaesthetic or metaphorical senses, since they are more likely to have an overlapping context with the more concrete, literal senses.

4.1.1 The nouns

A set of 7 nouns was selected that exhibit both homonymy and polysemy in at least one of the homonyms³⁰, as shown in Table 4.1. The purpose of this selection was to examine how models dealt with granularity, i.e. hierarchies of senses: homonyms should be easier to disambiguate than their senses, since they will apply to very different contexts, but maybe it would be possible to tune the parameter settings for different levels of granularity, like adjusting the focus on a camera.

Table 4.1: Definitions and examples for the senses of each of the 7 analysed nouns. In each sense, the first number indicates the homonym and, if there is a second number, the sense within the homonym.
Dutch	sense	English
blik
oogopslag (een blik werpen op iets, een blik van verstandhouding)	1.1	gaze (throw a look at something, a look of understanding)
gezichtsvermogen (een scherpe blik)	1.2	sight (a sharp sight)
inzicht, in intellectuele zin (een brede blik)	1.3	perspective, in intellectual sense (a wide view)
dun geplet metaal, i.h. bijz. vertind dun plaatstaal (dozen uit blik)	2.1	thin flattened metal, in particular thin tin-plated steel (boxes of tin)
voorwerp (i.h.bijz. doos voor voedsel) vervaardigd uit zulk materiaal (stoffer en blik, een blik erwtjes, een maaltijd uit blik)	2.2	object (in particular food container) made of tin (brush and dustpan, a can of peas, canned meal)
voedsel bewaard in een voorwerp als bedoeld in 2.2 (eet je niet teveel blik?)	2.3	food contained in an object as described by 2.2 (don’t you eat too much canned food?)
hoop
ongeordende stapel (een hoop rommel, gooi maar op de hoop)	1.1	unordered mass (a pile of junk, just drop it on the pile)
grote hoeveelheid (een hoop mensen, een hele hoop geld)	1.2	great quantity (a bunch of people, a lot of money)
positieve verwachting, vertrouwen op iets positiefs (hoop koesteren, de hoop uitspreken dat…)	2	positive expectation, trust in something positive (to nurture hope, express the hope that…)
horde
bende, ordeloze groep personen (een woeste horde)	1	band, unordered group of people (a ferocious horde)
	1.2	unordered group of non-people (a horde of computers)
materiële hindernis, m.n. houten raamwerk gebruikt bij het hordelopen (de 400m horden bij de vrouwen)	2.1	material obstacle, namely wooden frames used for hurdling (the 400m hurdles for women)
hindernis in figuurlijke zin (een horde nemen)	2.2	obstacle in figurative sense (to take a hurdle)
schaal
een geordende reeks cijfers, afstanden, hoeveelheden e.d. waarmee iets gemeten wordt (de schaal van Celsius, Richter, op een schaal van 1 tot 5)	1.1	an ordered list of numbers, distances, quantities and such, with which something is measured (the scale of Celsius, Richter, on a scale from 1 to 5)
de verhouding tussen de grootte van iets en de weergave ervan in een kaart, model, grafiek etc. (een schaal van 1:20, een schaal van 10 km)	1.2	the ratio between the size of something and its representation in a map, model, graph etc. (a scale of 1:20, a scale of 10km)
grootteorde, omvang (de schaal van een probleem, op grote/kleine schaal)	1.3	magnitude, size (the scale of a problem, on a large/small scale)
harde buitenbekleding van zekere organische zaken (de schaal van een ei, de schalen van een mossel)	2.1	hard exterior of certain organic things (the shell of an egg, the shell of a mussel)
ondiepe en wijde schotel (een schaal met vruchten)	2.2	shallow and wide dish (a platter with fruits)
elk van de beide schotels die aan de armen van een balans hangen (gewicht in de schaal leggen)	2.3	each of the dishes hanging from the arms of a scale (lay a weight on the (dish of a) scale)
spot
	0	(idiosyncratic usage in sports headlines) (Spot op 1ste)
oneerbiedige, ridiculiserende uitspraak of behandeling (de spot drijven met, bijtende spot)	1	disrespectful, mocking expression or behaviour (mock someone, sarcasm)
reclameboodschap via radio, televisie, bioscoop (een spotje voor tandpasta)	2.1	advertisement via radio, television, cinema (a spot for toothpaste)
schijnwerper (de spots richten op)	2.2	spotlight (direct the spotlights on)
	2.3	metaphorical spotlight (he likes to be in the spotlight)
staal
zeer hard ijzer met laag koolstofgehalte (twaalf ton staal, ijzer en staal, een man van staal)	1.1	very hard iron with low carbon content (twelve tons of steel, iron and steel, man of steel)
	1.3	steel industry (steel is striking)
voorwerp of deel van een voorwerp uit zulk metaal (het staal van de velgen is verroest)	1.2	object or part of an object made of such metal (the steel in the rims is rusted)
monster van een stof of materiaal, bij wijze van proef (een staal vragen, goederen op staal verkopen)	2.1	sample of a substance or material, as evidence or proof (to ask for a sample, to buy a sample of goods)
proef, voorbeeld, bewijs (een staaltje van hun kunnen, een staaltje van bewaamheid)	2.2	proof, example, evidence (a sample of their abilities, a proof of competence)
	2.3	sample taken from a population for statistical analysis (a representative sample)
stof
materie, substantie van een bepaald type (giftige stoffen, vaste stof, grijze stof)	1.1	matter, substance of a certain kind (poisonous substances, solid substances, gray matter)
weefsel (wollen en katoenen stoffen)	1.2	fabrics (woolen and cotton fabrics)
onderwerp waarover men spreekt, schrijft, nadenkt etc. (stof voor een roman, stof tot onenigheid)	1.3	topic about which people talk, write, think, etc. (material for a novel, topic of disagreement)
massa zeer kleine droge deeltjes van verschillende oorsprong, door de lucht meegevoerd (een wolk stof, stof afnemen)	2.1	mass of very small dry particles of various origin, floating in the air (a cloud of dust, to clean dust (=to dust))
massa zeer kleine deeltjes als toestand van een specifieke substantie (iets tot stof vermalen, tot stof verpulveren)	2.2	mass of very small particles as state of a specific substance (to bring something to dust)
	2.3	idiomatic uses of ‘dust’ (lift up dust)

Three nouns have one frequent, monosemous homonym and a less frequent, polysemous one: hoop ‘hope/heap,’ spot ‘ridicule/show or spotlight’ and horde ‘horde/hurdle.’ The polysemy phenomena are varied. First, horde ‘hurdle’ can refer to literal hurdles, e.g. in races, while the other sense is metaphorical: abstract difficulties are talked about as obstacles to be surpassed. In addition, after the annotation a new sense tag derived from ‘horde’ was included, for the cases in which the members of the horde were not human beings, but insects, cars or other entities. Second, one of the hoop ‘heap’ senses refers to literal heaps of things that can form a pile, while the other one is a generalization to large quantities, e.g. een hoop werk ‘a lot of work.’ Finally, the polysemous homonym of spot has two main senses linked by metonymy, namely ‘short video,’ e.g. and advertisement spot, or ‘spotlight.’ The ‘spotlight’ sense can also be used either literally or metaphorically (‘to be in the spotlight’); this distinction was not included in the original definitions, but the annotators pointed it out and it was added afterwards.³¹

The other four nouns have two polysemous homonyms: schaal ‘scale/dish,’ blik ‘gaze/tin,’ stof ‘substance/dust…,’ and staal ‘steel/sample.’ First, the frequent homonym of blik (‘gaze’) has a concrete sense with two metaphoric extensions: ‘intellectual look,’ which was not attested in the sample, and ‘perspective,’ which is quite infrequent. The infrequent homonym, ‘tin,’ can either refer to the material itself, to an object made of that material (‘tin can’) or its content (‘canned food’); due to their low frequency and the difficulty on part of the annotators to distinguish between the senses, the two last senses were later combined into one. Second, the frequent homonym of stof has two concrete, referentially distinct senses (‘substance’ and ‘fabric’) and an abstract one (‘topic, material’). In contrast, for the less frequent homonym we distinguished two senses presenting a subtle, context-specific difference: between ‘dust (in the air)’ and the ‘dust’ in ‘reducing something to dust, to pulverize.’ The last sense was so infrequent that it was excluded, but another distinction emerged from the annotation, namely between literal ‘dust’ and ‘dust’ in idiomatic expressions, such as stof doen opwaaien ‘to be controversial, lit. to stir up dust.’ The new sense was added because, even though within the idiomatic expression the meaning of stof is still ‘dust,’ the annotators kept confusing it with the ‘topic, material’ sense, which actually refers to expressions such as stof voor een roman ‘material for a novel.’ Third, schaal exhibits subtle perspective shifts in one homonym (‘scale’) and refers to different concrete objects with the second ‘shell/dish,’ of which the very distinctive ‘shell’ sense was removed due to its low frequency. Finally, staal ‘steel’ could refer, like blik ‘tin,’ to either the material or the part of an object that is made from it — the latter is very infrequent among our sample, but instead another sense could be identified, namely ‘steel industry.’ The ‘sample’ homonym, on the other hand, originally presented a metaphorical distinction between material samples and ‘evidence’ of abstract characteristics, but was modified after annotation to a specialization distinction between general samples, e.g. a urine sample, and (statistically) representative samples.

As we can see, the nouns present a variety of semantic phenomena at a finer granularity than homonymy: metaphor in the case of blik ‘gaze,’ horde ‘hurdle’ and spot ‘spotlight,’ metonymy in the case of horde ‘horde,’ blik ‘tin,’ staal ‘steel’ and spot ‘videoclip/spotlight,’ generalization/specialization in the case of staal ‘sample,’ schaal ‘dish’ and hoop ‘heap,’ perspective shifts for schaal ‘scale’ and other relationships in the frequent stof homonym.

4.1.2 The adjectives

The selection of adjectives includes 13 lemmas presenting different kinds of polysemy phenomena (Table 4.2). The purpose of this selection was to examine how models dealt with their semantic relationships and whether they could extract them from the different nouns modified by the target adjective.

Three adjectives have a metonymic reading: hoopvol ‘hopeful,’ geestig ‘witty’ and hachelijk ‘dangerous/critical.’ For geestig and hoopvol, one of the senses is anthropocentric, i.e. it’s mainly or exclusively applied to people: witty people against the witty things they say or do, and people who express hope against things that inspire it. In hachelijk’s case, the difference is a matter of temporal or telic perspective: between things that might go wrong and situations that are already problematic.

Table 4.2: Definitions and examples for the senses of each of the 13 analysed adjectives.
Dutch	sense	English
dof
(van kleuren en zichtbare dingen) mat, zonder glans, vaal (een doffe blik)	1	(of colours and visible things) matte, without shine, pale (a dull gaze)
(van geluiden) niet luid of scherp, onderdrukt, gesmoord (een doffe kreet)	2	(of sounds) not loud or sharp, suppressed, smothered (a dull cry)
(van personen, gevoelens e.d.) niet opgewekt, lusteloos, zonder energie (doffe onverschilligheid, doffe ellende)	3	(of people, feelings, etc.) not cheerful, apathetic, without energy (dull apathy, dull misery)
(van denkbeelden e.d.) niet scherp voor de geest staand (een doffe herinnering)	4	(of ideas and such) not sharp in the mind (a dull memory)
geestig
scherpzinnig en humoristisch van aard (een geestige collega)	1	of witty and humoristic nature (a witty colleague)
blijk gevend van, uitdrukking gevend aan, gekenmerkt door scherpzinnigheid en humor (een geestig boek, een geestige blik, een geestige opmerking)	2	giving an impression of, expressing, characterized by wittiness and humor (a witty book, a witty look, a witty remark)
	3	being perceived as witty (I find this funny)
gekleurd
met kleur, in letterlijke zin (in het bijzonder, niet zwart, wit of grijs) (gekleurde wangen)	1	with colour, in a literal sense (in particular, not black, white or gray) (colored cheeks)
(van personen e.a.) niet blank (de gekleurde medemens, van gekleurde afkomst zijn)	2	(of people a.o.) not white (the fellow colored man, to be of colored origin)
(van uitspraken, opvattingen e.d.) niet neutraal, tendentieus (een gekleurde voorstelling van zaken)	3	(of expressions, concepts) not neutral, tendentious (a colored representation of things)
geldig
van kracht, van toepassing, van waarde zijnde volgens wettelijke of andere regels (een geldig vervoerbewijs, betaalmiddel, juridisch bewijs)	1	valid, acceptable, with value according to legal or other rules (a valid driving license, currency, legal evidence)
van kracht, van toepassing, van waarde in ruimere zin (een geldige redenering)	2	valid, acceptable, with value in general sense (a valid reasoning)
gemeen
gemeenschappelijk in gebruik of bezit, gedeeld (gemene kosten, een gemene muur)	1	common property or of common use, shared (common costs, a common wall)
openbaar, publiek (de gemene zaak)	2	public (the public business)
alledaags, gewoon, tot de middelmaat behorend (het gemene volk, de gemene man)	3	commonplace, normal, mediocre (the common people, the common man)
boosaardig, kwaadaardig, laaghartig, malicieus (een gemene streek)	4	malicious, evil, mean (a mean trick)
ordinair, plat, onkies, vulgair (gemene praatjes)	5	ordinary, flat, indecent, vulgar (mean conversations)
	6	cool, awesome, badass
goedkoop
laag in prijs, betaalbaar, voordelig (goedkope wijn)	1	of low price, affordable, advantageous (cheap wine)
geen hoge prijzen vragend (een goedkoop winkeltje, een goedkope loodgieter)	2	not asking a high price (a cheap shop, a cheap plumber)
waar de prijzen laag zijn (een goedkope buurt)	3	where the prices are low (a cheap neighborhood)
van weinig waarde, makkelijk verkregen, oppervlakkig, banaal (goedkope lof, goedkoop succes, goedkope argumenten)	4	with little value, received easily, superficial, banal (cheap praise, cheap success, cheap arguments)
grijs
met een kleur die ligt tussen wit en zwart; vaalwit, grauw (grijs van het stof, de grijze dolfijn)	1	with a color between white and black, pale white (gray from the dust, the gray dolphin)
(van periodes e.d.) zonder veel zonneschijn, bewolkt, betrokken (een grijze dag)	2	(of periods and such) without much sunlight, cloudy, covered (a gray day)
(van haar) zijn kleur verloren hebbend, m.n. door gevorderde leeftijd (een grijs baardje)	3	(of hair) having lost its color, namely because of old age (a gray beard)
(van personen e.a.) grijsharig, en vandaar, betrekking hebbend op ouderen (de grijze golf)	4	(of people and related) gray haired, and thus, related to old people (the gray wave)
saai, kleurloos, vervelend (een grijze buurt)	5	boring, not colorful, tedious (a gray neighborhood)
niet helemaal volgens de wet of de regels, halflegaal (de grijze economie)	6	not exactly following the law or rules, half legal (the gray economy)
hachelijk
met kans op een ongunstige afloop, (potentieel) gevaarlijk (een hachelijke onderneming)	1	with chances of unfavorable outcome, (potentially) dangerous (a dangerous enterprise)
(reëel) gevaarlijk, netelig, kritiek, benard (een hachelijke situatie)	2	(actually) dangerous, trick, critical, dire (a dangerous situation)
heet
(van dingen) zeer warm (een gloeiend hete kachel)	1	(of things) very warm (a very hot stove)
(van het lichaam) warm aanvoelend, een hogere temperatuur dan normaal hebbend (hete wangen, het heet hebben)	2	(of the body) feeling warm, having a higher temperature than normal (hot cheeks, to feel hot)
(van het weer) zeer warm (hete dagen, hete zomer)	3	(of the weather) very warm (hot days, hot summer)
(van voedsel) pikant (hete sauzen)	4	(of food) spicy (hot sauce)
(van personen) sexueel hartstochtelijk, geil (een hete bok)	5	(of people) sexually attractive, horny (a hot buck)
(van gebeurtenissen, periodes e.d.) gekenmerkt door heftige strijd (het ging er heet aan toe, een hete herfst)	6	(of events, periods, etc.) characterized by fierce conflict (it was getting hot, a hot autumn)
	7	popular, interesting or new, recent
heilzaam
(letterlijk) bijdragend tot gezondheid en lichamelijk welzijn (een heilzaam dieet)	1	(lit.) that brings health and physical wellbeing (a healthy diet)
(figuurlijk) nuttig, een gunstig effect hebbend (een heilzaam besluit)	2	(fig.) necessary, having a beneficial effect (a beneficial decision)
hemels
betrekking hebbend op de hemel (de hemelse Vader, de hemelse boodschap)	1	related to heaven (de heavenly Father, the heavenly message)
verrukkelijk, heerlijk, zalig, goddelijk (een hemelse verschijning, een hemelse stem)	2	delightful, lovely, blissful, divine (a heavenly appearance, a heavenly voice)
hoekig
(van voorwerpen, figuren e.d.) met hoeken of scherpe kanten (een hoekige tekening, een hoekig gezicht)	1	(of objects, figures, etc.) with angles or sharp edges (an angulous drawing, an angulous face)
(van bewegingen, ritmes e.d.) niet vloeiend (een hoekig melodietje)	2	(of movements, rhythms, etc.) not fluent (a broken melody)
(van personen) houterig, stijf, onhandig in de omgang (een hoekig karakter)	3	(of people) rigid, stiff, clumsy (a clumsy character)
hoopvol
(van personen, uitingen, gedragingen etc.) blijk gevend van hoop, vol hoop, optimistisch (een hoopvolle stemming, dat stemt mij hoopvol)	1	(of people, expressions, behaviors, etc.) giving an impression of hope, full of hope, optimistic (a hopeful mood, that brings me hope (makes me hopeful))
reden tot hoop gevend, beloftevol (hoopvolle perspectieven)	2	giving reason for hope, promising (hopeful perspectives)

Four adjectives have metaphoric readings: hoekig ‘angular,’ dof ‘dull,’ heilzaam ‘healthy/beneficial’ and gekleurd ‘colourful/person of colour/tainted.’ Heilzaam has two senses, distinguishing between things that are literally healing, or beneficial for the health, and things that are metaphorically healing, or beneficial in general. Hoekig and gekleurd present three sense distinctions, one of which is particularly concrete and the most frequent, ‘of angular form’ and ‘colourful’ respectively, and another one explicitly anthropocentric: ‘clumsy’ and ‘non white.’ The third sense distinction has a different quality: synaesthetic for hoekig, applied to rhythms, and metaphoric for gekleurd, meaning ‘tainted, corrupted.’ Finally, dof has a concrete sense applied to the visual domain, a synaesthetic extension applied to sounds, and an abstract meaning applied to feelings and emotions; the fourth meaning listed in the table was not attested.

Three adjectives present some other form of similarity between the readings: geldig ‘valid,’ hemels ‘heavenly’ and gemeen ‘shared/mean….’ Geldig ‘valid’ and hemels ‘heavenly’ offer two options: one restricted to a specific context (laws and reglaments for geldig and Heaven for hemels) and one much broader. The case of gemeen is quite complex, involving a number of rather subtle distinctions that often co-exist in the same attestation: i.e. ‘common’ and ‘shared,’ or ‘average’ and ‘ordinary.’

Finally, the remaining three adjectives present a more complex picture: heet ‘hot’ and goedkoop ‘cheap’ have literal senses with different kinds of entities but has also metaphorical extensions, while grijs ‘grey’ has both metaphorical and metonymical extensions. Heet ‘hot’ presents, first, three very concrete senses that differ in perspective: temperatures of objects, of weather and as it is felt in the body; the other three senses are metaphorical, i.e. the objects to which heet is applied cannot be physically hot. Crucially, there is no exclusive sense tag for idiomatic expressions in which the combination of heet ‘hot’ and its concrete object (e.g. hang_ijzer ‘iron,’ aardappel ‘potato’) is used metaphorically. Goedkoop, on the other hand, presents a modest set of 4 sense distinctions: a concrete, prototypical and frequent sense (i.e. cheap products), two perspectival shifts (i.e. cheap shops and cheap area) and a clear metaphor (i.e. of little values). Finally, grijs presents a very frequent, concrete sense, three specific metonymic extensions — to weather and to hair, and from there to old people or generations — and two metaphorical ones — ‘boring’ and ‘half legal.’ In practice, the ‘boring’ reading can include ‘sad, not cheerful,’ and the ‘half legal’ sense is more general, applying to ‘gray areas’ between two poles.

In sum, the adjectives include more simple semasiological structures with only one kind of semantic extension involved as well as more complex interactions between the phenomena.

4.1.3 The verbs

The criterion to select the 12 verbs analysed here was to cover a range of combinations of syntactic and semantic variation, with the goal of exploring how different parameter settings dealt with their interaction or whether certain types of models would focus on one or the other aspect.³² Their senses and translations are shown in Table 4.3.

Four verbs are always transitive and their senses can be distinguished by the objects they can take: people or objects for haten ‘to hate,’ people or opinions for huldigen ‘to honour/to believe,’ concrete objects or taxes for heffen ‘to levy/to lift,’ and statements or decisions for herroepen ‘to recant/to void.’

Two of the verbs can be transitive, with a distinction based on the direct object, or intransitive: helpen ‘to help’ and herstructureren ‘to restructure.’ In both cases the intransitive sense is semantically similar to one of the transitive senses. For example, the intransitive sense and one of the transitive senses of herstructureren only apply to companies, with the connotation that the personnel is being reduced, while the other transitive sense has a much more general application.

Three verbs can be transitive, with a distinction based on the direct object, or reflexive: diskwalificeren ‘to disqualify,’ herhalen ‘to repeat’ and herinneren ‘to remember/to remind.’ In the case of diskwalificeren ‘to disqualify’ and, to a lesser degree, herhalen ‘to repeat,’ this opposition can be interpreted as a specific situation where the object and the subject coincide. In contrast, herinneren means ‘to remember’ in the reflexive construction and ‘to remind’ in the transitive construction with the preposition aan; the transitive construction without the preposition can also be attested (e.g. ik word herinnered als, ‘I am remembered as’) but very infrequently.

Table 4.3: Definitions and examples for the senses of each of the 12 analysed verbs.
Dutch	sense	English
diskwalificeren
(trans.) ongeschikt verklaren en uitsluiten van een bepaalde functie of positie (een getuige diskwalificeren)	1	(trans.) declare unsuitable and exclude from a certain function or position (disqualify a witness)
(trans.) wegens onregelmatigheden uitsluiten bij een wedstrijd (FC De Trappers werd gediskwalificeerd wegens wangedrag)	2	(trans.) exclude from a competition because of irregularities (FC De Trappers was disqualified because of misbehaviour)
(reflex.) zichzelf buiten spel zetten, zich onmogelijk maken (met zulk gedrag diskwalificeer je jezelf)	3	(reflex.) exclude oneself, make oneself impossible (with such a behaviour you disqualify yourself)
haken
(trans.) met of als met een haak vastmaken (aan, in, achter iets) (een wagen aan een locomotief haken, een sleutel in een ring haken)	1	(trans.) fix something with or as if with a hook (at, to, behind something) (hook a wagon to a locomotive, a key in a key ring)
(intrans.) met of als met een haak vastraken (de doornen haakten aan haar jas, haar paraplu bleef haken aan de deurknop)	2	(intrans.) get stuck with or as if with a hook (the thorns got stuck in her coat, her umbrella got stuck in the doorknob)
(trans.) over een uitgestoken been doen struikelen (hij werd gehaakt in de elfmeter, iemand pootje haken)	3	(trans.) make trip over a stuck out leg (he was made to trip in the penalty kick, make someone trip)
(intrans., met ‘blijven’) van gedachten, blikken e.d.: haperen, telkens terugkeren (aan of bij iets) (ik bleef haken bij de herinnering aan mijn broer)	4	(intrans., with ‘to keep’) of thoughts, gazes and such: falter, come back (to something) (I kept going back to the memory of my brother)
(intrans./trans.) zeker handwerk maken door met een staafje met een weerhaak lussen samen te weven (haken tijdens het televisiekijken, hoe ontspannend!, een babymutsje haken)	5	(intrans./trans.) make handcraft by weaving loops together with a hooked needle (crochetting while watching tv, so relaxing!, crochet a baby hat)
	6	(with ‘towards’) desire, aim for
harden
(trans.) hard maken, in letterlijke zin (staal harden)	1	(trans.) make hard, in literal sense (harden steel)
(intrans.) hard worden, in letterlijke zin (snel hardende verven)	2	(intr.) become hard, in literal sense (quickly hardening paint)
(trans.) hard maken in figuurlijke zin; weerstand en veerkracht bijbrengen (een kind harden tegen het klimaat)	3	(trans.) make hard in figurative sense; impart resistance and resilience (toughen a child against the weather)
(reflex.) bij zichzelf weerstand en veerkracht aankweken (zich harden tegen het lot)	4	(reflex.) develop resistance and resilience by oneself (toughen oneself against fate)
(trans.) uithouden, verdragen (niet te harden)	5	(trans.) endure, tolerate (unbearable (‘not to bear’))
haten
(trans.) iem. haat toedragen, een sterk gevoel van afkeer en vijandschap t.o.v. iem. hebben (waarom haat hij mij zo?)	1	(trans.) feel hatred, have a strong feeling of aversion and enmity towards someone (why does he hate me so much?)
(trans.) iets onaangenaam, verfoeilijk, verwerpelijk vinden (hoe zou iemand de taalkunde kunnen haten?)	2	(trans.) consider something unpleasant, detestable, reprehensible (how could someone hate linguistics?)
heffen
(trans.) m.b.t. materiële zaken: in de hoogte brengen, optillen (met geheven hoofd; hij heft met gemak 80 kilo in de hoogte)	1	(trans.) w.r.t. material objects: move to a higher position, lift (lifting their head; he easily lifted 80 kg)
(trans.) m.b.t. geld e.d.: invorderen, eisen, opleggen (belasting, rente, accijns heffen)	2	(trans.) w.r.t. money and such: collect, demand, impose (collect tax, interest, excise)
helpen
(trans.) ondersteunen in materiële of morele zin, bijstaan (met raad en daad helpen, een helpende hand, uit de nood helpen)	1	(trans.) support in material or moral sense, assist (help in word and deed, a helping hand, help out)
(trans.) iem. assisteren door met hem samen te werken (helpen met het huiswerk; heb je dat alleen gedaan of heeft iemand je geholpen?)	2	(trans.) assist someone by collaborating with them (help with homework, did you do that by yourself or did someone help you?)
(intrans.) voordeel opleveren, nuttig zijn (dat drankje heeft goed geholpen)	3	(intrans.) yield advantage, be useful (that drink helped a lot)
	4	(trans.) with inanimate entities, be helpful, useful
	5	(with ‘to/for’) to provide
herhalen
(trans.) m.b.t. handelingen of activiteiten: opnieuw uitvoeren (een experiment, een les, een bezoek herhalen)	1	(trans.) w.r.t. acts or activities: perform again (repeat an experiment, a lesson, a visit)
(trans.) m.b.t. zinnen, boodschappen e.d.: opnieuw uitspreken (kunt u dat even herhalen?)	2	(trans.) w.r.t. utterances, messages and such: pronounce again (Could you please repeat that?)
(reflex.) zich opnieuw voordoen (de geschiedenis herhaalt zich)	3	(reflex.) occur again (history repeats itself)
	4	(trans.) of a show or an episode, broadcast again
herinneren
(met ‘aan’) weer te binnen brengen, in het geheugen terugroepen (iemand aan iets herinneren)	1	(with ‘of’) bring back to the mind, to the memory (remind someone of something)
(reflex.) in het geheugen aanwezig hebben, niet vergeten (zich een gebeurtenis, een persoon herinneren)	2	(reflex.) have present in the memory, not forget (remember an event, a person)
(trans.) met een plechtigheid, monument o.i.d. gedenken (we herinneren vandaag de Slag bij Ronceval)	3	(trans.) remember with a celebration, monument and such (today we remember the Battle of Roncevaux Pass)
herroepen
(trans.) m.b.t. wetten, besluiten e.d.: intrekken, niet langer geldig verklaren (een besluit, volmacht, decreet herroepen)	1	(trans.) w.r.t. laws, decisions and such: withdraw, declare not valid anymore (annul a decision, power of attorney, decree)
(trans.) m.b.t. uitspraken, meningen e.d.: terugnemen en rechtzetten (Trump moest weer een van zijn dwaze tweets herroepen)	2	(trans.) w.r.t. statements, opinions and such: retract and correct (Trump had to retract one of his crazy tweets again)
herstellen
(trans.) repareren, de eraan ontstane schade wegwerken (het dak herstellen)	1	(trans.) repair, get rid of the damage in something (repair the roof)
(trans.) tot de vorige toestand terugbrengen, doen terugkeren (de goede verstandhouding herstellen)	2	(trans.) bring back, make return to the previous state (repair the understanding)
(trans.) goedmaken, weer doen vergeten (een fout herstellen)	3	(trans.) make good, make forget (fix a mistake)
(reflex.) tot de oorspronkelijke toestand terugkeren (de rust herstelt zich)	4	(reflex.) return to the original state (peace is restored)
(intrans.) genezen (van een ziekte herstellen)	5	(intrans.) heal (heal from a disease)
	6	(intrans.) of a financial/economic entity, recover
herstructureren
(trans.) reorganiseren, een nieuwe structuur geven (je kunt deze tekst maar beter herstructureren)	1	(trans.) reorganize, give a new structure (you should restructure this text)
(trans.) m.b.t. bedrijven in problemen: activiteiten of personeel afstoten, downsizen (Bayer herstructureert zijn plasticdivisie)	2	(trans.) w.r.t. businesses in difficulties: remove activities or personeel, downsize (Bayer restructures its plastic division)
(intrans.) van bedrijven in problemen: activiteiten of personeel afstoten, downsizen (de chemie moet zich herstructureren)	3	(intrans.) of businesses in difficulties: remove activities or personeel, downsize (chemistry must restructure (itself))
huldigen
(trans.) iets of iem. eer bewijzen, vieren (we huldigen de uitvinder van de herbruikbare broodzak)	1	(trans.) celebrate, pay homage to someone or something (we honor the inventor of the reusable bread bag)
(trans.) erkennen, aankleven, toegedaan zijn (een opvatting, mening, theorie huldigen)	2	(trans.) acknowledge, follow, be commited to (hold a view, an opinion, a theory)

Two more verbs can be transitive, intransitive or reflexive, with semantic distinctions within the transitive structure: harden ‘to make or become hard/ to tolerate’ and herstellen ‘to repair/ to heal….’ The senses of harden can be split in two main groups. One is more closely related to the property of ‘hardness,’ i.e. to turn something or someone hard or to become hard, in literal or figurative sense, with different constructions: from the intransitive literal sense in om hun kaas te laten harden ‘in order to make their cheese harden’ to the transtive figurative one in Verdriet heeft haar gehard ‘Grief has hardened her.’ The second group, however, includes one transitive construction in a very specific pattern but is more frequent in the sample than all the others combined: (niet) te harden (‘to (not) tolerate,’ always negative).

Finally, haken ‘to hook’ presents semantic distinctions within both the transitive and the intransitive structures. It can refer literally or metaphorically to hooking something or remaining hooked, but there are also two very specific senses: one characteristic of the football context, meaning ‘to make someone trip (by placing a foot in front of them),’ and ‘to crochet.’

In sum, the set of verbs includes cases where only the kind of direct object plays a role in the disambiguation and cases where it interacts with syntactic patterns. Moreover, the specific ways in which these kinds of direct objects are defined differ across verbs: from animacy or agency in the case of haten to concreteness in the case of heffen. The semantic distinctions can also rely on a broader context: diskwalificeren will typically have people as direct object, but the sports-related context defines a specific sense, characterised by distinct motivations and consequences.

4.2 The dataset

For each of the 32 lemmas listed above, about 300 tokens were collected from the QLVLNewsCorpus (described in Section 2.3.1). All attestations were manually annotated by at least three different people based on the definitions found in the Dutch column of Tables 4.1, 4.2 and 4.3. Next to the sense assignment, which was later revised for uniformity — and to include senses emerging from the annotation itself, as mentioned above — the annotation included confidence assignment and selection of disambiguating context words.

The selection of the lemmas involved some introspection as well as consultation of lexical resources and corpus data: thinking of potential candidates, checking the senses reported in dictionaries (Sterkenburg 1991; Boon, Geeraerts & Arts 2007) and estimating their relative frequencies in small concordances. We tried to avoid extremely skewed distributions approximating a monosemous structure or numerous infrequent senses that would be unlikely to stand out in a model.³³ In the end, as we will see, sense frequency is not really an issue, because clouds don’t model senses anyways.

The exploration of these samples of concordances also served for the calculation of the number of tokens to model and annotate. Regardless of the actual frequency of the items in the corpus, the minimum sample contained 240 tokens; it was raised to 280 if any of the senses had a relative frequency below 20% in the sample, to 320 if it was below 10%, and to 360 if there were many senses and therefore some had a low frequency (e.g. heet). The lower and upper bound were estimated from pilot studies of clouds as a large enough amount to warrant the use of this methodology and small enough to make sense of in the visualization tool. Table 4.4 shows the absolute frequency (in the 520mw QLVLNewsCorpus) of each selected lemma, the size of the sample and the distribution of the senses: the more the boxplot in the rightmost column goes to the right, the more frequent one of the senses. For example, the long boxplots for blik and hoop indicate a very skewed distribution, i.e. a sense with very high frequency and senses with very low frequencies, while the narrow, centred boxplots for hachelijk and hemels indicate that their senses are equally frequent. The sample extraction was almost completely random, with the only restriction that no two instances of the same lemma would be extracted from the same file. There were, however, a few duplicates, due to repetition of the same fragment on different dates.

Table 4.4: Absolute frequency of the lemmas in the corpus, number of batches and distribution of their senses. The number next to the boxplots indicate the number of different senses.
lemma	frequency	sample	senses
nouns
spot	3496	240	5
horde	3224	280	4
blik	22175	280	4
staal	5796	320	5
schaal	14249	320	5
stof	24502	320	5
hoop	41946	320	3
adjectives
hachelijk	1307	240	2
hemels	1417	240	2
heilzaam	1476	240	2
hoopvol	3680	240	2
geldig	5128	240	2
hoekig	1242	280	3
geestig	3970	280	3
gekleurd	4520	280	3
dof	1268	320	4
gemeen	2997	320	7
grijs	13567	320	7
goedkoop	40669	320	4
heet	10676	360	7
verbs
herroepen	848	240	2
herstructureren	936	240	3
diskwalificeren	1084	240	3
huldigen	4091	240	2
heffen	4799	240	2
haten	4828	240	3
herstellen	28814	240	6
herinneren	33432	240	3
helpen	87136	240	6
harden	1050	320	5
herhalen	16856	320	4
haken	1403	360	6

For each of the tokens a concordance line was extracted with 15 words to either side. Bachelor students of Linguistics at KU Leuven were recruited and hired to manually annotate the samples of the selected lemmas. Each of them was tasked with annotating 40 tokens of each of 12 types (at least three nouns, four adjectives and four verbs, plus one of either of the categories)³⁴: a total of 480 tokens³⁵, to annotate in 6 weeks. In total, each of the 9600 tokens was annotated by at least three annotators; 10% of them were annotated by four. Each lemma was split in 6-9 batches of 40 tokens, each of them annotated by a different group of annotators. The annotators were offered an introductory meeting, a video tutorial and written guidelines, but the procedure itself was performed individually.

Both the lemmas and the batches were assigned randomly, while keeping in mind the part-of-speech distribution. It was the intention to shuffle the samples of each lemma before splitting them into batches, but something went wrong with the code and they were ordered by source; each batch would have mostly tokens of a different newspaper. The annotation involved three tasks:

Assign a sense from a predefined set of definitions, namely the Dutch column in Tables 4.1 through 4.3. If none of the tags apply, select “None of the above” and explain why;
Express the confidence of the decision in a scale of 6 values;
Identify the words of the context that helped in the disambiguation.

Since entering textual information in a spreadsheet can easily lead to typos and inconsistencies and, furthermore, annotating the helpful context words is particularly challenging in such a tool, a user-friendly visual interface was designed that received input from buttons and returned the output in json format. The interface, which is not available in its original form any more, had a menu with the list of lemmas and two tabs: an overview of the concordance lines of the selected type and an annotation workspace (Figure 4.1). The annotation workspace focused on one concordance line³⁶ (or token) at a time, offering first the text, then a series of long radio buttons with the definitions and examples, a star rating option for the confidence evaluation, followed by a clickable reproduction of the text, and a text input field for comments. The long radio buttons meant that the annotators had the full definitions and examples at their disposal every time they had to assign a sense for a given lemma, while the final output transformed their decisions into more manageable codes, such as sense_1, sense_2, etc. The clickable concordance lines let them select the context words they deemed most useful to the annotation procedure by simply clicking on them; the program then translated this as an array of positions relative to the target, e.g. ["R1", "L2"] if the first word to the right and the second to the left are selected.³⁷ Finally, the text input field at the bottom was available to leave any sort of comment and was compulsory when “None of the above” was selected.

Figure 4.1: Screenshot of the options in the annotation tool.

The dataset obtained from this procedure is very rich and interesting for a variety of purposes. For each token we have sense assignment, confidence evaluation and selection of informative cues by at least three different independent annotators, as well as comments on at least the cases which did not receive a sense. Agreement between the annotators can be measured with coefficients such as Fleiss’ \(\kappa\) (Fleiss 1971), illustrated in Figure 4.2, but the resulting picture may be unnecessarily complex. First, disagreement is susceptible to granularity: annotators might disagree between senses of a noun but not between the homonyms, except for their confusion between idiomatic senses of stof ‘dust’ and its ‘topic, material’ sense. Second, annotators were not very sensitive to grammatical distinctions (e.g. between transitive and intransitive senses), which was a strong reason for disagreement in herstructureren, helpen, haken and herstellen. Third, disagreements were sometimes concentrated on one annotator, who showed a strong preference for a certain sense; as such, they were not an indicator of the ambiguity of the token but of misunderstandings on the part of the annotator. Some annotators exhibited an almost excessive attention to nuances, while others were much less thorough.

More importantly, for the great majority of the tokens (83.8%) the majority of the annotators agreed on one tag that remained as the official sense for that token. After gathering and exploring the data, the tokens were reread by me and a final decision was made for their sense tags. Figure 4.3 shows the number of tokens with full agreement, a majority agreement (i.e. only one annotator disagreed) or no agreement and whether the same chosen sense was kept in the final annotation, another tag was applied or the token was removed (e.g. tokens of heet that corresponded to the verb heten). The Other category includes new senses suggested by the annotators themselves as well as corrections from misunderstandings, such as the second original sense of blik, which annotators interpreted in different ways and was actually not attested in the dataset. The very few cases of Same with no agreement were tokens annotated by four annotators where two of them selected the senses that remained, while the other two disagreed.

Figure 4.2: Agreement between annotators per batch per lemma, computed with irr::kappam.fleiss() (Gamer et al. 2019).

Number of tokens per lemma with full, partial (majority) or no agreement, split by whether the majority sense was kept or changed. Removed tokens are not included.

Figure 4.3: Number of tokens per lemma with full, partial (majority) or no agreement, split by whether the majority sense was kept or changed. Removed tokens are not included.

In addition, the final sense distribution is not significantly different from that in the much smaller pilot samples. Distribution across batches, instead, was affected by regional variation. For example, Belgian sources include more sports-related articles than the Netherlandic sources, leading to variation in the sense distribution of lemmas with such a sense (diskwalificeren ‘to disqualify,’ haken ‘to make someone trip’ and horde ‘hurdle’) across regions. This discrepancy in distribution across batches could have been avoided if the tokens had been properly shuffled.

Around 4% of all the assigned tags where “None of the above,” with a clearly uneven distribution. The lemmas with the largest amount of were haken, with 117 tokens in which three annotators chose “None of the above” and 72 in which two of them did. Heet and harden follow with 69 and 90 tokens with 3 such tags and 14 and 10 with two. Many of these were due to wrong lemmatization: the concordance of haken had many instances of afhaken ‘to stop’ or met haken en ogen, an idiomatic expression in which it is a noun; the concordances of heet and harden included instances of the verb heten ‘to call, to be named’ and the adjective hard respectively. In a similar way, many of the tokens in the concordance of heffen were instances of opheffen ‘to lift/to cancel,’ but the annotators did not always catch these cases. The verbs afhaken and opheffen are separable verbs in Dutch: in some constructions, the prefix is separated from the root, so that a syntactic parser might confuse them with a different verb and a preposition. Next to these issues, annotators assigned “None of the above” in cases where the tokens did not match any of the suggested senses, especially in cases of idiomatic expressions such as hete aardappel ‘hot potato.’ All these annotations where classified in four categories: wrong_lemma, for the cases of wrongly selected concordance lines, was assigned to 413; not_listed, assigned 421 times, indicated that the lemma was correct but none of the suggestions applied; unclear (240 times) was used when the token could not be parsed by the annotator, and between (45 cases) referred to doubt between two or more of the given options. These different classes informed later decisions such as whether to add or remove senses or tokens.

Tokens were removed for different reasons. Next to the cases where the concordance line did not belong to begin with (including adverbial uses of the adjectives), there were some indecipherable tokens, extremely infrequent senses (e.g. 4, 5 tokens out of 250) and duplicated tokens. In total, 424 tokens were removed, 109 of which belonged to haken.

Confidence values were explored but not used, because they tend to be similar across batches, lemmas and senses, with a tendency towards the highest values and variation across annotators instead: what is low confidence for some of them is high confidence for others. Figure 4.4 breaks this down in terms of the degree of agreement and whether the assigned tag matched one of the senses offered or not. Note that the top facet, “None of the above,” has much lower counts than the lower facet. We would expect confidence ratings to be lower for annotations that do not agree with the other votes for the same token and, in relative terms, that is the case. Confidence assignment to a “None of the above” tag is ambiguous: some annotators tend to give them the minimum confidence because they are not confident about the meaning of the concordance line, while others assign a high value because they are confident that none of the other options applies.

Distribution of confidence values across annotations, by whether the annotators agreed with another in the same token and by whether they selected a sense or “None of the above.”

Figure 4.4: Distribution of confidence values across annotations, by whether the annotators agreed with another in the same token and by whether they selected a sense or “None of the above.”

The selection of cues was consulted when defining parameter settings (Section 2.3): if two annotators agreed on both the sense tag and a context word for a given token, that context word was considered an official cue for that sense. From the relative position representing the cue in the output of the annotation tool, other information available in the corpus could be extracted and counted, such as the lemma of the context word, its dependency relation (or distance) to the target and its bow distance to the target. For example, Tables 4.5 and 4.6 list the most frequent dependency paths, lemmas and window sizes across the official cues of heilzaam ‘healthy/beneficial’ for each of its senses. As we will see again in Section 6.2.1, this lemma is characterised by frequent nouns modified by the target, namely werking ‘effect,’ effect and invloed ‘influence,’ which are ambiguous in terms of the senses of heilzaam: in a sentence such as de heilzame werking van look ‘the healing power of garlic,’ garlic is a better cue in the ‘health/beneficial’ distinction than werking ‘effect, power.’ Nonetheless, annotators did select these context words as cues for both senses, not realising that they were not distinctive of one or the other sense. The pattern fulfilled by garlic in this example was indeed captured by some cues, as shown in the third line of Table 4.5, but it is much less frequent.

Table 4.5: Four most frequent dependency paths among the cues of *heilzaam*, with counts per sense. `NA` indicates that the cue is not in the sentence of the target. In the path, `CW` stands for the cue and `T` stands for the target: the head is at the left of \(\rightarrow\) and its dependents are to the right, preceded by the name of the dependency relation.
path	examples	beneficial	healthy
CW \(\rightarrow\) mod:T	heilzame werking ’healing power	60	41
NA	Different sentence	13	32
werking \(\rightarrow\) [mod:T,mod:van \(\rightarrow\) obj1:CW]	de heilzame werking van look ’the healing power of garlic	8	14
ben \(\rightarrow\) [predc:T,mod:voor \(\rightarrow\) obj1:CW]	look is heilzaam voor de gezondheid ’garlic is beneficial for the health	7	1

Table 4.6: Six most frequent lemmas and window spans among the cues of *heilzaam*, with counts per sense.
CW	healthy	beneficial	BOW	healthy	beneficial
werking/noun	20	12	1	74	48
effect/noun	5	9	4	38	28
gezondheid/noun	5	0	3	27	23
lichamelijk/adj	4	0	2	18	22
medisch/adj	4	0	5	21	20
economie/noun	0	4	6	21	14

4.3 Summary

In this chapter we looked at the dataset used to test and explore the workflow and the visualization tools. The selection of lemmas was described along with the semantic phenomena they would allow us to test. Afterwards, the annotation procedure was delineated, from the extraction of concordances to the assignment of senses, confidence values and cues.

As was mentioned before, for each of the lemmas, 200-212 models were generated following the workflow described in Chapter 2. The cues selected by the annotators informed some of the decisions involved in the parameter settings. The sense annotation was applied to assess how well the models performed at disambiguation: initially, we did not try to match senses to clustering solutions, but looked for a spatial configuration that might hide more subtle relationships. As a few examples in Chapter 3 have shown, this is much more straightforward in some lemmas than in others.

The range of semantic phenomena was meant to provide different possible aspects of meaning that distributional models might be able to capture. From a lexicological point of view, “similarity of distribution correlates with similarity of meaning” is not enough. What is similarity of meaning?³⁸ Does this mean that more granular distinctions, such as senses within homonyms, will be more difficult to capture than coarser distinctions, i.e. the homonyms themselves? Are metonymy, metaphor and specialization modelled by the same parameter settings? Can they be discriminated, can we fine-tune models to capture one or the other? And what is the role of constructions: does argument structure interfere in the modelling of senses? These were the questions that the case studies presented here tried to address, and the following part of this dissertation will present the answers.

3 Visualization tools

5 A cloud atlas