تعداد نشریات | 24 |
تعداد شمارهها | 796 |
تعداد مقالات | 6,575 |
تعداد مشاهده مقاله | 9,650,138 |
تعداد دریافت فایل اصل مقاله | 7,325,280 |
ملاکهای برچسبگذاری استعاره: گامی بهسوی ساخت پیکرهی استعاره | ||
زبان پژوهی | ||
مقالات آماده انتشار، پذیرفته شده، انتشار آنلاین از تاریخ 11 آبان 1401 | ||
نوع مقاله: مقاله پژوهشی | ||
شناسه دیجیتال (DOI): 10.22051/jlr.2022.40681.2192 | ||
نویسنده | ||
محمدسعید میری ![]() ![]() | ||
دانشگاه علامه طباطبائی، تهران، ایران. | ||
چکیده | ||
فهم معنای زبان انسان برای ماشین همواره با چالشهایی روبهروست. استعاره نیز یکی از موضوعات دشوار در پردازش معنایی است. درک استعاره در ارتقا و توسعهی فعالیتهای حوزهی پردازش زبان طبیعی اهمیت فراوانی دارد. مقالهی حاضر به معرفی روشی برای شناسایی استعارهها در زبان فارسی میپردازد. هدف این مقاله پیشنهادِ شیوهنامهای است که به کمک آن بتوان پیکرهای برای استعارههای فارسی تدوین کرد. برای انجام چنین کاری نیاز است که ملاکهایی برای شناسایی و برچسبگذاری استعاره معرفی شود. روال شناسایی استعارهی دانشگاه آزاد آمستردام (MIPVU) میتواند انواع استعاره را شناسایی کند و در ساخت پیکرهی استعاره نیز از آن میتوان بهره برد. مبنای مقالهی حاضر نیز همین روال شناسایی است. بهرهگیری از این روال از دو جهت مفید است: نخست، به کمک این روال میتوان پیکرهای از استعارههای فارسی تهیه کرد، دوم، به پژوهشگران حوزهی استعاره، بهویژه زبانشناسان پیکرهای، کمک میکند که با روایی و پایایی مطلوبی استعارههای فارسی را تحلیل کنند. لازم به ذکر است که برای آزمودن کارآیی روال پیشنهادی، پیکرهای از دادههای فارسی (متون خبری و دانشگاهی) گردآوری و توسط سه کارشناس برچسبگذاری شده و نتایج مطلوبی (ضریب کاپای ۰.۹۶۴) به دست آمده است که در پژوهشی مجزا به آن خواهیم پرداخت. | ||
کلیدواژهها | ||
استعاره؛ زبانشناسی پیکرهای؛ پیکرهی استعارهی فارسی؛ شناسایی استعاره؛ روال شناسایی استعاره (MIPVU) | ||
عنوان مقاله [English] | ||
Metaphor Annotation Criteria: A Step towards Building a Metaphor Corpus | ||
نویسندگان [English] | ||
Mohammad Saeid Miri | ||
Allameh Tabataba'i University, Tehran, Iran. | ||
چکیده [English] | ||
1. INTRODUCTION Due to the development of smart devices, the ability of computers to understand human language has become a key issue in technology. By learning and analyzing machine-readable, annotated linguistic data, computers are able to comprehend human language (corpus). Corpora play a crucial role in helping computers comprehend human language. Metaphor is one of the most complicated linguistic data that computers cannot comprehend. Despite the prevalence of metaphor in everyday language use and the importance of identifying it, no corpus has been published for Persian yet. Compiling a corpus of Persian metaphors is the initial step in learning metaphors for computers. To compile a corpus of Persian metaphors, it is necessary to meet two main criteria. Deciding the best definition is the first prerequisite. The best definition is both comprehensive and applicable; comprehensive in the sense that it covers a significant proportion of metaphorical instances, and applicable in the sense that it could be used to build a corpus. The second requirement is the construction of a method for metaphor identification. Without a straightforward data annotation method, it is impossible to identify metaphors. 2. MATERIALS AND METHODS Various definitions and theories of metaphor exist in the academic literature (Black, 1993; Fauconnier & Turner, 2002; Gibbs, 1999; Glucksberg & Keysar, 1990; Lakoff & Johnson, 1980; Ortony, 1993). In addition to theoretical endeavors, the literature on operationalizing metaphor identification is expanding (Cameron, 1999, 2003; Deignan, 1999; Low, 1999; Steen, 1999). The Pragglejaz Group (2007) introduced the first serious method for identifying ‘linguistic’ (not ‘conceptual’) metaphor: the Metaphor Identification Procedure (MIP). Although MIP is an explicit, step-by-step procedure, feedback from numerous studies suggest that disagreement among experts, average reliability, and the exclusion of other metaphor examples (inadequate validity) prompted Steen et al. (2010) to introduce MIPVU, a revised method for metaphor identification. The following is the general guideline (Steen et al., 2010: 23-24): 1. Find metaphor-related words (MRWs) by examining the text on a word-by-word basis. 2. When a word is used indirectly and that use may potentially be explained by some form of cross-domain mapping from a more basic meaning of that word, mark the word as metaphorically used (MRW). 3. When a word is used directly and its use may potentially be explained by some form of cross-domain mapping from a more basic referent or topic in the text, mark the word as direct metaphor (MRW, direct). 4. When words are used for the purpose of lexico-grammatical substitution, such as third person personal pronouns, or when ellipsis occurs where words may be seen as missing, as in some forms of co-ordination, and when a direct or indirect meaning is conveyed by those substitutions or ellipses that may potentially be explained by some form of cross-domain mapping from a more basic meaning, referent, or topic, insert a code for implicit metaphor (MRW, implicit). 5. When a word functions as a signal that a cross-domain mapping may be at play, mark it as a metaphor flag (MFlag). 6. When a word is a new-formation coined by the author, examine the distinct words which are its independent parts according to steps 2 through 5. Given that the main objective of this paper is to operationalize metaphor identification, MIPVU is the best, most ‘comprehensive’ and ‘applicable’ method. MIPVU's guidelines are provided in Steen et al (2010). This paper assesses the procedure to determine whether it will be accepted in Persian due to its language-specific characteristics. 3. RESULTS AND DISCUSSION While the MIPVU is explicit, detailed, and (in some cases) adaptable, it is not without flaws. Steen et al. (2010) welcome new versions of MIPVU (in multiple languages) in order to identify its shortcomings. This section will examine one of the difficulties associated with identifying metaphors: the demarcation of lexical units. An essential element of the MIPVU is the unit of analysis. Steen et al. (2010: 27) called a word a lexical unit “for theoretical reasons.” Even in the English version of the MIPVU, there are exceptions (such as phrasal verbs) for which Steen et al. (2010) provided guidelines. The most prevalent issue in the other variants of MIPVU is lexical unit demarcation (Herrmann et al., 2019; Nacey, Greve et al., 2019; Pasma, 2019). In Persian, polywords, compound verbs, and compound nouns pose the greatest difficulty in demarcating lexical units. This paper suggests that it may be possible to define three labels: ‘cv’ (for compound verbs), ‘p’ (for polywords), and a numbered 'extra element' with a numerical attribute. We can solve the demarcation issue by labeling the 'extra element' with ‘cv’ (or ‘p’) and assigning it a number. 4. CONCLUSION This paper aimed to introduce a method for identifying Persian metaphors based on MIPVU. The most significant finding of this study is the validation of the proposed method for identifying Persian metaphors. This procedure is straightforward, takes into account language-specific properties, and allows the researcher to make case-specific decisions. The proposed method was able to pass all of the reliability tests (κ = 0.964) and is an effective method for identifying Persian metaphors. Statistical analysis and reliability results will be discussed in another paper. The ability to quantify the study of metaphors in Persian is yet another accomplishment of this paper. With its various constraints on the demarcation of lexical units and the analysis of their basic and contextual meanings, the proposed method enables the researcher to provide a quantitative, detailed, measurable, and trustworthy analysis. | ||
کلیدواژهها [English] | ||
Metaphor, Corpus linguistics, Persian metaphor corpus, Metaphor identification, Metaphor Identification Procedure (MIPVU) | ||
مراجع | ||
| ||
آمار تعداد مشاهده مقاله: 1,232 |