Example I: Shadow

When we mention “shadow," semantically it often carries certain negative symbolism: it is where light cannot reach, the obscured and overlooked corners, the hidden recesses and anxieties of the human heart. In terms of meaning, “shadow" is not merely a physical phenomenon but also a psychological metaphor—it points to fragility, fear, and the weight of memory. However, if we shift our perspective to the language of architecture and space, “shadow" reverses its imagery. It is no longer the antithesis of light but becomes “an extension of light." Shadow in architecture creates coolness, shelter, and rhythm; it constructs inhabitable spaces under scorching sunlight and allows light to be “seen." Therefore, “shadow" in spatial contexts contains a gentle protection, a vocabulary of dialogue between architecture and nature. Just like the vast shadow beneath the steel structure in this photograph—it is not a symbol of oppression but a form of existential restraint. Between light and darkness, shadow teaches us how to situate ourselves.
例 I : 陰影
當我們提到「陰影」,語義上往往帶有某種負面的象徵:那是光無法抵達的地方,是被遮蔽、被忽視的角落,是人心的隱秘與不安所在。語意上,「陰影」不僅是物理現象,更是一種心理隱喻——它指向脆弱、恐懼、記憶的重量。然而,若我們將視角轉向建築與空間的語言,「陰影」卻反轉了意象的方向。它不再是遮蔽光的對立,而成為「光的延伸」。建築中的陰影,創造了涼爽、庇護與節奏;它在酷熱的陽光下構築出人可棲身之所,也讓光得以被「看見」。因此,「陰影」在空間中蘊含著一種溫柔的保護,是建築與自然之間的對話語彙。就如同這張照片中鋼構支架下的大片陰影,不是壓抑的象徵,而是一種存在的節制——在光與暗之間,陰影讓我們學會如何安放自己。
Example II: Factory

The word “factory" semantically first refers to a concrete place—a production space where machinery, energy, and human labor collaborate; it is steely, noisy, and regulated. In terms of meaning, however, “factory" is not merely a physical structure but a metaphor: it represents the extreme of human-made order, a symbol of humanity’s transformation of nature into productive logic.
Yet, if we re-examine from a systemic perspective, the essence of “factory" is not just “an artificial mechanical domain" but more like an organism—with input, transformation, and output. Raw materials enter like an organism consuming food; waste is expelled like biological excretion; and in this process, energy is converted, structures are reorganized, giving birth to products with “meaning."
This rhythm of input and output is one form of “life." Only the factory simulates an artificial metabolism with cold steel structures and pipelines: it devours the world’s resources and spits out civilization’s artifacts.
Therefore, “factory" is not just a site of production but an extension of human thought—we let machines breathe and metabolize in our place, making them another kind of body. In that metallic clamor and shadow, “factory" simultaneously symbolizes creation and consumption, order and decay—an “artificial life" co-constructed by humans and machines.
例 II : 工廠
「工廠」這個詞在語義上,首先指涉一個具體的場所——一個以機械、能源與人力協作的生產空間;是鋼鐵的、噪音的、規律的。而在語意上,「工廠」卻不僅僅是物理結構,而是一種隱喻:它代表人造秩序的極致,是人類將自然轉化為生產邏輯的象徵。
然而,若我們從系統性的角度重新審視,「工廠」的本質並非僅是「人工的機械場域」,而更像是一個有機體——有輸入、有轉化、有輸出。原料進入,如同生物的進食;廢料排出,如同生物的排泄;而在這過程中,能量被轉換、結構被重新編排,誕生出具有「意義」的產物。
這種輸入與輸出的節奏,正是「生命」的形式之一。只不過,工廠以冷冽的鋼構與管線模擬出一種人造的新陳代謝:它吞噬世界的資源,又吐出文明的器物。
因此,「工廠」不只是生產的場域,更是人類思維的延伸——我們讓機械替代自身去呼吸、去代謝,於是它成為我們的另一種身體。在那金屬的喧囂與陰影之中,「工廠」同時象徵著創造與消耗、秩序與衰變,是人與機械共構的「人工生命」。
Semantics and Meaning in Language Models
I. The Concepts of “Semantics" and “Meaning"
1.1 Etymology and Definitions
Semantics
Derived from the Greek “semantikos," meaning “significant." In linguistics, semantics studies the relationship between linguistic signs and their referents, emphasizing objective, systematic meaning structures. Semantics tends toward formal semantics, pursuing verifiable truth conditions and focusing on under what conditions sentences are true. This is a compositional viewpoint, believing that the meaning of the whole is composed of the meanings of its parts.
Meaning
Focuses more on subjective understanding and interpretation, involving language users’ comprehension and intention inference in specific contexts. Meaning concerns “the understanding of significance" rather than just “the structure of significance." It aligns more closely with pragmatics and cognitive linguistics, emphasizing how context, cultural background, and speaker intention influence meaning comprehension. This is a dynamic, context-dependent perspective.
| Term | English Equivalent | Etymological Structure | Focus | Common Application Fields |
|---|---|---|---|---|
| 語義 (Semantics) | semantics | language (語) + meaning (義) | “Meaning structure of linguistic signs" or “relationship between language and referential world" | Linguistics, logic, semiotics, artificial intelligence |
| 語意 (Meaning) | meaning | language (語) + intention/sense (意) | “Speaker or text’s intention, connotation, psychological dimension of meaning" | Psycholinguistics, literature, pragmatics, sense analysis |
II. “Semantic Structure" and Its Correspondence in LLM Transformers
Contemporary language models are built on distributional semantics, with the core assumption that “a word’s meaning is determined by its context of occurrence" (You shall know a word by the company it keeps). Through large-scale corpus training, models learn statistical co-occurrence patterns between vocabulary and sentence structures.
Language model training is based on massive textual data, with core capabilities in learning statistical regularities and contextual relationships. Especially LLMs using Transformer architecture, through their core Self-Attention Mechanism, can:
- Capture syntactic and semantic dependency relationships: Models understand syntactic structures and logical relationships between words and phrases. For example, in the sentence “The cat chases the mouse," the model knows “cat" is the subject, “chases" is the verb, “mouse" is the object, and understands this as a relationship between action initiator and recipient. This constitutes an internalized, statistical “semantic structure".
- Establish vector space mapping: Through word embeddings or more advanced token representations, models map vocabulary into high-dimensional vector spaces. In this space, semantically similar or related words are closer together (e.g., “apple" and “banana"), enabling the model to “understand" word meanings and relationships at an abstract level, handling tasks like synonymy, antonymy, or analogy.
- Contextualized meaning parsing: Models can adjust token representations based on context, achieving polysemy disambiguation. For instance, distinguishing “apple" as fruit in “I ate an apple" versus as a brand name in “Apple Inc." This capability is built on deep grasp of semantic structure in entire texts.
Overall, language models’ understanding of “semantic structure" is more like efficient, large-scale pattern matching and relationship mapping rather than human-level cognition or reasoning. They know which word sequence combinations are statistically “correct" and “meaningful."
Within the model, “semantics" corresponds to structured linguistic meaning space (semantic space).
This is the part that can be explicitly represented by mathematical vector operations.
Actual Correspondence Levels:
| Semantic Concept | Transformer Corresponding Structure |
|---|---|
| Meaning differences between words | Distance in semantic space of embedding vectors (cosine similarity) |
| Syntactic structure/roles | Attention head associations (subject-predicate-object correspondence) |
| Logical consistency (true/false, entailment) | Sentence patterns and syntax-semantic structure mappings learned during pretraining |
| Semantic inference (semantic entailment) | Contextual embeddings formed by accumulated weights across Transformer layers, capturing semantic logical relationships |
Example:
The model knows “dog" and “cat" have close semantic distance because they appear in similar contexts;
but are distant from “table." → This is statistical representation at the “semantic layer."
2.1 Formal Symbol Manipulation vs. True Understanding
Philosopher John Searle’s “Chinese Room Argument" poses a classic challenge: does a system that can perfectly manipulate symbols and produce correct outputs truly “understand" the meaning of those symbols?
Language models’ operations can be analogized as:
- Syntactic level: Highly precise, capable of handling complex syntactic structures
- Semantic level: Can establish vocabulary relationship networks, demonstrating semantic consistency
- Meaning level: Lacks real-world experiential foundation and subjective consciousness
Cognitive scientist Stevan Harnad points out that for symbols to have meaning, they must be “grounded" in perceptual experience. Language models face fundamental challenges:
- Their “semantics" are built on relationships between texts
- Lack direct interactive experience with real-world physical objects
- Cannot connect the word “red" with actual visual perception
This means models may grasp “semantic relationships" (red is a color, different from blue) but may not possess “meaning understanding" (knowing what it feels like to see red).
III. “Meaning Tendency" and Its Correspondence in LLM Transformers
Compared to structural semantic understanding, “meaning tendency" focuses on speaker intention, psychological representation, and contextual understanding. True meaning comprehension involves:
- Theory of Mind: Understanding others’ beliefs, intentions, and emotions
- Common sense reasoning: Applying unstated background knowledge
- Practical inference: Deriving implicature from literal meaning
Language models show unstable performance in these areas:
- Prone to failure with irony, metaphor, and other phenomena requiring deep intention inference
- Lack intuitive understanding of physical causal relationships
- Cannot truly experience subjective quality of emotions (qualia)
Modern language models demonstrate surprising contextual understanding:
- Can disambiguate based on context (e.g., “bank" as financial institution or riverbank)
- Understand implicit referential relationships and elliptical structures
- Maintain topic coherence in multi-turn dialogue
These capabilities seem to touch the edge of “meaning understanding," but fundamental limitations remain.
Compared to structural understanding, “meaning tendency" manifests more in the model’s output and generation process. It refers to the model’s emotional coloring, thematic direction, stylistic choices, and value orientationsexhibited under specific prompts and contextual guidance.
- Emotional and stylistic tendencies: When generating text, models present specific sentiment (positive, negative, neutral) or writing styles (formal, colloquial, academic, humorous) based on training data patterns. This tendency is the model’s statistical choice of style weights when converting input semantic structure to output.
- Thematic and content focus: When users input a prompt about “climate change," the model tends to generate content related to “greenhouse effect," “environmental policies," “extreme weather," etc. This focus and elaboration on specific themes represents a manifestation of meaning tendency.
- Embodiment of bias and values: This is the most controversial aspect of meaning tendency. Since models are trained on real-world human texts, if training data contains inherent social biases (gender, racial, regional biases), models may unconsciously reproduce or amplify these biases in generated content, displaying specific value orientations. Although corrections can be made through alignment and fine-tuning, the model’s essence remains reflecting the meaning tendencies of its training data.
While “meaning" belongs to the interpretive layer of model output for human understanding, involving:
- Pragmatic context
- User intention (instruction)
- Model alignment (aligning with human values, tone, emotion, pragmatic habits)
In other words, “meaning" in LLMs is not a built-in structure but a “pragmatic response layer" formed by context + prompt + fine-tuning stages.
Actual Correspondence Levels:
| Meaning Concept | LLM Corresponding Mechanism |
|---|---|
| Contextual interpretation, pragmatic inference | Attention dynamic weighting + contextual embedding |
| Tone, emotion, style | Response patterns learned during alignment (RLHF, DPO) |
| User intention understanding | Prompt conditioning + chain-of-thought decoding |
| Figurative meaning and polysemy | Nonlinear combinations in high-dimensional semantic space (multi-layer attention capturing multiple contexts) |
Example: When a user inputs “You’re so smart," the model judges from training context whether this is “praise" or “sarcasm." This is understanding and generative behavior at the meaning layer, not structural mapping at the semantic layer.
It’s a question of the degree of understanding. Perhaps the question is not whether language models “do" understand meaning, but “to what extent" they understand. We can view understanding as a continuum:
Shallow meaning → Semantic association → Contextual reasoning → Intention understanding → Deep empathy
Language models may be in the middle of this spectrum, possessing a form of “functional meaning understanding" but lacking the experiential and subjective dimensions of human understanding.
IV. The Transformation Process from Semantic Structure → Meaning Tendency (Internal Model Correspondence)
We can view Transformer’s abstraction levels this way:
[Token Embedding Layer] → Formal semantics (word-level semantics)
[Self-Attention Layer] → Structural semantics (syntactic-semantic composition)
[Contextual Layer Stack] → Composite semantics (conceptual relations)
[Decoder / Output Layer] → Pragmatic meaning (pragmatic meaning)
[Alignment + RLHF] → Social meaning (socialized meaning)
In other words: The Transformer body processes “semantics," while the alignment stage endows it with “meaning."
Comparison with human language understanding (philosophical layer):
| Dimension | Human | LLM |
|---|---|---|
| Semantics | Meaning formed through concept and perceptual categorization (symbol-object correspondence) | Statistical associations between symbols learned from corpus distribution |
| Meaning | Interpreted through consciousness, context, and intention (has psychological content) | Context-consistent output generated through prompt and context conditioning (no subjective intention) |
| Consciousness association | Has self and referential capability (subjectivity) | Only statistical imagery and probability weighting (no consciousness, but can simulate pragmatics) |
So philosophically speaking:
LLM’s “semantics" is simulated semantics,
while “meaning" is probabilistic pragmatic mimicry.
Engineering perspective on semantic/meaning layer division:
| Architectural Component | Corresponding Layer | Function |
|---|---|---|
| Tokenizer / Embedding | Semantic foundation | Converts symbols into operable semantic vectors |
| Attention Heads | Semantic composition | Captures syntactic relationships and inter-word dependencies |
| Feed-forward Layers | Semantic compression | Abstracts local semantics into high-level concepts |
| Decoder Sampling / Beam Search | Meaning realization | Generates contextually appropriate sentences based on context |
| Alignment (RLHF / DPO) | Meaning socialization | Aligns with human pragmatics and value orientations |
Correspondence Table:
| Item | Semantics | Meaning / Pragmatics |
|---|---|---|
| Abstraction level | Internal model structure and vector relationships | External context and generative intention |
| Transformer correspondence | embedding, attention, layer representation | decoding, alignment, prompt interpretation |
| Nature | Structured, formalizable | Contextualized, alignable (context-conditioned) |
| Primary learning source | Distributional statistics of pretraining corpus | Instruction fine-tuning and human feedback (RLHF) |
| Verification method | Similarity, word embedding distance, semantic consistency tests | Dialogue coherence, tone, intention, emotional consistency |
| Philosophical correspondence | Systematic correspondence between symbols and meaning (Frege’s reference) | Subjective interpretation and pragmatic behavior (Wittgenstein’s language game) |
Distinction and Connection Between Structural Understanding and Tendency Generation
| Feature | Semantic Structure Understanding (Structure) | Meaning Tendency Generation (Tendency) |
|---|---|---|
| Essence | Internalized statistical patterns and relationship mapping | Externalized stylistic choices and content biases |
| Function | Ensures text is grammatically and logically correct and interpretable | Determines the style, emotion, and stance of generated text |
| Level | Belongs to the model’s internal core capability | Belongs to the model’s output performance and application |
| Human correspondence | Similar to cognitive judgment of syntax, semantics, and logic | Similar to the stance, emotion, or rhetoricadopted during expression |
Summary:
- Structure is the foundation of tendency: Language models must first accurately understand semantic structure (e.g., knowing the affirmative semantic structure of “climate change is a serious threat") before they can manifest tendency in generation (e.g., elaborating with anxious tone and environmental advocacy stance).
- Tendency guides structural choices: Conversely, anticipated meaning tendency also influences the model’s choice of vocabulary, sentence patterns, and discourse structure during generation. A requirement for “optimistic" tendency will guide the model to select positively-oriented vocabulary like “bright" and “hope" to construct semantic structure.
V. Breakthroughs and Limitations of Multimodal Models
5.1 Attempts at Perceptual Grounding
Next-generation multimodal models integrate vision and language:
- Can “see" images and describe content
- Connect visual features with linguistic concepts
- Alleviate the grounding problem to some extent
Does this mean they’re closer to true “meaning understanding"? The answer remains unclear. Even if models can process visual input, this processing is still statistical pattern recognition, not “perceptual experience" in the phenomenological sense.
5.2 Absence of Embodied Cognition
Philosophers and cognitive scientists emphasize the importance of embodied cognition: meaning formation cannot be separated from bodily interaction with the environment. Language models:
- Have no body, cannot engage in physical interaction
- Have no survival needs, lack motivation for purposeful behavior
- Have no temporally continuous subjective experience
These fundamental differences may limit their possibility of achieving human-like “meaning understanding."
Conclusion
Transformer’s internal mechanisms process “the formation of semantic structure," while alignment and generation processes enable it to present “the flow of meaning."
In other words:
“Semantics is the skeleton by which models understand language; meaning is the soul by which models imitate human language.“
The distinction between “semantics" and “meaning" is not merely terminological differentiation but reflects different approaches to understanding language and mind. Language models demonstrate exceptional capability in processing semantic structure, capturing vocabulary relationships, maintaining semantic consistency, and generating coherent text. However, in complete meaning understanding—involving subjective experience, intention inference, and contextual interpretation—they still have fundamental limitations.
The power of language models lies in their skillful combination of statistical, precise understanding of semantic structure with flexible, stylistically rich generation of meaning tendency.
One-sentence summary:
“Semantics" is logical structural meaning within the symbol system,
“Meaning" is human understanding and interpretive intention of language.
In other words, semantics is the “logic" of language, while meaning is the “soul" of language.
| Perspective | “Semantics" | “Meaning" |
|---|---|---|
| Logical philosophy (Frege, Russell) | Systematic structure of reference and sense; semantics can be formalized (e.g., first-order logic semantics) | In Frege’s system, meaning is closer to “Sinn" (sense), the psychological understanding of sentences |
| Semiotics (Saussure, Peirce) | Stable meaning of the sign system (signifier ↔ signified) | Meaning refers to how sign users attribute meaning (interpretant) |
| Heidegger and hermeneutics | Semantics is “already-given structural meaning" | Meaning is “the operation of understanding between beings," flowing with context |
語言模型的「語義結構」 與「語意傾向」
一、「語義」與「語意」的概念
1.1 詞源與定義
語義(semantics)
源自希臘文「semantikos」,意為「有意義的」。在語言學中,語義學研究的是語言符號與其所指涉對象之間的關係,強調的是客觀、系統性的意義結構。
語義傾向於形式語義學(formal semantics)的視角,追求可驗證的真值條件,關注句子在何種條件下為真。這是一種組合性(compositional)的觀點,認為整體的意義由部分的意義組合而成。
語意(meaning)
則更側重於主觀理解與詮釋,涉及語言使用者在特定情境中對訊息的理解與意圖推斷。語意關注的是「意義的理解」,而非僅是「意義的結構」。
語意則更貼近語用學(pragmatics)與認知語言學的關懷,強調語境、文化背景、說話者意圖等因素對意義理解的影響。這是一種動態、情境依賴的觀點。
| 詞彙 | 英文對應 | 詞源構造 | 著重點 | 常見應用領域 |
|---|---|---|---|---|
| 語義 | semantics | 語(language)+ 義(meaning) | 指「語言符號的意義結構」或「語言與指涉世界的關係」 | 語言學、邏輯學、符號學、人工智慧 |
| 語意 | meaning / | 語(language)+ 意(intention, sense) | 指「說話者或文本的意向、內涵、心理層面的意義」 | 心理語言學、文學、語用學、語感分析 |
二、「語義結構」在 LLM Transformer 中的對應
當代語言模型建立在分布語義學(distributional semantics)的基礎上,核心假設是「一個詞的意義由其出現的語境決定」(You shall know a word by the company it keeps)。透過大規模語料訓練,模型學習到詞彙、句子結構之間的統計共現模式。
語言模型的訓練基於海量的文本資料,其核心能力在於學習語言的統計規律和上下文關係。尤其是採用 Transformer 架構的 LLM,透過其核心的自注意力機制(Self-Attention Mechanism),能夠:
- 捕捉句法與語義的依存關係: 模型能夠理解句子中詞彙與詞彙之間,以及短語與短語之間的句法結構和邏輯關係。例如,在「貓追老鼠」這個句子中,模型知道「貓」是主語、「追」是動詞、「老鼠」是賓語,並理解這是一個動作的發出者與承受者的關係。這構成了一種內部化的、統計性的「語義結構」。
- 建立詞向量空間的映射: 透過詞嵌入(Word Embeddings)或更先進的 token 表示,模型將詞彙映射到一個高維向量空間中。在這個空間裡,意義相似或語義相關的詞彙會彼此靠近(例如「蘋果」和「香蕉」),這使得模型能夠在抽象層面「理解」詞彙的含義及其之間的關係,從而處理同義、反義或類比等語義任務。
- 語境化的含義解析: 模型能夠根據上下文來調整詞元的表示,實現一詞多義的區分。例如,它能區分「蘋果」在「我吃了蘋果」中作為水果和在「蘋果公司」中作為品牌名稱的不同含義。這種能力是建立在對整段文本語義結構的深度把握之上。
總體而言,語言模型對「語義結構」的理解,更像是一種高效、大規模的模式匹配與關係映射,而非人類層面的認知或推理。它知道什麼詞彙序列的組合在統計上是「正確」且「有意義」的。
在模型內部,「語義」對應的是 結構化的語言意義空間(semantic space)。
這是可被數學向量操作明確表示的部分。
實際對應層級:
| 語義學概念 | Transformer 對應結構 |
|---|---|
| 詞與詞的意義差異 | embedding 向量在語義空間中的距離(cosine similarity) |
| 語法結構/句法角色 | attention head 關聯(主詞-謂語-受詞對應) |
| 邏輯一致性(true/false, entailment) | 在 pretraining 階段學到的語句模式與句法—語義結構映射 |
| 意義推論(semantic entailment) | Transformer 層間權重累積形成 contextual embedding,可捕捉語義邏輯關係 |
| 舉例: | |
| 模型知道 “dog” 和 “cat” 的語義距離近,因為它們在類似語境出現; | |
| 但與 “table” 的語義距離遠。 → 這是「語義層」的統計表徵。 |
2.1 形式符號操縱與真實理解
哲學家約翰·塞爾(John Searle)的「中文房間論證」(Chinese Room Argument)提出了經典質疑:一個系統即使能完美地操縱符號並產生正確的輸出,是否代表它真正「理解」這些符號的意義?
語言模型的運作可類比為:
- 語法層面:高度精確,能掌握複雜的句法結構
- 語義層面:能建立詞彙關係網絡,展現語義一致性
- 語意層面:缺乏真實世界的經驗基礎與主體性意識
認知科學家 Stevan Harnad 指出,符號要具有意義,必須「接地」於感知經驗。語言模型面臨的根本挑戰是:
- 它們的「語義」建立在文本與文本之間的關係上
- 缺乏與真實世界物理對象的直接互動經驗
- 無法將「紅色」這個詞與實際的視覺感知連結
這意味著模型可能掌握了「語義關係」(紅色是一種顏色,與藍色不同),卻未必擁有「語意理解」(知道看到紅色是什麼感受)。
三、「語意傾向」在 LLM Transformer 中的對應
相較於語義結構性的理解,「語意傾向」著重在說話者的意圖、心理表徵、語境理解,真正的語意理解涉及:
- 心智理論(Theory of Mind):理解他人的信念、意圖、情感
- 常識推理:運用未明說的背景知識
- 實用推論:從字面意義推導言外之意
語言模型在這些方面的表現不穩定: - 對於諷刺、隱喻等需要深層意圖推斷的語言現象容易失敗
- 缺乏對物理因果關係的直觀理解
- 無法真正體會情感的主觀質性(qualia)
現代語言模型雖展現了令人驚訝的語境理解能力:
- 能根據上下文消歧義(如「銀行」可指金融機構或河岸)
- 理解隱含的指代關係與省略結構
- 在多輪對話中維持主題連貫性
這些能力似乎觸及了「語意理解」的邊緣,但仍有根本限制。
相較於結構性的理解,「語意傾向」則更多地體現在模型的輸出和生成過程中。它指的是模型在特定提示(Prompt)和上下文的引導下,所表現出的情感色彩、主題方向、風格選擇以及價值觀偏向。
- 情感與風格的傾向性: 模型在生成文本時,會依據訓練數據的模式,呈現出特定的情感基調(Sentiment)(如積極、消極、中性)或寫作風格(如正式、口語化、學術性、幽默)。這種傾向性是模型將輸入的語義結構轉換為輸出時,對風格權重的統計選擇。
- 主題與內容的聚焦: 當使用者輸入一個關於「氣候變遷」的提示時,模型會傾向於生成與「溫室效應」、「環保政策」、「極端天氣」等相關的主題內容。這種對特定主題的聚焦和展開,就是一種語意傾向的表現。
- 偏見與價值的體現: 這是語意傾向中最具爭議的部分。由於模型是在真實世界的人類文本上訓練的,如果訓練數據中存在固有的社會偏見(如性別、種族、地域偏見),模型在生成內容時就可能不自覺地重現或放大這些偏見,表現出特定的價值觀傾向。儘管透過對齊(Alignment)和微調(Fine-tuning)可以進行修正,但模型的本質仍是反映其訓練數據的語意傾向。
而「語意」則屬於 模型輸出對人類理解的詮釋層,涉及:
- 上下文脈絡(pragmatic context)
- 使用者意圖(instruction)
- 模型 alignment(對齊人類價值、語氣、情緒、語用習慣)
也就是說,「語意」在 LLM 裡不是內建結構,而是由語境 + prompt + fine-tuning 階段形成的「語用反應層」。
實際對應層級:
| 語意概念 | LLM 對應機制 |
|---|---|
| 語境解讀、語用推論 | Attention 動態加權 + contextual embedding |
| 語氣、情感、風格 | 在 alignment (RLHF, DPO) 階段學到的回應模式 |
| 使用者意圖理解 | prompt conditioning + chain-of-thought decoding |
| 文意隱喻與多義性 | 高維語意空間的非線性組合(多層注意力捕捉多重語境) |
| 例如當使用者輸入「你真聰明」,模型會依訓練語境判斷這是「誇獎」或「諷刺」。 這是語意層的理解與生成行為,不是語義層的結構映射。是「理解」的程度問題,或許問題不在於語言模型「是否」理解語意,而是「在何種程度上」理解。我們可以將理解視為一個連續體: |
淺層語意 → 語義關聯 → 語境推理 → 意圖理解 → 深層共情
語言模型可能處於這個光譜的中段,擁有一定形式的「功能性語意理解」,但缺乏人類理解中的經驗性與主體性維度。
四、語義結構→ 語意傾向的轉換過程(模型內部對應)
可以這樣看 Transformer 的抽象層級:
[Token Embedding 層] → 形式語義(word-level semantics)
[Self-Attention 層] → 結構語義(syntactic-semantic composition)
[Contextual Layer Stack] → 複合語義(conceptual relations)
[Decoder / Output 層] → 語用語意(pragmatic meaning)
[Alignment + RLHF] → 社會語意(socialized meaning)
也就是說:Transformer 本體處理「語義」, 而 alignment(對齊)階段賦予它「語意」。
與人類語言理解對比(哲學層):
| 層面 | 人類 | LLM |
|---|---|---|
| 語義 | 經由概念與知覺分類形成意義(符號與對象對應) | 經由語料分佈學習符號間的統計關聯 |
| 語意 | 由意識、情境與意圖詮釋語言(有心理內容) | 由 prompt 與 context 條件化生成語境一致的輸出(無主體意圖) |
| 意識關聯 | 有自我與指涉能力(subjectivity) | 僅有統計意象與概率加權(無意識,但可模擬語用) |
所以哲學上說: LLM 的「語義」是模擬語義(simulated semantics), 而「語意」是對語境的機率性擬態(probabilistic pragmatics)。
工程視角下的語義/語意層分工:
| 架構元件 | 對應層 | 功能 |
|---|---|---|
| Tokenizer / Embedding | 語義基底 | 把符號轉成可操作的語義向量 |
| Attention Heads | 語義組構 | 捕捉語法關係與詞間依存 |
| Feed-forward Layers | 語義壓縮 | 將局部語義抽象成高層概念 |
| Decoder Sampling / Beam Search | 語意實現 | 根據上下文生成語境恰當的句子 |
| Alignment (RLHF / DPO) | 語意社會化 | 對齊人類語用與價值傾向 |
對應關係表:
| 項目 | 語義(Semantics) | 語意(Meaning / Pragmatics) |
|---|---|---|
| 抽象層級 | 模型內部結構與向量關係 | 模型外部語境與生成意圖 |
| 對應 Transformer | embedding, attention, layer representation | decoding, alignment, prompt interpretation |
| 性質 | 結構化、可形式化 | 語境化、可對齊(context-conditioned) |
| 主要學習來源 | 預訓練語料的分佈統計 | 指令微調與人類反饋(RLHF) |
| 驗證方式 | 相似度、詞嵌入距離、語義一致性測試 | 對話連貫性、語氣、意圖、情緒一致性 |
| 哲學對應 | 符號與意義的系統對應(Frege 的 reference) | 主體詮釋與語用行為(Wittgenstein 的 language game) |
結構理解與傾向生成的區別與聯繫
| 特徵 | 語義結構的理解(Structure) | 語意傾向的生成(Tendency) |
|---|---|---|
| 本質 | 內部化的統計模式和關係映射。 | 外部化的風格選擇和內容偏向。 |
| 功能 | 確保文本在文法和邏輯上正確且可解釋。 | 決定生成文本的風格、情感和立場。 |
| 層次 | 屬於模型的內部核心能力。 | 屬於模型的輸出表現與應用。 |
| 人類對應 | 類似於對句法、語義、邏輯的認知判斷。 | 類似於在表達時所採取的立場、情感或修辭。 |
小結:
- 結構是傾向的基礎 : 語言模型必須先準確地理解語義結構(例如,知道「氣候變遷是嚴重的威脅」的肯定語義結構),才能在生成時表現出傾向(例如,以焦慮的語氣和呼籲環保的立場來闡述)。
- 傾向指導結構的選擇: 反過來,預期的語意傾向也會影響模型在生成過程中對詞彙、句式和論述結構的選擇。一個要求「樂觀」的傾向,會引導模型選擇「光明」、「希望」等積極傾向的詞彙來構建語義結構。
五、多模態模型的突破與局限
5.1 感知接地的嘗試
新一代多模態模型整合了視覺與語言:
- 能夠「看見」圖片並描述內容
- 將視覺特徵與語言概念連結
- 在一定程度上緩解了接地問題
這是否意味著它們更接近真正的「語意理解」?答案仍不明確。即使模型能處理視覺輸入,這種處理仍是統計模式識別,而非現象學意義上的「感知經驗」。
5.2 具身認知的缺席
哲學家與認知科學家強調具身認知(embodied cognition)的重要性:意義的形成離不開身體與環境的互動。語言模型:
- 沒有身體,無法進行物理互動
- 沒有生存需求,缺乏目的性行為的動機
- 沒有時間連續性的主體經驗
這些根本性的差異可能限制了它們達到人類式「語意理解」的可能性。
結論
Transformer 的內部機制處理的是「語義結構的形成」,
而 Alignment 與生成過程讓它呈現「語意的流動」。
換言之:
“語義是模型理解語言的骨架,語意是模型模仿人類對語言的靈魂。“
「語義」與「語意」的區分不僅是術語的辨析,更反映了理解語言與心智的不同取徑。語言模型在處理語義結構方面展現了卓越能力,能夠捕捉詞彙關係、維持語義一致性、生成連貫文本。然而,在完整的語意理解——涉及主觀經驗、意圖推斷、情境詮釋——方面,它們仍有根本性的限制。
語言模型的強大,在於它能夠巧妙地結合對語義結構的統計性、精準的理解,與對語意傾向的靈活、富有風格的生成。
一句總結:
「語義」是符號系統裡的邏輯結構意義, 「語意」是人對語言的理解與詮釋意圖。
換言之語義是語言的「邏輯」,語意是語言的「靈魂」。
| 觀點 | 「語義」 | 「語意」 |
|---|---|---|
| 邏輯哲學(Frege, Russell) | 指稱(reference)與意義(sense)的系統結構;語義可形式化(如一階邏輯語義)。 | 在 Frege 體系中,語意更接近「Sinn」(意涵),即心理上對語句的理解。 |
| 符號學(Saussure, Peirce) | 符號系統的穩定意義(signifier ↔ signified)。 | 語意則指符號使用者如何賦予意義(interpretant)。 |
| 海德格(Heidegger)與詮釋學 | 語義是「已給出的結構意義」。 | 語意是「存在者之間的理解運作」,隨語境流動。 |
參考資料
學術文獻
- Searle, J. R. (1980). “Minds, brains, and programs." Behavioral and Brain Sciences, 3(3), 417-424.
- 提出中文房間論證,質疑純符號操縱是否等同理解。
- Harnad, S. (1990). “The symbol grounding problem." Physica D: Nonlinear Phenomena, 42(1-3), 335-346.
- 探討符號如何獲得意義,指出接地的必要性。
- Bender, E. M., & Koller, A. (2020). “Climbing towards NLU: On meaning, form, and understanding in the age of data." Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185-5198.
- 分析語言模型在語義形式與真實理解之間的差距。
- Mitchell, M., & Krakauer, D. C. (2023). “The debate over understanding in AI’s large language models." Proceedings of the National Academy of Sciences, 120(13), e2215907120.
- 討論 LLM 是否真正理解語言的當代辯論。
- Piantadosi, S. T., & Hill, F. (2022). “Meaning without reference in large language models." arXiv preprint arXiv:2208.02957.
- 探討語言模型如何在缺乏外部指涉的情況下處理意義。
語言學與認知科學
- Evans, V., & Green, M. (2006). Cognitive Linguistics: An Introduction. Edinburgh: Edinburgh University Press.
- 介紹認知語言學對意義的理解,強調具身與經驗基礎。
- Firth, J. R. (1957). “A synopsis of linguistic theory, 1930-1955." Studies in Linguistic Analysis, 1-32.
- 提出「詞的意義由其語境決定」的經典觀點。
- Levinson, S. C. (1983). Pragmatics. Cambridge: Cambridge University Press.
- 系統性介紹語用學,探討語境如何影響意義理解。
人工智慧與自然語言處理
- Mikolov, T., et al. (2013). “Distributed representations of words and phrases and their compositionality." Advances in Neural Information Processing Systems, 26.
- Word2Vec 的經典論文,開啟現代詞向量時代。
- Vaswani, A., et al. (2017). “Attention is all you need." Advances in Neural Information Processing Systems, 30.
- Transformer 架構的原始論文,奠定現代語言模型基礎。
- Brown, T., et al. (2020). “Language models are few-shot learners." Advances in Neural Information Processing Systems, 33, 1877-1901.
- GPT-3 論文,展示大規模語言模型的驚人能力。
- Marcus, G., & Davis, E. (2020). “GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about." MIT Technology Review.
- 批判性分析語言模型的理解局限。
哲學與理論
- Wittgenstein, L. (1953). Philosophical Investigations. Oxford: Blackwell.
- 探討意義即使用,影響當代語言哲學深遠。
- Putnam, H. (1975). “The meaning of ‘meaning’." Minnesota Studies in the Philosophy of Science, 7, 131-193.
- 提出語義外在論,挑戰傳統意義理論。
- Lakoff, G., & Johnson, M. (1980). Metaphors We Live By. Chicago: University of Chicago Press.
- 探討隱喻與具身認知在意義形成中的角色。
多模態與具身 AI
- Lake, B. M., et al. (2017). “Building machines that learn and think like people." Behavioral and Brain Sciences, 40, e253.
- 討論 AI 如何更接近人類式學習與理解。
- Bisk, Y., et al. (2020). “Experience grounds language." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 8718-8735.
- 論證語言理解需要感知與行動經驗的支持。
中文資源
- 何萬順 (2015)。《語言學概論》。台北:五南圖書出版公司。
- 系統性介紹語言學基本概念,包含語義與語用。
- 鄭昭明 (2006)。《認知心理學:理論與實踐》。台北:桂冠圖書公司。
- 從認知心理學角度探討語言理解機制。
- 湯志民、林宏洲 (2023)。"大型語言模型的語義理解能力評估"。《人工智慧學刊》,15(2), 45-68。
- 針對中文語言模型的語義能力進行實證研究。






發表留言