Semantic Search

🇹🇭 ภาษาไทย

การค้นหาข้อมูลโดยใช้ความหมาย (meaning) แทนที่จะใช้การ match คำตรงๆ (keyword matching) แปลง text เป็น vector ใน embedding space แล้วหา documents ที่ “ใกล้เคียง” กันในเชิง semantic

หลักการ

ข้อความที่มีความหมายใกล้เคียงกันจะถูก map ไปยัง vectors ที่อยู่ใกล้กันใน high-dimensional space แม้จะใช้คำต่างกัน เช่น “ราคาหุ้นตก” กับ “มูลค่าตลาดลดลง” จะอยู่ใกล้กันใน embedding space

Query → Embed → Vector
                  ↓
Corpus → Embed → Vectors → Similarity search (cosine/dot product) → Top-K results

การใช้งานใน Wiki นี้

System	ใช้ Semantic Search อย่างไร
MemPalace	retrieval layer หลัก — ค้นหา conversation history
MiroFish	ส่วนหนึ่งของ GraphRAG pipeline
LLM Wiki Pattern	ไม่ใช้ — ใช้ index.md + LLM reading แทน

RAG vs Wiki — semantic search คือ retrieval mechanism ใน RAG side
MemPalace — implementation ที่ใช้ semantic search เป็นหลัก
MiroFish — ใช้ใน GraphRAG pipeline

🇬🇧 English

A retrieval method that finds information based on meaning rather than exact keyword matching. Text is converted to vectors in an embedding space, and documents that are semantically “close” are retrieved — even if they use entirely different words.

How It Works

Text with similar meanings is mapped to nearby vectors in a high-dimensional space. For example, “stock price dropped” and “market value declined” would be close in embedding space even though they share no keywords.

Query → Embed → Vector
                  ↓
Corpus → Embed → Vectors → Similarity search (cosine/dot product) → Top-K results

Usage in This Wiki

System	How it uses semantic search
MemPalace	Core retrieval layer — finds relevant conversation history from a natural language query
MiroFish	Part of the GraphRAG pipeline in the graph-building step
LLM Wiki Pattern	Not used — relies on `index.md` + direct LLM page reading at current scale

Key Properties

Language-agnostic similarity: finds related content across paraphrases and synonyms
Embedding model dependent: quality depends on the embedding model used
Approximate nearest neighbor (ANN): large corpora use ANN algorithms (HNSW, IVF) for speed
Common backends: ChromaDB (MemPalace default), FAISS, Pinecone, Weaviate

PrasitN Wiki

รายการหน้า

Semantic Search

Semantic Search

🇹🇭 ภาษาไทย

หลักการ

การใช้งานใน Wiki นี้

🇬🇧 English

How It Works

Usage in This Wiki

Key Properties

มุมมองกราฟ

สารบัญ

PrasitN Wiki

รายการหน้า

Semantic Search

Semantic Search

🇹🇭 ภาษาไทย

หลักการ

การใช้งานใน Wiki นี้

Related

🇬🇧 English

How It Works

Usage in This Wiki

Key Properties

มุมมองกราฟ

สารบัญ