RAG for Financial Research

🇹🇭 ภาษาไทย

RAG for Financial Research (อ้างอิงจากบทที่ 22 ของหนังสือ Machine Learning for Algorithmic Trading) เจาะลึกความจำเป็นของการใช้ Retrieval-Augmented Generation ในงานการเงิน ซึ่งเป็นโดเมนที่การ “เดาหรือแต่งเรื่อง” (Hallucination) ก่อให้เกิดความเสียหายร้ายแรง

ประเด็นสำคัญ (Key Takeaways)

ข้อจำกัดของ RAG แบบพื้นฐาน: การหั่นข้อความ (Chunking) แบบตายตัวทำให้ตารางและโครงสร้างเอกสารการเงินเสียหาย ระบบที่ใช้งานจริงต้องใช้ Structure-aware parsing และใช้ Domain-Specific Embeddings (เช่น Fin-E5) แทนโมเดลแบบ Generic ทั่วไป
Hybrid Search เป็นสิ่งจำเป็น: Semantic search มักจะค้นหาชื่อหุ้น (Ticker) หรือตัวเลขเฉพาะไม่เจอ จึงต้องใช้ Hybrid Search (Vector + BM25 keyword) ร่วมกับการกรองด้วย Metadata (เช่น ปีงบประมาณ) อย่างเข้มงวด
การป้องกัน Hallucination แบบซ้อนชั้น: ใช้ Constraint-based prompting, การทำ Re-ranking, และการดึงส่วนคำนวณตัวเลข (Arithmetic) ออกไปให้ Tool ที่มีความแม่นยำทางคณิตศาสตร์ทำงานแทน LLM
RAG คือสะพานไปสู่ Agent: เฟรมเวิร์กนี้เสนอวิธีวิเคราะห์ข้อผิดพลาดของ RAG อย่างเป็นระบบ (แยกเป็น Retrieval, Context, Synthesis, Computation errors) และชี้ว่า RAG เป็นเพียงเครื่องมือหนึ่งใน Agentic Framework ที่ใหญ่กว่า

🇬🇧 English

RAG for Financial Research (from Chapter 22 of Machine Learning for Algorithmic Trading) details the absolute necessity of Retrieval-Augmented Generation in finance, a domain where LLM hallucination is unacceptable and potentially disastrous.

Key Takeaways

Limitations of Naive RAG: Fixed-size chunking destroys financial tables and temporal metadata. Production systems require structure-aware parsing and domain-specific embeddings (e.g., Fin-E5 or Voyage AI) rather than generic models.
Hybrid Search is Essential: Pure semantic search fails on exact tickers and specific financial figures. Financial RAG demands Hybrid Search (combining Vector and BM25) fused with strict metadata filtering (e.g., fiscal year, company).
Multi-layered Hallucination Defense: The pipeline enforces grounding through constraint-based prompting, cross-encoder re-ranking, and crucially, delegating arithmetic calculations to deterministic tools rather than relying on the LLM’s math capabilities.
A Bridge to Agents: The chapter provides a diagnostic framework for RAG failures and positions RAG as just one tool within broader Agentic Frameworks like ReAct.

PrasitN Wiki

รายการหน้า

RAG for Financial Research

RAG for Financial Research

🇹🇭 ภาษาไทย

ประเด็นสำคัญ (Key Takeaways)

🇬🇧 English

Key Takeaways

มุมมองกราฟ

สารบัญ

หน้าที่กล่าวถึง