Concepts

Regenerate Background

Chatbot Evaluation Benchmarks

Benchmarks for LLM chatbots: MT-Bench, MMLU, GSM8K, and related tasks for dialogue quality, reasoning, and trustworthiness.

Read More25.05.2025

Benchmarking Text Embeddings

MTEB, BEIR, STS, and related benchmarks for evaluating text embedding quality across retrieval, classification, and clustering.

Read More25.05.2025

Conversational React Description

Conversational ReAct builds on the original ReAct paradigm by weaving in two critical capabilities: (1) persistent memory so that ...

Read More06.13.2025

Model Context Protocol - An Architectural Pattern

The Model Context Protocol (MCP) is a structured framework for designing intelligent systems powered by large language models ...

Read More02.01.2025

Prompt injection

Prompt injection is a critical security vulnerability specific to AI applications that leverage large language models (LLMs) Unlike ...

Read More06.10.2025

AI Vulnerabilities Exposed

OWASP Top 10 for LLM applications: prompt injection, supply chain, model theft, training data poisoning, secret management, and secure tool design.

Read More25.04.2025

Reflexion actor

The Reflexion Actor is a cognitive AI architecture that enhances an agent's ability to learn from its mistakes through self-reflection. It ...

Read More06.14.2025

Zero Shot React Description

Zero Shot React Description is a technique used in AI agents to make decisions dynamically without requiring prior training examples or ...

Read More06.16.2025