Tuesday, September 16, 2025

Amazon Kendra | Overview.

Amazon Kendra - Overview.

Scope:

  • Quick elevator (one-liner),
  • Core concepts & components,
  • How Kendra “thinks” about relevance,
  • Query types behavior,
  • GenAI integration (what’s different),
  • Security & governance,
  • Scalability, performance & capacity,
  • Observability & metrics to watch,
  • Costs pricing model (high level),
  • Integrations & common architectures,
  • Best practices & tuning checklist,
  • Common pitfalls & gotchas,
  • Sample reference architecture (textual diagram),
  • An Architecture diagram,
  • Link to AWS Documentation,
  • Insights.

Quick elevator (one-liner)

    • Amazon Kendra is a managed, ML-powered enterprise search service that provides semantic / natural-language search over heterogeneous content (S3, SharePoint, databases, FAQs, etc.).
    • Amazon Kendra is equipped with features for:
      • Relevance tuning, 
      • Connectors, 
      • Document enrichment, 
      • (newer) GenAI-index integration for retrieval-augmented responses.

Core concepts & components

    • Index — the primary unit: stores documents, attributes, embeddings/representations, and ranking metadata. 
    •  Index types include Developer, Enterprise, and GenAI Enterprise (for easier reuse with generative tools).
    • Data sources / Connectors — native/scheduled connectors for S3, SharePoint, Salesforce, RDS, Confluence, Drive, etc. 
    • twtech can also implement custom connectors or push via API.
    • Ingestion pipeline — fetch (connector) convert/parse (file types: PDF, Office, HTML, text, etc.) optional pre-processing/enrichment indexing. Custom Document Enrichment lets twtech alter attributes/content at ingest time.
    • Document chunking & fields — documents are broken into passages/paragraph chunks; fields/metadata (author, source, date, doc-type) are indexed separately so twtech can boost/filter by them.
    • Query engine — combines semantic matching (deep models / embeddings), lexical matching, and a ranking model that uses field boosts, recency, and other signals to produce a ranked result set.
    • Supports direct answers (answer extraction) and returning source passages.

How Kendra “thinks” about relevance

    • Hybrid scoring — Kendra blends semantic similarity (contextual/embedding-like signals) with traditional retrieval and engineered ranking features.
    • Relevance tuning — twtech can boost documents by field, source, or attribute (e.g., boost FAQs, fresh docs, or a trusted source). This adjusts which documents are favored in ranking; it doesn’t force inclusion but biases ranking. There are UI and API ways to tune.
    • Synonyms & domain vocab — upload synonym lists and domain-specific terminology so queries with alternate terms still match the correct content.

Query types & behavior

    • Natural-language questions (Who, What, How) Kendra extracts concise answers when the exact answer is present in content.
    • Descriptive answers returns passages or full documents for how-to or explanatory queries.
    • Keyword / faceted search supports filters/facets on metadata and structured attributes.
    • Limits — Kendra is not designed to aggregate answers requiring cross-document synthesis or to perform arbitrary calculations over content (it favors direct extraction and ranking).

GenAI integration (what’s different)

    • GenAI Enterprise Index: allows Kendra-indexed data to be used as a retriever for generative systems (e.g., Bedrock, other RAG setups) — easier reuse of indexed knowledge for assistants and prompt flows.
    • It also has more granular capacity options.

Security & governance

    • Encryption — data encrypted at rest (AWS-owned/managed/customer KMS) and in transit (TLS).
    • Access control — integrates with IAM and supports document-level access control (ACL propagation from source connectors) so search results can be filtered by user permissions.
    • Audit / compliance — use CloudTrail and CloudWatch for logging queries, index actions, and connector activity.

Scalability, performance & capacity

    • Capacity model — twtech provisions index capacity units (storage + query units) and these are billed per-hour for storage/query capacity; query throughput and latency correlate with provisioned capacity.
    • Sync modes — connectors can schedule incremental syncs; for heavy-change environments twtech can push deltas via API to avoid full crawls.

Observability & metrics to watch

    • Query latency,
    • Query units utilization,
    • Error rate for connectors,
    • Document ingestion throughput,
    • Freshness lag,
    • Top-queries with no matches (helpful for gap analysis).
    • Use CloudWatch dashboards + logs to track and alert.

Costs & pricing model (high level)

Main bill drivers:

    • Number/type of indexes (index hours),
    • Storage units for index content,
    • Query units (query-hours and/or per-query costs depending on edition),
    • Connectors and connector hours.
    • GenAI index tiers may have different starting prices and finer-grained capacity.
    • Always validate current pricing in twtech AWS account/region.

Integrations & common architectures

    • Intranet / knowledge portal: Kendra index + Experience Builder (or custom UI) + authentication via your SSO + connectors to S3/SharePoint/Drive.
    • Customer support / help center: Kendra behind a chat UI (Lex/third-party) to power FAQ answering and surfacing KB articles; use Custom Document Enrichment to surface product metadata/intent.
    • RAG / virtual assistant: Kendra GenAI index used as retriever in a RAG pipeline (retriever reranker generator) to ground generative responses in company documents.

Best practices & tuning checklist

    1. Start small, iterate: index a representative subset (top 5–10 content sources) and tune relevance there.
    2. Enrich on ingest: normalize metadata (product IDs, region, department) via CDE(custom document enrichment) so twtech can filter/boost reliably.
    3. Use synonyms & FAQs: capture domain synonyms and explicit FAQ entries for high-value QA mapping.
    4. Relevance tuning: add source/freshness boosts for authoritative or time-sensitive content; measure impact with A/B tests (query logs).
    5. Permissions: propagate ACLs from sources or enforce via attributes to prevent oversharing.
    6. Instrument queries: capture zero-result queries and answer-quality feedback so twtech can iteratively re-index or add clarifying content.

Common pitfalls & gotchas

    • Expectations mismatch: Kendra extracts direct answers only when the exact answer text exists; generative paraphrasing is limited unless twtech pairs with a generation model.
    • Poor metadata: if sources lack structured attributes, tuning and faceting are weaker — invest in enrichment.
    • Overfitting boosts: overly aggressive boosts can hide useful documents; monitor search quality after each change.

Sample reference architecture (textual diagram)

    1. Sources: SharePoint, S3 (documents), RDS (DB), Salesforce. 
    2. Connectors: scheduled syncs + webhook pushes for near real-time. 
    3. Preprocessing: optional Lambda jobs / Comprehend / Transcribe to extract entities, OCR, or audio transcripts. 
    4. Kendra Index (GenAI/Enterprise) with Custom Document Enrichment. 
    5. Application layer: Search API / Experience Builder / Chat UI (with RAG: Kendra as retriever + generator service). 
    6. Security & Ops: IAM, KMS, CloudWatch, CloudTrail.

An Architecture diagram image showing:

 Connectors Kendra App Security.

Link to AWS Documentation

https://docs.aws.amazon.com/kendra/latest/dg/hiw-index-types.html?utm_source=chatgpt.com

twtech-Insights:



No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, Insights. Intro: Amazon EventBridg...