Knowledge Base

FAQ: “Prompt is too long” when uploading large documents (AWS Bedrock) What you might see When you attach a very large PDF (or similar) to a normal chat and ask the model to read it, the request can fail with an error like this: In the example above, the combined prompt grew to about 210,000 tokens, but the model’s limit was 200,000 tokens, so the call could not complete. Why this happens Flows such as Upload as text turn the whole file into text and send it in one prompt. Every model has a maximum context size. If the document (plus your message and system instructions) exceeds that limit, you will see a “prompt is too long” (or similar) error. This is expected behavior for oversized content — it is not a sign that your PDF is corrupt. What to do instead: use a LibreChat Agent with File Search (RAG) For large manuals, reports, or policies, use an Agent that indexes your files and answers using retrieval (RAG). Only the relevant chunks are sent to the model, instead of the entire document at once — so you stay within context limits. Requirements: Your organization must have Agents and File Search / RAG (and the RAG API backend) enabled and working. If you are unsure, ask your administrator. Steps: create an Agent and attach your document (Claude Sonnet 4.5) Open Agents From the LibreChat UI, go to Agents (sometimes labeled Agent Builder). Create a new agent Choose Create / New agent. Choose the model Under Model (and AWS Bedrock if applicable), select Claude Sonnet 4.5 — or the Sonnet model your administrator has enabled. If you don’t see it, Bedrock access or model allowlisting may need to be updated by an admin. Enable file / knowledge search Turn on File Search, Knowledge, Files, or RAG (labels vary by version) so the agent uses indexed documents, not a single giant paste of text. Add your file Upload your PDF (e.g. nova2-ug.pdf) to the agent’s files or knowledge area. Wait until indexing / processing completes. Chat with the agent Start a conversation with that agent and ask your questions (summary, specific sections, comparisons, etc.). Reuse Save or pin the agent if you will query the same document again later. Alternatives (when Agent + RAG is not the right fit yet) Option A — Amazon Nova Pro (larger context in normal chat) Nova Pro on Amazon Bedrock supports a ~300,000-token context window, compared with ~200,000 tokens for many Claude models in the same “whole file in one prompt” scenario. Good for Short-term work where you need to process a large file once (or rarely) in regular chat / upload-as-text, and the combined prompt fits under 300k tokens. Not ideal for Documents you will query again and again across many sessions — Agent + File Search (RAG) stays the better option: indexed retrieval scales better and avoids stuffing entire manuals into every request. How: In the model picker, switch the conversation to Nova Pro (or the exact Bedrock model ID your admin exposes, e.g. amazon.nova-pro-v1:0 — names vary by region and release). Then retry your upload and question. Note: Context limits and model IDs change over time. Confirm the latest figures in AWS Bedrock documentation for your region. Option B — Very large context models (roadmap) AWS has indicated that models with on the order of 1 million tokens of context will become available on Bedrock in the near term (often discussed in the ~1–2 month horizon, subject to AWS announcements and regional rollout). When such a model is enabled for your organization and appears in LibreChat’s model list, many “prompt too long” failures for single-shot full-document chats will diminish or disappear for workloads that still fit under that window. Expectation A strong fix for oversized single-prompt uploads once the model is generally available to you. Still true For recurring access to the same corpora (policies, manuals, libraries), Agent + RAG often remains preferable for cost, latency, and governance (search over chunks vs. sending megatokens every time). Disclaimer: Release dates and context sizes are not guaranteed until AWS publishes them; your administrator must enable new models per your org’s policy. Quick comparison Approach Large PDFs Whole file in normal chat (e.g. Claude-class ~200k context) Often fails when over that model’s token limit Whole file in normal chat with Nova Pro (~300k context) Short-term option for one-off processing if you stay under the limit; weaker fit if you need the same files long-term — prefer Agent + RAG Agent + File Search / RAG Best for ongoing use — retrieves slices instead of loading full documents every time Future: ~1M context on Bedrock (when available) Should greatly reduce single-prompt limit errors; confirm with AWS and your admin when listed If something still fails Confirm File Search / RAG is enabled for your deployment. If you are just under the old limit, try Nova Pro (~300k context) for a one-off — see Option A above. Try a smaller file in regular chat only to verify connectivity — not as the main workflow for full manuals. Ask your admin which Bedrock model IDs are approved (e.g. Nova Pro, Claude Sonnet 4.5, Claude 3.5 Sonnet) and that inference profiles are correct for your region. Related LibreChat — File Search (RAG) LibreChat — Agents LibreChat — Upload as Text (best for smaller files that fit in context)

0 lessons