KB1
FAQ: โPrompt is too longโ when uploading large documents (AWS Bedrock)
What you might see
When you attach a very large PDF (or similar) to a normal chat and ask the model to read it, the request can fail with an error like this:
In the example above, the combined prompt grew to about 210,000 tokens, but the modelโs limit was 200,000 tokens, so the call could not complete.
Why this happens
Flows such as Upload as text turn the whole file into text and send it in one prompt. Every model has a maximum context size. If the document (plus your message and system instructions) exceeds that limit, you will see a โprompt is too longโ (or similar) error.
This is expected behavior for oversized content โ it is not a sign that your PDF is corrupt.
What to do instead: use a LibreChat Agent with File Search (RAG)
For large manuals, reports, or policies, use an Agent that indexes your files and answers using retrieval (RAG). Only the relevant chunks are sent to the model, instead of the entire document at once โ so you stay within context limits.
Requirements: Your organization must have Agents and File Search / RAG (and the RAG API backend) enabled and working. If you are unsure, ask your administrator.
Steps: create an Agent and attach your document (Claude Sonnet 4.5)
Open Agents
From the LibreChat UI, go to Agents (sometimes labeled Agent Builder).
Create a new agent
Choose Create / New agent.
Choose the model
Under Model (and AWS Bedrock if applicable), select Claude Sonnet 4.5 โ or the Sonnet model your administrator has enabled.
If you donโt see it, Bedrock access or model allowlisting may need to be updated by an admin.
Enable file / knowledge search
Turn on File Search, Knowledge, Files, or RAG (labels vary by version) so the agent uses indexed documents, not a single giant paste of text.
Add your file
Upload your PDF (e.g. nova2-ug.pdf) to the agentโs files or knowledge area. Wait until indexing / processing completes.
Chat with the agent
Start a conversation with that agent and ask your questions (summary, specific sections, comparisons, etc.).
Reuse
Save or pin the agent if you will query the same document again later.
Alternatives (when Agent + RAG is not the right fit yet)
Option A โ Amazon Nova Pro (larger context in normal chat)
Nova Pro on Amazon Bedrock supports a ~300,000-token context window, compared with ~200,000 tokens for many Claude models in the same โwhole file in one promptโ scenario.
Good for
Short-term work where you need to process a large file once (or rarely) in regular chat / upload-as-text, and the combined prompt fits under 300k tokens.
Not ideal for
Documents you will query again and again across many sessions โ Agent + File Search (RAG) stays the better option: indexed retrieval scales better and avoids stuffing entire manuals into every request.
How: In the model picker, switch the conversation to Nova Pro (or the exact Bedrock model ID your admin exposes, e.g. amazon.nova-pro-v1:0 โ names vary by region and release). Then retry your upload and question.
Note: Context limits and model IDs change over time. Confirm the latest figures in AWS Bedrock documentation for your region.
Option B โ Very large context models (roadmap)
AWS has indicated that models with on the order of 1 million tokens of context will become available on Bedrock in the near term (often discussed in the ~1โ2 month horizon, subject to AWS announcements and regional rollout).
When such a model is enabled for your organization and appears in LibreChatโs model list, many โprompt too longโ failures for single-shot full-document chats will diminish or disappear for workloads that still fit under that window.
Expectation
A strong fix for oversized single-prompt uploads once the model is generally available to you.
Still true
For recurring access to the same corpora (policies, manuals, libraries), Agent + RAG often remains preferable for cost, latency, and governance (search over chunks vs. sending megatokens every time).
Disclaimer: Release dates and context sizes are not guaranteed until AWS publishes them; your administrator must enable new models per your orgโs policy.
Quick comparison
Approach
Large PDFs
Whole file in normal chat (e.g. Claude-class ~200k context)
Often fails when over that modelโs token limit
Whole file in normal chat with Nova Pro (~300k context)
Short-term option for one-off processing if you stay under the limit; weaker fit if you need the same files long-term โ prefer Agent + RAG
Agent + File Search / RAG
Best for ongoing use โ retrieves slices instead of loading full documents every time
Future: ~1M context on Bedrock (when available)
Should greatly reduce single-prompt limit errors; confirm with AWS and your admin when listed
If something still fails
Confirm File Search / RAG is enabled for your deployment.
If you are just under the old limit, try Nova Pro (~300k context) for a one-off โ see Option A above.
Try a smaller file in regular chat only to verify connectivity โ not as the main workflow for full manuals.
Ask your admin which Bedrock model IDs are approved (e.g. Nova Pro, Claude Sonnet 4.5, Claude 3.5 Sonnet) and that inference profiles are correct for your region.
Related
LibreChat โ File Search (RAG)
LibreChat โ Agents
LibreChat โ Upload as Text (best for smaller files that fit in context)
0 lessons