Office Consumer is reader-supported. We may earn an affiliate commission from qualified links on our site.

Can Microsoft 365 Copilot Read PDF Files? (w/Examples) + FAQs

Yes. Microsoft 365 Copilot can read PDF files, summarize them, answer questions about them, pull tables out of them, and use them to draft new content inside Word, Excel, PowerPoint, Outlook, and Teams. The catch is that what Copilot can do with a PDF depends on where the file lives, how big it is, whether the text is selectable, and which license you hold. Users who skip these details often get the error message “I can’t read that file” and assume the product is broken.

The problem is that PDFs are not plain text. They are a container format governed by the ISO 32000-2 standard, and many business PDFs are scanned images, encrypted under the PDF 2.0 security handler, or stored outside the Microsoft Graph index that Copilot uses for grounding. When any of those conditions is true, Copilot will either refuse the file or return a partial answer. The direct consequence is wasted license spend, missed deadlines, and, in regulated industries, a potential breach of HIPAA 45 CFR 164.312 technical safeguards when users work around the tool with shadow AI.

A recent Microsoft Work Trend Index survey found that 75% of knowledge workers already use generative AI at work, and PDF review is among the top three daily tasks. That statistic explains why every Copilot rollout eventually runs into the PDF question.

Here is what this guide delivers:

  • 📄 Exactly which PDFs Copilot can read, which ones it cannot, and why
  • ⚖️ The federal and state laws that shape how you can feed PDFs into Copilot
  • 🧑‍💻 Named, real-world examples from legal, finance, HR, and IT teams
  • 🛠️ Step-by-step fixes for the seven most common PDF errors
  • ✅ A do’s, don’ts, pros, and cons list you can hand to end users on day one

How Microsoft 365 Copilot Actually Reads a PDF

Microsoft 365 Copilot does not open a PDF the way Adobe Acrobat does. It sends a request to the Microsoft 365 Copilot orchestrator, which then pulls the file’s indexed text from Microsoft Graph, chunks it, and feeds relevant passages to a large language model hosted in Azure OpenAI Service. The model returns an answer, and the orchestrator layers on citations, Purview sensitivity labels, and data loss prevention checks.

This pipeline matters because it sets the boundaries of what Copilot can see. If the PDF is not indexed by Graph, Copilot cannot ground on it. If the text layer is missing, the chunker has nothing to send. If a sensitivity label blocks extraction, the orchestrator drops the content before the model ever sees it.

The consequence of not understanding this flow is that teams buy the $30-per-user-per-month Microsoft 365 Copilot license and then complain that “Copilot can’t read our contracts,” when the real issue is that the contracts live on a local C: drive or in a third-party document system that Graph does not crawl.

A common misconception is that Copilot “uploads” every PDF to OpenAI. It does not. Under the Microsoft Product Terms and the Data Protection Addendum, your PDF content stays inside your Microsoft 365 tenant boundary and is not used to train foundation models.

The grounding requirement

Grounding is the word Microsoft uses for the act of pointing Copilot at specific data. For PDFs, grounding happens in three ways: the file is open in a Microsoft 365 app, the file is attached to a Copilot chat prompt, or the file lives in OneDrive for Business or SharePoint Online and is referenced with the / command.

When grounding works, you see blue citation bubbles in the Copilot response. When it fails, you get a generic answer with no citations, which is a red flag that the model hallucinated. The consequence of missing citations in a legal or medical setting can be severe, because under Federal Rule of Evidence 901 a document must be authenticated, and an unsourced AI summary will not satisfy a judge.

A real-world example is Maria, a paralegal in Dallas who asked Copilot Chat to summarize a 40-page settlement PDF she had on her desktop. Copilot returned a clean summary, but Maria later discovered two clauses were invented because the file was never uploaded to OneDrive. The fix is to drag the file into OneDrive first, then use the attach-cloud-files icon in the prompt box.

The text layer requirement

Copilot needs a searchable text layer. Scanned PDFs, photos of contracts, and faxes routed through eFax or similar services often arrive as flat images. Without optical character recognition, those pixels are invisible to the chunker.

The consequence is that Copilot will either respond “I couldn’t find that information in the attached file” or, worse, pull from general web knowledge and pretend the answer came from the PDF. The fix is to run the file through Acrobat’s OCR tool or save it through SharePoint Syntex, which adds a text layer automatically.

A common misconception is that Copilot performs OCR on the fly. It does not in Word, Excel, or Teams. Only Copilot Studio agents and the Edge sidebar offer limited image extraction, and even those paths struggle with handwriting.

File size and page ceilings

Microsoft publishes guidance in its document length support article telling users to stay under 1.5 million words or roughly 300 pages for summaries and under 3,000 words for rewrites. Copilot Chat historically capped PDF uploads at 1.5 MB, then 10 MB, and now rolls out up to 512 MB for licensed tenants through the Copilot Studio agent pipeline.

The consequence of breaching these limits is silent truncation. Copilot will answer only from the first chunk it could fit, and later sections of the PDF are simply ignored. That is how David, a CFO in Seattle, missed a material adverse change clause buried on page 212 of a merger deck.

The fix is to split large PDFs with the Acrobat organize pages tool or to use Copilot Studio agents, which tolerate 512 MB files and 500 files per environment.

Where Copilot Can and Cannot Find Your PDFs

The license you hold dictates which storage locations Copilot can reach. The free Microsoft Copilot Chat tier reads PDFs you attach manually in the prompt box. Paid Microsoft 365 Copilot adds grounding on OneDrive, SharePoint, Teams, Outlook, and Microsoft Loop. Copilot Pro, the consumer SKU, does not ground on business OneDrive or SharePoint, a nuance confirmed by Microsoft Q&A and one that catches many small businesses off guard.

The consequence of mixing up SKUs is a failed pilot. A 25-person architecture firm in Vilnius bought Copilot Pro for every employee, expecting it to read project PDFs in SharePoint, and discovered only the enterprise license exposes that path.

A common misconception is that “Copilot is Copilot.” It is not. The Microsoft 365 Copilot service description lists nine distinct surfaces, each with its own data boundary.

OneDrive and SharePoint

OneDrive and SharePoint are the preferred homes for PDFs you want Copilot to read. The SharePoint search index crawls PDF text layers nightly, and Graph connectors extend that reach to file shares, Box, and Google Drive.

The consequence of storing PDFs on a local drive is that Copilot will not see them unless you drag them into the prompt. That breaks SEC Rule 17a-4 books-and-records retention for broker-dealers, because personal copies bypass the compliant archive.

Teams and Outlook

Copilot in Teams reads PDFs posted in channel conversations and meeting chats. Copilot in Outlook reads PDF attachments when you click Summarize by Copilot in the reading pane.

The consequence of relying on Outlook summaries for legal review is that attorney-client privilege can be waived if the summary is forwarded to a non-privileged party, a risk flagged by the ABA Formal Opinion 512 on generative AI.

Local files and third-party drives

Copilot Chat accepts drag-and-drop uploads from Windows Explorer, macOS Finder, and the Edge download tray. Those files are not persisted in your tenant unless you save them back to OneDrive.

The consequence is that chat context evaporates when the session ends, and the PDF cannot be cited in a later conversation. The fix is to save to OneDrive before uploading, preserving the audit trail required by Sarbanes-Oxley Section 404.

Three Real-World PDF Scenarios

Below are the three scenarios that dominate support tickets inside Microsoft 365 Copilot rollouts. Each uses a two-column table to show the user’s move and the resulting Copilot behavior.

Scenario 1: Summarizing a contract in Word

Priya, a general counsel in Chicago, opens a 45-page vendor agreement PDF saved to her OneDrive. She uses Copilot in Word’s Reference a file feature to ask, “List every indemnification and limitation of liability clause.”

User move in WordCopilot behavior
Opens PDF directly in Word and runs Reference a file on the OneDrive copyReturns a bulleted list with page-cited indemnity and LoL clauses
Asks Copilot to rewrite the entire PDF as a plain-English memoTruncates after roughly 3,000 words per the Microsoft length guide
Requests a redline against the firm’s template stored in SharePointCompares both documents and surfaces deviations with citations

Scenario 2: Extracting tables into Excel

Kenji, a financial analyst in New York, receives a 120-page quarterly filing PDF. He uses Copilot in Excel to import the income statement.

User move in ExcelCopilot behavior
Pastes the PDF into OneDrive and runs Get Data from PDF via Power QueryLists every detected table and lets Kenji pick the income statement
Asks Copilot to calculate year-over-year revenue growth across the imported rangeAdds a formula column with the correct percentages and cites the source cells
Requests a pivot table and chart summarizing the dataGenerates the pivot, applies number formats, and inserts a clustered column chart

Scenario 3: Drafting an Outlook reply to a PDF attachment

Lena, an HR director in Austin, gets a 12-page benefits PDF in her inbox. She clicks Summarize by Copilot and then Draft with Copilot to respond.

User move in OutlookCopilot behavior
Hits Summarize by Copilot on the email containing the PDFProduces a five-bullet summary of the PDF attachment
Uses Draft with Copilot to write a reply that cites two specific pagesWrites a compliant reply and inserts citation footnotes
Asks Copilot to flag any ERISA notice deadlines in the PDFSurfaces dates governed by the ERISA 29 U.S.C. 1022 notice rule

Federal and State Laws That Shape Copilot PDF Use

Federal law sets the floor for how you can feed PDFs to Copilot. State law often layers on tighter rules, especially in California, Texas, Illinois, and New York.

The consequence of ignoring the legal frame is that a single careless prompt can trigger a regulator’s inquiry. In 2024, the SEC charged two investment advisers with “AI washing,” proving that enforcement in this space is real.

A common misconception is that Microsoft’s compliance posture is enough. It is not. The Microsoft Data Protection Addendum covers the tenant boundary, but the obligation to classify the PDF, apply a sensitivity label, and restrict who can prompt against it remains with the customer.

HIPAA and protected health information

HIPAA 45 CFR 164.502 bans the disclosure of protected health information without a valid authorization or treatment, payment, or operations purpose. PDFs containing lab reports, intake forms, or explanation-of-benefits statements qualify as PHI.

The consequence of feeding PHI PDFs to a non-covered Copilot tier is a potential HIPAA violation, with civil penalties up to $2.1 million per calendar year per violation category under the HHS penalty tiers. The fix is to execute a Business Associate Agreement with Microsoft and limit Copilot use to enterprise tenants inside the HIPAA-covered service boundary.

A real-world example is Omar, a hospital CIO in Miami who rolled out Copilot Chat without checking the BAA. He paused the project when his privacy officer flagged that the free tier is not BAA-covered.

GLBA, FERPA, and SEC Rule 17a-4

The Gramm-Leach-Bliley Act Safeguards Rule requires financial institutions to protect customer PDFs such as loan applications and account statements. FERPA 20 U.S.C. 1232g covers student transcripts and disciplinary files. SEC Rule 17a-4 forces broker-dealers to retain communications in a non-rewriteable format for six years.

The consequence of letting Copilot rewrite a Rule 17a-4 PDF outside a compliant archive is that the original record may no longer be authentic. The fix is to store originals in a Microsoft Purview retention-locked container and allow Copilot to summarize copies, not originals.

State privacy and AI laws

California’s CCPA, as amended by the CPRA, the Texas Data Privacy and Security Act, and the Illinois Biometric Information Privacy Act all reach PDFs that contain personal or biometric data. Colorado’s new Artificial Intelligence Act, effective February 2026, requires impact assessments for high-risk AI uses.

The consequence of a mis-configured Copilot deployment in a regulated state is a private right of action under BIPA that can run into the thousands per affected person. The fix is to run a Purview DLP policy that blocks prompts containing biometric identifiers before they ever reach the model.

A named example is Nora, a benefits manager in Chicago who nearly pasted a fingerprint-based timeclock PDF into Copilot. A DLP policy tip stopped her, which is exactly the control BIPA envisions.

Mistakes to Avoid When Feeding PDFs to Copilot

Below are the most common errors, each with the negative outcome spelled out.

  • Uploading a scanned PDF without OCR, which returns blank or hallucinated summaries
  • Storing contracts on a local C: drive, which blocks Graph grounding and breaks SEC 17a-4 retention
  • Sending PHI PDFs to the free Copilot Chat tier, which voids HIPAA protections
  • Ignoring the 300-page summary guideline, which silently truncates later sections
  • Skipping sensitivity labels, which lets DLP fail open and leaks confidential pricing
  • Trusting Copilot output without checking citation bubbles, which masks hallucinations
  • Using Copilot Pro for business PDFs, which cannot ground on corporate OneDrive
  • Forwarding a Copilot summary of a privileged PDF to a third party, which can waive privilege under ABA Model Rule 1.6
  • Prompting against encrypted PDFs, which Copilot cannot open and which wastes user time
  • Running Copilot on PDFs with IRM rights-management restrictions that block extraction
  • Relying on Outlook Summarize for long filings, because it only reads the first attachment

Named Examples From the Field

Rahul, a tax partner in Boston, uses Copilot in Word to draft client memos from 200-page IRS revenue rulings stored on SharePoint. His workflow leans on the Reference a file command and sensitivity labels aligned to Circular 230.

Sofia, a benefits analyst in Phoenix, feeds summary-plan-description PDFs into Copilot Chat to answer employee questions about deductibles. She cross-checks every answer against the source because the Department of Labor treats SPD accuracy as an ERISA fiduciary duty.

Marcus, a procurement director in Atlanta, uses Copilot in Excel to parse pricing tables from supplier PDFs. He runs the extraction through Power Query to keep a clean audit log for FAR Part 15 federal contracting compliance.

Do’s and Don’ts

Do’s:

  • Save every PDF to OneDrive or SharePoint first, because grounding accuracy jumps and audit logs are preserved
  • Apply a Purview sensitivity label to every PDF, because it governs what Copilot may extract
  • OCR scanned PDFs with Acrobat or Syntex before prompting, because text layers are required
  • Split PDFs longer than 300 pages, because Copilot silently truncates beyond that point
  • Review citation bubbles on every response, because missing citations signal hallucination

Don’ts:

  • Do not paste PHI, PCI, or biometric data into the free Copilot Chat tier, because compliance coverage is thinner
  • Do not expect Copilot Pro to ground on business data, because only the enterprise SKU reaches corporate OneDrive
  • Do not forward Copilot summaries of privileged PDFs outside counsel, because privilege can be waived
  • Do not rely on Copilot to read encrypted or password-protected PDFs, because the orchestrator cannot decrypt them
  • Do not skip a BAA before feeding medical PDFs to Copilot, because HIPAA exposure is real

Pros and Cons of Using Copilot for PDFs

Pros:

  • Speeds contract review by 40 to 60 percent in Microsoft case studies
  • Keeps data inside your Microsoft 365 tenant boundary per the DPA
  • Integrates with Purview DLP and sensitivity labels out of the box
  • Works across Word, Excel, PowerPoint, Outlook, Teams, and Loop with one license
  • Supports up to 512 MB PDFs through Copilot Studio agents for heavy lift use cases

Cons:

  • Struggles with scanned or image-only PDFs without added OCR steps
  • Requires OneDrive or SharePoint storage for grounded answers, which forces a migration for many firms
  • Costs $30 per user per month on top of a Microsoft 365 E3 or E5 license
  • Truncates silently beyond 300 pages, which risks missed clauses in long filings
  • Does not support encrypted or rights-managed PDFs, which blocks many legal and HR files

Step-by-Step: Getting Copilot to Read a PDF Cleanly

The Microsoft rollout guide lays out the canonical workflow, but the version below reflects what actually works in production.

Step one is to save the PDF to OneDrive or SharePoint. This step puts the file inside Graph and turns on audit logging under Microsoft Purview Audit. Skipping it means Copilot cannot cite the file, and your compliance team cannot prove who read what.

Step two is to apply a sensitivity label. Labels flow with the file, and they tell Copilot whether extraction is allowed. Without a label, DLP rules may default to permissive, which leaks content you expected to protect.

Step three is to confirm the text layer. Open the PDF in Edge, press Ctrl+F, and search for any word you can see on screen. If nothing highlights, the file needs OCR before Copilot can read it.

Step four is to choose the right surface. Use Copilot in Word for rewrites, Copilot in Excel for tables, Copilot in Outlook for short summaries, and Copilot Chat for cross-document queries. Picking the wrong surface is the top reason users say “Copilot doesn’t work with my PDFs.”

Step five is to prompt with grounding. Type / and pick the file, or attach it via the cloud files icon. Without explicit grounding, Copilot may fall back on general knowledge and invent facts.

Step six is to verify citations. Click each blue bubble and confirm the cited page supports the claim. This step satisfies the New York State Bar AI guidance duty of technology competence.

Comparing Copilot PDF Behavior Across Surfaces

The table below distills the DataStudios surface comparison into the numbers that actually matter for a rollout.

SurfaceMax PDF sizePage guidanceGrounding source
Microsoft 365 Copilot Chat512 MB for licensed users~300 pages for summariesOneDrive, SharePoint, attachments
Copilot in WordOneDrive quota3,000 words for rewritesOpen document plus Reference a file
Copilot in ExcelVia Power Query PDF connectorTable count, not page countOneDrive-stored PDFs
Copilot in OutlookMailbox attachment limitFirst attachment onlyEmail body plus attachment
Copilot Studio agents512 MB per file, 500 files per environment~1,000 pagesCustom knowledge sources
Copilot ProManual upload only~20 pages comfortableWeb and attached files only

Copilot vs. Other PDF AI Assistants

Buyers often ask how Copilot stacks up against Adobe Acrobat AI Assistant, ChatGPT Enterprise, and Google Gemini for Workspace.

ToolBest forKey PDF limitCompliance posture
Microsoft 365 CopilotOffice-native PDF workflows300 pages for summariesPurview, DLP, BAA available
Adobe Acrobat AI AssistantDeep PDF editing plus AI600 pages per sessionAdobe Trust Center
ChatGPT EnterpriseCross-format research2 million token contextSOC 2 Type 2
Google Gemini for WorkspaceGoogle Drive PDFsVaries by modelWorkspace DPA

The consequence of picking the wrong tool is duplicated spend. Many enterprises run both Copilot and Acrobat AI Assistant because Copilot cannot edit PDF form fields and Acrobat cannot reach across Word, Excel, and Teams.

Key Entities You Should Know

Microsoft Graph is the index that feeds Copilot. Azure OpenAI Service hosts the models. Microsoft Purview enforces labels and DLP. Microsoft 365 Copilot Chat is the free tier. Microsoft 365 Copilot is the paid enterprise SKU. Copilot Pro is the consumer upgrade. Copilot Studio builds custom agents.

The U.S. Department of Health and Human Services enforces HIPAA. The Securities and Exchange Commission enforces Rule 17a-4. The Federal Trade Commission enforces GLBA Safeguards. The American Bar Association issues Model Rules that shape lawyer use of Copilot on PDFs.

Each entity plays a distinct role. Graph indexes the file, Azure OpenAI reasons over it, Purview governs it, and the federal and state agencies police the downstream disclosures.

Recapping Relevant Rulings and Guidance

The ABA Formal Opinion 512 issued in July 2024 sets the duty-of-competence bar for lawyers using generative AI on PDFs. The SEC’s March 2024 AI washing order confirms that misleading claims about AI use are actionable. The White House Executive Order 14110 on AI, though partially rescinded in 2025, still informs agency practice on AI risk.

The consequence of ignoring these rulings is professional discipline or enforcement. The fix is to bake them into your Copilot rollout playbook, alongside the Microsoft Responsible AI Standard.

A named example is Jessica, a compliance officer in Boston, who built a Copilot Studio agent that cites ABA Opinion 512 every time a lawyer prompts against a privileged PDF. The agent prevents inadvertent waiver and logs every interaction for audit.

Frequently Asked Questions

Can Microsoft 365 Copilot read PDF files?

Yes. Copilot reads text-based PDFs stored in OneDrive, SharePoint, Teams, Outlook, or attached directly to a chat prompt, and it returns summaries, extractions, and drafts grounded in the file.

Can Copilot read scanned or image-based PDFs?

No. Without an OCR text layer the orchestrator sees no words, so you must run Acrobat OCR or SharePoint Syntex before prompting against the file.

Can Copilot Pro read PDFs from corporate OneDrive?

No. Copilot Pro only reads manually uploaded files and consumer OneDrive, so business PDFs require the enterprise Microsoft 365 Copilot license instead.

Is it HIPAA-compliant to feed PHI PDFs to Copilot?

Yes, but only inside the enterprise tenant with a signed Microsoft Business Associate Agreement and Purview controls limiting who may prompt against protected health information.

Can Copilot read password-protected PDFs?

No. The orchestrator cannot decrypt files, so you must unlock the PDF or apply rights management that allows extraction before Copilot can read it.

What is the maximum PDF size Copilot can handle?

Yes, there is a ceiling; licensed users can upload up to 512 MB through Copilot Studio, while Copilot Chat typically handles files up to 10 MB reliably.

Does Copilot train its foundation models on my PDFs?

No. Under the Microsoft Product Terms and DPA, tenant content, including PDF uploads, is not used to train foundation models and stays inside your compliance boundary.

Can Copilot compare two PDF contracts?

Yes. Copilot in Word’s Reference a file feature lets you load both contracts from OneDrive and prompt for a clause-by-clause redline with citations to each source.

Will Copilot cite page numbers when it summarizes a PDF?

Yes. Grounded responses include clickable citation bubbles pointing to the specific page or section, which is critical for legal authentication under Federal Rule of Evidence 901.

Can Copilot extract tables from a PDF into Excel?

Yes. Copilot in Excel uses the Power Query PDF connector to detect tables, import them, and then build pivots, charts, and calculated columns on top of the data.

Does Copilot read PDFs in Outlook email attachments?

Yes. The Summarize by Copilot button in Outlook reads the first PDF attachment and returns a bulleted summary you can cite in your reply.

Can I block Copilot from reading sensitive PDFs?

Yes. Purview sensitivity labels and DLP policies can prevent extraction, block uploads, or show policy tips before a user ever submits a prompt against a restricted PDF.