A practical, six-part action plan for getting your SharePoint AI-safe.
TL;DR
- AI can read everything your users have access to. If your permissions are messy, AI’s answers will be too.
- Six things need to be in order: project site architecture, document libraries, retention, sensitivity labels, Entra ID groups, and an oversharing scan.
- You have two real choices: DIY the action plan (usually 3–6 months of part-time work), or get it assessed and roadmapped by experts.
- Engineering firms have specific exposure: draft tenders, reports, expert opinions, salary spreadsheets parked in the wrong place in 2019. AI doesn't know which is which.
Why "make sure your data is clean" deserves a better answer
If you've Googled "AI readiness" lately, you've probably noticed most articles land in the same place: "make sure your data is clean."
It's well-intentioned, but it skips the part you need: the specifics. A bit like telling someone planning a wedding to "make sure things are organised." True, but where do you start?
That's where this post comes in. We'll walk through the six things engineering firms specifically need to look at, what "good" looks like for each one, and the patterns we see almost every time. So you can move from "we feel behind" to "we know what to fix first."
Before we dive in, here's the most useful way to think about AI and Copilot.
AI isn't a search engine. It's a brilliant new grad with full network access.
It will read everything it's allowed to read, and then helpfully summarise it.
Including the draft tender you haven't submitted yet, the unrevised report from three years ago, and the salary spreadsheet that found its way into a project folder by accident.
The great news?
That's not AI misbehaving. It's just trusting permissions that were set when your firm was half the size.
Which means the fix is entirely within your control. And the action plan below is exactly how to do it.
The six-part AI readiness action plan
1. Project site architecture
What it is:
How your SharePoint is structured at the site level. Typically, one site per project, per client, per discipline, or some mix.
Why AI cares:
Site-level permissions are the broadest brushstroke in your security model. If a Project Engineer has access to a project site they shouldn't, AI will happily pull anything from inside it.
What good looks like for engineering firms:
- A consistent pattern (e.g. one site per project, named by project number)
- Clear ownership. Every site has a named site owner, not an ex-employee
- Provisioning that goes through a template, not ad-hoc creation
- Project sites archived or locked when the project closes
What's usually broken:
- A mix of "old SharePoint" (department sites with everyone added), "new SharePoint" (modern team sites), and Teams-created sites with permissions nobody understands
- Sites owned by someone who left in 2022
- Project sites still active years after handover
2. Document libraries
What it is:
The actual folders inside each site where files live.
Why AI cares:
This is where the work product lives. Think drawings, specs, reports, models, tenders. AI indexes content at the file level. Messy library structure makes both AI’s answers and your governance harder.
What good looks like:
- A consistent library naming convention (Drawings, Specifications, Reports, Tenders, Correspondence)
- Metadata where it matters (discipline, document type, revision, status) rather than deeply nested folders
- A defined home for work in progress versus issued content so AI doesn’t surface draft expert opinions as if they were final
- Minimal unique permissions below site level so access is controlled consistently and remains understandable
What's usually broken:
- Folders six levels deep with names like "FINAL_v2_USE_THIS_ONE"
- No distinction between draft and issued documents
- Personal OneDrives being used as the actual document store
3. Retention policies
What it is:
Rules that retain content for a required period and, when appropriate, delete it or support lifecycle decisions such as archival.
Why AI cares:
AI reads old content as readily as new. Without retention, every superseded specification and every closed-project email is fair game.
What good looks like:
- A retention schedule mapped to your actual obligations (contract terms, insurance run-off, statute of limitations for professional liability)
- Different rules for different content types: a 2-year client email isn't the same as a 10-year report
- Defensible deletion of content past its retention date
What's usually broken:
- Nothing has been deleted since SharePoint was deployed
- Or retention has been set so aggressively that critical project records were deleted while a dispute was active
4. Sensitivity labels
What it is:
Labels applied to documents that travel with the file (Public, Internal, Confidential, Highly Confidential) and trigger automatic protections.
Why AI cares:
AI respects sensitivity labels. A document labelled Highly Confidential – Project Team Only won’t be surfaced to someone outside the project team, regardless of where they found it. Sensitivity labels are one of the most important controls you have, but they do not replace the need to fix permissions and oversharing.
What good looks like for engineering firms:
- A label set tailored to engineering work product, not Microsoft's generic example
- Auto-labelling rules for predictable patterns (tender responses, draft expert opinions, employee files)
- User training so the labels are applied to the work people are creating now
What's usually broken:
- Labels aren't deployed at all
- Or labels exist but nobody applies them, so they're decorative
5. Entra ID groups
What it is:
Microsoft Entra ID (formerly Azure AD) is your identity layer. Groups in Entra are how you control who can access what.
Why AI cares:
Almost every permission in SharePoint ultimately resolves to "is this person in this group?" If your groups are stale, your permissions are stale.
What good looks like:
- Groups that match your org structure today, not three reorganisations ago
- Dynamic groups based on attributes (department, project allocation) where possible
- A regular access review: someone confirming each quarter that the right people are still in the right groups
- A clear process for off-boarding that removes access on the day, not on a future audit
What's usually broken:
- "All Staff" groups that contain ex-employees
- External collaborators (sub-consultants, surveyors, peer reviewers) added to project groups years ago and never removed
- Nested groups where nobody is sure of the actual effective access
6. Oversharing scan
What it is:
A targeted audit of where documents have been shared more broadly than they should be: Anyone with the link, Everyone in the firm, Specific guests still on the access list.
Why AI cares:
Oversharing is the #1 cause of AI surfacing the wrong thing. If a document is shared with Everyone, AI treats that as "this is fair game for everyone's queries."
What good looks like:
- A documented scan of every share above a defined threshold
- Remediation of the highest-risk items first (tenders, expert opinions, HR files, financials)
- A standing process (usually via Microsoft's SharePoint Advanced Management or a third-party tool) to flag new oversharing as it happens
What's usually broken:
- Nobody has ever looked
- Or, equally common, somebody looked once, generated a 4,000-line report, and shelved it because nobody knew where to start
Bonus: Adjacent controls that strengthen Copilot readiness
If the six-part action plan is your core readiness baseline, these are the adjacent controls that make it more robust. They’re not a substitute for fixing SharePoint structure and permissions, but they do reduce the chance of sensitive content being surfaced, prompted, or processed in ways you didn’t intend.
- Sharing configuration: Review tenant and site-level sharing settings, default link types, and whether Anyone or broad internal links are still allowed where they shouldn’t be.
- Site access reviews and Data Access Governance reports: Use SharePoint Advanced Management reports to identify overshared sites, then push review and remediation to the site owners best placed to fix them.
- DLP for Copilot and generative AI: Extend your governance beyond stored content by using Microsoft Purview controls to reduce the risk of sensitive information being entered into prompts or returned in responses.
- Discoverability and search: Assume that if content is well indexed and a user has access, Copilot can make use of it. Search quality and security hygiene are now directly connected.
- File shares and non-SharePoint repositories: Be explicit about what is and isn’t in scope. Many firms still have critical content on file shares, project platforms, or line-of-business systems that Copilot won’t treat the same way.
In practice, these adjacent controls are what turn a one-off remediation exercise into an operating model. They help you not just clean things up once, but keep them clean enough that Copilot remains useful without becoming a governance surprise.
Two ways to get this done
Realistically, you have two paths.
Path A: DIY.
Everything above is doable in-house. Most engineering firms with a decent IT manager and a co-operative Information Manager can work through it over 3–6 months of part-time effort. The hardest bits are the sensitivity label design (it's deceptively easy to get wrong) and the oversharing remediation (where the tools generate more findings than humans can act on without a triage process).
Path B: Get it assessed.
The AI Readiness Discovery and Roadmap compresses the diagnostic and design work into a right-sized engagement, scaled to your firm rather than a fixed template. And hands you a prioritised roadmap your team executes afterwards
Either path is legitimate. The wrong path is "switch Copilot on first and figure this out later."
A note on why WebVine wrote this
We're not engineering experts. We've never designed a retaining wall, and we don't pretend to. What we know deeply is Microsoft 365 governance (SharePoint architecture, sensitivity labels, DLP, retention, Entra ID) and how to apply that knowledge to the specific work product engineering firms create.
We chose to write this rather than gate it behind a form because the action plan itself is the value. If you take this list and work through it yourself, that's a win.
If you'd rather have it done for you, we built the AI Readiness Discovery and Roadmap for exactly that.
Two ways to get started:
- Take the 5-minute self-assessment: How Ready Is Your SharePoint for Copilot? A quick personalised read on where you sit against the action plan above.
- Book a 20-minute scoping call: We'll size the right AI Readiness Discovery and Roadmap for your firm.
FAQ
Will Copilot work without doing all this first?
Yes. That's the trap. Copilot will switch on, give people genuinely useful answers, and occasionally surface something it shouldn't. The damage is usually invisible until it isn't.
How long does the action plan take DIY?
For a 200–2,000 staff firm with an existing IT manager and a SharePoint environment that's been in use for several years: typically 3–6 months part-time, longer if sensitivity labels and DLP are new ground. The oversharing scan alone can take a month before remediation begins.
Do we need every action plan item in place before turning Copilot on?
No, but you need a defensible baseline. At minimum: oversharing scan complete and high-risk items remediated, sensitivity labels deployed for tender responses and expert opinions, and Entra ID groups reviewed for ex-employees and external collaborators. Everything else can run in parallel.
Will sensitivity labels slow our engineers down?
Done well, no. Auto-labelling does most of the work and users only label exceptions. Done badly, they'll be the most-complained-about feature you've ever rolled out. The design step matters more than the technology.
What about external collaborators: sub-consultants, surveyors, peer reviewers?
This is where the oversharing scan earns its keep. External access is legitimate and necessary; unmanaged external access is the problem. The goal is a known list of guests with documented project scope and an off-boarding trigger when the project closes.
Do we need to migrate off file shares first?
Ideally yes, because Copilot can't index a file share. But that's a separate project. Most firms we work with are 80–95% on SharePoint already; we focus on the SharePoint readiness and flag the residual file-share content as a separate workstream.
Is this just for Microsoft 365 Copilot, or other Copilots too?
The action plan applies to any Microsoft 365 Copilot scenario: Chat, Word, Outlook, Excel, Teams. Role-specific Copilots (Copilot for Sales, Copilot for Service) layer on top of the same foundation, so the work isn't wasted if you adopt those later.
What does this cost if we get help?
The AI Readiness Discovery and Roadmap is right-sized to your firm. Scope and price depend on how many priority sites and how much depth you need, which we'll agree on a short scoping call.
About the Author – Chloe Dervin
Chloe Dervin is Managing Director of WebVine, a specialist consultancy that does one thing: Microsoft 365. She and her team work with mid-market firms across Australia and New Zealand on the unglamorous-but-critical foundations: SharePoint architecture, sensitivity labels, retention, Entra ID, and the governance that makes Copilot safe to switch on.
Chloe's view on AI readiness is the one running through this post: most firms aren't behind on AI, they're behind on the information hygiene that makes AI useful instead of dangerous.
WebVine's clients include environmental and engineering consultancies, so the patterns in this checklist aren't theoretical. They're what the team sees, and fixes, every week.
WebVine doesn't resell licences or hand work to a partner network. The team that scopes the work is the team that delivers it.
Sources
- Microsoft 365 Copilot overview — Microsoft Learn
- Data, privacy, and security for Microsoft 365 Copilot
- Sensitivity labels overview — Microsoft Purview
- Retention policies and retention labels — Microsoft Purview
- SharePoint Advanced Management
- Microsoft Entra ID best practices
- Microsoft Work Trend Index — productivity research