About the client
The challenge was structural: Italian labor law requires companies and their subcontractors to demonstrate ongoing compliance with CCNL (Contratto Collettivo Nazionale di Lavoro) — national collective bargaining agreements that govern minimum wages, social security contributions, and employment tiers. Verifying compliance manually, across dozens of contractors who each format their payrolls differently, was slow, error-prone, and unscalable.
The client approached WWG with a clear brief: build an AI-powered proof of concept that could read payroll PDFs from different companies, normalize their data regardless of field naming variations, and check each payroll against the applicable CCNL. If the PoC succeeded, it would form the foundation for a full multi-tenant SaaS platform.
Challenge
- Field Name Heterogeneity Across Payroll Documents. Italian payrolls are not standardized at a formatting level. A field labeled “Paga base mensile” by one company might appear as “Minimo contrattuale” or “Retribuzione base” in another. The AI had to identify, extract, and normalize these variations into a unified schema — without hard-coding rules that would break on new document templates.
- Compliance Verification Against Dynamic Regulations. CCNL agreements differ by industry sector and are periodically updated. The system needed to store structured CCNL data (wage minimums, contribution rates, employment level tiers) and compare extracted payroll fields against the correct agreement for each contractor — flagging discrepancies without false positives.
- Operator-in-the-Loop for Unknown Templates. When the AI encountered a payroll from a previously unseen company, it could not be expected to map fields automatically with sufficient confidence. The system needed a validation interface where a studio operator could review the extracted fields, confirm or correct the mapping, and teach the system for future encounters — creating a feedback loop between human validation and machine learning.
- Multi-Role Platform Architecture from Day One. The PoC was not a standalone tool — it had to be embedded within a platform that would ultimately serve four distinct user types (Admin, Professional Studio, Client Company, and Contractor) with different data access rights, workflows, and notification needs. Building for extensibility while delivering core AI functionality in parallel required disciplined architecture decisions throughout.
Have some question?
Let’s meet and talk.
Solution
WWG structured delivery across four major epics, executed in parallel Agile sprints with regular client checkpoints. The solution combined a purpose-built AI extraction and compliance engine with a multi-role web platform.
AI Extraction Engine: OCR, ML, and NLP PipelineThe core extraction layer used AWS Document Analyze (Textract) to parse payroll PDFs into structured key-value pairs and tabular data. A machine learning and NLP normalization layer then mapped extracted field names — regardless of company-specific terminology — onto a unified internal schema. The system maintained a growing dictionary of confirmed field mappings, updated each time a studio operator validated a new template, making future extraction progressively more accurate.
For new payroll templates not yet seen by the system, extracted fields were surfaced in a dedicated review UI accessible to Studio users. Operators could confirm correct mappings, override incorrect ones, and submit validated data to the database. This “first-encounter” review gate ensured data integrity without blocking workflow — subsequent uploads from the same company bypassed manual review entirely.
Compliance Engine: CCNL Verification
Once payroll data was normalized and validated, the compliance engine compared each field against the relevant CCNL agreement stored in the platform database. The system checked three critical dimensions:
- Minimum base salary (“Minimo” / Paga base mensile) against CCNL minimums by employment level
- INPS social security contribution rates against the mandatory CSC (Contributive Sector Code) thresholds
- Employment tier classification against the contractor’s declared CCNL level
The compliance engine was embedded within a fully role-segmented platform. Each user type accessed a tailored interface:
- Administrators managed CCNL data, platform configuration, and user invitations
- Professional Studios reviewed payroll extractions, validated field mappings, managed contractor portfolios, and configured notification rules
- Client Companies managed their contractor relationships, monitored compliance status, and accessed audit reports
- Contractors uploaded payroll documents (PDF and Excel), viewed compliance results, and managed their own document checklists
An automated notification system alerted relevant parties to document irregularities, missing uploads, compliance failures, and pending reviews. Studio users could configure notification frequency and delivery channels (platform-native and email), ensuring that compliance gaps surfaced promptly rather than accumulating undetected.
Technology Stack
OCR and structured data extraction from payroll PDFs — key-value pairs and tabular data
Cloud Infrastructure (AWS)
Scalable compute for AI processing, storage, and environment management across PoC and production stages
testomat.io
Test case management for QA — all 15 AI compliance test cases executed and tracked to 100% pass rate
Field name normalization across heterogeneous payroll formats; operator feedback loop improves accuracy over time
Agile / Scrum (ClickUp + Linear)
Sprint-based delivery with task tracking, QA bug reporting, and client visibility throughout the project lifecycle
Figma
UX/UI design system — component library and user flows for all four platform roles
Key Technical Challenges
The most technically demanding aspect of the project was designing a normalization layer that could handle previously unseen payroll formats without human intervention at scale. The solution combined an NLP-based similarity engine for initial field matching with a structured feedback mechanism — every operator validation decision was stored and used to update the system’s mapping confidence. Templates seen for the first time required manual review; subsequent uploads from the same company bypassed this gate entirely. This architecture made the platform progressively smarter over time without requiring model retraining cycles.
Accurately modeling CCNL compliance rules was a domain challenge as much as a technical one. National collective agreements vary by industry sector, employment level (livello), and update periodically. The platform required a database schema flexible enough to store CCNL parameters across sectors (wage minimums, INPS contribution thresholds, level classifications) while ensuring that compliance checks always applied the correct agreement version for each contractor. The Compliance_Algorithm_Analysis document informed the algorithm’s treatment of Italian payroll components — including IRPEF tax brackets, TFR accrual, seniority increments, and contribution base calculations.
Building a multi-tenant system with four distinct user types — each with different data visibility, workflow capabilities, and notification configurations — required careful permission architecture from the outset. The invitation-based onboarding chain (Admin → Studio → Client → Contractor) enforced relational trust while maintaining data isolation. Role-specific interfaces ensured each user type accessed only the tools and data relevant to their function, reducing cognitive load and minimizing the risk of privilege escalation.
Results
- PoC Delivered in 8 Weeks, On Schedule
The proof of concept was completed within the agreed timeline, validating the core AI hypothesis and providing a solid architectural foundation for the full platform build. - 100% AI Test Pass Rate
All 15 QA test cases for the AI extraction and compliance functionality passed without failures, confirming the accuracy of the extraction engine and the reliability of the compliance checks across real payroll documents. - Three Payroll Formats Successfully Normalized
The AI engine demonstrated the ability to read, extract, and normalize payroll data from three distinct company formats — each with different field naming conventions — mapping them accurately to the unified platform schema. - Full Multi-Role Platform Shipped
Beyond the PoC scope, WWG delivered a complete SaaS platform with four distinct user roles, invitation-based onboarding, document management, compliance reporting, and a configurable notification system.
Conclusion
WWG approached this not as a generic automation project but as a domain-specific engineering challenge.
The 8-week PoC proved the technical feasibility of the approach. The subsequent platform build translated that proof into a production-ready, multi-tenant SaaS system that professional studios and their client networks can use today to manage contractor compliance at scale.