Jump to content

Project OpenCoder: AI Independence Initiative: Difference between revisions

From MediawikiCIT
Justinaquino (talk | contribs)
Imported from gi7b wiki
 
Justinaquino (talk | contribs)
No edit summary
 
Line 1: Line 1:
= Project OpenCoder: Strategic AI Independence Initiative =
== '''PROJECT OPENCODER''' ==


'''Prepared for:''' Comfac Technology Group & Cornersteel Systems Corp.
=== Strategic AI Independence Initiative ===


'''Objective:''' Transition from renting intelligence ($1,200/yr/seat) to '''manufacturing intelligence''' using a self-hosted, open-source ecosystem that powers both software development and physical production.
==== Revised Plan v2.0 ====
Prepared for:


----
'''Comfac Technology Group & Cornersteel Systems Corp.'''


== Executive Summary ==


'''The Pivot:''' Moving beyond simple "Chatbot" replacements to establish the '''Comfac Matrix Model (CMM)''' Business Unit—a proprietary engine that automates not just software code, but the ''physical'' design and manufacturing processes (via FreeCAD/Python) for the group and partners.
= 1. Executive Summary =
'''The Pivot:''' Project OpenCoder establishes the '''Comfac Matrix Model (CMM) Business Unit'''—a proprietary engine that manufactures intelligence for software development, physical production (via FreeCAD/Python), and enterprise automation. Rather than renting AI at $1,200/yr/seat, Comfac will '''build, train, and deploy its own specialized models''', reducing per-seat cost to $200/year while creating a new revenue-generating division.


=== The Strategy ===
== 1.1 Strategic Pillars ==
'''Open Source Core:''' Release stable models publicly, cultivating a community of students and developers who become a distributed R&D department. Partner with schools to create a QA pipeline that continuously improves model quality.


# '''Open Source Core (The Stable Release):''' Release "Stable" models to the public, building a community of students and developers who learn the stack, effectively becoming a distributed R&D department.
'''Partner Premium:''' Reserve cutting-edge models—trained on latest data with faster iteration—for Comfac and paying partners (Cornersteel, Government agencies).
# '''Partner Premium (The Cutting Edge):''' Reserve "Cutting Edge" models—trained on latest data and capable of faster iteration—for Comfac and paying partners (e.g., Cornersteel, Government).
# '''Funding Mechanism:''' Partners receive '''Exclusive Access''', allowing their teams to be trained on and utilize "Cutting Edge" resources.


=== Financial Model ===
'''LoRA Package Registry:''' Introduce a structured LoRA Package Registry that packages domain expertise as composable, versionable LoRA modules. Each LoRA Package encapsulates PRD writing parameters, app architecture patterns, and validated QA benchmarks for a specific use case.


; Operational Expenditure (OpEx) : '''₱15M per year''' upon release of Open Source and Cutting Edge models
== 1.2 Financial Model ==
; Partner Commitment : '''₱5M/year in Partner Access fees''' (less than $100,000 USD per year)
{| class="wikitable"
; Partner ROI : Access to high-end '''unified memory AI PCs''' capable of running full-sized models, with training and customization for specific corporate requirements, plus highly optimized '''tiny to small AIs''' for departmental usage
|'''Item'''
; Overage Billing : Resources beyond initial commitment billed at '''₱5,000 per hour'''
|'''Detail'''
; Reinvestment : Revenue funds project growth: hiring, infrastructure, marketing, community training
|-
|'''Operational Expenditure'''
|₱15M per year upon release of Open Source and Cutting Edge models
|-
|'''Partner Commitment'''
|₱5M/year in Partner Access fees (less than $100,000 USD/year)
|-
|'''Partner ROI'''
|Unified memory AI PCs + full-sized models + optimized tiny-to-small AIs for departmental use
|-
|'''Overage Billing'''
|₱5,000 per hour beyond commitment
|-
|'''Target Per-Seat Cost'''
|$200/year (down from $1,200/year)
|} 


'''Target Outcome:''' Reduce per-seat AI cost to '''$200/year''' while establishing a new revenue-generating Business Unit that creates intelligence for future Comfac robotics.
= 2. Platform Accounts & Development Infrastructure =
Before any model training or fine-tuning can begin, Comfac must establish accounts on key AI development platforms. These accounts provide access to model weights, training infrastructure, community resources, and hardware-specific toolchains.


----
== 2.1 Hugging Face Account Setup ==
Hugging Face is the central hub for open-weight model distribution, dataset hosting, and community collaboration. A Comfac organizational account is essential for the following reasons:


== The New Business Unit: CMM (Comfac Matrix Models) ==
* Model Access: Download gated models like Qwen 2.5 Coder variants (3B, 7B, 32B) that require license acceptance. Certain models require organizational verification for commercial use.
* Dataset Hosting: Host proprietary training datasets (Frappe code patterns, FreeCAD scripts, internal SOPs) as private repositories. Version control training data alongside model checkpoints.
* Model Publishing: Publish stable CMM releases for community consumption. Track downloads, issues, and community contributions to stable models.
* Spaces & Inference: Use Hugging Face Spaces to host demo inference endpoints for partner evaluation and school QA testing.
* Training with AutoTrain / TRL: Access Hugging Face’s training libraries (TRL, PEFT, Unsloth integration) which are tightly coupled to the HF ecosystem.


'''Mission:''' "Dev, Design, and Manufacture."
=== Account Tiers ===
{| class="wikitable"
|'''Tier'''
|'''Cost'''
|'''Recommendation'''
|-
|Free (Individual)
|$0
|Sufficient for initial benchmarking (Stage 1)
|-
|Pro (Individual)
|$9/month
|For lead MMD/MMS researchers needing priority inference
|-
|Organization
|Free (public repos) / $20/user (private)
|Required for Stage 2+ with private dataset repos
|}


The CMM Unit's mandate is to enhance the ability to create the intelligence for the robotics and systems Comfac will build.
== 2.2 AMD Developer Account & ROCm Ecosystem ==
Since Comfac’s GPU strategy centers on AMD Radeon consumer cards, an AMD Developer account is critical for accessing ROCm (Radeon Open Compute) tooling:


=== Core Products (2026 - 2027 Strategy) ===
* ROCm SDK Access: ROCm is AMD’s open-source GPU compute platform—equivalent to NVIDIA’s CUDA. It provides PyTorch/ROCm builds, HIP compiler toolchains, and GPU profiling tools necessary for LoRA fine-tuning on Radeon hardware.
* Hardware Compatibility: The RX 7600 XT and RX 7800 XT require ROCm 6.x+ for stable PyTorch training. AMD Developer forums and early-access drivers often resolve compatibility issues weeks before public release.
* MI-Series Migration Path: As Comfac scales, AMD’s MI250/MI300 accelerators offer a professional upgrade path. Developer account holders get priority access to documentation, benchmarks, and partner pricing.
* Unified Memory APU Roadmaps: AMD’s Strix Point and subsequent APUs with shared CPU/GPU memory are central to Comfac’s next-gen strategy. Developer accounts provide early access to XDNA NPU SDKs.
* Bug Reporting & Community: Direct channels to AMD engineers for ROCm issues specific to consumer Radeon cards (which are less tested than MI-series for ML workloads).


These foundational products fuel revenue growth required to develop advanced Robotics and VLA models (Stage 3), which require 1-2 years to reach market readiness.
=== Setup Action Items ===


==== 1. The "Coder Matrix" (3B-16B Coding LoRA) ====
# Register AMD Developer account at developer.amd.com
# Install ROCm 6.x on Ubuntu 22.04/24.04 development machines
# Validate PyTorch-ROCm build against RX 7600 XT (gfx1102 target)
# Subscribe to AMD ROCm GitHub releases for driver/compatibility updates
# Join AMD Developer community forums—flag Comfac as AI inference + training use case for potential partnership outreach


'''The Concept:''' A fine-tuned coding assistant replicating '''"Claude Code Premium"''' capabilities without per-seat subscription cost.
= 3. The LoRA Package Registry =
A LoRA Package is a composable, versionable LoRA adapter module paired with its training data, evaluation benchmarks, and deployment configuration. The LoRA Package Registry is the organizational system that catalogs, versions, and chains these Packages for different use cases.


'''Immediate Impact (2026):''' Enables Comfac to rapidly turn around software solutions, directly aiding in capturing sales.
== 3.1 What is a LoRA Package? ==
Each LoRA Package consists of the following components packaged as a single versioned unit:
{| class="wikitable"
|'''Component'''
|'''Description'''
|-
|'''LoRA Weights'''
|The fine-tuned adapter weights for a specific domain (e.g., frappe-erpnext-v1.2)
|-
|'''Training Dataset'''
|Curated instruction/completion pairs, code samples, PRD templates, and architecture patterns
|-
|'''Eval Benchmarks'''
|Automated test cases that measure accuracy, code correctness, and domain knowledge retention
|-
|'''PRD Parameters'''
|Structured prompting templates that define how the model writes PRDs, specs, and documentation for this domain
|-
|'''Package Card'''
|Metadata: version, dependencies, compatible base models, author, QA status, deployment notes
|}


'''Key Beneficiaries & Applications:'''
== 3.2 Package Categories ==


* '''CSC Facilities:''' Rapid deployment of management services and custom integrations (2026)
=== A. Architecture Packages (App Framework Patterns) ===
* '''Frappe Ecosystem:''' Rapid creation of Apps and ERPNext customizations (2026)
These Packages encode deep understanding of specific application frameworks—not just syntax, but idiomatic patterns, project structure conventions, and deployment workflows.
* '''NextCloud:''' Custom App development for partner network (2027)
{| class="wikitable"
* '''TrueNAS:''' Creation of apps and middleware integrations (2027)
|'''Package Name'''
* '''"Steward":''' Philippine home-grown Home Assistant for facility control—tracking all solar panels in CGG with real-time energy usage visible up to every panel board
|'''Domain Coverage'''
* '''Secada:''' Enhanced local-made system (like PaperlessNG, open source) with improved image recognition, running on smaller GPUs until reaching sub-50k PHP PC with 8GB model. Used by all CGG departments.
|'''Training Sources'''
* '''Synopsis:''' Comprehensive OpenCLAW version consolidating all messaging systems, allowing custom tools to organize/search messages and create reports. Used by CGG to remove need for constant Viber checking—ensuring emails reach everyone across platforms with singular indexed source of truth.
|-
* '''Synx:''' Scheduling software for workers, manpower organization, work breakdown documents, and workload prediction. Used by schools for class scheduling and businesses for monitoring work schedules.
|'''frappe-erpnext'''
* '''Data Cleaning Tools:''' Secada and spreadsheet importation tools for Frappe app migration. Comfac Accounting migrated to ERPNext with tools automating data structuring, extraction, and encoding—processing 2-3 companies per month per specialist.
|ERPNext customization, DocType creation, server/client scripts, Print Formats, custom workflows, Frappe hooks
|Comfac ERPNext repos, Frappe Framework docs, community apps, Frappe School tutorials
|-
|'''frappe-app-dev'''
|Full Frappe app creation from bench init to deployment, API design, Jinja templating, bench commands
|Frappe app scaffolds, published Frappe apps on GitHub, internal Comfac Frappe apps
|-
|'''onlyoffice-plugin'''
|OnlyOffice plugin architecture, macro development, document builder SDK, editor customization
|OnlyOffice SDK docs, existing plugin repos, Comfac OnlyOffice configurations
|-
|'''nextcloud-app'''
|NextCloud app development, OCS API, DAV integration, notification system
|NextCloud developer docs, community app repos
|-
|'''freecad-python'''
|FreeCAD scripting, workbench creation, parametric modeling API, macro development
|FreeCAD Python docs, wiki examples, Comfac CAD scripts
|}
 
=== B. Product Packages (Internal Apps) ===
These Packages specialize the model for extending and maintaining Comfac’s proprietary applications:
{| class="wikitable"
|'''Product'''
|'''Package Focus'''
|'''Key Training Material'''
|-
|'''Synx'''
|Worker scheduling algorithms, class schedule optimization, workload prediction models, manpower WBS documents
|Synx codebase, scheduling domain papers, school timetabling patterns
|-
|'''Secada'''
|Document management workflows, OCR/image recognition pipelines, metadata extraction, compliance tagging
|Secada repos, PaperlessNG patterns, Philippine regulatory document formats
|-
|'''Steward'''
|Home/facility automation, solar panel monitoring, IoT device integration, energy dashboard UI, panel board tracking
|Steward codebase, Home Assistant integration patterns, energy monitoring APIs
|-
|'''Synopsis'''
|Multi-platform message consolidation, search/indexing of Viber/email/Telegram/Messenger, report generation from conversations
|Synopsis architecture docs, messaging API schemas, Comfac communication workflows
|}
 
=== C. Process Packages (Cross-Cutting Capabilities) ===
These Packages encode general engineering practices that apply across multiple architectures:
 
* PRD Writing Package: Structured product requirements generation following Comfac’s template parameters—user stories, acceptance criteria, technical constraints, and dependency mapping.
* Data Migration Package: ETL patterns for migrating data into ERPNext—spreadsheet cleanup, field mapping, validation rules.
* API Integration Package: RESTful and webhook patterns for connecting Comfac services to third-party systems.
* DevOps Package: Docker containerization, Nginx reverse proxy configuration, self-hosted deployment patterns for Comfac’s infrastructure.
 
== 3.3 Package Composition & Chaining ==
The real power of the LoRA Package Registry emerges when Packages are composed. For a given task, the system loads the base Qwen 2.5 Coder model, then applies one or more LoRA Packages in sequence:
 
'''Example:''' "Create a Frappe app for Synx’s new shift-swap feature" would chain: '''''frappe-app-dev''''' + '''''synx-product''''' + '''''prd-writing''''' Packages, giving the model deep knowledge of Frappe conventions, Synx’s specific domain, and structured requirements output.
 
Package composition is managed through a configuration file that specifies load order, weight blending ratios, and conflict resolution when Packages touch overlapping domains.
 
= 4. QA Team & School Partnership Program =
Quality assurance for AI model training is fundamentally different from traditional software QA. The QA team’s primary function is generating, validating, and curating training data—ensuring that every Package in the library meets accuracy and reliability benchmarks before deployment.
 
== 4.1 QA Team Structure ==
{| class="wikitable"
|'''Role'''
|'''Responsibility'''
|'''Output'''
|'''Count'''
|-
|'''QA Lead'''
|Defines evaluation criteria per Package, manages benchmark suites, approves Package promotions from dev to stable
|Package scorecards, release approvals, regression test suites
|1 (internal)
|-
|'''Data Curator'''
|Reviews and cleans training pairs from all sources (Git logs, school submissions, internal docs). Ensures format compliance.
|Validated datasets in HF-compatible format
|2 (internal)
|-
|'''School QA Cohort'''
|Student testers who run model outputs against real assignments, flag hallucinations, and submit correction pairs
|Error logs, correction datasets, usage reports
|10-20 (school partners)
|-
|'''Domain Expert Reviewers'''
|Internal engineers who validate model outputs for their specific domains (Frappe, FreeCAD, etc.)
|Domain accuracy scores, corrected code samples
|3-5 (internal, rotating)
|}


==== 2. The "FreeCAD Matrix" (CAD Automation) ====
== 4.2 School Partnership Model ==
Schools provide the distributed labor force for data generation and QA, receiving free AI tooling in return. This is the core of the Open Source Ecology strategy.


'''The Concept:''' Parametric models designed to Draw and Create 3D assets automatically.
=== The Exchange ===


'''Timeline:''' Full benefits by '''End of 2026''' as workflows migrate to FreeCAD.
* Schools receive: Free unlimited access to Stable model releases, curriculum integration support, exposure to production AI workflows, and certificates of participation.
* Comfac receives: Error logs, corrected outputs, new training pairs (student code + corrections), usage telemetry, and domain-specific test cases across diverse real-world scenarios.


'''Key Innovations:'''
=== QA Workflow for School Cohorts ===


* '''Path Procedural Design Workbench:''' Engineers draw a simple path; AI automates generation of pipes, ducts, cabling, including intersections, mounting brackets, and supports.
# Assignment Distribution: QA Lead distributes model-generated outputs to student cohorts along with evaluation rubrics.
* '''Civil Works Automation:''' Procedural room design and layout generation.
# Testing & Annotation: Students run the outputs in their own environments, annotate errors (hallucinated APIs, incorrect syntax, wrong framework patterns), and submit correction pairs.
* '''Output:''' Generates 3D Models, 2D Shop Drawings, and BOM (Bill of Materials) instantly.
# Data Curation: Internal Data Curators validate student submissions, clean formatting, and merge into training datasets on Hugging Face.
# Retraining Cycle: Updated datasets trigger LoRA retraining. New Package version is benchmarked against the previous version.
# Release: If benchmarks improve, the new Package version is promoted to Stable and redistributed to schools, completing the feedback loop.


'''The Proprietary Asset:''' Not just AI model weights, but the '''Technique''': The Order of Operations, Business Processes, and Internal Workflows that automate detailing of Fitouts, Engineering, and Container Customization (SBC).
== 4.3 Using Other Models to Train Qwen ==
A critical acceleration strategy is using larger, more capable models (Claude, GPT-4, DeepSeek-V3, Qwen 72B) to generate synthetic training data for the smaller Qwen 2.5 3B target model. This is the knowledge distillation pipeline:


----
* PRD Generation: Use Claude or GPT-4 to generate high-quality PRDs for Frappe apps, OnlyOffice plugins, and internal products. These become the gold-standard training pairs for the PRD Writing Package.
* Code Correction: Feed Qwen 3B’s incorrect outputs to a larger model for correction. The (incorrect, corrected) pairs become DPO (Direct Preference Optimization) training data.
* Architecture Explanation: Use larger models to generate detailed explanations of Frappe/OnlyOffice/FreeCAD code patterns, which become instruction-tuning data for the smaller model.
* Benchmark Generation: Use multiple large models to generate diverse test cases and expected outputs for each Package’s evaluation suite.


== Technical Architecture (The Stack) ==
= 5. Qwen 2.5 3B Coder Instruct: Training Strategy =
The target model—Qwen 2.5 Coder 3B Instruct—is chosen for its balance of capability and deployability on consumer AMD hardware. The training strategy involves layered LoRA specialization, where each layer adds domain expertise without degrading general coding ability.


=== Hardware (The Engine) ===
== 5.1 Training Phases ==


'''GPU Strategy:''' Consumer AMD Radeon cards and next-generation Unified Memory architectures (High Memory per Dollar).
=== Phase 1: Base Calibration ===
Establish baseline performance on general coding tasks. Benchmark against Claude Sonnet on standardized tests (HumanEval, MBPP, custom Frappe tasks). Document the "Weakness Vocabulary"—specific concepts, APIs, and patterns where the base model fails.


=== Phase 2: PRD Writing LoRA ===
The PRD Writing Package is trained first because it produces structured outputs that become training data for subsequent Packages. Training parameters:
{| class="wikitable"
{| class="wikitable"
! Level !! Hardware !! Price !! Purpose
|'''Parameter'''
|'''Specification'''
|-
|-
| Entry || Radeon RX 7600 XT (16GB) || ~₱22,000 || Good for Inference
|'''Output Format'''
|Structured markdown with sections: Overview, User Stories, Acceptance Criteria, Technical Constraints, API Contracts, Data Models, Dependencies, Risk Assessment
|-
|-
| Production || Radeon RX 7800 XT (16GB) || ~₱35,000 || Faster token/sec
|'''Architecture Awareness'''
|PRDs must reference target architecture (Frappe, OnlyOffice, NextCloud, etc.) and include framework-specific implementation notes
|-
|-
| Next-Gen AI PCs || AMD APUs with integrated TPU/NPU || — || Unified Memory sharing system RAM with AI models
|'''Scope Constraints'''
|}
|Model must generate PRDs scoped to single-sprint deliverables (~2 weeks), decomposing larger features into phased PRDs
|-
|'''Training Data Source'''
|100+ PRDs generated by Claude/GPT-4 for real Comfac features, reviewed and corrected by internal engineers
|-
|'''Evaluation Criteria'''
|Completeness score, feasibility rating (can a developer build from this PRD alone?), architecture alignment
|}  


'''Next-Gen Advantage:''' Unified Memory eliminates strict VRAM limitations, empowering deployment of massive, full-sized models natively on consumer-grade hardware.
=== Phase 3: Architecture-Specific LoRAs ===
With the PRD Package established, train architecture Packages using a combination of:


'''Infrastructure:'''
* Synthetic instruction pairs from larger models (Claude generates Frappe code, model learns patterns)
* '''The Solar Data Center (Cabuyao):''' Utilizing Cornersteel's solar capacity to run inference at near-zero OpEx.
* Real codebase extraction from Comfac Git repositories (code patterns, commit messages, PR descriptions)
* Documentation distillation from official framework docs (Frappe, OnlyOffice SDK, FreeCAD API)
* QA correction pairs from school cohorts and internal domain experts


=== Software (The Brain) ===
=== Phase 4: Product-Specific LoRAs ===
These are the most specialized Packages, trained on Comfac’s proprietary codebases. Access is restricted to Cutting Edge tier:


* Synx: Scheduling optimization, workforce management patterns, school timetabling algorithms
* Secada: Document processing pipelines, OCR integration, compliance metadata schemas
* Steward: IoT device control, solar monitoring dashboards, energy tracking data models
* Synopsis: Multi-platform message aggregation, search indexing, conversation-to-report generation
== 5.2 LoRA Training Configuration ==
{| class="wikitable"
{| class="wikitable"
! Component !! Technology
|'''Parameter'''
|'''3B Model'''
|'''32B Model (Future)'''
|-
|'''LoRA Rank (r)'''
|64
|128
|-
|'''LoRA Alpha'''
|128
|256
|-
|-
| '''Base Model''' || Qwen 2.5 Coder 3B/32B (The Engine)
|'''Target Modules'''
|q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
|Same
|-
|-
| '''"Matrix" LoRA''' || Custom adapter trained on: Cornersteel manufacturing SOPs, FreeCAD Python Documentation, Comfac Legacy Codebases (Laravel/Node)
|'''Training Framework'''
|Unsloth (4-bit quantized training)
|Unsloth or Axolotl
|-
|-
| '''Deployment''' || Ollama + OpenCode CLI (for Devs) / Custom WebUI (for CAD operators)
|'''Hardware Requirement'''
|}
|1x RX 7600 XT (16GB)
|Multi-GPU or MI250
|-
|'''Epochs per Package'''
|3-5 (with early stopping)
|2-3
|}


----
= 6. Key Architecture Deep Dives =


== Implementation Roadmap ==
== 6.1 Frappe ERPNext Customization & App Creation ==
This is Comfac’s highest-priority LoRA Package given its direct revenue impact. The Frappe ecosystem requires deep model understanding across multiple layers:


=== Stage 1: Benchmarking & Setup (Weeks 1-4) ===
=== ERPNext Customization Package Coverage ===


'''Goal:''' Prove Qwen 2.5 can handle Code + Basic CAD scripting.
* DocType creation and modification (fields, naming rules, permissions, workflows)
* Server Scripts (Python): Whitelisted methods, scheduled tasks, document events, custom API endpoints
* Client Scripts (JavaScript): Form manipulation, custom buttons, dynamic link filters, real-time updates
* Print Format Designer: Custom Jinja templates for invoices, reports, and compliance documents (including Philippine regulatory formats)
* Workflow automation: Multi-step approval chains, state transitions, email notifications
* Data migration: Importing from legacy systems, spreadsheet cleanup, field mapping validation


* '''Hardware:''' Purchase 1x RX 7600 XT (~₱22,000)
=== Frappe App Development Package Coverage ===
* '''Task 1 (Software):''' Benchmark Qwen against Claude Sonnet on Laravel Controller task
* '''Task 2 (Hardware/CAD):''' Benchmark Qwen on generating simple FreeCAD script ("Draw a cube with a hole")
* '''Deliverable:''' Report identifying "Weakness Vocabulary" (what the model fails to understand about specific workflows)


=== Stage 2: POC - The "Matrix" Prototype (Weeks 5-8) ===
* Full app scaffolding: bench new-app through to production deployment
* Frappe hooks system: app_include_js/css, doc_events, scheduler_events, override_whitelisted_methods
* REST API design patterns within Frappe’s framework constraints
* Web page and portal development using Frappe’s website module
* Integration patterns: Frappe’s OAuth, webhook, and background job systems


'''Goal:''' Create first "Comfac Matrix Model" (LoRA).
== 6.2 OnlyOffice Plugin & Customization ==
OnlyOffice is the document editing backbone for Comfac’s self-hosted infrastructure. The model needs Package coverage across:


'''Training Data:'''
* Plugin architecture: manifest.json structure, plugin lifecycle, event hooks, UI panel creation
* Export Git logs from Comfac (Software patterns)
* Document Builder API: Programmatic document creation, template automation, batch processing
* Create dataset of "Text Description -> FreeCAD Script" pairs
* Macro development: JavaScript macros for spreadsheet automation (replacing Google Apps Script workflows)
* Fine-tune Qwen 2.5 3B using Unsloth
* Editor customization: Custom toolbars, context menus, and formatting presets for Philippine business documents
* RAG Setup: Ingest Cornersteel machine manuals and SOPs into Knowledge Base
* Currency/locale handling: Philippine peso formatting defaults, date formats, and number conventions


=== Stage 3: Internal Deployment & OJT Swarm (Month 3-6) ===
== 6.3 Internal App Modifications (Synx, Secada, Steward, Synopsis) ==
Each internal product has its own codebase, architecture decisions, and domain logic that the model must deeply understand. These are Cutting Edge-only Packages with restricted access.


'''Goal:''' The "Open Source Ecology" in action—developing ecosystem of specialized models and tools.
'''Synx''' (Scheduling): The model must understand constraint-satisfaction algorithms for timetabling, worker availability matrices, shift-swap logic, and the specific data models Synx uses for workload prediction. Training data includes historical scheduling decisions and their outcomes.


'''The OJT Army:''' 10-20 Students/OJTs given access to '''Stable''' model release. Exchange: They get free, unlimited AI; We get their error logs and patches to improve Cutting Edge model. If we have enough success we can roll it out to school partners who will help us improve the LoRA and create better models.  
'''Secada''' (Document Management): Beyond basic CRUD, the model needs OCR pipeline understanding—how Secada processes scanned Philippine government documents, extracts compliance metadata, and routes documents through approval workflows. Image recognition for document classification is a key differentiator.


'''Crowdsourced R&D (Model Expansion):'''
'''Steward''' (Facility Automation): IoT integration patterns, MQTT/Zigbee protocol handling, solar panel monitoring data models (tracking every panel in CGG down to individual panel boards), and energy dashboard visualization logic.


# '''Visual Models:''' Facility inspection, defect recognition, site monitoring
'''Synopsis''' (Communication Hub): Multi-platform API integration (Viber Business, email IMAP/SMTP, Telegram Bot API, Messenger Platform), message deduplication across platforms, full-text search indexing, and AI-assisted conversation summarization for report generation.
# '''CAD Models (The "Matrix" Core):''' Hybrid of Coder and Vision Language Action (VLA) models trained to:
#* Control FreeCAD API
#* Interpret uploaded PDFs/images and "draw" in FreeCAD
#* Convert 2D sketches into 3D parametric models
#* Generate BOM and Spreadsheets automatically
#* Navigate various Workbenches (PartDesign, Arch, Draft)
# '''Vision Language Action (VLA) Models:''' Connecting visual input directly to digital execution
# '''Action Models (Robotics):''' Trained in '''Godot''' game engine to simulate physical movement, hardware interaction, kinematics before real-world deployment
# '''Audio Models:''' Voice-commanded site logging, machine diagnostics, hands-free interface
# '''Expert Models:''' Specialized knowledge bases as "Expert Systems" for niche domains (HVAC troubleshooting, Electrical load calculation)
# '''Engineering Design Models:''' Orchestrating CAD models, interpreting design outputs, organizing unstructured data, comparing designs against Building Codes (NBCP)


'''The Tooling Expansion (Software Ecosystem):'''
= 7. Core Products & Revenue Strategy (2026-2027) =


# '''Comfac Custom Workbenches:''' Proprietary FreeCAD environments
== 7.1 The Coder Matrix (3B-16B Coding LoRA) ==
# '''FreeCAD-Blender Bridge:''' Plugins automating import/export, preserving metadata bi-directionally
A fine-tuned coding assistant replicating Claude Code Premium capabilities without per-seat cost. Immediate 2026 impact: enables Comfac to rapidly turn around software solutions for CSC Facilities, Frappe ecosystem, and internal products.
# '''PandaPower App:''' FreeCAD/Blender integration for power system analysis and grid simulation
# '''Geometry Node Interoperability:''' Migrate geometry nodes between Blender and FreeCAD
# '''Godot "Tri-App" Workflow:''' Plugins for seamless transfer between '''FreeCAD (Design)''', '''Blender (Visualization)''', and '''Godot (Simulation)'''


'''The Hardware Cascade (Upgrade Cycle):'''
== 7.2 The FreeCAD Matrix (CAD Automation) ==
* '''Senior Staff Upgrade:''' 8-12GB VRAM → 16-20GB VRAM workstations
Parametric models for automated 3D asset creation. Full benefits by end of 2026 as workflows migrate to FreeCAD. Key innovations include Path Procedural Design Workbench (engineers draw paths, AI generates pipes/ducts/cabling with supports), Civil Works Automation (procedural room design), and instant generation of 3D models, 2D shop drawings, and BOMs.
* '''Junior Inheritance:''' 8-12GB hardware trickled down to Junior staff and Trainees


=== Stage 4: Commercialization & Partnerships (Month 6+) ===
== 7.3 The Marketplace ==


'''Goal:''' Revenue Generation.
* Rent spare compute from Cabuyao Solar Data Center to students and startups
* "Plug & Play" AI Rentals for schools—pre-configured inference endpoints
* LoRA Package Registry licensing—partners can purchase individual Packages rather than full platform access
 
= 8. Technical Architecture =
 
== 8.1 Hardware Strategy ==
{| class="wikitable"
|'''Level'''
|'''Hardware'''
|'''Price'''
|'''Purpose'''
|-
|'''Entry'''
|Radeon RX 7600 XT (16GB)
|~₱22,000
|Inference + LoRA training (3B models)
|-
|'''Production'''
|Radeon RX 7800 XT (16GB)
|~₱35,000
|Faster inference + larger model support
|-
|'''Next-Gen'''
|AMD APUs with NPU/TPU
|TBD
|Unified Memory for full-sized model deployment
|}


'''Partner Offer:''' "Sustainable AI Integration" (₱5M/Year ROI)
Infrastructure: The Solar Data Center (Cabuyao) utilizes Cornersteel’s solar capacity to run inference at near-zero OpEx.


For less than $100,000 USD per year, partners receive:
== 8.2 Software Stack ==
* '''Dedicated High-End Compute:''' Unified memory AI PCs with exclusive access to proprietary LoRA-trained models
{| class="wikitable"
* '''Bespoke Full-Size Training:''' Customization to partner's proprietary requirements
|'''Component'''
* '''Distilled Edge Solutions:''' "Tiny to small" AIs optimized for localized, niche usages
|'''Technology'''
|-
|'''Base Model'''
|Qwen 2.5 Coder 3B / 32B (via Hugging Face)
|-
|'''LoRA Packages'''
|Custom adapters per domain, stored in LoRA Package Registry on Hugging Face (private org repos)
|-
|'''Training Framework'''
|Unsloth + ROCm (AMD), with TRL/PEFT for training orchestration
|-
|'''Deployment'''
|Ollama + OpenCode CLI (devs) / Custom WebUI (CAD operators / non-technical)
|-
|'''QA Pipeline'''
|Automated eval harness + school cohort testing + domain expert review
|} 


''Service Example:'' Cornersteel Facilities Management utilizing deployed "CMM Nodes" for Vision Model maintenance scanning, cost-auditing apps, on-demand embedded system drafting.
= 9. Implementation Roadmap =


'''The Marketplace:'''
== Stage 1: Accounts, Benchmarking & Setup (Weeks 1-4) ==
* Rent out spare compute from Cabuyao Solar Data Center to students/startups
* "Plug & Play" AI Rentals for schools


----
# Set up Hugging Face organizational account; accept Qwen 2.5 model licenses
# Register AMD Developer account; install ROCm 6.x; validate PyTorch-ROCm on RX 7600 XT
# Purchase 1x RX 7600 XT (~₱22,000)
# Benchmark Qwen 2.5 3B against Claude Sonnet on: Laravel controller generation, Frappe DocType creation, basic FreeCAD scripting
# Deliverable: Weakness Vocabulary report + LoRA Package Registry v0.1 schema definition


== Transition to Model Customization ==
== Stage 2: First Packages & QA Foundation (Weeks 5-12) ==


'''The Evolution:''' Moving beyond simple LoRAs to building '''Cutting Edge Custom Models''' from scratch using '''Deepseek''' and '''Qwen Open Weights'''.
# Train PRD Writing Package using Claude/GPT-4 generated training data, reviewed by internal team
# Train frappe-erpnext Package using Comfac Git exports + Frappe documentation distillation
# Establish QA pipeline: automated eval harness, internal reviewer rotation, benchmark scoring
# RAG Setup: Ingest Cornersteel machine manuals, SOPs, and Frappe docs into knowledge base
# Begin OnlyOffice plugin Package data collection


'''Financial Shift:'''
== Stage 3: School Partnerships & Package Expansion (Months 3-6) ==
* Model-building operational costs increase by '''₱3M - ₱4M per year'''
* ''Why:'' Deep customization requires higher compute density and data cleaning


'''Workforce Specialization:'''
# Onboard first school partner cohort (10-20 students) with Stable model access
# Launch QA feedback loop: students test, annotate, submit corrections
# Train Product Packages: Synx, Secada, Steward, Synopsis (Cutting Edge only)
# Train freecad-python Package, begin FreeCAD Matrix prototype
# Hardware cascade: Senior staff upgrade to 16-20GB VRAM; juniors inherit 8-12GB units


; MM Scientists : Focused on architecture, training, and weights of AI models
== Stage 4: Commercialization & Revenue (Month 6+) ==
; MM Designers : Focused on parametric logic, "Technique," and UX of AI-controlled tools


----
# Launch Partner Access program (₱5M/year tier)
# Deploy LoRA Package Registry marketplace for individual Package licensing
# Scale school partnerships to additional institutions
# Begin visual model development (facility inspection, defect recognition)
# Rent spare Cabuyao compute to students and startups


== Financial Impact & Efficiency Analysis ==
= 10. Revenue Model & Unit Economics =


'''Current Baseline:'''
== 10.1 No-SaaS Policy: The Open Source Support Model ==
# '''Design & Engineering:''' ₱10M - ₱15M per year on Engineering and Design talent (~30 engineers)
Comfac does not operate a SaaS business. All Stable model weights and LoRA Packages are released as open source—anyone can download them, self-host them, and run them without paying Comfac a single peso. This is deliberate: it builds community, creates a distributed QA workforce, and establishes Comfac as the recognized authority on these models.
# '''Facilities & Maintenance:''' ₱10M per year on 20 engineering and maintenance staff


=== A. Design Automation (The 10x Lever) ===
'''The revenue comes from the reality that Cutting Edge models are inherently unstable.''' They are experimental, frequently updated, may break between versions, and require expertise to deploy correctly. Organizations that want to run these models in production need support—configuration help, troubleshooting, version migration, hardware optimization, and domain-specific tuning. '''''That support is what they pay for.'''''


CMM equips 1 designer to generate production elements at '''10x speed''' of industry standard.
This follows the proven open-source business model used by Red Hat (RHEL), Canonical (Ubuntu Pro), Elastic, and others: the code is free, the expertise and stability guarantees cost money. Anyone can DIY it, but for non-technical organizations or those without specialized ML engineers, self-hosting Cutting Edge models without support is impractical.


== 10.2 Revenue Stream Breakdown ==
{| class="wikitable"
{| class="wikitable"
! Metric !! Traditional Workflow !! CMM Powered Workflow
|'''Revenue Stream'''
|'''% of Revenue'''
|'''Description'''
|-
|-
| '''Annual Talent Spend''' || ₱16M (30 Engineers) || ₱16M (Same Team, Supercharged)
|'''Supported Seats'''
|'''~20%'''
|$200/year per supported seat. Each seat is a user or workstation running Cutting Edge models with Comfac support coverage. The fee reflects the expected support burden per seat—more users means more tickets, more version-migration assistance, more hardware troubleshooting. Organizations self-hosting Stable models owe nothing. Organizations wanting Cutting Edge models with guaranteed uptime and support response pay per seat.
|-
|-
| '''Output Equivalent''' || ₱16M Value || '''₱50M Value''' (3.1x Realized Gain)
|'''R&D Access (Partner Tier)'''
|'''~40%'''
|Partners pay ₱5M/year ($100K USD) for direct access to Comfac’s R&D pipeline: pre-release Cutting Edge models, dedicated compute provisioning, bespoke fine-tuning on partner data, and priority engineering support. Partners are funding—and benefiting from—ongoing research, not licensing a hosted product. Models run on partner-owned or Comfac-provisioned hardware, never on a Comfac cloud.
|-
|-
| '''Production Speed''' || 1x (Manual Logistics, Redesign) || '''10x''' (Parametric Gen, Auto-Paperwork)
|'''Custom LoRA Packages'''
|-
|'''~40%'''
| '''Cost Per Revision''' || High (Man-hours) || '''Near Zero''' (Compute-seconds)
|Project-based engagements to develop bespoke LoRA Packages for professionals and industrial clients. Comfac builds, trains, validates, and delivers a specialized tiny-to-small model customized for the client’s operational setting. The deliverable is the Package itself—client owns and self-hosts it. Ongoing support is optional and billed separately.
|}
|}  


''Note: Conservative ~3x realized gain due to human communication bottleneck between production and construction teams.''
== 10.3 Supported Seat Economics ==
The $200/year per-seat price is not a software license—it is a support contract. The models are free. What organizations pay for is:


=== B. Facilities & Maintenance (The 2x Lever) ===
* Access to Cutting Edge model releases (not yet promoted to Stable—these are experimental, fast-moving, and may have breaking changes between versions)
 
* Deployment support: Comfac engineers help configure ROCm, Ollama, hardware optimization, and LoRA Package loading for the client’s specific environment
AI impacts the ''management'' layer—design, monitoring, scanning, vendor coordination—rather than physical execution.
* Version migration: When Cutting Edge models update (which is frequent), supported seats get migration assistance—config changes, compatibility fixes, regression testing
* Incident response: When something breaks in production—and with Cutting Edge models, things will break—supported seats get priority troubleshooting
* Usage-based pricing rationale: More seats = more support requests = more Comfac engineer time. The per-seat fee is calibrated to the expected support load, not to software access


Compare: $200/year with Comfac support vs. $1,200/year for a commercial SaaS AI seat (Claude, Copilot Enterprise). Organizations that have in-house ML talent can run Stable models for free and never pay Comfac. Those who lack that expertise—or who need the latest Cutting Edge capabilities for competitive advantage—pay for support.
{| class="wikitable"
{| class="wikitable"
! Metric !! Traditional Workflow !! CMM Powered Workflow
|'''Deployment Tier'''
|'''Supported Seats'''
|'''Annual Support Revenue'''
|-
|-
| '''Annual Talent Spend''' || ₱10M (20 Staff) || ₱10M (Same Team, Optimized)
|'''Internal (CGG Group)'''
|50–100 seats
|$10,000–$20,000
|-
|-
| '''Output Equivalent''' || ₱10M Value || '''₱20M Value''' (2x Realized Gain)
|'''Partners (Cornersteel, Gov’t)'''
|100–300 seats
|$20,000–$60,000
|-
|-
| '''Key Efficiency''' || Manual Inspection || '''Vision Model Scanning & Auto-Ticketing'''
|'''Schools & Community'''
|}
|Stable models: free / Cutting Edge: subsidized
|$0–minimal (QA value offsets)
|-
|'''Commercial Clients'''
|50–200 seats
|$10,000–$40,000
|}  


=== C. The Robotics Frontier (Future CapEx Reduction) ===
== 10.4 R&D Access: Subscribing to Research, Not Software ==
The R&D Access tier (₱5M/year per partner) is the anchor revenue stream. Partners are not licensing a product—they are funding and receiving early access to Comfac’s continuous research output:


Ability to create LoRAs and modify models = ability to create '''Robotic Intelligence'''.
* Pre-release Cutting Edge models before they reach the general support tier
* Dedicated compute provisioned on partner-owned or Comfac-provisioned hardware (not a hosted cloud service)
* Bespoke fine-tuning on partner’s proprietary data—their SOPs, codebases, and domain knowledge become part of their private LoRA Packages
* Priority engineering support with direct access to CMM Scientists and Designers
* Early access to new model capabilities (vision, VLA, audio) as they reach usable state—partners help validate these in real environments


'''The Godot Division:''' Operating at '''~₱2M per year''', simulating specific work environments for CGG and partners.
== 10.5 Custom LoRA Package Engagements ==
This stream represents the majority of projected use cases at scale. Industrial and professional clients need small, specialized models that run on modest hardware in operational settings—and they lack the ML expertise to build these themselves:


'''Environment LoRAs:''' Simulations generate specialized LoRAs allowing robots to "know" a facility before physically entering.
* Facilities management firms: Vision + expert models for predictive maintenance, inspection automation, compliance reporting
* Engineering consultancies: Specialized calculation models (HVAC sizing, electrical load, structural analysis) that run on sub-₱50K hardware
* Government agencies: Compliance document processing, automated form filling, regulatory cross-referencing models
* Manufacturing plants: Quality inspection models, SOP-aware troubleshooting assistants, production scheduling optimizers
* Schools and training centers: Customized tutoring models, curriculum-aligned coding assistants, assessment automation


* ''Phase 1:'' Navigation, scanning, object retrieval
The deliverable is the Package itself—the client receives trained LoRA weights, deployment configuration, and documentation. They own it and self-host it. Ongoing support is a separate, optional contract. Each engagement produces reusable Package templates that reduce Comfac’s development cost for similar future clients.
* ''Phase 2:'' Toxic/hazardous tasks (bioreactor maintenance, sewage/pipe infrastructure)


'''Execution:''' Combined with Design Automation, enables Comfac to build its own robotics to further automate Cornersteel's manufacturing lines—moving from "Software Automation" to "Physical Automation."
== 10.6 Profitability Model ==
{| class="wikitable"
|'''Item'''
|'''Detail'''
|-
|'''Gross Revenue Target (Year 2)'''
|₱20M–₱30M across all three streams
|-
|'''R&D Expenditure'''
|₱15M operational + ₱3–4M model-building = ₱18–19M/year
|-
|'''R&D Treatment'''
|Capitalized and depreciated over 3–5 years (model weights, training infrastructure, Package IP)
|-
|'''Target Equity/Profit Margin'''
|~15% after full R&D depreciation
|-
|'''Margin Expansion Path'''
|Custom Package templates compound—each new client in a similar vertical costs less to serve. Solar data center drives inference OpEx toward zero. Support burden per seat decreases as models mature from Cutting Edge to Stable.
|}


=== D. Organizational & Strategic Impact ===
The 15% margin target is conservative and accounts for heavy reinvestment in the first 2–3 years. As the LoRA Package Registry matures, more models graduate from Cutting Edge to Stable (reducing support load), and template reuse increases, margins should expand to 20–25% by Year 3–4.  


* '''Retention & Work-Life Balance:''' Automating "grunt work" eliminates overtime and burnout → '''exceptionally high retention rates'''
= 11. Operational Efficiency Analysis =
* '''Flipped Bottlenecks:''' '''Sales becomes the new bottleneck''' as design throughput increases dramatically
* '''Design-Led Sales:''' Designers join Sales in lead generation with real-time generative tools
* '''Multidisciplinary Expansion:''' Absorb new requirements—'''Hydraulic & Sanitary (Water), Electrical (Power), Structural, Certifications'''
* '''The New Challenge:''' Constraint shifts to '''"Hiring Certified Professionals"''' to sign off on AI-generated work volume
* '''Unique Market Position:''' Only Philippine engineering firm balancing '''Engineering and IT''', manufacturing its own intelligence


----
== 11.1 Design Automation (The 10x Lever) ==
{| class="wikitable"
|'''Metric'''
|'''Traditional'''
|'''CMM Powered'''
|-
|'''Annual Talent Spend'''
|₱16M (30 Engineers)
|₱16M (Same Team, Supercharged)
|-
|'''Output Equivalent'''
|₱16M Value
|₱50M Value (3.1x Realized Gain)
|-
|'''Production Speed'''
|1x (Manual)
|10x (Parametric Gen + Auto-Paperwork)
|-
|'''Cost Per Revision'''
|High (Man-hours)
|Near Zero (Compute-seconds)
|}


== Immediate Action Items ==
== 11.2 Facilities & Maintenance (The 2x Lever) ==
 
{| class="wikitable"
# '''Strategic Briefing:''' Present full "Project OpenCoder" roadmap to core leadership and technical leads
|'''Metric'''
# '''Operational Planning & Resource Allocation:'''
|'''Traditional'''
#* Audit current engineering workloads
|'''CMM Powered'''
#* Select internal leads (MMDs) for Data, Coding, and Manufacturing/CAD tracks
|-
#* Detail plan and assign roles before spending
|'''Annual Talent Spend'''
# '''Hardware Determination:''' Finalize hardware outlay based on identified personnel (CAD-focused vs. Code-focused setups)
|₱10M (20 Staff)
# '''Procurement:''' Purchase initial hardware batch (Pilot: AMD Radeon RX 7600 XT) once team is ready
|₱10M (Same Team, Optimized)
# '''Execution & Monitoring:''' Launch Stage 1 Benchmarking with weekly reporting cadence
|-
 
|'''Output Equivalent'''
----
|₱10M Value
|₱20M Value (2x Realized Gain)
|-
|'''Key Efficiency'''
|Manual Inspection
|Vision Model Scanning & Auto-Ticketing
|}


== Appendix: The Matrix Model Assets ==
== 11.3 Robotics Frontier (Future CapEx Reduction) ==
The Godot Division (~₱2M/year) simulates specific work environments for CGG and partners. Environment LoRAs allow robots to “know” a facility before physically entering. Phase 1: Navigation, scanning, object retrieval. Phase 2: Toxic/hazardous tasks.


=== Reference Links ===
= 12. Immediate Action Items =


* '''FreeCAD Matrix Models:''' [https://sites.comfac.net/freecad.html https://sites.comfac.net/freecad.html]
# '''Strategic Briefing: Present revised Project OpenCoder roadmap to core leadership and technical leads'''
* ''Description:'' Models designed to Draw and Create 3D Parametric models
# '''Account Setup: Register Hugging Face organization account (comfac-cmm) and AMD Developer accounts for lead engineers'''
# '''QA Team Formation: Appoint QA Lead, recruit 2 Data Curators, identify first school partner for pilot cohort'''
# '''LoRA Package Registry Schema: Define v0.1 Package Card format, dataset structure, and evaluation harness specifications'''
# '''Hardware Procurement: Purchase initial RX 7600 XT; install ROCm and validate training pipeline'''
# '''Data Collection Sprint: Begin exporting Comfac Git repos, Frappe configurations, and internal documentation for training dataset v1'''
# '''Synthetic Data Generation: Commission first batch of PRDs and Frappe code samples from Claude/GPT-4 for training data'''
# '''Execution & Monitoring: Launch Stage 1 Benchmarking with weekly reporting cadence'''


=== Proprietary Definition ===
= Appendix: The Matrix Model Assets =


The "Secret Sauce" of CMM Business Unit is not open-source code, but the '''Technique''':
== Reference Links ==
FreeCAD Matrix Models: <nowiki>https://sites.comfac.net/freecad.html</nowiki>


# '''Order of Operations:''' Specific sequence for manufacturing steps
''Models designed to draw and create 3D parametric models.''  
# '''Business Processes:''' How design moves from prompt to physical Bill of Materials
# '''Internal Workflow:''' Automation of detailing for:
#* Fitout Construction
#* Engineering Schematics
#* Container Customization (SBC - Smart Building Components)
#* Open Source Ecology Production


----
== Proprietary Definition ==
The “Secret Sauce” of the CMM Business Unit is not open-source code, but the Technique:


''Drafted by the Comfac AI Strategy Team''
# Order of Operations: Specific sequence for manufacturing steps
# Business Processes: How design moves from prompt to physical Bill of Materials
# Internal Workflow: Automation of detailing for Fitout Construction, Engineering Schematics, Container Customization (SBC), and Open Source Ecology Production
# LoRA Package Registry: The composable, versioned collection of domain expertise LoRAs that encode Comfac’s competitive advantage in deployable model form


[[Category:Research]]
[[Category:Research]]

Latest revision as of 10:26, 8 March 2026

PROJECT OPENCODER

Strategic AI Independence Initiative

Revised Plan v2.0

Prepared for:

Comfac Technology Group & Cornersteel Systems Corp.


1. Executive Summary

The Pivot: Project OpenCoder establishes the Comfac Matrix Model (CMM) Business Unit—a proprietary engine that manufactures intelligence for software development, physical production (via FreeCAD/Python), and enterprise automation. Rather than renting AI at $1,200/yr/seat, Comfac will build, train, and deploy its own specialized models, reducing per-seat cost to $200/year while creating a new revenue-generating division.

1.1 Strategic Pillars

Open Source Core: Release stable models publicly, cultivating a community of students and developers who become a distributed R&D department. Partner with schools to create a QA pipeline that continuously improves model quality.

Partner Premium: Reserve cutting-edge models—trained on latest data with faster iteration—for Comfac and paying partners (Cornersteel, Government agencies).

LoRA Package Registry: Introduce a structured LoRA Package Registry that packages domain expertise as composable, versionable LoRA modules. Each LoRA Package encapsulates PRD writing parameters, app architecture patterns, and validated QA benchmarks for a specific use case.

1.2 Financial Model

Item Detail
Operational Expenditure ₱15M per year upon release of Open Source and Cutting Edge models
Partner Commitment ₱5M/year in Partner Access fees (less than $100,000 USD/year)
Partner ROI Unified memory AI PCs + full-sized models + optimized tiny-to-small AIs for departmental use
Overage Billing ₱5,000 per hour beyond commitment
Target Per-Seat Cost $200/year (down from $1,200/year)

2. Platform Accounts & Development Infrastructure

Before any model training or fine-tuning can begin, Comfac must establish accounts on key AI development platforms. These accounts provide access to model weights, training infrastructure, community resources, and hardware-specific toolchains.

2.1 Hugging Face Account Setup

Hugging Face is the central hub for open-weight model distribution, dataset hosting, and community collaboration. A Comfac organizational account is essential for the following reasons:

  • Model Access: Download gated models like Qwen 2.5 Coder variants (3B, 7B, 32B) that require license acceptance. Certain models require organizational verification for commercial use.
  • Dataset Hosting: Host proprietary training datasets (Frappe code patterns, FreeCAD scripts, internal SOPs) as private repositories. Version control training data alongside model checkpoints.
  • Model Publishing: Publish stable CMM releases for community consumption. Track downloads, issues, and community contributions to stable models.
  • Spaces & Inference: Use Hugging Face Spaces to host demo inference endpoints for partner evaluation and school QA testing.
  • Training with AutoTrain / TRL: Access Hugging Face’s training libraries (TRL, PEFT, Unsloth integration) which are tightly coupled to the HF ecosystem.

Account Tiers

Tier Cost Recommendation
Free (Individual) $0 Sufficient for initial benchmarking (Stage 1)
Pro (Individual) $9/month For lead MMD/MMS researchers needing priority inference
Organization Free (public repos) / $20/user (private) Required for Stage 2+ with private dataset repos

2.2 AMD Developer Account & ROCm Ecosystem

Since Comfac’s GPU strategy centers on AMD Radeon consumer cards, an AMD Developer account is critical for accessing ROCm (Radeon Open Compute) tooling:

  • ROCm SDK Access: ROCm is AMD’s open-source GPU compute platform—equivalent to NVIDIA’s CUDA. It provides PyTorch/ROCm builds, HIP compiler toolchains, and GPU profiling tools necessary for LoRA fine-tuning on Radeon hardware.
  • Hardware Compatibility: The RX 7600 XT and RX 7800 XT require ROCm 6.x+ for stable PyTorch training. AMD Developer forums and early-access drivers often resolve compatibility issues weeks before public release.
  • MI-Series Migration Path: As Comfac scales, AMD’s MI250/MI300 accelerators offer a professional upgrade path. Developer account holders get priority access to documentation, benchmarks, and partner pricing.
  • Unified Memory APU Roadmaps: AMD’s Strix Point and subsequent APUs with shared CPU/GPU memory are central to Comfac’s next-gen strategy. Developer accounts provide early access to XDNA NPU SDKs.
  • Bug Reporting & Community: Direct channels to AMD engineers for ROCm issues specific to consumer Radeon cards (which are less tested than MI-series for ML workloads).

Setup Action Items

  1. Register AMD Developer account at developer.amd.com
  2. Install ROCm 6.x on Ubuntu 22.04/24.04 development machines
  3. Validate PyTorch-ROCm build against RX 7600 XT (gfx1102 target)
  4. Subscribe to AMD ROCm GitHub releases for driver/compatibility updates
  5. Join AMD Developer community forums—flag Comfac as AI inference + training use case for potential partnership outreach

3. The LoRA Package Registry

A LoRA Package is a composable, versionable LoRA adapter module paired with its training data, evaluation benchmarks, and deployment configuration. The LoRA Package Registry is the organizational system that catalogs, versions, and chains these Packages for different use cases.

3.1 What is a LoRA Package?

Each LoRA Package consists of the following components packaged as a single versioned unit:

Component Description
LoRA Weights The fine-tuned adapter weights for a specific domain (e.g., frappe-erpnext-v1.2)
Training Dataset Curated instruction/completion pairs, code samples, PRD templates, and architecture patterns
Eval Benchmarks Automated test cases that measure accuracy, code correctness, and domain knowledge retention
PRD Parameters Structured prompting templates that define how the model writes PRDs, specs, and documentation for this domain
Package Card Metadata: version, dependencies, compatible base models, author, QA status, deployment notes

3.2 Package Categories

A. Architecture Packages (App Framework Patterns)

These Packages encode deep understanding of specific application frameworks—not just syntax, but idiomatic patterns, project structure conventions, and deployment workflows.

Package Name Domain Coverage Training Sources
frappe-erpnext ERPNext customization, DocType creation, server/client scripts, Print Formats, custom workflows, Frappe hooks Comfac ERPNext repos, Frappe Framework docs, community apps, Frappe School tutorials
frappe-app-dev Full Frappe app creation from bench init to deployment, API design, Jinja templating, bench commands Frappe app scaffolds, published Frappe apps on GitHub, internal Comfac Frappe apps
onlyoffice-plugin OnlyOffice plugin architecture, macro development, document builder SDK, editor customization OnlyOffice SDK docs, existing plugin repos, Comfac OnlyOffice configurations
nextcloud-app NextCloud app development, OCS API, DAV integration, notification system NextCloud developer docs, community app repos
freecad-python FreeCAD scripting, workbench creation, parametric modeling API, macro development FreeCAD Python docs, wiki examples, Comfac CAD scripts

B. Product Packages (Internal Apps)

These Packages specialize the model for extending and maintaining Comfac’s proprietary applications:

Product Package Focus Key Training Material
Synx Worker scheduling algorithms, class schedule optimization, workload prediction models, manpower WBS documents Synx codebase, scheduling domain papers, school timetabling patterns
Secada Document management workflows, OCR/image recognition pipelines, metadata extraction, compliance tagging Secada repos, PaperlessNG patterns, Philippine regulatory document formats
Steward Home/facility automation, solar panel monitoring, IoT device integration, energy dashboard UI, panel board tracking Steward codebase, Home Assistant integration patterns, energy monitoring APIs
Synopsis Multi-platform message consolidation, search/indexing of Viber/email/Telegram/Messenger, report generation from conversations Synopsis architecture docs, messaging API schemas, Comfac communication workflows

C. Process Packages (Cross-Cutting Capabilities)

These Packages encode general engineering practices that apply across multiple architectures:

  • PRD Writing Package: Structured product requirements generation following Comfac’s template parameters—user stories, acceptance criteria, technical constraints, and dependency mapping.
  • Data Migration Package: ETL patterns for migrating data into ERPNext—spreadsheet cleanup, field mapping, validation rules.
  • API Integration Package: RESTful and webhook patterns for connecting Comfac services to third-party systems.
  • DevOps Package: Docker containerization, Nginx reverse proxy configuration, self-hosted deployment patterns for Comfac’s infrastructure.

3.3 Package Composition & Chaining

The real power of the LoRA Package Registry emerges when Packages are composed. For a given task, the system loads the base Qwen 2.5 Coder model, then applies one or more LoRA Packages in sequence:

Example: "Create a Frappe app for Synx’s new shift-swap feature" would chain: frappe-app-dev + synx-product + prd-writing Packages, giving the model deep knowledge of Frappe conventions, Synx’s specific domain, and structured requirements output.

Package composition is managed through a configuration file that specifies load order, weight blending ratios, and conflict resolution when Packages touch overlapping domains.

4. QA Team & School Partnership Program

Quality assurance for AI model training is fundamentally different from traditional software QA. The QA team’s primary function is generating, validating, and curating training data—ensuring that every Package in the library meets accuracy and reliability benchmarks before deployment.

4.1 QA Team Structure

Role Responsibility Output Count
QA Lead Defines evaluation criteria per Package, manages benchmark suites, approves Package promotions from dev to stable Package scorecards, release approvals, regression test suites 1 (internal)
Data Curator Reviews and cleans training pairs from all sources (Git logs, school submissions, internal docs). Ensures format compliance. Validated datasets in HF-compatible format 2 (internal)
School QA Cohort Student testers who run model outputs against real assignments, flag hallucinations, and submit correction pairs Error logs, correction datasets, usage reports 10-20 (school partners)
Domain Expert Reviewers Internal engineers who validate model outputs for their specific domains (Frappe, FreeCAD, etc.) Domain accuracy scores, corrected code samples 3-5 (internal, rotating)

4.2 School Partnership Model

Schools provide the distributed labor force for data generation and QA, receiving free AI tooling in return. This is the core of the Open Source Ecology strategy.

The Exchange

  • Schools receive: Free unlimited access to Stable model releases, curriculum integration support, exposure to production AI workflows, and certificates of participation.
  • Comfac receives: Error logs, corrected outputs, new training pairs (student code + corrections), usage telemetry, and domain-specific test cases across diverse real-world scenarios.

QA Workflow for School Cohorts

  1. Assignment Distribution: QA Lead distributes model-generated outputs to student cohorts along with evaluation rubrics.
  2. Testing & Annotation: Students run the outputs in their own environments, annotate errors (hallucinated APIs, incorrect syntax, wrong framework patterns), and submit correction pairs.
  3. Data Curation: Internal Data Curators validate student submissions, clean formatting, and merge into training datasets on Hugging Face.
  4. Retraining Cycle: Updated datasets trigger LoRA retraining. New Package version is benchmarked against the previous version.
  5. Release: If benchmarks improve, the new Package version is promoted to Stable and redistributed to schools, completing the feedback loop.

4.3 Using Other Models to Train Qwen

A critical acceleration strategy is using larger, more capable models (Claude, GPT-4, DeepSeek-V3, Qwen 72B) to generate synthetic training data for the smaller Qwen 2.5 3B target model. This is the knowledge distillation pipeline:

  • PRD Generation: Use Claude or GPT-4 to generate high-quality PRDs for Frappe apps, OnlyOffice plugins, and internal products. These become the gold-standard training pairs for the PRD Writing Package.
  • Code Correction: Feed Qwen 3B’s incorrect outputs to a larger model for correction. The (incorrect, corrected) pairs become DPO (Direct Preference Optimization) training data.
  • Architecture Explanation: Use larger models to generate detailed explanations of Frappe/OnlyOffice/FreeCAD code patterns, which become instruction-tuning data for the smaller model.
  • Benchmark Generation: Use multiple large models to generate diverse test cases and expected outputs for each Package’s evaluation suite.

5. Qwen 2.5 3B Coder Instruct: Training Strategy

The target model—Qwen 2.5 Coder 3B Instruct—is chosen for its balance of capability and deployability on consumer AMD hardware. The training strategy involves layered LoRA specialization, where each layer adds domain expertise without degrading general coding ability.

5.1 Training Phases

Phase 1: Base Calibration

Establish baseline performance on general coding tasks. Benchmark against Claude Sonnet on standardized tests (HumanEval, MBPP, custom Frappe tasks). Document the "Weakness Vocabulary"—specific concepts, APIs, and patterns where the base model fails.

Phase 2: PRD Writing LoRA

The PRD Writing Package is trained first because it produces structured outputs that become training data for subsequent Packages. Training parameters:

Parameter Specification
Output Format Structured markdown with sections: Overview, User Stories, Acceptance Criteria, Technical Constraints, API Contracts, Data Models, Dependencies, Risk Assessment
Architecture Awareness PRDs must reference target architecture (Frappe, OnlyOffice, NextCloud, etc.) and include framework-specific implementation notes
Scope Constraints Model must generate PRDs scoped to single-sprint deliverables (~2 weeks), decomposing larger features into phased PRDs
Training Data Source 100+ PRDs generated by Claude/GPT-4 for real Comfac features, reviewed and corrected by internal engineers
Evaluation Criteria Completeness score, feasibility rating (can a developer build from this PRD alone?), architecture alignment

Phase 3: Architecture-Specific LoRAs

With the PRD Package established, train architecture Packages using a combination of:

  • Synthetic instruction pairs from larger models (Claude generates Frappe code, model learns patterns)
  • Real codebase extraction from Comfac Git repositories (code patterns, commit messages, PR descriptions)
  • Documentation distillation from official framework docs (Frappe, OnlyOffice SDK, FreeCAD API)
  • QA correction pairs from school cohorts and internal domain experts

Phase 4: Product-Specific LoRAs

These are the most specialized Packages, trained on Comfac’s proprietary codebases. Access is restricted to Cutting Edge tier:

  • Synx: Scheduling optimization, workforce management patterns, school timetabling algorithms
  • Secada: Document processing pipelines, OCR integration, compliance metadata schemas
  • Steward: IoT device control, solar monitoring dashboards, energy tracking data models
  • Synopsis: Multi-platform message aggregation, search indexing, conversation-to-report generation

5.2 LoRA Training Configuration

Parameter 3B Model 32B Model (Future)
LoRA Rank (r) 64 128
LoRA Alpha 128 256
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj Same
Training Framework Unsloth (4-bit quantized training) Unsloth or Axolotl
Hardware Requirement 1x RX 7600 XT (16GB) Multi-GPU or MI250
Epochs per Package 3-5 (with early stopping) 2-3

6. Key Architecture Deep Dives

6.1 Frappe ERPNext Customization & App Creation

This is Comfac’s highest-priority LoRA Package given its direct revenue impact. The Frappe ecosystem requires deep model understanding across multiple layers:

ERPNext Customization Package Coverage

  • DocType creation and modification (fields, naming rules, permissions, workflows)
  • Server Scripts (Python): Whitelisted methods, scheduled tasks, document events, custom API endpoints
  • Client Scripts (JavaScript): Form manipulation, custom buttons, dynamic link filters, real-time updates
  • Print Format Designer: Custom Jinja templates for invoices, reports, and compliance documents (including Philippine regulatory formats)
  • Workflow automation: Multi-step approval chains, state transitions, email notifications
  • Data migration: Importing from legacy systems, spreadsheet cleanup, field mapping validation

Frappe App Development Package Coverage

  • Full app scaffolding: bench new-app through to production deployment
  • Frappe hooks system: app_include_js/css, doc_events, scheduler_events, override_whitelisted_methods
  • REST API design patterns within Frappe’s framework constraints
  • Web page and portal development using Frappe’s website module
  • Integration patterns: Frappe’s OAuth, webhook, and background job systems

6.2 OnlyOffice Plugin & Customization

OnlyOffice is the document editing backbone for Comfac’s self-hosted infrastructure. The model needs Package coverage across:

  • Plugin architecture: manifest.json structure, plugin lifecycle, event hooks, UI panel creation
  • Document Builder API: Programmatic document creation, template automation, batch processing
  • Macro development: JavaScript macros for spreadsheet automation (replacing Google Apps Script workflows)
  • Editor customization: Custom toolbars, context menus, and formatting presets for Philippine business documents
  • Currency/locale handling: Philippine peso formatting defaults, date formats, and number conventions

6.3 Internal App Modifications (Synx, Secada, Steward, Synopsis)

Each internal product has its own codebase, architecture decisions, and domain logic that the model must deeply understand. These are Cutting Edge-only Packages with restricted access.

Synx (Scheduling): The model must understand constraint-satisfaction algorithms for timetabling, worker availability matrices, shift-swap logic, and the specific data models Synx uses for workload prediction. Training data includes historical scheduling decisions and their outcomes.

Secada (Document Management): Beyond basic CRUD, the model needs OCR pipeline understanding—how Secada processes scanned Philippine government documents, extracts compliance metadata, and routes documents through approval workflows. Image recognition for document classification is a key differentiator.

Steward (Facility Automation): IoT integration patterns, MQTT/Zigbee protocol handling, solar panel monitoring data models (tracking every panel in CGG down to individual panel boards), and energy dashboard visualization logic.

Synopsis (Communication Hub): Multi-platform API integration (Viber Business, email IMAP/SMTP, Telegram Bot API, Messenger Platform), message deduplication across platforms, full-text search indexing, and AI-assisted conversation summarization for report generation.

7. Core Products & Revenue Strategy (2026-2027)

7.1 The Coder Matrix (3B-16B Coding LoRA)

A fine-tuned coding assistant replicating Claude Code Premium capabilities without per-seat cost. Immediate 2026 impact: enables Comfac to rapidly turn around software solutions for CSC Facilities, Frappe ecosystem, and internal products.

7.2 The FreeCAD Matrix (CAD Automation)

Parametric models for automated 3D asset creation. Full benefits by end of 2026 as workflows migrate to FreeCAD. Key innovations include Path Procedural Design Workbench (engineers draw paths, AI generates pipes/ducts/cabling with supports), Civil Works Automation (procedural room design), and instant generation of 3D models, 2D shop drawings, and BOMs.

7.3 The Marketplace

  • Rent spare compute from Cabuyao Solar Data Center to students and startups
  • "Plug & Play" AI Rentals for schools—pre-configured inference endpoints
  • LoRA Package Registry licensing—partners can purchase individual Packages rather than full platform access

8. Technical Architecture

8.1 Hardware Strategy

Level Hardware Price Purpose
Entry Radeon RX 7600 XT (16GB) ~₱22,000 Inference + LoRA training (3B models)
Production Radeon RX 7800 XT (16GB) ~₱35,000 Faster inference + larger model support
Next-Gen AMD APUs with NPU/TPU TBD Unified Memory for full-sized model deployment

Infrastructure: The Solar Data Center (Cabuyao) utilizes Cornersteel’s solar capacity to run inference at near-zero OpEx.

8.2 Software Stack

Component Technology
Base Model Qwen 2.5 Coder 3B / 32B (via Hugging Face)
LoRA Packages Custom adapters per domain, stored in LoRA Package Registry on Hugging Face (private org repos)
Training Framework Unsloth + ROCm (AMD), with TRL/PEFT for training orchestration
Deployment Ollama + OpenCode CLI (devs) / Custom WebUI (CAD operators / non-technical)
QA Pipeline Automated eval harness + school cohort testing + domain expert review

9. Implementation Roadmap

Stage 1: Accounts, Benchmarking & Setup (Weeks 1-4)

  1. Set up Hugging Face organizational account; accept Qwen 2.5 model licenses
  2. Register AMD Developer account; install ROCm 6.x; validate PyTorch-ROCm on RX 7600 XT
  3. Purchase 1x RX 7600 XT (~₱22,000)
  4. Benchmark Qwen 2.5 3B against Claude Sonnet on: Laravel controller generation, Frappe DocType creation, basic FreeCAD scripting
  5. Deliverable: Weakness Vocabulary report + LoRA Package Registry v0.1 schema definition

Stage 2: First Packages & QA Foundation (Weeks 5-12)

  1. Train PRD Writing Package using Claude/GPT-4 generated training data, reviewed by internal team
  2. Train frappe-erpnext Package using Comfac Git exports + Frappe documentation distillation
  3. Establish QA pipeline: automated eval harness, internal reviewer rotation, benchmark scoring
  4. RAG Setup: Ingest Cornersteel machine manuals, SOPs, and Frappe docs into knowledge base
  5. Begin OnlyOffice plugin Package data collection

Stage 3: School Partnerships & Package Expansion (Months 3-6)

  1. Onboard first school partner cohort (10-20 students) with Stable model access
  2. Launch QA feedback loop: students test, annotate, submit corrections
  3. Train Product Packages: Synx, Secada, Steward, Synopsis (Cutting Edge only)
  4. Train freecad-python Package, begin FreeCAD Matrix prototype
  5. Hardware cascade: Senior staff upgrade to 16-20GB VRAM; juniors inherit 8-12GB units

Stage 4: Commercialization & Revenue (Month 6+)

  1. Launch Partner Access program (₱5M/year tier)
  2. Deploy LoRA Package Registry marketplace for individual Package licensing
  3. Scale school partnerships to additional institutions
  4. Begin visual model development (facility inspection, defect recognition)
  5. Rent spare Cabuyao compute to students and startups

10. Revenue Model & Unit Economics

10.1 No-SaaS Policy: The Open Source Support Model

Comfac does not operate a SaaS business. All Stable model weights and LoRA Packages are released as open source—anyone can download them, self-host them, and run them without paying Comfac a single peso. This is deliberate: it builds community, creates a distributed QA workforce, and establishes Comfac as the recognized authority on these models.

The revenue comes from the reality that Cutting Edge models are inherently unstable. They are experimental, frequently updated, may break between versions, and require expertise to deploy correctly. Organizations that want to run these models in production need support—configuration help, troubleshooting, version migration, hardware optimization, and domain-specific tuning. That support is what they pay for.

This follows the proven open-source business model used by Red Hat (RHEL), Canonical (Ubuntu Pro), Elastic, and others: the code is free, the expertise and stability guarantees cost money. Anyone can DIY it, but for non-technical organizations or those without specialized ML engineers, self-hosting Cutting Edge models without support is impractical.

10.2 Revenue Stream Breakdown

Revenue Stream % of Revenue Description
Supported Seats ~20% $200/year per supported seat. Each seat is a user or workstation running Cutting Edge models with Comfac support coverage. The fee reflects the expected support burden per seat—more users means more tickets, more version-migration assistance, more hardware troubleshooting. Organizations self-hosting Stable models owe nothing. Organizations wanting Cutting Edge models with guaranteed uptime and support response pay per seat.
R&D Access (Partner Tier) ~40% Partners pay ₱5M/year ($100K USD) for direct access to Comfac’s R&D pipeline: pre-release Cutting Edge models, dedicated compute provisioning, bespoke fine-tuning on partner data, and priority engineering support. Partners are funding—and benefiting from—ongoing research, not licensing a hosted product. Models run on partner-owned or Comfac-provisioned hardware, never on a Comfac cloud.
Custom LoRA Packages ~40% Project-based engagements to develop bespoke LoRA Packages for professionals and industrial clients. Comfac builds, trains, validates, and delivers a specialized tiny-to-small model customized for the client’s operational setting. The deliverable is the Package itself—client owns and self-hosts it. Ongoing support is optional and billed separately.

10.3 Supported Seat Economics

The $200/year per-seat price is not a software license—it is a support contract. The models are free. What organizations pay for is:

  • Access to Cutting Edge model releases (not yet promoted to Stable—these are experimental, fast-moving, and may have breaking changes between versions)
  • Deployment support: Comfac engineers help configure ROCm, Ollama, hardware optimization, and LoRA Package loading for the client’s specific environment
  • Version migration: When Cutting Edge models update (which is frequent), supported seats get migration assistance—config changes, compatibility fixes, regression testing
  • Incident response: When something breaks in production—and with Cutting Edge models, things will break—supported seats get priority troubleshooting
  • Usage-based pricing rationale: More seats = more support requests = more Comfac engineer time. The per-seat fee is calibrated to the expected support load, not to software access

Compare: $200/year with Comfac support vs. $1,200/year for a commercial SaaS AI seat (Claude, Copilot Enterprise). Organizations that have in-house ML talent can run Stable models for free and never pay Comfac. Those who lack that expertise—or who need the latest Cutting Edge capabilities for competitive advantage—pay for support.

Deployment Tier Supported Seats Annual Support Revenue
Internal (CGG Group) 50–100 seats $10,000–$20,000
Partners (Cornersteel, Gov’t) 100–300 seats $20,000–$60,000
Schools & Community Stable models: free / Cutting Edge: subsidized $0–minimal (QA value offsets)
Commercial Clients 50–200 seats $10,000–$40,000

10.4 R&D Access: Subscribing to Research, Not Software

The R&D Access tier (₱5M/year per partner) is the anchor revenue stream. Partners are not licensing a product—they are funding and receiving early access to Comfac’s continuous research output:

  • Pre-release Cutting Edge models before they reach the general support tier
  • Dedicated compute provisioned on partner-owned or Comfac-provisioned hardware (not a hosted cloud service)
  • Bespoke fine-tuning on partner’s proprietary data—their SOPs, codebases, and domain knowledge become part of their private LoRA Packages
  • Priority engineering support with direct access to CMM Scientists and Designers
  • Early access to new model capabilities (vision, VLA, audio) as they reach usable state—partners help validate these in real environments

10.5 Custom LoRA Package Engagements

This stream represents the majority of projected use cases at scale. Industrial and professional clients need small, specialized models that run on modest hardware in operational settings—and they lack the ML expertise to build these themselves:

  • Facilities management firms: Vision + expert models for predictive maintenance, inspection automation, compliance reporting
  • Engineering consultancies: Specialized calculation models (HVAC sizing, electrical load, structural analysis) that run on sub-₱50K hardware
  • Government agencies: Compliance document processing, automated form filling, regulatory cross-referencing models
  • Manufacturing plants: Quality inspection models, SOP-aware troubleshooting assistants, production scheduling optimizers
  • Schools and training centers: Customized tutoring models, curriculum-aligned coding assistants, assessment automation

The deliverable is the Package itself—the client receives trained LoRA weights, deployment configuration, and documentation. They own it and self-host it. Ongoing support is a separate, optional contract. Each engagement produces reusable Package templates that reduce Comfac’s development cost for similar future clients.

10.6 Profitability Model

Item Detail
Gross Revenue Target (Year 2) ₱20M–₱30M across all three streams
R&D Expenditure ₱15M operational + ₱3–4M model-building = ₱18–19M/year
R&D Treatment Capitalized and depreciated over 3–5 years (model weights, training infrastructure, Package IP)
Target Equity/Profit Margin ~15% after full R&D depreciation
Margin Expansion Path Custom Package templates compound—each new client in a similar vertical costs less to serve. Solar data center drives inference OpEx toward zero. Support burden per seat decreases as models mature from Cutting Edge to Stable.

The 15% margin target is conservative and accounts for heavy reinvestment in the first 2–3 years. As the LoRA Package Registry matures, more models graduate from Cutting Edge to Stable (reducing support load), and template reuse increases, margins should expand to 20–25% by Year 3–4.

11. Operational Efficiency Analysis

11.1 Design Automation (The 10x Lever)

Metric Traditional CMM Powered
Annual Talent Spend ₱16M (30 Engineers) ₱16M (Same Team, Supercharged)
Output Equivalent ₱16M Value ₱50M Value (3.1x Realized Gain)
Production Speed 1x (Manual) 10x (Parametric Gen + Auto-Paperwork)
Cost Per Revision High (Man-hours) Near Zero (Compute-seconds)

11.2 Facilities & Maintenance (The 2x Lever)

Metric Traditional CMM Powered
Annual Talent Spend ₱10M (20 Staff) ₱10M (Same Team, Optimized)
Output Equivalent ₱10M Value ₱20M Value (2x Realized Gain)
Key Efficiency Manual Inspection Vision Model Scanning & Auto-Ticketing

11.3 Robotics Frontier (Future CapEx Reduction)

The Godot Division (~₱2M/year) simulates specific work environments for CGG and partners. Environment LoRAs allow robots to “know” a facility before physically entering. Phase 1: Navigation, scanning, object retrieval. Phase 2: Toxic/hazardous tasks.

12. Immediate Action Items

  1. Strategic Briefing: Present revised Project OpenCoder roadmap to core leadership and technical leads
  2. Account Setup: Register Hugging Face organization account (comfac-cmm) and AMD Developer accounts for lead engineers
  3. QA Team Formation: Appoint QA Lead, recruit 2 Data Curators, identify first school partner for pilot cohort
  4. LoRA Package Registry Schema: Define v0.1 Package Card format, dataset structure, and evaluation harness specifications
  5. Hardware Procurement: Purchase initial RX 7600 XT; install ROCm and validate training pipeline
  6. Data Collection Sprint: Begin exporting Comfac Git repos, Frappe configurations, and internal documentation for training dataset v1
  7. Synthetic Data Generation: Commission first batch of PRDs and Frappe code samples from Claude/GPT-4 for training data
  8. Execution & Monitoring: Launch Stage 1 Benchmarking with weekly reporting cadence

Appendix: The Matrix Model Assets

FreeCAD Matrix Models: https://sites.comfac.net/freecad.html

Models designed to draw and create 3D parametric models.

Proprietary Definition

The “Secret Sauce” of the CMM Business Unit is not open-source code, but the Technique:

  1. Order of Operations: Specific sequence for manufacturing steps
  2. Business Processes: How design moves from prompt to physical Bill of Materials
  3. Internal Workflow: Automation of detailing for Fitout Construction, Engineering Schematics, Container Customization (SBC), and Open Source Ecology Production
  4. LoRA Package Registry: The composable, versioned collection of domain expertise LoRAs that encode Comfac’s competitive advantage in deployable model form