Jump to content

Project OpenCoder: AI Independence Initiative

From MediawikiCIT

PROJECT OPENCODER

Strategic AI Independence Initiative

Revised Plan v2.0

Prepared for:

Comfac Technology Group & Cornersteel Systems Corp.


1. Executive Summary

The Pivot: Project OpenCoder establishes the Comfac Matrix Model (CMM) Business Unit—a proprietary engine that manufactures intelligence for software development, physical production (via FreeCAD/Python), and enterprise automation. Rather than renting AI at $1,200/yr/seat, Comfac will build, train, and deploy its own specialized models, reducing per-seat cost to $200/year while creating a new revenue-generating division.

1.1 Strategic Pillars

Open Source Core: Release stable models publicly, cultivating a community of students and developers who become a distributed R&D department. Partner with schools to create a QA pipeline that continuously improves model quality.

Partner Premium: Reserve cutting-edge models—trained on latest data with faster iteration—for Comfac and paying partners (Cornersteel, Government agencies).

LoRA Package Registry: Introduce a structured LoRA Package Registry that packages domain expertise as composable, versionable LoRA modules. Each LoRA Package encapsulates PRD writing parameters, app architecture patterns, and validated QA benchmarks for a specific use case.

1.2 Financial Model

Item Detail
Operational Expenditure ₱15M per year upon release of Open Source and Cutting Edge models
Partner Commitment ₱5M/year in Partner Access fees (less than $100,000 USD/year)
Partner ROI Unified memory AI PCs + full-sized models + optimized tiny-to-small AIs for departmental use
Overage Billing ₱5,000 per hour beyond commitment
Target Per-Seat Cost $200/year (down from $1,200/year)

2. Platform Accounts & Development Infrastructure

Before any model training or fine-tuning can begin, Comfac must establish accounts on key AI development platforms. These accounts provide access to model weights, training infrastructure, community resources, and hardware-specific toolchains.

2.1 Hugging Face Account Setup

Hugging Face is the central hub for open-weight model distribution, dataset hosting, and community collaboration. A Comfac organizational account is essential for the following reasons:

  • Model Access: Download gated models like Qwen 2.5 Coder variants (3B, 7B, 32B) that require license acceptance. Certain models require organizational verification for commercial use.
  • Dataset Hosting: Host proprietary training datasets (Frappe code patterns, FreeCAD scripts, internal SOPs) as private repositories. Version control training data alongside model checkpoints.
  • Model Publishing: Publish stable CMM releases for community consumption. Track downloads, issues, and community contributions to stable models.
  • Spaces & Inference: Use Hugging Face Spaces to host demo inference endpoints for partner evaluation and school QA testing.
  • Training with AutoTrain / TRL: Access Hugging Face’s training libraries (TRL, PEFT, Unsloth integration) which are tightly coupled to the HF ecosystem.

Account Tiers

Tier Cost Recommendation
Free (Individual) $0 Sufficient for initial benchmarking (Stage 1)
Pro (Individual) $9/month For lead MMD/MMS researchers needing priority inference
Organization Free (public repos) / $20/user (private) Required for Stage 2+ with private dataset repos

2.2 AMD Developer Account & ROCm Ecosystem

Since Comfac’s GPU strategy centers on AMD Radeon consumer cards, an AMD Developer account is critical for accessing ROCm (Radeon Open Compute) tooling:

  • ROCm SDK Access: ROCm is AMD’s open-source GPU compute platform—equivalent to NVIDIA’s CUDA. It provides PyTorch/ROCm builds, HIP compiler toolchains, and GPU profiling tools necessary for LoRA fine-tuning on Radeon hardware.
  • Hardware Compatibility: The RX 7600 XT and RX 7800 XT require ROCm 6.x+ for stable PyTorch training. AMD Developer forums and early-access drivers often resolve compatibility issues weeks before public release.
  • MI-Series Migration Path: As Comfac scales, AMD’s MI250/MI300 accelerators offer a professional upgrade path. Developer account holders get priority access to documentation, benchmarks, and partner pricing.
  • Unified Memory APU Roadmaps: AMD’s Strix Point and subsequent APUs with shared CPU/GPU memory are central to Comfac’s next-gen strategy. Developer accounts provide early access to XDNA NPU SDKs.
  • Bug Reporting & Community: Direct channels to AMD engineers for ROCm issues specific to consumer Radeon cards (which are less tested than MI-series for ML workloads).

Setup Action Items

  1. Register AMD Developer account at developer.amd.com
  2. Install ROCm 6.x on Ubuntu 22.04/24.04 development machines
  3. Validate PyTorch-ROCm build against RX 7600 XT (gfx1102 target)
  4. Subscribe to AMD ROCm GitHub releases for driver/compatibility updates
  5. Join AMD Developer community forums—flag Comfac as AI inference + training use case for potential partnership outreach

3. The LoRA Package Registry

A LoRA Package is a composable, versionable LoRA adapter module paired with its training data, evaluation benchmarks, and deployment configuration. The LoRA Package Registry is the organizational system that catalogs, versions, and chains these Packages for different use cases.

3.1 What is a LoRA Package?

Each LoRA Package consists of the following components packaged as a single versioned unit:

Component Description
LoRA Weights The fine-tuned adapter weights for a specific domain (e.g., frappe-erpnext-v1.2)
Training Dataset Curated instruction/completion pairs, code samples, PRD templates, and architecture patterns
Eval Benchmarks Automated test cases that measure accuracy, code correctness, and domain knowledge retention
PRD Parameters Structured prompting templates that define how the model writes PRDs, specs, and documentation for this domain
Package Card Metadata: version, dependencies, compatible base models, author, QA status, deployment notes

3.2 Package Categories

A. Architecture Packages (App Framework Patterns)

These Packages encode deep understanding of specific application frameworks—not just syntax, but idiomatic patterns, project structure conventions, and deployment workflows.

Package Name Domain Coverage Training Sources
frappe-erpnext ERPNext customization, DocType creation, server/client scripts, Print Formats, custom workflows, Frappe hooks Comfac ERPNext repos, Frappe Framework docs, community apps, Frappe School tutorials
frappe-app-dev Full Frappe app creation from bench init to deployment, API design, Jinja templating, bench commands Frappe app scaffolds, published Frappe apps on GitHub, internal Comfac Frappe apps
onlyoffice-plugin OnlyOffice plugin architecture, macro development, document builder SDK, editor customization OnlyOffice SDK docs, existing plugin repos, Comfac OnlyOffice configurations
nextcloud-app NextCloud app development, OCS API, DAV integration, notification system NextCloud developer docs, community app repos
freecad-python FreeCAD scripting, workbench creation, parametric modeling API, macro development FreeCAD Python docs, wiki examples, Comfac CAD scripts

B. Product Packages (Internal Apps)

These Packages specialize the model for extending and maintaining Comfac’s proprietary applications:

Product Package Focus Key Training Material
Synx Worker scheduling algorithms, class schedule optimization, workload prediction models, manpower WBS documents Synx codebase, scheduling domain papers, school timetabling patterns
Secada Document management workflows, OCR/image recognition pipelines, metadata extraction, compliance tagging Secada repos, PaperlessNG patterns, Philippine regulatory document formats
Steward Home/facility automation, solar panel monitoring, IoT device integration, energy dashboard UI, panel board tracking Steward codebase, Home Assistant integration patterns, energy monitoring APIs
Synopsis Multi-platform message consolidation, search/indexing of Viber/email/Telegram/Messenger, report generation from conversations Synopsis architecture docs, messaging API schemas, Comfac communication workflows

C. Process Packages (Cross-Cutting Capabilities)

These Packages encode general engineering practices that apply across multiple architectures:

  • PRD Writing Package: Structured product requirements generation following Comfac’s template parameters—user stories, acceptance criteria, technical constraints, and dependency mapping.
  • Data Migration Package: ETL patterns for migrating data into ERPNext—spreadsheet cleanup, field mapping, validation rules.
  • API Integration Package: RESTful and webhook patterns for connecting Comfac services to third-party systems.
  • DevOps Package: Docker containerization, Nginx reverse proxy configuration, self-hosted deployment patterns for Comfac’s infrastructure.

3.3 Package Composition & Chaining

The real power of the LoRA Package Registry emerges when Packages are composed. For a given task, the system loads the base Qwen 2.5 Coder model, then applies one or more LoRA Packages in sequence:

Example: "Create a Frappe app for Synx’s new shift-swap feature" would chain: frappe-app-dev + synx-product + prd-writing Packages, giving the model deep knowledge of Frappe conventions, Synx’s specific domain, and structured requirements output.

Package composition is managed through a configuration file that specifies load order, weight blending ratios, and conflict resolution when Packages touch overlapping domains.

4. QA Team & School Partnership Program

Quality assurance for AI model training is fundamentally different from traditional software QA. The QA team’s primary function is generating, validating, and curating training data—ensuring that every Package in the library meets accuracy and reliability benchmarks before deployment.

4.1 QA Team Structure

Role Responsibility Output Count
QA Lead Defines evaluation criteria per Package, manages benchmark suites, approves Package promotions from dev to stable Package scorecards, release approvals, regression test suites 1 (internal)
Data Curator Reviews and cleans training pairs from all sources (Git logs, school submissions, internal docs). Ensures format compliance. Validated datasets in HF-compatible format 2 (internal)
School QA Cohort Student testers who run model outputs against real assignments, flag hallucinations, and submit correction pairs Error logs, correction datasets, usage reports 10-20 (school partners)
Domain Expert Reviewers Internal engineers who validate model outputs for their specific domains (Frappe, FreeCAD, etc.) Domain accuracy scores, corrected code samples 3-5 (internal, rotating)

4.2 School Partnership Model

Schools provide the distributed labor force for data generation and QA, receiving free AI tooling in return. This is the core of the Open Source Ecology strategy.

The Exchange

  • Schools receive: Free unlimited access to Stable model releases, curriculum integration support, exposure to production AI workflows, and certificates of participation.
  • Comfac receives: Error logs, corrected outputs, new training pairs (student code + corrections), usage telemetry, and domain-specific test cases across diverse real-world scenarios.

QA Workflow for School Cohorts

  1. Assignment Distribution: QA Lead distributes model-generated outputs to student cohorts along with evaluation rubrics.
  2. Testing & Annotation: Students run the outputs in their own environments, annotate errors (hallucinated APIs, incorrect syntax, wrong framework patterns), and submit correction pairs.
  3. Data Curation: Internal Data Curators validate student submissions, clean formatting, and merge into training datasets on Hugging Face.
  4. Retraining Cycle: Updated datasets trigger LoRA retraining. New Package version is benchmarked against the previous version.
  5. Release: If benchmarks improve, the new Package version is promoted to Stable and redistributed to schools, completing the feedback loop.

4.3 Using Other Models to Train Qwen

A critical acceleration strategy is using larger, more capable models (Claude, GPT-4, DeepSeek-V3, Qwen 72B) to generate synthetic training data for the smaller Qwen 2.5 3B target model. This is the knowledge distillation pipeline:

  • PRD Generation: Use Claude or GPT-4 to generate high-quality PRDs for Frappe apps, OnlyOffice plugins, and internal products. These become the gold-standard training pairs for the PRD Writing Package.
  • Code Correction: Feed Qwen 3B’s incorrect outputs to a larger model for correction. The (incorrect, corrected) pairs become DPO (Direct Preference Optimization) training data.
  • Architecture Explanation: Use larger models to generate detailed explanations of Frappe/OnlyOffice/FreeCAD code patterns, which become instruction-tuning data for the smaller model.
  • Benchmark Generation: Use multiple large models to generate diverse test cases and expected outputs for each Package’s evaluation suite.

5. Qwen 2.5 3B Coder Instruct: Training Strategy

The target model—Qwen 2.5 Coder 3B Instruct—is chosen for its balance of capability and deployability on consumer AMD hardware. The training strategy involves layered LoRA specialization, where each layer adds domain expertise without degrading general coding ability.

5.1 Training Phases

Phase 1: Base Calibration

Establish baseline performance on general coding tasks. Benchmark against Claude Sonnet on standardized tests (HumanEval, MBPP, custom Frappe tasks). Document the "Weakness Vocabulary"—specific concepts, APIs, and patterns where the base model fails.

Phase 2: PRD Writing LoRA

The PRD Writing Package is trained first because it produces structured outputs that become training data for subsequent Packages. Training parameters:

Parameter Specification
Output Format Structured markdown with sections: Overview, User Stories, Acceptance Criteria, Technical Constraints, API Contracts, Data Models, Dependencies, Risk Assessment
Architecture Awareness PRDs must reference target architecture (Frappe, OnlyOffice, NextCloud, etc.) and include framework-specific implementation notes
Scope Constraints Model must generate PRDs scoped to single-sprint deliverables (~2 weeks), decomposing larger features into phased PRDs
Training Data Source 100+ PRDs generated by Claude/GPT-4 for real Comfac features, reviewed and corrected by internal engineers
Evaluation Criteria Completeness score, feasibility rating (can a developer build from this PRD alone?), architecture alignment

Phase 3: Architecture-Specific LoRAs

With the PRD Package established, train architecture Packages using a combination of:

  • Synthetic instruction pairs from larger models (Claude generates Frappe code, model learns patterns)
  • Real codebase extraction from Comfac Git repositories (code patterns, commit messages, PR descriptions)
  • Documentation distillation from official framework docs (Frappe, OnlyOffice SDK, FreeCAD API)
  • QA correction pairs from school cohorts and internal domain experts

Phase 4: Product-Specific LoRAs

These are the most specialized Packages, trained on Comfac’s proprietary codebases. Access is restricted to Cutting Edge tier:

  • Synx: Scheduling optimization, workforce management patterns, school timetabling algorithms
  • Secada: Document processing pipelines, OCR integration, compliance metadata schemas
  • Steward: IoT device control, solar monitoring dashboards, energy tracking data models
  • Synopsis: Multi-platform message aggregation, search indexing, conversation-to-report generation

5.2 LoRA Training Configuration

Parameter 3B Model 32B Model (Future)
LoRA Rank (r) 64 128
LoRA Alpha 128 256
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj Same
Training Framework Unsloth (4-bit quantized training) Unsloth or Axolotl
Hardware Requirement 1x RX 7600 XT (16GB) Multi-GPU or MI250
Epochs per Package 3-5 (with early stopping) 2-3

6. Key Architecture Deep Dives

6.1 Frappe ERPNext Customization & App Creation

This is Comfac’s highest-priority LoRA Package given its direct revenue impact. The Frappe ecosystem requires deep model understanding across multiple layers:

ERPNext Customization Package Coverage

  • DocType creation and modification (fields, naming rules, permissions, workflows)
  • Server Scripts (Python): Whitelisted methods, scheduled tasks, document events, custom API endpoints
  • Client Scripts (JavaScript): Form manipulation, custom buttons, dynamic link filters, real-time updates
  • Print Format Designer: Custom Jinja templates for invoices, reports, and compliance documents (including Philippine regulatory formats)
  • Workflow automation: Multi-step approval chains, state transitions, email notifications
  • Data migration: Importing from legacy systems, spreadsheet cleanup, field mapping validation

Frappe App Development Package Coverage

  • Full app scaffolding: bench new-app through to production deployment
  • Frappe hooks system: app_include_js/css, doc_events, scheduler_events, override_whitelisted_methods
  • REST API design patterns within Frappe’s framework constraints
  • Web page and portal development using Frappe’s website module
  • Integration patterns: Frappe’s OAuth, webhook, and background job systems

6.2 OnlyOffice Plugin & Customization

OnlyOffice is the document editing backbone for Comfac’s self-hosted infrastructure. The model needs Package coverage across:

  • Plugin architecture: manifest.json structure, plugin lifecycle, event hooks, UI panel creation
  • Document Builder API: Programmatic document creation, template automation, batch processing
  • Macro development: JavaScript macros for spreadsheet automation (replacing Google Apps Script workflows)
  • Editor customization: Custom toolbars, context menus, and formatting presets for Philippine business documents
  • Currency/locale handling: Philippine peso formatting defaults, date formats, and number conventions

6.3 Internal App Modifications (Synx, Secada, Steward, Synopsis)

Each internal product has its own codebase, architecture decisions, and domain logic that the model must deeply understand. These are Cutting Edge-only Packages with restricted access.

Synx (Scheduling): The model must understand constraint-satisfaction algorithms for timetabling, worker availability matrices, shift-swap logic, and the specific data models Synx uses for workload prediction. Training data includes historical scheduling decisions and their outcomes.

Secada (Document Management): Beyond basic CRUD, the model needs OCR pipeline understanding—how Secada processes scanned Philippine government documents, extracts compliance metadata, and routes documents through approval workflows. Image recognition for document classification is a key differentiator.

Steward (Facility Automation): IoT integration patterns, MQTT/Zigbee protocol handling, solar panel monitoring data models (tracking every panel in CGG down to individual panel boards), and energy dashboard visualization logic.

Synopsis (Communication Hub): Multi-platform API integration (Viber Business, email IMAP/SMTP, Telegram Bot API, Messenger Platform), message deduplication across platforms, full-text search indexing, and AI-assisted conversation summarization for report generation.

7. Core Products & Revenue Strategy (2026-2027)

7.1 The Coder Matrix (3B-16B Coding LoRA)

A fine-tuned coding assistant replicating Claude Code Premium capabilities without per-seat cost. Immediate 2026 impact: enables Comfac to rapidly turn around software solutions for CSC Facilities, Frappe ecosystem, and internal products.

7.2 The FreeCAD Matrix (CAD Automation)

Parametric models for automated 3D asset creation. Full benefits by end of 2026 as workflows migrate to FreeCAD. Key innovations include Path Procedural Design Workbench (engineers draw paths, AI generates pipes/ducts/cabling with supports), Civil Works Automation (procedural room design), and instant generation of 3D models, 2D shop drawings, and BOMs.

7.3 The Marketplace

  • Rent spare compute from Cabuyao Solar Data Center to students and startups
  • "Plug & Play" AI Rentals for schools—pre-configured inference endpoints
  • LoRA Package Registry licensing—partners can purchase individual Packages rather than full platform access

8. Technical Architecture

8.1 Hardware Strategy

Level Hardware Price Purpose
Entry Radeon RX 7600 XT (16GB) ~₱22,000 Inference + LoRA training (3B models)
Production Radeon RX 7800 XT (16GB) ~₱35,000 Faster inference + larger model support
Next-Gen AMD APUs with NPU/TPU TBD Unified Memory for full-sized model deployment

Infrastructure: The Solar Data Center (Cabuyao) utilizes Cornersteel’s solar capacity to run inference at near-zero OpEx.

8.2 Software Stack

Component Technology
Base Model Qwen 2.5 Coder 3B / 32B (via Hugging Face)
LoRA Packages Custom adapters per domain, stored in LoRA Package Registry on Hugging Face (private org repos)
Training Framework Unsloth + ROCm (AMD), with TRL/PEFT for training orchestration
Deployment Ollama + OpenCode CLI (devs) / Custom WebUI (CAD operators / non-technical)
QA Pipeline Automated eval harness + school cohort testing + domain expert review

9. Implementation Roadmap

Stage 1: Accounts, Benchmarking & Setup (Weeks 1-4)

  1. Set up Hugging Face organizational account; accept Qwen 2.5 model licenses
  2. Register AMD Developer account; install ROCm 6.x; validate PyTorch-ROCm on RX 7600 XT
  3. Purchase 1x RX 7600 XT (~₱22,000)
  4. Benchmark Qwen 2.5 3B against Claude Sonnet on: Laravel controller generation, Frappe DocType creation, basic FreeCAD scripting
  5. Deliverable: Weakness Vocabulary report + LoRA Package Registry v0.1 schema definition

Stage 2: First Packages & QA Foundation (Weeks 5-12)

  1. Train PRD Writing Package using Claude/GPT-4 generated training data, reviewed by internal team
  2. Train frappe-erpnext Package using Comfac Git exports + Frappe documentation distillation
  3. Establish QA pipeline: automated eval harness, internal reviewer rotation, benchmark scoring
  4. RAG Setup: Ingest Cornersteel machine manuals, SOPs, and Frappe docs into knowledge base
  5. Begin OnlyOffice plugin Package data collection

Stage 3: School Partnerships & Package Expansion (Months 3-6)

  1. Onboard first school partner cohort (10-20 students) with Stable model access
  2. Launch QA feedback loop: students test, annotate, submit corrections
  3. Train Product Packages: Synx, Secada, Steward, Synopsis (Cutting Edge only)
  4. Train freecad-python Package, begin FreeCAD Matrix prototype
  5. Hardware cascade: Senior staff upgrade to 16-20GB VRAM; juniors inherit 8-12GB units

Stage 4: Commercialization & Revenue (Month 6+)

  1. Launch Partner Access program (₱5M/year tier)
  2. Deploy LoRA Package Registry marketplace for individual Package licensing
  3. Scale school partnerships to additional institutions
  4. Begin visual model development (facility inspection, defect recognition)
  5. Rent spare Cabuyao compute to students and startups

10. Revenue Model & Unit Economics

10.1 No-SaaS Policy: The Open Source Support Model

Comfac does not operate a SaaS business. All Stable model weights and LoRA Packages are released as open source—anyone can download them, self-host them, and run them without paying Comfac a single peso. This is deliberate: it builds community, creates a distributed QA workforce, and establishes Comfac as the recognized authority on these models.

The revenue comes from the reality that Cutting Edge models are inherently unstable. They are experimental, frequently updated, may break between versions, and require expertise to deploy correctly. Organizations that want to run these models in production need support—configuration help, troubleshooting, version migration, hardware optimization, and domain-specific tuning. That support is what they pay for.

This follows the proven open-source business model used by Red Hat (RHEL), Canonical (Ubuntu Pro), Elastic, and others: the code is free, the expertise and stability guarantees cost money. Anyone can DIY it, but for non-technical organizations or those without specialized ML engineers, self-hosting Cutting Edge models without support is impractical.

10.2 Revenue Stream Breakdown

Revenue Stream % of Revenue Description
Supported Seats ~20% $200/year per supported seat. Each seat is a user or workstation running Cutting Edge models with Comfac support coverage. The fee reflects the expected support burden per seat—more users means more tickets, more version-migration assistance, more hardware troubleshooting. Organizations self-hosting Stable models owe nothing. Organizations wanting Cutting Edge models with guaranteed uptime and support response pay per seat.
R&D Access (Partner Tier) ~40% Partners pay ₱5M/year ($100K USD) for direct access to Comfac’s R&D pipeline: pre-release Cutting Edge models, dedicated compute provisioning, bespoke fine-tuning on partner data, and priority engineering support. Partners are funding—and benefiting from—ongoing research, not licensing a hosted product. Models run on partner-owned or Comfac-provisioned hardware, never on a Comfac cloud.
Custom LoRA Packages ~40% Project-based engagements to develop bespoke LoRA Packages for professionals and industrial clients. Comfac builds, trains, validates, and delivers a specialized tiny-to-small model customized for the client’s operational setting. The deliverable is the Package itself—client owns and self-hosts it. Ongoing support is optional and billed separately.

10.3 Supported Seat Economics

The $200/year per-seat price is not a software license—it is a support contract. The models are free. What organizations pay for is:

  • Access to Cutting Edge model releases (not yet promoted to Stable—these are experimental, fast-moving, and may have breaking changes between versions)
  • Deployment support: Comfac engineers help configure ROCm, Ollama, hardware optimization, and LoRA Package loading for the client’s specific environment
  • Version migration: When Cutting Edge models update (which is frequent), supported seats get migration assistance—config changes, compatibility fixes, regression testing
  • Incident response: When something breaks in production—and with Cutting Edge models, things will break—supported seats get priority troubleshooting
  • Usage-based pricing rationale: More seats = more support requests = more Comfac engineer time. The per-seat fee is calibrated to the expected support load, not to software access

Compare: $200/year with Comfac support vs. $1,200/year for a commercial SaaS AI seat (Claude, Copilot Enterprise). Organizations that have in-house ML talent can run Stable models for free and never pay Comfac. Those who lack that expertise—or who need the latest Cutting Edge capabilities for competitive advantage—pay for support.

Deployment Tier Supported Seats Annual Support Revenue
Internal (CGG Group) 50–100 seats $10,000–$20,000
Partners (Cornersteel, Gov’t) 100–300 seats $20,000–$60,000
Schools & Community Stable models: free / Cutting Edge: subsidized $0–minimal (QA value offsets)
Commercial Clients 50–200 seats $10,000–$40,000

10.4 R&D Access: Subscribing to Research, Not Software

The R&D Access tier (₱5M/year per partner) is the anchor revenue stream. Partners are not licensing a product—they are funding and receiving early access to Comfac’s continuous research output:

  • Pre-release Cutting Edge models before they reach the general support tier
  • Dedicated compute provisioned on partner-owned or Comfac-provisioned hardware (not a hosted cloud service)
  • Bespoke fine-tuning on partner’s proprietary data—their SOPs, codebases, and domain knowledge become part of their private LoRA Packages
  • Priority engineering support with direct access to CMM Scientists and Designers
  • Early access to new model capabilities (vision, VLA, audio) as they reach usable state—partners help validate these in real environments

10.5 Custom LoRA Package Engagements

This stream represents the majority of projected use cases at scale. Industrial and professional clients need small, specialized models that run on modest hardware in operational settings—and they lack the ML expertise to build these themselves:

  • Facilities management firms: Vision + expert models for predictive maintenance, inspection automation, compliance reporting
  • Engineering consultancies: Specialized calculation models (HVAC sizing, electrical load, structural analysis) that run on sub-₱50K hardware
  • Government agencies: Compliance document processing, automated form filling, regulatory cross-referencing models
  • Manufacturing plants: Quality inspection models, SOP-aware troubleshooting assistants, production scheduling optimizers
  • Schools and training centers: Customized tutoring models, curriculum-aligned coding assistants, assessment automation

The deliverable is the Package itself—the client receives trained LoRA weights, deployment configuration, and documentation. They own it and self-host it. Ongoing support is a separate, optional contract. Each engagement produces reusable Package templates that reduce Comfac’s development cost for similar future clients.

10.6 Profitability Model

Item Detail
Gross Revenue Target (Year 2) ₱20M–₱30M across all three streams
R&D Expenditure ₱15M operational + ₱3–4M model-building = ₱18–19M/year
R&D Treatment Capitalized and depreciated over 3–5 years (model weights, training infrastructure, Package IP)
Target Equity/Profit Margin ~15% after full R&D depreciation
Margin Expansion Path Custom Package templates compound—each new client in a similar vertical costs less to serve. Solar data center drives inference OpEx toward zero. Support burden per seat decreases as models mature from Cutting Edge to Stable.

The 15% margin target is conservative and accounts for heavy reinvestment in the first 2–3 years. As the LoRA Package Registry matures, more models graduate from Cutting Edge to Stable (reducing support load), and template reuse increases, margins should expand to 20–25% by Year 3–4.

11. Operational Efficiency Analysis

11.1 Design Automation (The 10x Lever)

Metric Traditional CMM Powered
Annual Talent Spend ₱16M (30 Engineers) ₱16M (Same Team, Supercharged)
Output Equivalent ₱16M Value ₱50M Value (3.1x Realized Gain)
Production Speed 1x (Manual) 10x (Parametric Gen + Auto-Paperwork)
Cost Per Revision High (Man-hours) Near Zero (Compute-seconds)

11.2 Facilities & Maintenance (The 2x Lever)

Metric Traditional CMM Powered
Annual Talent Spend ₱10M (20 Staff) ₱10M (Same Team, Optimized)
Output Equivalent ₱10M Value ₱20M Value (2x Realized Gain)
Key Efficiency Manual Inspection Vision Model Scanning & Auto-Ticketing

11.3 Robotics Frontier (Future CapEx Reduction)

The Godot Division (~₱2M/year) simulates specific work environments for CGG and partners. Environment LoRAs allow robots to “know” a facility before physically entering. Phase 1: Navigation, scanning, object retrieval. Phase 2: Toxic/hazardous tasks.

12. Immediate Action Items

  1. Strategic Briefing: Present revised Project OpenCoder roadmap to core leadership and technical leads
  2. Account Setup: Register Hugging Face organization account (comfac-cmm) and AMD Developer accounts for lead engineers
  3. QA Team Formation: Appoint QA Lead, recruit 2 Data Curators, identify first school partner for pilot cohort
  4. LoRA Package Registry Schema: Define v0.1 Package Card format, dataset structure, and evaluation harness specifications
  5. Hardware Procurement: Purchase initial RX 7600 XT; install ROCm and validate training pipeline
  6. Data Collection Sprint: Begin exporting Comfac Git repos, Frappe configurations, and internal documentation for training dataset v1
  7. Synthetic Data Generation: Commission first batch of PRDs and Frappe code samples from Claude/GPT-4 for training data
  8. Execution & Monitoring: Launch Stage 1 Benchmarking with weekly reporting cadence

Appendix: The Matrix Model Assets

FreeCAD Matrix Models: https://sites.comfac.net/freecad.html

Models designed to draw and create 3D parametric models.

Proprietary Definition

The “Secret Sauce” of the CMM Business Unit is not open-source code, but the Technique:

  1. Order of Operations: Specific sequence for manufacturing steps
  2. Business Processes: How design moves from prompt to physical Bill of Materials
  3. Internal Workflow: Automation of detailing for Fitout Construction, Engineering Schematics, Container Customization (SBC), and Open Source Ecology Production
  4. LoRA Package Registry: The composable, versioned collection of domain expertise LoRAs that encode Comfac’s competitive advantage in deployable model form