# Morfd

A composable img2img Stable Diffusion pipeline powered by LoRA, enabling authentic style transfer and creative flexibility for artists and developers.

# Confidentiality Notice

This document and the information contained herein are confidential and intended solely for the use of the individual or entity to whom they are addressed. Unauthorized disclosure, copying, distribution, or reliance on the contents of this document is strictly prohibited. If you have received this document in error, please notify the sender immediately and delete all copies of the original message.

# Overview

This project operates as a Temple Technology case study and proof of concept in end-to-end verifiable generative content. Exploring atomic outputs, leaving the prompt or the essence beholden to the creator. A web3 native system to build an economy of creators and their consumers.

In essence, Morfd is a selfie remixing tool, allowing true-to-style transfer into embodying an artist. The artist needs only to provide a subset of images to train a model, which is then placed in a predefined yet extensible pipeline, allowing mass inference to consumers.

# Architecture

Morfd operates under an optimistic, asynchronous architecture to service both on-chain and off-chain applications. At the core of Morfd’s content is managed via morfd-cap, a content-addressable proxy that publishes and retrieves data via IPFS. By interfacing with both public and private IPFS nodes, Morfd ensures that creators’ training materials remain confidential (stored on private nodes) while still being verifiably referenced through their CIDs. Once content is properly addressed, Morfd can act in two main capacities: Training and Inference. During Training, Morfd ingests a collection of creator-provided assets stored privately for protection then a morfd-agent picks up the job to train a LoRA model. Upon completion, the LoRA model weights are hashed to form a verifiable proof of training (e.g., a commitment stored on-chain). During Inference, the same agent composes the public pipeline with the privately trained LoRA and input parameters to generate an image. If required, Morfd can incorporate cryptographic proofs or zero-knowledge techniques to confirm without revealing private data that a specific public pipeline was used in conjunction with a committed LoRA, thereby providing end-to-end integrity and trust in the final output.

# Content Addressable Proxy Service

morfd-cap uses a content-addressable approach to cache data in cloud block storage before finalizing submissions to IPFS. During uploads, morfd-cap proxies the data to cloud storage and returns a precomputed CID prefix for immediate reference. Likewise, CIDs can be used to fetch data from morfd-cap, which maps them back to their underlying storage. This architecture provides a high-performance buffering layer between clients and IPFS and seamlessly interfaces with both public and private IPFS nodes.

Submission Flow: Content is prehashed using CIDv1 and conditionally stored in IPFS, while in parralel caching it in a strongly consistient cloud block store.

Retrieval Flow: Content is requested with the CID, and will fetch from the cache, in the case where it is not available it will fallback on IPFS

# Agents

morfd-agent is a container based processing cluster designed to scale compute task horizantally across bare metal hardware. Jobs sent to the cluster sit behind a global queue and loadbalancer to delegate tasks to free nodes. To add effciency k-native will be the control plane to allow scaling to zero to save resources on downtime.

The morfd-agent will be orchestrated with kubernetes running on CUDA enabled hardware, to cover the following rolls:

Training
Inference
Proof/Verification

# Pipelines

Morfd has the ability to be composable, while removing the unessary user input, leaving it in the control of the flow. Leaving the flow to be atomic in nature. Temple Technology will be responsible for the creation and optimisation of these flows, with the aim for accuracy staying true to the artists style. These components are at the control of the creator and can be configured

# Morfd V1 Pipeline

Below is an outline of how the first iteration of the Morfd pipeline a LoRA-augmented Stable Diffusion pipeline that uses text embeddings and ControlNet. We also illustrate object-driven prompt augmentation via OpenCV, which improves accuracy and reduces hallucinations by adding contextual details to the prompt.

Load/prepare an initial image
The pipeline ingests a source image and, if needed, converts it into feature maps (specifically a depth map) to guide the generation.
Detect objects using OpenCV
An object-detection step identifies key figures or items in the image. The pipeline uses these detections to refine or augment the text prompt, resulting in more accurate outputs.
Load the base model checkpoint
This step loads the Stable Diffusion components:
- U-Net (the core diffusion network)
- VAE (for encoding/decoding images into latent space)
- CLIP text encoder (to transform user prompts into embeddings)
Encode the positive prompt
A descriptive prompt, enriched with any object-detection insights, is converted into a text embedding that pushes the model toward desired features.
Encode the negative prompt
A separate “negative” prompt encodes undesired features or styles, guiding the model away from irrelevant or unwanted elements.
Apply LoRA
LoRA injects specialized style or concept weights into the model without retraining the entire network, allowing for targeted customization.
Generate an initial latent space
The process typically begins with a noise or zero-filled latent tensor, which is much smaller than the final image dimensions.
Integrate ControlNet
The feature maps derived from the initial image (e.g., depth, edges) feed into ControlNet. This conditioning ensures the output image respects the structure or composition of the source.
Perform diffusion sampling
The pipeline refines the latent tensor in iterative steps (e.g., using Euler or DDIM samplers). Classifier-free guidance combines the positive and negative embeddings (and, if relevant, object-detection enhancements) to steer generation toward the intended look.
Decode latent back to image
After the final denoising step, the VAE decodes the latent representation into a high-resolution RGB image.
Save or return the result
The completed image can be saved to disk or passed back to the calling application.

Object-Driven Prompt Augmentation

By detecting and labeling objects in the input, the pipeline can automatically insert more context into the text prompt. This leads to images that better align with real-world details and reduce the likelihood of hallucinated (unwanted) elements.

Below, you can see how refining the prompt with object-detection results changes the accuracy and detail of the outputs at each iteration:

lowest-accuracy-prompt low-accuracy-prompt high-accuracy-prompt highest-accuracy-prompt

By incorporating OpenCV’s object identification and LoRA-driven customization within Stable Diffusion, Morfd achieves improved coherence, fewer hallucinations, and better contextual accuracy in its final images.

Keep in mind this can be leveraged at the creators discretion

Input Parameter Schema

{
    "input": "string",          // CID
    "seed": "int64",            // Randomness
    "timestamp": "int32",       // UTC epoch timestamp in seconds
    "aspect_ratio": "float32"   // Target aspect ration (width / height)
}

# Contract Infrastructure V1 (Trusted)

The on-chain component of Morfd provides two main capabilities:

On-Chain Verification: Verifies model ownership, authenticity, or other relevant metadata in a trust-minimized environment.
Creator Monetization: Automates the distribution of inference fees (or “minting fees”) between creators and the broader Taste ecosystem.

In a typical transaction flow:

A consumer uploads the input parameters via morfd-cap to make the data addressable.
The consumer pays msg.value (the native chain currency) into the contract.
The contract calculates the split using creatorFeeBasis and ecosystemFeeBasis.
Proceeds are distributed:
- A portion to the creator (being the rewards pool contract).
- A portion to the ecosystem.
An offchain indexer will pickup the event and proceed with generation. Using the event logs of the transaction.
morfd-agent will utilise the morfd-cap to pull the creator model and input data in for inference.

Trusted Model Hosting & Inference

Temple Technology manages the trusted portion of model training and deployment. While the model itself may run off-chain or in a specialized environment, Morfd’s on-chain interface ensures users can verify that the model is legitimate.
The primary revenue model is driven by inference calls, where users pay to access the model’s inference output.

Fee Splitting Logic

Fees are generally paid in the native chain currency.
Two parameters govern how fees are split:
- creatorFeeBasis – the basis points (or percentage) that go directly to rewards pools claimable by the creator of the model or NFT.
- ecosystemFeeBasis – the complementary basis points for the Temple/Taste ecosystem.

Reward Distribution in Taste Token

The system allows creators to claim their share in Taste tokens. Under the hood, reward fees are used to purchase Taste via an immediate swap. At each mint or inference call, the on-chain logic automatically converts the native currency into Taste (e.g., through a DEX Router or OTC within the Taste Ecosystem).

As a result, creators directly receive Taste tokens proportional to the fees they generate. This approach ensures a consistent experience across all deployed models and incentivizes creators to adopt and promote Morfd and Taste.

# Concepts

# Contract Model Identity

To address cross-chain provenance and attribution of both the model and its creator, a contract-based model identity can be established to provide a verifiable and immutable record within the Ethereum Virtual Machine (EVM). The key mechanism is the use of the CREATE2 opcode to generate a deterministic contract address that cryptographically encapsulates the essential components of the model—such as the weights and training data. This ensures reliable provenance and traceability throughout the model’s lifecycle.

Additionally, because CREATE2 allows deterministic contract deployment, the same contract can be replicated across multiple EVM-compatible blockchains while preserving the same address—an outcome directly tied to the creator’s attributes.

NOTE: While it is theoretically possible to have conflicting addresses on other chains, the likelihood of such a collision is extremely low. This is because CREATE2 derives the contract address from a combination of the deployer’s address, the contract’s initialization code, and a unique salt. As long as the salt and other parameters (e.g., model hash) are selected carefully, the probability of unintentionally replicating an existing address remains negligible.

salt = Keccak256(CID^{model}||creatorAddress||nonce)

# Proof Mechanism and Verification

Morfd’s architecture incorporates cryptographic commitments and optional zero-knowledge (ZK) proofs to ensure verifiable integrity of both training and inference steps. During training, a morfd-agent produces a LoRA model from the creator’s private dataset. Once trained, the agent computes a cryptographic hash of the LoRA weights. This hash serves as a commitment that is recorded on-chain or otherwise persisted, guaranteeing that the exact LoRA parameters cannot be changed without detection.

For inference, Morfd can operate in two modes:

Optimistic Verification (Commitment-Only):
- The pipeline references the public model (e.g., Stable Diffusion) by a known hash.
- The private LoRA is referenced by the on-chain commitment generated during training.
- The input parameters (prompt, seed, etc.) and final outputs (images) are addressed via IPFS CIDs.
- By verifying these references (model hash, LoRA hash, input data CID) and reproducing the same inference steps, third parties gain confidence that the claimed LoRA and data produced the stated result—without the LoRA ever being revealed.
Zero-Knowledge Verification (ZK Proofs):
- Morfd optionally leverages a ZK-capable agent that runs inference inside a ZK circuit, proving correctness of the pipeline (public model + private LoRA + provided input) without exposing LoRA details or intermediate computations.
- The resulting ZK proof attests that the final image was generated exactly as claimed, from the committed LoRA and public model.
- On-chain or off-chain verifiers can confirm the proof’s validity without trusting the agent or learning any private weights.

By combining the content-addressable workflow—where training data, model weights, and inference inputs are all stored as IPFS CIDs—with commitment schemes or ZK proofs, Morfd provides an auditable yet privacy-preserving solution. Creators retain control over their proprietary training materials, while end users (and smart contracts) can cryptographically verify that any resulting LoRA or inference output is authentic and corresponds to the expected pipeline.