🎨 CS x Design Convergence Project: Generative AI Pipeline & Workflow Archive

"Bridging Technical Logic with Aesthetic Sensibility"

This repository serves as a Portfolio Archive documenting the construction of Generative AI image generation pipelines and workflow optimization. As a result of an interdisciplinary curriculum merging Computer Science and Design, this project demonstrates the end-to-end process from data collection and model fine-tuning to the design of advanced inference workflows.


πŸ“‹ 1. Project Overview

The core objective of this project is to demonstrate the ability to accurately train specific artistic styles and implement them into highly controllable workflows, going beyond simple prompt engineering. It aims to prove both technical proficiency (Model Architecture, Latent Space understanding) and artistic expression (Style Transfer).

  • Key Activities: Custom LoRA Training, Advanced ComfyUI Workflow Design, Automated Pipeline Scripting.
  • Tools Used: ComfyUI, OneTrainer, Stable Diffusion, Python, Hugging Face.

🧠 2. Model Training Methodology: Kirochy Style LoRA

To replicate the unique style of the illustrator Kirochy, I conducted LoRA (Low-Rank Adaptation) training with a rigorous data processing approach.

2.1 Data Acquisition & Preprocessing

  • Data Source: Aggregated reference illustrations from the artist's official portfolios (Instagram @kirochy_00, X).
  • Preprocessing: Implemented OneTrainer to handle various resolutions and aspect ratios via bucketing. Conducted detailed tagging to capture specific stylistic features (line art weight, color palettes, shading techniques).

2.2 Training Framework & Optimization

  • Engine: Trained using OneTrainer for precise parameter control.
  • Optimization: Adjusted Epochs and Learning Rates iteratively to balance between style fidelity and generalization, ensuring the model avoids overfitting while retaining the artist's signature touch.

βš™οΈ 3. Workflow Architecture: P2A (Photo to Anime) Pipeline

The p2a.ai.json file in this repository is a highly sophisticated Img2Img Workflow designed to convert real-world photos into Kirochy-style illustrations. To solve common structural distortion issues in style transfer, I engineered a multi-stage processing pipeline.

3.1 Technical Logic & Customization

This workflow is not a mere copy-paste; it is a custom-built architecture integrating various advanced techniques researched from diverse community workflows and technical documentation.

  1. ControlNet Integration (Structural Integrity):

    • Utilized ControlNet algorithms to strictly preserve the pose and depth information of the source image, preventing the "hallucinations" often seen in generative models.
  2. SAM (Segment Anything Model) & SAG (Self-Attention Guidance):

    • Integrated SAM for precise object segmentation and SAG to refine attention mechanisms. This ensures a clear separation between the subject and the background, enhancing the clarity of the illustration style.
  3. Automatic Detailer (Face & Hand Refinement):

    • Implemented a post-processing pipeline using Face and Hand Detailers. The workflow automatically detects and masks these complex regions, resampling them at higher resolutions to fix artifacts and ensure anatomical correctness.

πŸ–ΌοΈ 4. Results & Portfolio Showcase

The final outputs generated using this model and workflow are archived on Instagram. You can compare the reference inputs with the generated results to verify the technical quality.


⚠️ 5. Ethical Considerations & License

This project was conducted strictly for Academic Study and Research purposes.

β›” Copyright & Usage Warning

  • Intellectual Property: The copyright and stylistic rights of the LoRA model belong entirely to the original artist, Kirochy (@kirochy_00).
  • Non-Commercial Use Only: Utilizing this model file or the workflows for any commercial purpose (sales, paid commissions, advertising, etc.) is strictly prohibited.
  • Legal Notice: Any commercial exploitation may result in legal consequences under copyright laws.

πŸ“ Scope of Permitted Use

  • β­• Allowed: Personal study, portfolio research, non-commercial fan art.
  • ❌ Prohibited: Commercial use, impersonation of the original artist, unauthorized redistribution for profit.

Author: Um Yunsang
Role: CS & Design Convergence Researcher / AI Engineer Candidate

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for umyunsang/comfyui-models

Adapter
(7805)
this model