byte-vortex's picture
Update README.md
44b4842 verified
---
title: Autonomous Rendering And Imaging Agent (A.R.I.A.)
emoji: πŸš€
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: true
short_description: MCP's 1st Birthday - Hosted by Anthropic and Gradio Project
tags:
- mcp-in-action-track-creative
- agent-course
- stable-diffusion
- agentic-workflow
license: apache-2.0
thumbnail: >-
https://cdn-uploads.huggingface.co/production/uploads/68389182a9bdf98f9279e018/jr7YqXf2Q1rteSNJ7yIFB.png
---
# 🧬 A.R.I.A. (Autonomous Rendering & Imaging Agent)
> **"Autonomous reasoning for artistic execution."**
A.R.I.A. is an AI Agent designed for the **Track 2: MCP in Action (Creative Applications)** category. It demonstrates a complete agentic loopβ€”**Reasoning, Planning, and Execution**β€”to transform raw visual signals into complex artistic outputs without opaque "black box" generation.
---
## πŸ† Hackathon Submission Details
* **Track:** πŸ€– Track 2: MCP in Action
* **Category:** Creative Applications
* **Tag:** `mcp-in-action-track-creative`
* **Agent Vision:** To create a system that acts as a "Digital Art Director," breaking down a user's vague intent into specific technical steps (Foundation vs. Atmosphere) and executing them sequentially.
---
## 🧠 Agentic Workflow (The "Reasoning" Core)
This application is not a simple filter. It utilizes a structured **Chain-of-Thought (CoT)** engine to simulate an artist's decision-making process.
### 1. Reasoning (Analysis)
The agent first analyzes the **Input Signal** (User Image) and the **Semantic Intent** (Prompt). It determines the necessary stylistic divergence based on the "Concept Divergence" parameter.
> *Log Example:* `πŸ’­ THOUGHT: Analyzing input... Intent: 'Cyberpunk'. Strategy: Structure (Step 1) + Atmosphere (Step 2).`
### 2. Planning (Tool Selection)
Based on the analysis, A.R.I.A. selects the appropriate tools from its registry:
* **Foundation Tool:** Selected to rewrite the texture matrix (e.g., *Oil Painting* to break digital edges).
* **Atmosphere Tool:** Selected to inject lighting and mood (e.g., *Neon City* for color grading).
### 3. Execution (Action)
The agent executes the plan in a multi-pass pipeline:
* **Pass 1 (High Denoising):** Destructive generation to establish the new style.
* **Pass 2 (Low Denoising):** Constructive generation to "glaze" the final output.
---
## πŸ–ΌοΈ Three-Step Image Gallery
This gallery represents the **three-step pipeline workflow**:
* **Step 1:** Foundation / Structural Style
* **Step 2:** Atmosphere / Lighting & Mood
* **Step 3:** Final Output
<div style="display: grid; grid-template-rows: repeat(3, auto); gap: 40px;">
<!-- Example 1 -->
<div style="background:#fdfdfd; padding:15px; border-radius:8px;">
<h3 style="text-align:center; margin-bottom:10px;">Example 1</h3>
<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 1: Foundation</strong><br>
<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 2: Atmosphere</strong><br>
<img src="assets/ex1-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 3: Final Output</strong><br>
<img src="assets/ex1-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
</div>
</div>
<hr style="margin:20px 0; border:0; border-top:2px solid #ccc;">
</div>
<!-- Example 2 -->
<div style="background:#f7f7f7; padding:15px; border-radius:8px;">
<h3 style="text-align:center; margin-bottom:10px;">Example 2</h3>
<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 1: Foundation</strong><br>
<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 2: Atmosphere</strong><br>
<img src="assets/ex2-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 3: Final Output</strong><br>
<img src="assets/ex2-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
</div>
</div>
<hr style="margin:20px 0; border:0; border-top:2px solid #ccc;">
</div>
<!-- Example 3 -->
<div style="background:#fdfdfd; padding:15px; border-radius:8px;">
<h3 style="text-align:center; margin-bottom:10px;">Example 3</h3>
<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 1: Foundation</strong><br>
<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 2: Atmosphere</strong><br>
<img src="assets/ex3-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 3: Final Output</strong><br>
<img src="assets/ex3-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
</div>
</div>
</div>
</div>
---
## πŸŽ₯ Video Demonstrations
<!-- YouTube Video Thumbnail -->
<div>
<h3>A.R.I.A. Overview (YouTube)</h3>
<a href="https://www.youtube.com/watch?v=fZMkg6tRhEc" target="_blank">
<img src="https://img.youtube.com/vi/fZMkg6tRhEc/0.jpg" alt="A.R.I.A. Overview" style="width:560px; max-width:100%;">
</a>
<p>Watch the full video on YouTube: <a href="https://www.youtube.com/watch?v=fZMkg6tRhEc" target="_blank">A.R.I.A. Overview</a></p>
</div>
### A.R.I.A. Video Demo
πŸŽ₯ [Watch ARIA.mp4 Demo](assets/ARIA.mp4)
*(Note: video can only be viewed after downloading)*
![ARIA Demo Preview](assets/ARIA.gif)
## πŸ’Ό LinkedIn Showcase
<ul>
<li><a href="https://www.linkedin.com/posts/activity-7398136529942081536-9PY_?utm_source=share&utm_medium=member_desktop&rcm=ACoAADV81lIBHfqnWPcrqTwi8q3nrm4-wpvkldE" target="_blank">πŸ’Ό LinkedIn Post</a></li>
</ul>
---
## πŸ› οΈ Technical Stack
* **Frontend:** Gradio 5 (Streaming interface for real-time feedback)
* **Backend:** Python / PyTorch / Diffusers
* **Model:** Stable Diffusion v1.5 (Optimized for CPU inference via Attention Slicing)
* **UX:** Custom "Mission Control" CSS theme with Typewriter-style logging to visualize the Agent's thinking speed
---
## πŸš€ How to Use
1. **Upload** a source image (sketch, photo, or 3D render).
2. **Define Intent:** Type a prompt (e.g., "A lonely rover on Mars").
3. **Configure Agent:**
* *Foundation:* Choose the base texture style.
* *Atmosphere:* Choose the lighting/mood context.
4. **Initialize:** Click **Initialize Agent Sequence**.
5. **Observe:** Watch the **Chain-of-Thought Log** as the agent thinks, plans, and executes the visual transformation in real-time.
6. **Compare:** Review multiple outputs in the **three-step gallery grid**.
---
## πŸ“¦ Local Installation
```bash
pip install gradio diffusers torch transformers scipy
python app.py