File size: 7,993 Bytes
1277a08 44b4842 1277a08 4eaa4f0 1277a08 e372500 1277a08 c74e7c5 34e3a02 c74e7c5 34e3a02 c74e7c5 34e3a02 c74e7c5 0b1f014 34e3a02 fc723e3 4eaa4f0 fc723e3 4eaa4f0 fc723e3 4eaa4f0 fc723e3 34e3a02 fc723e3 4eaa4f0 fc723e3 4eaa4f0 fc723e3 4eaa4f0 fc723e3 34e3a02 fc723e3 4eaa4f0 fc723e3 4eaa4f0 fc723e3 4eaa4f0 fc723e3 34e3a02 fc723e3 34e3a02 21cb97c 73e1285 21cb97c e83d504 30b7a30 e83d504 30b7a30 21cb97c 73e1285 21cb97c 73e1285 8298b54 73e1285 34e3a02 c74e7c5 34e3a02 c74e7c5 34e3a02 c74e7c5 0d1cefd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
---
title: Autonomous Rendering And Imaging Agent (A.R.I.A.)
emoji: π
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: true
short_description: MCP's 1st Birthday - Hosted by Anthropic and Gradio Project
tags:
- mcp-in-action-track-creative
- agent-course
- stable-diffusion
- agentic-workflow
license: apache-2.0
thumbnail: >-
https://cdn-uploads.huggingface.co/production/uploads/68389182a9bdf98f9279e018/jr7YqXf2Q1rteSNJ7yIFB.png
---
# 𧬠A.R.I.A. (Autonomous Rendering & Imaging Agent)
> **"Autonomous reasoning for artistic execution."**
A.R.I.A. is an AI Agent designed for the **Track 2: MCP in Action (Creative Applications)** category. It demonstrates a complete agentic loopβ**Reasoning, Planning, and Execution**βto transform raw visual signals into complex artistic outputs without opaque "black box" generation.
---
## π Hackathon Submission Details
* **Track:** π€ Track 2: MCP in Action
* **Category:** Creative Applications
* **Tag:** `mcp-in-action-track-creative`
* **Agent Vision:** To create a system that acts as a "Digital Art Director," breaking down a user's vague intent into specific technical steps (Foundation vs. Atmosphere) and executing them sequentially.
---
## π§ Agentic Workflow (The "Reasoning" Core)
This application is not a simple filter. It utilizes a structured **Chain-of-Thought (CoT)** engine to simulate an artist's decision-making process.
### 1. Reasoning (Analysis)
The agent first analyzes the **Input Signal** (User Image) and the **Semantic Intent** (Prompt). It determines the necessary stylistic divergence based on the "Concept Divergence" parameter.
> *Log Example:* `π THOUGHT: Analyzing input... Intent: 'Cyberpunk'. Strategy: Structure (Step 1) + Atmosphere (Step 2).`
### 2. Planning (Tool Selection)
Based on the analysis, A.R.I.A. selects the appropriate tools from its registry:
* **Foundation Tool:** Selected to rewrite the texture matrix (e.g., *Oil Painting* to break digital edges).
* **Atmosphere Tool:** Selected to inject lighting and mood (e.g., *Neon City* for color grading).
### 3. Execution (Action)
The agent executes the plan in a multi-pass pipeline:
* **Pass 1 (High Denoising):** Destructive generation to establish the new style.
* **Pass 2 (Low Denoising):** Constructive generation to "glaze" the final output.
---
## πΌοΈ Three-Step Image Gallery
This gallery represents the **three-step pipeline workflow**:
* **Step 1:** Foundation / Structural Style
* **Step 2:** Atmosphere / Lighting & Mood
* **Step 3:** Final Output
<div style="display: grid; grid-template-rows: repeat(3, auto); gap: 40px;">
<!-- Example 1 -->
<div style="background:#fdfdfd; padding:15px; border-radius:8px;">
<h3 style="text-align:center; margin-bottom:10px;">Example 1</h3>
<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 1: Foundation</strong><br>
<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 2: Atmosphere</strong><br>
<img src="assets/ex1-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 3: Final Output</strong><br>
<img src="assets/ex1-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
</div>
</div>
<hr style="margin:20px 0; border:0; border-top:2px solid #ccc;">
</div>
<!-- Example 2 -->
<div style="background:#f7f7f7; padding:15px; border-radius:8px;">
<h3 style="text-align:center; margin-bottom:10px;">Example 2</h3>
<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 1: Foundation</strong><br>
<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 2: Atmosphere</strong><br>
<img src="assets/ex2-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 3: Final Output</strong><br>
<img src="assets/ex2-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
</div>
</div>
<hr style="margin:20px 0; border:0; border-top:2px solid #ccc;">
</div>
<!-- Example 3 -->
<div style="background:#fdfdfd; padding:15px; border-radius:8px;">
<h3 style="text-align:center; margin-bottom:10px;">Example 3</h3>
<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 1: Foundation</strong><br>
<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 2: Atmosphere</strong><br>
<img src="assets/ex3-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
</div>
<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
<strong>Step 3: Final Output</strong><br>
<img src="assets/ex3-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
</div>
</div>
</div>
</div>
---
## π₯ Video Demonstrations
<!-- YouTube Video Thumbnail -->
<div>
<h3>A.R.I.A. Overview (YouTube)</h3>
<a href="https://www.youtube.com/watch?v=fZMkg6tRhEc" target="_blank">
<img src="https://img.youtube.com/vi/fZMkg6tRhEc/0.jpg" alt="A.R.I.A. Overview" style="width:560px; max-width:100%;">
</a>
<p>Watch the full video on YouTube: <a href="https://www.youtube.com/watch?v=fZMkg6tRhEc" target="_blank">A.R.I.A. Overview</a></p>
</div>
### A.R.I.A. Video Demo
π₯ [Watch ARIA.mp4 Demo](assets/ARIA.mp4)
*(Note: video can only be viewed after downloading)*

## πΌ LinkedIn Showcase
<ul>
<li><a href="https://www.linkedin.com/posts/activity-7398136529942081536-9PY_?utm_source=share&utm_medium=member_desktop&rcm=ACoAADV81lIBHfqnWPcrqTwi8q3nrm4-wpvkldE" target="_blank">πΌ LinkedIn Post</a></li>
</ul>
---
## π οΈ Technical Stack
* **Frontend:** Gradio 5 (Streaming interface for real-time feedback)
* **Backend:** Python / PyTorch / Diffusers
* **Model:** Stable Diffusion v1.5 (Optimized for CPU inference via Attention Slicing)
* **UX:** Custom "Mission Control" CSS theme with Typewriter-style logging to visualize the Agent's thinking speed
---
## π How to Use
1. **Upload** a source image (sketch, photo, or 3D render).
2. **Define Intent:** Type a prompt (e.g., "A lonely rover on Mars").
3. **Configure Agent:**
* *Foundation:* Choose the base texture style.
* *Atmosphere:* Choose the lighting/mood context.
4. **Initialize:** Click **Initialize Agent Sequence**.
5. **Observe:** Watch the **Chain-of-Thought Log** as the agent thinks, plans, and executes the visual transformation in real-time.
6. **Compare:** Review multiple outputs in the **three-step gallery grid**.
---
## π¦ Local Installation
```bash
pip install gradio diffusers torch transformers scipy
python app.py |