Spaces:

MCP-1st-Birthday
/

A.R.I.A-Autonomous_Rendering_and_Imaging_Agent

Running

App Files Files Community

A.R.I.A-Autonomous_Rendering_and_Imaging_Agent / README.md

byte-vortex

Update README.md

44b4842 verified 15 days ago

preview code

raw

history blame contribute delete

7.99 kB

	---
	title: Autonomous Rendering And Imaging Agent (A.R.I.A.)
	emoji: 🚀
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 6.0.1
	app_file: app.py
	pinned: true
	short_description: MCP's 1st Birthday - Hosted by Anthropic and Gradio Project
	tags:
	- mcp-in-action-track-creative
	- agent-course
	- stable-diffusion
	- agentic-workflow
	license: apache-2.0
	thumbnail: >-
	https://cdn-uploads.huggingface.co/production/uploads/68389182a9bdf98f9279e018/jr7YqXf2Q1rteSNJ7yIFB.png
	---

	# 🧬 A.R.I.A. (Autonomous Rendering & Imaging Agent)

	> "Autonomous reasoning for artistic execution."

	A.R.I.A. is an AI Agent designed for the Track 2: MCP in Action (Creative Applications) category. It demonstrates a complete agentic loop—Reasoning, Planning, and Execution—to transform raw visual signals into complex artistic outputs without opaque "black box" generation.

	---

	## 🏆 Hackathon Submission Details

	* Track: 🤖 Track 2: MCP in Action
	* Category: Creative Applications
	* Tag: `mcp-in-action-track-creative`
	* Agent Vision: To create a system that acts as a "Digital Art Director," breaking down a user's vague intent into specific technical steps (Foundation vs. Atmosphere) and executing them sequentially.

	---

	## 🧠 Agentic Workflow (The "Reasoning" Core)

	This application is not a simple filter. It utilizes a structured Chain-of-Thought (CoT) engine to simulate an artist's decision-making process.

	### 1. Reasoning (Analysis)
	The agent first analyzes the Input Signal (User Image) and the Semantic Intent (Prompt). It determines the necessary stylistic divergence based on the "Concept Divergence" parameter.

	> Log Example: `💭 THOUGHT: Analyzing input... Intent: 'Cyberpunk'. Strategy: Structure (Step 1) + Atmosphere (Step 2).`

	### 2. Planning (Tool Selection)
	Based on the analysis, A.R.I.A. selects the appropriate tools from its registry:

	* Foundation Tool: Selected to rewrite the texture matrix (e.g., Oil Painting to break digital edges).
	* Atmosphere Tool: Selected to inject lighting and mood (e.g., Neon City for color grading).

	### 3. Execution (Action)
	The agent executes the plan in a multi-pass pipeline:

	* Pass 1 (High Denoising): Destructive generation to establish the new style.
	* Pass 2 (Low Denoising): Constructive generation to "glaze" the final output.

	---

	## 🖼️ Three-Step Image Gallery

	This gallery represents the three-step pipeline workflow:

	* Step 1: Foundation / Structural Style
	* Step 2: Atmosphere / Lighting & Mood
	* Step 3: Final Output

	<div style="display: grid; grid-template-rows: repeat(3, auto); gap: 40px;">

	<!-- Example 1 -->
	<div style="background:#fdfdfd; padding:15px; border-radius:8px;">
	<h3 style="text-align:center; margin-bottom:10px;">Example 1</h3>
	<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 1: Foundation</strong><br>
	<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
	</div>
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 2: Atmosphere</strong><br>
	<img src="assets/ex1-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
	</div>
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 3: Final Output</strong><br>
	<img src="assets/ex1-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
	</div>
	</div>
	<hr style="margin:20px 0; border:0; border-top:2px solid #ccc;">
	</div>

	<!-- Example 2 -->
	<div style="background:#f7f7f7; padding:15px; border-radius:8px;">
	<h3 style="text-align:center; margin-bottom:10px;">Example 2</h3>
	<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 1: Foundation</strong><br>
	<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
	</div>
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 2: Atmosphere</strong><br>
	<img src="assets/ex2-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
	</div>
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 3: Final Output</strong><br>
	<img src="assets/ex2-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
	</div>
	</div>
	<hr style="margin:20px 0; border:0; border-top:2px solid #ccc;">
	</div>

	<!-- Example 3 -->
	<div style="background:#fdfdfd; padding:15px; border-radius:8px;">
	<h3 style="text-align:center; margin-bottom:10px;">Example 3</h3>
	<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 1: Foundation</strong><br>
	<img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
	</div>
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 2: Atmosphere</strong><br>
	<img src="assets/ex3-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
	</div>
	<div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
	<strong>Step 3: Final Output</strong><br>
	<img src="assets/ex3-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
	</div>
	</div>
	</div>

	</div>

	---

	## 🎥 Video Demonstrations

	<!-- YouTube Video Thumbnail -->
	<div>
	<h3>A.R.I.A. Overview (YouTube)</h3>
	<a href="https://www.youtube.com/watch?v=fZMkg6tRhEc" target="_blank">
	<img src="https://img.youtube.com/vi/fZMkg6tRhEc/0.jpg" alt="A.R.I.A. Overview" style="width:560px; max-width:100%;">
	</a>
	<p>Watch the full video on YouTube: <a href="https://www.youtube.com/watch?v=fZMkg6tRhEc" target="_blank">A.R.I.A. Overview</a></p>
	</div>

	### A.R.I.A. Video Demo

	🎥 [Watch ARIA.mp4 Demo](assets/ARIA.mp4)
	(Note: video can only be viewed after downloading)

	![ARIA Demo Preview](assets/ARIA.gif)

	## 💼 LinkedIn Showcase

	<ul>
	<li><a href="https://www.linkedin.com/posts/activity-7398136529942081536-9PY_?utm_source=share&utm_medium=member_desktop&rcm=ACoAADV81lIBHfqnWPcrqTwi8q3nrm4-wpvkldE" target="_blank">💼 LinkedIn Post</a></li>
	</ul>

	---

	## 🛠️ Technical Stack

	* Frontend: Gradio 5 (Streaming interface for real-time feedback)
	* Backend: Python / PyTorch / Diffusers
	* Model: Stable Diffusion v1.5 (Optimized for CPU inference via Attention Slicing)
	* UX: Custom "Mission Control" CSS theme with Typewriter-style logging to visualize the Agent's thinking speed

	---

	## 🚀 How to Use

	1. Upload a source image (sketch, photo, or 3D render).
	2. Define Intent: Type a prompt (e.g., "A lonely rover on Mars").
	3. Configure Agent:
	* Foundation: Choose the base texture style.
	* Atmosphere: Choose the lighting/mood context.
	4. Initialize: Click Initialize Agent Sequence.
	5. Observe: Watch the Chain-of-Thought Log as the agent thinks, plans, and executes the visual transformation in real-time.
	6. Compare: Review multiple outputs in the three-step gallery grid.

	---

	## 📦 Local Installation

	```bash
	pip install gradio diffusers torch transformers scipy
	python app.py