Spaces:

MCP-1st-Birthday
/

A.R.I.A-Autonomous_Rendering_and_Imaging_Agent

Running

A.R.I.A-Autonomous_Rendering_and_Imaging_Agent

File size: 7,993 Bytes

1277a08
 
 
 
44b4842
1277a08
4eaa4f0
1277a08
 
 
 
 
 
 
 
 
e372500
 
1277a08
c74e7c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34e3a02
c74e7c5
 
 
 
34e3a02
c74e7c5
 
 
 
 
34e3a02
c74e7c5
 
 
 
 
0b1f014
34e3a02
 
 
 
 
 
 
fc723e3
 
 
 
 
 
 
 
4eaa4f0
fc723e3
 
 
4eaa4f0
fc723e3
 
 
4eaa4f0
fc723e3
 
 
34e3a02
 
fc723e3
 
 
 
 
 
4eaa4f0
fc723e3
 
 
4eaa4f0
fc723e3
 
 
4eaa4f0
fc723e3
 
 
34e3a02
 
fc723e3
 
 
 
 
 
4eaa4f0
fc723e3
 
 
4eaa4f0
fc723e3
 
 
4eaa4f0
fc723e3
 
34e3a02
fc723e3
34e3a02
 
 
 
21cb97c
 
73e1285
 
 
 
 
 
 
 
21cb97c
e83d504
 
30b7a30
e83d504
 
30b7a30
21cb97c
73e1285
21cb97c
73e1285
8298b54
73e1285
34e3a02
 
 
c74e7c5
 
34e3a02
 
 
 
c74e7c5
 
 
 
 
 
 
 
 
 
 
 
34e3a02
c74e7c5
 
 
 
 
 
 
0d1cefd

---
title: Autonomous Rendering And Imaging Agent (A.R.I.A.)
emoji: 🚀
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: true
short_description: MCP's 1st Birthday - Hosted by Anthropic and Gradio Project
tags:
- mcp-in-action-track-creative
- agent-course
- stable-diffusion
- agentic-workflow
license: apache-2.0
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/68389182a9bdf98f9279e018/jr7YqXf2Q1rteSNJ7yIFB.png
---

# 🧬 A.R.I.A. (Autonomous Rendering & Imaging Agent)

> **"Autonomous reasoning for artistic execution."**

A.R.I.A. is an AI Agent designed for the **Track 2: MCP in Action (Creative Applications)** category. It demonstrates a complete agentic loop—**Reasoning, Planning, and Execution**—to transform raw visual signals into complex artistic outputs without opaque "black box" generation.

---

## 🏆 Hackathon Submission Details

* **Track:** 🤖 Track 2: MCP in Action
* **Category:** Creative Applications
* **Tag:** `mcp-in-action-track-creative`
* **Agent Vision:** To create a system that acts as a "Digital Art Director," breaking down a user's vague intent into specific technical steps (Foundation vs. Atmosphere) and executing them sequentially.

---

## 🧠 Agentic Workflow (The "Reasoning" Core)

This application is not a simple filter. It utilizes a structured **Chain-of-Thought (CoT)** engine to simulate an artist's decision-making process.

### 1. Reasoning (Analysis)
The agent first analyzes the **Input Signal** (User Image) and the **Semantic Intent** (Prompt). It determines the necessary stylistic divergence based on the "Concept Divergence" parameter.

> *Log Example:* `💭 THOUGHT: Analyzing input... Intent: 'Cyberpunk'. Strategy: Structure (Step 1) + Atmosphere (Step 2).`

### 2. Planning (Tool Selection)
Based on the analysis, A.R.I.A. selects the appropriate tools from its registry:

* **Foundation Tool:** Selected to rewrite the texture matrix (e.g., *Oil Painting* to break digital edges).
* **Atmosphere Tool:** Selected to inject lighting and mood (e.g., *Neon City* for color grading).

### 3. Execution (Action)
The agent executes the plan in a multi-pass pipeline:

* **Pass 1 (High Denoising):** Destructive generation to establish the new style.
* **Pass 2 (Low Denoising):** Constructive generation to "glaze" the final output.

---

## 🖼️ Three-Step Image Gallery

This gallery represents the **three-step pipeline workflow**:

* **Step 1:** Foundation / Structural Style
* **Step 2:** Atmosphere / Lighting & Mood
* **Step 3:** Final Output

<div style="display: grid; grid-template-rows: repeat(3, auto); gap: 40px;">

  <!-- Example 1 -->
  <div style="background:#fdfdfd; padding:15px; border-radius:8px;">
    <h3 style="text-align:center; margin-bottom:10px;">Example 1</h3>
    <div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 1: Foundation</strong><br>
        <img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
      </div>
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 2: Atmosphere</strong><br>
        <img src="assets/ex1-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
      </div>
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 3: Final Output</strong><br>
        <img src="assets/ex1-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
      </div>
    </div>
    <hr style="margin:20px 0; border:0; border-top:2px solid #ccc;">
  </div>

  <!-- Example 2 -->
  <div style="background:#f7f7f7; padding:15px; border-radius:8px;">
    <h3 style="text-align:center; margin-bottom:10px;">Example 2</h3>
    <div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 1: Foundation</strong><br>
        <img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
      </div>
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 2: Atmosphere</strong><br>
        <img src="assets/ex2-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
      </div>
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 3: Final Output</strong><br>
        <img src="assets/ex2-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
      </div>
    </div>
    <hr style="margin:20px 0; border:0; border-top:2px solid #ccc;">
  </div>

  <!-- Example 3 -->
  <div style="background:#fdfdfd; padding:15px; border-radius:8px;">
    <h3 style="text-align:center; margin-bottom:10px;">Example 3</h3>
    <div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; text-align:center;">
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 1: Foundation</strong><br>
        <img src="assets/rabbit2.png" alt="Step 1" style="max-width:100%; border-radius:6px;" />
      </div>
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 2: Atmosphere</strong><br>
        <img src="assets/ex3-step2.webp" alt="Step 2" style="max-width:100%; border-radius:6px;" />
      </div>
      <div style="background:#ffffff; padding:10px; border-radius:8px; box-shadow:0 1px 4px rgba(0,0,0,0.1);">
        <strong>Step 3: Final Output</strong><br>
        <img src="assets/ex3-step3.webp" alt="Step 3" style="max-width:100%; border-radius:6px;" />
      </div>
    </div>
  </div>

</div>

---

## 🎥 Video Demonstrations

<!-- YouTube Video Thumbnail -->
<div>
  <h3>A.R.I.A. Overview (YouTube)</h3>
  <a href="https://www.youtube.com/watch?v=fZMkg6tRhEc" target="_blank">
    <img src="https://img.youtube.com/vi/fZMkg6tRhEc/0.jpg" alt="A.R.I.A. Overview" style="width:560px; max-width:100%;">
  </a>
  <p>Watch the full video on YouTube: <a href="https://www.youtube.com/watch?v=fZMkg6tRhEc" target="_blank">A.R.I.A. Overview</a></p>
</div>

### A.R.I.A. Video Demo

🎥 [Watch ARIA.mp4 Demo](assets/ARIA.mp4)  
*(Note: video can only be viewed after downloading)*  

![ARIA Demo Preview](assets/ARIA.gif)

## 💼 LinkedIn Showcase

<ul>
  <li><a href="https://www.linkedin.com/posts/activity-7398136529942081536-9PY_?utm_source=share&utm_medium=member_desktop&rcm=ACoAADV81lIBHfqnWPcrqTwi8q3nrm4-wpvkldE" target="_blank">💼 LinkedIn Post</a></li>
</ul>

---

## 🛠️ Technical Stack

* **Frontend:** Gradio 5 (Streaming interface for real-time feedback)
* **Backend:** Python / PyTorch / Diffusers
* **Model:** Stable Diffusion v1.5 (Optimized for CPU inference via Attention Slicing)
* **UX:** Custom "Mission Control" CSS theme with Typewriter-style logging to visualize the Agent's thinking speed

---

## 🚀 How to Use

1.  **Upload** a source image (sketch, photo, or 3D render).
2.  **Define Intent:** Type a prompt (e.g., "A lonely rover on Mars").
3.  **Configure Agent:**
    * *Foundation:* Choose the base texture style.
    * *Atmosphere:* Choose the lighting/mood context.
4.  **Initialize:** Click **Initialize Agent Sequence**.
5.  **Observe:** Watch the **Chain-of-Thought Log** as the agent thinks, plans, and executes the visual transformation in real-time.
6.  **Compare:** Review multiple outputs in the **three-step gallery grid**.

---

## 📦 Local Installation

```bash
pip install gradio diffusers torch transformers scipy
python app.py