Spaces:
Running
Running
Mandark-droid
commited on
Commit
Β·
daacf12
1
Parent(s):
bf61933
Add user guide accordions to all leaderboard tabs
Browse filesAdded comprehensive 'π README' accordions to all tabs in the
leaderboard screen to provide user guidance and feature explanations.
New accordions added:
- π Leaderboard: How to Use the Leaderboard
- π DrillDown: How to Use DrillDown (with column explanations)
- π Trends: How to Read Trends (temporal analysis guide)
- π₯ Summary Card: How to Create Summary Cards (step-by-step)
- π€ AI Insights: About AI Insights (MCP and LLM-powered features)
All accordions:
- Collapsed by default (open=False) to save screen space
- Provide context-specific help and best practices
- Include tips, use cases, and feature explanations
- Match the style and structure of Analytics tab accordion
app.py
CHANGED
|
@@ -1053,6 +1053,32 @@ with gr.Blocks(title="TraceMind-AI", theme=theme) as app:
|
|
| 1053 |
with gr.TabItem("π Leaderboard"):
|
| 1054 |
gr.Markdown("*Styled leaderboard with inline filters*")
|
| 1055 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1056 |
# Inline filters for styled leaderboard
|
| 1057 |
with gr.Row():
|
| 1058 |
with gr.Column(scale=1):
|
|
@@ -1091,6 +1117,38 @@ with gr.Blocks(title="TraceMind-AI", theme=theme) as app:
|
|
| 1091 |
with gr.TabItem("π DrillDown"):
|
| 1092 |
gr.Markdown("*Click any row to view detailed run information*")
|
| 1093 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1094 |
# Inline filters for drilldown table
|
| 1095 |
with gr.Row():
|
| 1096 |
with gr.Column(scale=1):
|
|
@@ -1131,6 +1189,41 @@ with gr.Blocks(title="TraceMind-AI", theme=theme) as app:
|
|
| 1131 |
)
|
| 1132 |
|
| 1133 |
with gr.TabItem("π Trends"):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1134 |
trends_plot = gr.Plot()
|
| 1135 |
|
| 1136 |
with gr.TabItem("π Analytics"):
|
|
@@ -1190,19 +1283,39 @@ with gr.Blocks(title="TraceMind-AI", theme=theme) as app:
|
|
| 1190 |
""", elem_id="viz-explanation")
|
| 1191 |
|
| 1192 |
with gr.TabItem("π₯ Summary Card"):
|
| 1193 |
-
|
| 1194 |
-
|
| 1195 |
-
|
| 1196 |
-
|
| 1197 |
-
|
| 1198 |
-
|
| 1199 |
-
|
| 1200 |
-
|
| 1201 |
-
|
| 1202 |
-
|
| 1203 |
-
|
| 1204 |
-
|
| 1205 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1206 |
|
| 1207 |
with gr.Row():
|
| 1208 |
with gr.Column(scale=1):
|
|
@@ -1223,6 +1336,43 @@ with gr.Blocks(title="TraceMind-AI", theme=theme) as app:
|
|
| 1223 |
card_preview = gr.HTML(label="Card Preview", value="<p style='text-align: center; color: #666; padding: 40px;'>Click 'Generate Card Preview' to see your summary card</p>")
|
| 1224 |
|
| 1225 |
with gr.TabItem("π€ AI Insights"):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1226 |
with gr.Row():
|
| 1227 |
regenerate_btn = gr.Button("π Regenerate Insights (Streaming)", size="sm", variant="secondary")
|
| 1228 |
gr.Markdown("*Real-time AI analysis powered by Gradio 6 streaming*", elem_classes=["text-sm"])
|
|
|
|
| 1053 |
with gr.TabItem("π Leaderboard"):
|
| 1054 |
gr.Markdown("*Styled leaderboard with inline filters*")
|
| 1055 |
|
| 1056 |
+
# User Guide Accordion
|
| 1057 |
+
with gr.Accordion("π How to Use the Leaderboard", open=False):
|
| 1058 |
+
gr.Markdown("""
|
| 1059 |
+
### π Interactive Leaderboard View
|
| 1060 |
+
|
| 1061 |
+
**What is this tab?**
|
| 1062 |
+
The main leaderboard displays all evaluation runs in a styled HTML table with color-coded performance indicators.
|
| 1063 |
+
|
| 1064 |
+
**How to use it:**
|
| 1065 |
+
- π¨ **Visual Design**: Gradient cards with model logos and performance metrics
|
| 1066 |
+
- π **Filters**: Use agent type, provider, and sorting controls above
|
| 1067 |
+
- π **Sort Options**: Click "Sort By" to order by success rate, cost, duration, or tokens
|
| 1068 |
+
- π **Click to Drill Down**: Click any model card to view detailed run information
|
| 1069 |
+
- π― **Quick Comparison**: Select 2+ runs and click "Compare" button
|
| 1070 |
+
|
| 1071 |
+
**Performance Indicators:**
|
| 1072 |
+
- π’ Green metrics = Excellent performance
|
| 1073 |
+
- π‘ Yellow metrics = Average performance
|
| 1074 |
+
- π΄ Red metrics = Needs improvement
|
| 1075 |
+
|
| 1076 |
+
**Tips:**
|
| 1077 |
+
- Use sidebar filters to narrow down by model
|
| 1078 |
+
- Apply inline filters for more granular control
|
| 1079 |
+
- Switch to "DrillDown" tab for a raw table view
|
| 1080 |
+
""")
|
| 1081 |
+
|
| 1082 |
# Inline filters for styled leaderboard
|
| 1083 |
with gr.Row():
|
| 1084 |
with gr.Column(scale=1):
|
|
|
|
| 1117 |
with gr.TabItem("π DrillDown"):
|
| 1118 |
gr.Markdown("*Click any row to view detailed run information*")
|
| 1119 |
|
| 1120 |
+
# User Guide Accordion
|
| 1121 |
+
with gr.Accordion("π How to Use DrillDown", open=False):
|
| 1122 |
+
gr.Markdown("""
|
| 1123 |
+
### π Data Table View
|
| 1124 |
+
|
| 1125 |
+
**What is this tab?**
|
| 1126 |
+
The DrillDown tab provides a raw, sortable table view of all evaluation runs with full details.
|
| 1127 |
+
|
| 1128 |
+
**How to use it:**
|
| 1129 |
+
- π **Table Format**: Clean, spreadsheet-like view of all runs
|
| 1130 |
+
- π **Filters**: Apply agent type, provider, and sorting controls
|
| 1131 |
+
- π₯ **Export Ready**: Easy to copy/paste data for reports
|
| 1132 |
+
- π **Click Rows**: Click any row to navigate to detailed run view
|
| 1133 |
+
- π’ **All Metrics**: Shows run ID, model, success rate, cost, duration, and more
|
| 1134 |
+
|
| 1135 |
+
**Columns Explained:**
|
| 1136 |
+
- **Run ID**: Unique identifier for each evaluation
|
| 1137 |
+
- **Model**: AI model that was evaluated
|
| 1138 |
+
- **Agent Type**: tool (function calling), code (code execution), or both
|
| 1139 |
+
- **Provider**: litellm (API models) or transformers (local models)
|
| 1140 |
+
- **Success Rate**: Percentage of test cases passed
|
| 1141 |
+
- **Tests**: Number of test cases executed
|
| 1142 |
+
- **Duration**: Average execution time in milliseconds
|
| 1143 |
+
- **Cost**: Total cost in USD for this run
|
| 1144 |
+
- **Submitted By**: HuggingFace username of evaluator
|
| 1145 |
+
|
| 1146 |
+
**Tips:**
|
| 1147 |
+
- Use this for detailed data analysis
|
| 1148 |
+
- Combine with sidebar filters for focused views
|
| 1149 |
+
- Sort by any column to find best/worst performers
|
| 1150 |
+
""")
|
| 1151 |
+
|
| 1152 |
# Inline filters for drilldown table
|
| 1153 |
with gr.Row():
|
| 1154 |
with gr.Column(scale=1):
|
|
|
|
| 1189 |
)
|
| 1190 |
|
| 1191 |
with gr.TabItem("π Trends"):
|
| 1192 |
+
# User Guide Accordion
|
| 1193 |
+
with gr.Accordion("π How to Read Trends", open=False):
|
| 1194 |
+
gr.Markdown("""
|
| 1195 |
+
### π Temporal Performance Analysis
|
| 1196 |
+
|
| 1197 |
+
**What is this tab?**
|
| 1198 |
+
The Trends tab visualizes how model performance evolves over time, helping you identify patterns and improvements.
|
| 1199 |
+
|
| 1200 |
+
**How to read it:**
|
| 1201 |
+
- π
**X-axis**: Timeline showing when evaluations were run
|
| 1202 |
+
- π **Y-axis**: Performance metrics (success rate, cost, duration, etc.)
|
| 1203 |
+
- π **Line Charts**: Each line represents a different model
|
| 1204 |
+
- π¨ **Color Coding**: Different colors for different models
|
| 1205 |
+
- π **Interactive**: Hover over points to see exact values
|
| 1206 |
+
|
| 1207 |
+
**What to look for:**
|
| 1208 |
+
- **Upward trends** = Model improvements over time
|
| 1209 |
+
- **Downward trends** = Performance degradation (needs investigation)
|
| 1210 |
+
- **Flat lines** = Consistent performance
|
| 1211 |
+
- **Spikes** = Anomalies or special test conditions
|
| 1212 |
+
- **Gaps** = Periods without evaluations
|
| 1213 |
+
|
| 1214 |
+
**Use cases:**
|
| 1215 |
+
- Track model version improvements
|
| 1216 |
+
- Identify when performance degraded
|
| 1217 |
+
- Compare model evolution over time
|
| 1218 |
+
- Spot patterns in cost or latency changes
|
| 1219 |
+
- Validate optimization efforts
|
| 1220 |
+
|
| 1221 |
+
**Tips:**
|
| 1222 |
+
- Use sidebar filters to focus on specific models
|
| 1223 |
+
- Look for correlation between cost and accuracy
|
| 1224 |
+
- Identify best time periods for each model
|
| 1225 |
+
""")
|
| 1226 |
+
|
| 1227 |
trends_plot = gr.Plot()
|
| 1228 |
|
| 1229 |
with gr.TabItem("π Analytics"):
|
|
|
|
| 1283 |
""", elem_id="viz-explanation")
|
| 1284 |
|
| 1285 |
with gr.TabItem("π₯ Summary Card"):
|
| 1286 |
+
# User Guide Accordion
|
| 1287 |
+
with gr.Accordion("π How to Create Summary Cards", open=False):
|
| 1288 |
+
gr.Markdown("""
|
| 1289 |
+
### π₯ Downloadable Leaderboard Summary Card
|
| 1290 |
+
|
| 1291 |
+
**What is this tab?**
|
| 1292 |
+
Generate professional, shareable summary cards with top performers and key statistics.
|
| 1293 |
+
Perfect for presentations, reports, and sharing results with your team!
|
| 1294 |
+
|
| 1295 |
+
**How to use it:**
|
| 1296 |
+
1. **Select Top N**: Use the slider to choose how many top models to include (1-5)
|
| 1297 |
+
2. **Generate Preview**: Click "Generate Card Preview" to see the card
|
| 1298 |
+
3. **Download**: Click "Download as PNG" to save as high-quality image
|
| 1299 |
+
4. **Share**: Use the downloaded image in presentations, reports, or social media
|
| 1300 |
+
|
| 1301 |
+
**Card Features:**
|
| 1302 |
+
- π **Medal Indicators**: Gold, silver, bronze for top 3 performers
|
| 1303 |
+
- π **Key Metrics**: Success rate, cost, duration, and tokens per model
|
| 1304 |
+
- π **Aggregate Stats**: Overall leaderboard statistics at a glance
|
| 1305 |
+
- π¨ **TraceMind Branding**: Professional design with logo
|
| 1306 |
+
- π₯ **High Quality**: PNG format suitable for presentations
|
| 1307 |
+
|
| 1308 |
+
**Best Practices:**
|
| 1309 |
+
- Use 3-5 models for balanced card density
|
| 1310 |
+
- Include metric context in your presentations
|
| 1311 |
+
- Update cards regularly to reflect latest results
|
| 1312 |
+
- Combine with detailed reports for stakeholders
|
| 1313 |
+
|
| 1314 |
+
**Tips:**
|
| 1315 |
+
- Cards are automatically sized for readability
|
| 1316 |
+
- All current sidebar filters are applied
|
| 1317 |
+
- Cards update dynamically as data changes
|
| 1318 |
+
""")
|
| 1319 |
|
| 1320 |
with gr.Row():
|
| 1321 |
with gr.Column(scale=1):
|
|
|
|
| 1336 |
card_preview = gr.HTML(label="Card Preview", value="<p style='text-align: center; color: #666; padding: 40px;'>Click 'Generate Card Preview' to see your summary card</p>")
|
| 1337 |
|
| 1338 |
with gr.TabItem("π€ AI Insights"):
|
| 1339 |
+
# User Guide Accordion
|
| 1340 |
+
with gr.Accordion("π About AI Insights", open=False):
|
| 1341 |
+
gr.Markdown("""
|
| 1342 |
+
### π€ LLM-Powered Leaderboard Analysis
|
| 1343 |
+
|
| 1344 |
+
**What is this tab?**
|
| 1345 |
+
AI Insights provides intelligent, natural language analysis of your leaderboard data using advanced language models.
|
| 1346 |
+
Get instant insights, trends, and recommendations powered by AI.
|
| 1347 |
+
|
| 1348 |
+
**How it works:**
|
| 1349 |
+
- π **Automatic Analysis**: AI analyzes all leaderboard data automatically
|
| 1350 |
+
- π **Streaming Responses**: Watch insights generate in real-time (Gradio 6)
|
| 1351 |
+
- π― **Smart Recommendations**: Get actionable advice for model selection
|
| 1352 |
+
- π **Trend Detection**: AI identifies patterns and anomalies
|
| 1353 |
+
- π‘ **Context-Aware**: Insights adapt to current filters and data
|
| 1354 |
+
|
| 1355 |
+
**What insights you'll get:**
|
| 1356 |
+
- **Top Performers**: Which models lead in accuracy, speed, cost
|
| 1357 |
+
- **Trade-offs**: Cost vs accuracy, speed vs quality analysis
|
| 1358 |
+
- **Recommendations**: Best model for different use cases
|
| 1359 |
+
- **Trends**: Performance changes over time
|
| 1360 |
+
- **Anomalies**: Unusual results that need attention
|
| 1361 |
+
- **Optimization Tips**: How to improve evaluation strategies
|
| 1362 |
+
|
| 1363 |
+
**Powered by:**
|
| 1364 |
+
- π€ **MCP Servers**: Model Context Protocol for intelligent data access
|
| 1365 |
+
- π§ **Advanced LLMs**: Google Gemini 1.5 Pro for analysis
|
| 1366 |
+
- π‘ **Real-time Streaming**: Gradio 6 for live response generation
|
| 1367 |
+
- π **Context Integration**: Understands your full leaderboard context
|
| 1368 |
+
|
| 1369 |
+
**Tips:**
|
| 1370 |
+
- Click "Regenerate" for updated insights after data changes
|
| 1371 |
+
- Insights respect your sidebar and inline filters
|
| 1372 |
+
- Use insights to guide model selection decisions
|
| 1373 |
+
- Share AI insights in team discussions
|
| 1374 |
+
""")
|
| 1375 |
+
|
| 1376 |
with gr.Row():
|
| 1377 |
regenerate_btn = gr.Button("π Regenerate Insights (Streaming)", size="sm", variant="secondary")
|
| 1378 |
gr.Markdown("*Real-time AI analysis powered by Gradio 6 streaming*", elem_classes=["text-sm"])
|