rohit commited on
Commit
b7b8e60
Β·
1 Parent(s): 8946f02
Files changed (2) hide show
  1. api.md +221 -0
  2. pytest.ini +1 -5
api.md ADDED
@@ -0,0 +1,221 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # RAG Pipeline API Documentation
2
+
3
+ ## Overview
4
+ FastAPI-based RAG (Retrieval-Augmented Generation) pipeline with OpenRouter GLM integration for intelligent tool calling.
5
+
6
+ ## Base URL
7
+ ```
8
+ http://localhost:8000
9
+ ```
10
+
11
+ ## Endpoints
12
+
13
+ ### `/chat` - Main Chat Endpoint
14
+ **Method:** `POST`
15
+ **Description:** Intelligent chat with RAG tool calling. GLM automatically determines when to use RAG vs. general conversation.
16
+
17
+ #### Request Body
18
+ ```json
19
+ {
20
+ "messages": [
21
+ {
22
+ "role": "user|assistant|system",
23
+ "content": "string"
24
+ }
25
+ ]
26
+ }
27
+ ```
28
+
29
+ #### Response Format
30
+ ```json
31
+ {
32
+ "response": "string",
33
+ "tool_calls": [
34
+ {
35
+ "name": "rag_qa",
36
+ "arguments": "{\"question\": \"string\", \"dataset\": \"string\"}"
37
+ }
38
+ ] | null
39
+ }
40
+ ```
41
+
42
+ #### Examples
43
+
44
+ **1. General Greeting (No RAG):**
45
+ ```bash
46
+ curl -X POST http://localhost:8000/chat \
47
+ -H "Content-Type: application/json" \
48
+ -d '{"messages":[{"role":"user","content":"hi"}]}'
49
+ ```
50
+
51
+ **Response:**
52
+ ```json
53
+ {
54
+ "response": "Hi! I'm Rohit's AI assistant. I can help you learn about his professional background, skills, and experience. What would you like to know about Rohit?",
55
+ "tool_calls": null
56
+ }
57
+ ```
58
+
59
+ **2. Portfolio Question (RAG Enabled):**
60
+ ```bash
61
+ curl -X POST http://localhost:8000/chat \
62
+ -H "Content-Type: application/json" \
63
+ -d '{"messages":[{"role":"user","content":"What is your current role?"}]}'
64
+ ```
65
+
66
+ **Response:**
67
+ ```json
68
+ {
69
+ "response": "Based on the portfolio information, Rohit is currently working as a Tech Lead at FleetEnable, where he leads UI development for a logistics SaaS product focused on drayage and freight management...",
70
+ "tool_calls": [
71
+ {
72
+ "name": "rag_qa",
73
+ "arguments": "{\"question\": \"What is your current role?\"}"
74
+ }
75
+ ]
76
+ }
77
+ ```
78
+
79
+ ### `/health` - Health Check
80
+ **Method:** `GET`
81
+ **Description:** Check API and dataset loading status.
82
+
83
+ #### Response
84
+ ```json
85
+ {
86
+ "status": "healthy",
87
+ "datasets_loaded": 1,
88
+ "available_datasets": ["developer-portfolio"]
89
+ }
90
+ ```
91
+
92
+ ### `/datasets` - List Available Datasets
93
+ **Method:** `GET`
94
+ **Description:** Get list of available datasets.
95
+
96
+ #### Response
97
+ ```json
98
+ {
99
+ "datasets": ["developer-portfolio"]
100
+ }
101
+ ```
102
+
103
+ ## Features
104
+
105
+ ### 🧠 Intelligent Tool Calling
106
+ - **Automatic Detection:** GLM determines when questions need RAG vs. general conversation
107
+ - **Context-Aware:** Uses portfolio information for relevant questions
108
+ - **Natural Responses:** Synthesizes RAG results into conversational answers
109
+
110
+ ### 🎯 Third-Person AI Assistant
111
+ - **Portfolio Focus:** Responds about Rohit's experience (not "my" experience)
112
+ - **Professional Tone:** Maintains proper third-person references
113
+ - **Context Integration:** Combines multiple data points coherently
114
+
115
+ ### ⚑ Performance Optimizations
116
+ - **On-Demand Loading:** Datasets load only when RAG is needed
117
+ - **Clean Output:** No verbose ML logging for general conversations
118
+ - **Fast Responses:** Sub-second for greetings, ~20s for first RAG query
119
+
120
+ ## Available Datasets
121
+
122
+ ### `developer-portfolio`
123
+ - **Content:** Work experience, skills, projects, achievements
124
+ - **Topics:** FleetEnable, Coditude, technologies, leadership
125
+ - **Size:** 19 documents with full metadata
126
+
127
+ ## Error Handling
128
+
129
+ ### Common Responses
130
+ - **Datasets Loading:** "RAG Pipeline is running but datasets are still loading..."
131
+ - **Dataset Not Found:** "Dataset 'xyz' not available. Available datasets: [...]"
132
+ - **API Errors:** HTTP 500 with error details
133
+
134
+ ### Status Codes
135
+ - `200` - Success
136
+ - `400` - Bad Request (invalid JSON, missing fields)
137
+ - `500` - Internal Server Error
138
+
139
+ ## Environment Variables
140
+
141
+ Create `.env` file:
142
+ ```bash
143
+ OPENROUTER_API_KEY=sk-or-v1-your-key-here
144
+ PORT=8000
145
+ TOKENIZERS_PARALLELISM=false
146
+ ```
147
+
148
+ ## Development
149
+
150
+ ### Running Locally
151
+ ```bash
152
+ # Install dependencies
153
+ pip install -r requirements.txt
154
+
155
+ # Start server
156
+ python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
157
+
158
+ # Or use script
159
+ ./start.sh
160
+ ```
161
+
162
+ ### Testing
163
+ ```bash
164
+ # Health check
165
+ curl http://localhost:8000/health
166
+
167
+ # Chat test
168
+ curl -X POST http://localhost:8000/chat \
169
+ -H "Content-Type: application/json" \
170
+ -d '{"messages":[{"role":"user","content":"hi"}]}'
171
+ ```
172
+
173
+ ## Deployment
174
+
175
+ ### Docker
176
+ ```bash
177
+ # Build
178
+ docker build -t rag-pipeline .
179
+
180
+ # Run
181
+ docker run -p 8000:8000 rag-pipeline
182
+ ```
183
+
184
+ ### Hugging Face Spaces
185
+ 1. Push code to repository
186
+ 2. Connect Space to repository
187
+ 3. Set environment variables in Space settings
188
+ 4. Automatic deployment from `main` branch
189
+
190
+ ## Architecture
191
+
192
+ ```
193
+ OpenRouter GLM-4.5-air (Parent AI)
194
+ β”œβ”€β”€ Tool Calling Logic
195
+ β”‚ β”œβ”€β”€ Automatically detects RAG-worthy questions
196
+ β”‚ └── Falls back to general knowledge
197
+ β”œβ”€β”€ RAG Tool Function
198
+ β”‚ β”œβ”€β”€ Dataset selection (developer-portfolio)
199
+ β”‚ β”œβ”€β”€ Document retrieval
200
+ β”‚ └── Context formatting
201
+ └── Response Generation
202
+ β”œβ”€β”€ Tool results integration
203
+ └── Natural language responses
204
+ ```
205
+
206
+ ## Changelog
207
+
208
+ ### v2.0 - Current
209
+ - οΏ½οΏ½οΏ½ OpenRouter GLM integration with tool calling
210
+ - βœ… Intelligent RAG vs. conversation detection
211
+ - βœ… Third-person AI assistant for Rohit's portfolio
212
+ - βœ… On-demand dataset loading
213
+ - βœ… Removed `/answer` endpoint (use `/chat` only)
214
+ - βœ… Environment variable configuration
215
+ - βœ… Performance optimizations
216
+
217
+ ### v1.0 - Legacy
218
+ - Google Gemini integration
219
+ - Multiple endpoints (`/answer`, `/chat`)
220
+ - Background dataset loading
221
+ - First-person responses
pytest.ini CHANGED
@@ -3,8 +3,4 @@ testpaths = .
3
  python_files = test_*.py
4
  python_classes = Test*
5
  python_functions = test_*
6
- addopts = -v --tb=short
7
- markers =
8
- slow: marks tests as slow (deselect with '-m "not slow"')
9
- integration: marks tests as integration tests
10
- unit: marks tests as unit tests
 
3
  python_files = test_*.py
4
  python_classes = Test*
5
  python_functions = test_*
6
+ addopts = -v --tb=short