AI Recommendation Engine

Repository: bridges-ai-engine-source-code/oim-ai-recommandation-engine-staging

The AI engine is the heart of the platform. It is a FastAPI service that indexes opportunity documents into a vector store and generates personalized, explainable recommendations using Google Gemini.

Module Layout

Module

Responsibility

main.py

FastAPI application; exposes all REST endpoints (titled “RAG Vector Knowledge Base API”, version 1.0.0).

gemini_agent.py

Orchestrates recommendation generation by combining vector search with generative AI (Gemini via OpenRouter).

qdrant_kb.py

Interface to Qdrant for semantic vector search; handles Ollama bge-m3 embeddings (1024D) and chunking (max 350 tokens).

agent_memory.py

PostgreSQL persistence layer with an optimized connection pool (v2.0).

requirements.txt

Python dependencies.

Dockerfile

Container build for deployment.

FastAPI Application (main.py)

The main entry point exposes REST endpoints for document upload and indexing, recommendation generation, and analytics. Key endpoints:

Endpoint

Purpose

POST /documents/upload

Upload and index a single document

POST /documents/index-json-batch

Index a batch of opportunities provided as JSON

POST /recommendations/get22

Generate recommendations for a single user

POST /recommendations/get22-batch

Generate recommendations for a batch of users

GET /analytics/*

Various platform statistics

See API Reference for full request/response schemas.

Gemini Agent (gemini_agent.py)

The agent orchestrates recommendation generation by combining vector search and the generative model. It supports the following opportunity types:

  • PERMANENT_EMPLOYMENT — permanent job

  • TEMPORARY_EMPLOYMENT — temporary job

  • VOCATIONAL_TRAINING — professional training

  • ENTREPRENEURSHIP — Income Generating Activity (IGA)

  • INTERNSHIP — professional internship

For each request, the agent retrieves candidate opportunities from Qdrant using the beneficiary’s profile, then prompts Gemini to rank and explain the most relevant matches, returning a relevance_score (0.0–1.0) and a human-readable reason for each.

PostgreSQL Memory (agent_memory.py)

Manages persistence in PostgreSQL using an optimized connection pool (v2.0). Main tables consumed by the engine:

Table

Content

user_profiles

Beneficiary profiles (stored as JSONB)

recommendations

Recommendation history and interaction tracking

interactions

Conversation history (reserved for future use)

The v2.0 improvements (connection pool, TCP keepalive, auto-reconnection, configurable SSL, 2–5× performance) are described in Configuration.

Qdrant Knowledge Base (qdrant_kb.py)

Provides the interface to Qdrant for semantic vector search. It uses Ollama bge-m3 embeddings (1024 dimensions) and chunks documents to a maximum of 350 tokens before indexing. Collection statistics are available via GET /collections/{name}/stats.

Indexing Workflow

  1. Prepare documents (Word, PDF, JSON).

  2. Upload via /documents/upload or /documents/index-json-batch.

  3. The engine automatically extracts text, classifies it (GREEN/YELLOW/RED), and vectorizes it.

  4. Verify with GET /collections/{name}/stats.

  5. Opportunities are now searchable.

Recommendation Workflow

  1. Retrieve the beneficiary profile.

  2. Format it as JSON according to the schema in API Reference.

  3. POST /recommendations/get22.

  4. Receive 5–10 ranked recommendations.

  5. Recommendations are auto-saved to the database.

  6. Display the results with their metadata.