A complete AI platform that captures every question, validates every answer, then takes it to first delivery. Four-phase delivery: 1) Guided streamlit intake collects source systems, requirements, SLAs, stakeholders, DQ rules, and PII strategy — stored in Databricks Unity Catalog or Posgres with full audit trail. 2)  Validation: AI agents assess completeness, flag gaps, and engage stakeholders before a single line of DDL is written 3) Architect: Automated architecture design grounded in captured requirements — source-to-target mappings, medallion layers, partition strategies
4) 8-agent pipeline generates DDL scripts, pytest suites, SDLC artifacts, and documentation — first delivery, ready for iteration

Stack Used

Based on the provided documentation and architecture guides, NiData is built on a dual-deployment architecture (Databricks and Docker) that shares a core foundation. 

Here is the full tech stack broken down by component:

**Core Languages & Frontend**
*   **Python (3.8+):** The primary language used across the platform, driving everything from agent orchestration to the knowledge graph.
*   **SQL:** Used for both Unity Catalog DDL (Delta Lake) and standard PostgreSQL schemas.
*   **Streamlit:** Powers the main 9-step wizard web UI, the artifact viewer, and the admin panel for reference data management.

**AI Models & Orchestration Engine**
*   **LLMs:** The default model is Llama 3.3 70B (hosted via Databricks Model Serving). The platform can also be configured to use OpenAI GPT-4 and Anthropic Claude via their APIs.
*   **Agent Orchestration:** A Python-based, config-driven engine using LangGraph-style orchestration to manage the 8-agent legacy pipeline and the 7-agent sequential delivery pipeline.
*   **Job Orchestration:** Can run on Databricks Workflows or Apache Airflow.

**Deployment Option A: Databricks (Enterprise Cloud-Native)**
*   **Storage & Governance:** Unity Catalog (26-table schema) and Delta Lake.
*   **Compute:** Databricks Runtime / Spark.
*   **Model Registry & Feature Store:** MLflow on Databricks and Databricks Feature Store.
*   **Security:** OIDC managed authentication and Databricks secrets.

**Deployment Option B: Docker (Standalone / Air-Gapped)**
*   **Containerization:** Multi-service Docker Compose orchestration.
*   **Database:** PostgreSQL (for platform-agnostic storage) connected via `psycopg2`.
*   **Infrastructure as Code (IaC):** Terraform used to provision cost-optimized GCP spot instances.
*   **Web Server:** Nginx acting as a reverse proxy for production profiles.

**Business Intelligence (BI) Integration**
*   **Tableau Parsers & Connectors:** Parses TWB/TWBX files, integrates via the Tableau Server REST API, and connects directly to the internal Tableau PostgreSQL repository on port 8060.
*   **Power BI Parsers & Connectors:** Extracts DAX measures and connects to the Power BI XMLA endpoint and Azure SQL.

**Knowledge Graph & Context Tools**
*   **Business-Domain Graph:** Custom Python implementation (`agent_knowledge_graph.py` and `graph_query_layer.py`) with parallel internal indexing for lineage tracking and multi-hop impact analysis.
*   **Code-Structure Graph:** `CodeGraphContext` (cgc) and `Kuzu` are used in the developer tooling to map function calls, class hierarchies, and module dependencies.

**CI/CD, Testing & Integrations**
*   **CI/CD:** GitHub Actions.
*   **Testing:** `pytest` is used both for testing the platform itself and for generating automated data quality and Gold-layer reconciliation tests as an artifact of the agent pipeline.
*   **Notifications:** Microsoft Teams webhooks for automated stakeholder sign-offs and notifications.