Automated Mechanisms to Discover AI/ML Models Across Environments

Organizations must implement automated mechanisms to continuously discover AI/ML models across cloud, on-prem, SaaS, and runtime environments. Below are practical approaches used in real-world scenarios.

1. Cloud-Native Discovery

Integrate with AWS SageMaker, Azure ML, GCP Vertex AI
Use APIs to list models and endpoints
Schedule periodic discovery scans
Auto-tag assets with owner, environment, and risk

Tools: AWS Config, Azure Resource Graph, GCP Asset Inventory

2. MLOps Platform Integration

Integrate with MLflow, Kubeflow, Databricks
Sync model registry and experiment tracking
Track model versions and datasets
Capture deployment status

3. Code Repository Scanning

Scan GitHub, GitLab, Bitbucket repositories
Detect model files (.pkl, .onnx, .h5)
Identify ML libraries (TensorFlow, PyTorch)
Detect model loading patterns

joblib.load("fraud_model.pkl")

4. API & Endpoint Discovery

Scan API gateways (Kong, Apigee, AWS API Gateway)
Detect endpoints like /predict, /generate
Monitor exposure of AI services

GenAI applications often exist only as APIs and are not formally registered.

5. Runtime / Infrastructure Scanning

Monitor Kubernetes, Docker, and VM workloads
Detect GPU usage patterns
Identify ML serving frameworks (TensorFlow Serving, TorchServe)

6. SaaS & Shadow AI Discovery

Monitor usage of ChatGPT, Copilot, Gemini
Use CASB/SSE tools (Netskope, Defender for Cloud Apps)
Detect unauthorized API key usage

Shadow AI is one of the highest-risk areas in AI security.

7. Data Pipeline Inspection

Scan Airflow, Spark, and ETL pipelines
Detect inference steps like model.predict()
Track feature engineering workflows

model.predict(data)

8. CMDB & Asset Correlation

Integrate with ServiceNow or asset management tools
Link models with applications and datasets
Build AI asset relationship graphs

Key Takeaway

AI discovery requires a combination of cloud scanning, code analysis, API monitoring, and runtime inspection.

The biggest risk comes from unknown or shadow AI systems — discovering them is critical.