> For the complete documentation index, see [llms.txt](https://docs.agents.opsera.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.agents.opsera.ai/devsecops-agents/architecture-analyze-agent.md). # Architecture Analyze Agent ### What it does The Architecture Analyze Agent automatically discovers, maps, and documents your entire software architecture—generating visual diagrams, API documentation, cost optimization recommendations, and operational runbooks. Think of it as your automated principal architect that reverse-engineers your system and creates production-ready documentation. **You'll get:** * System architecture diagrams (15+ formats: Mermaid, PlantUML, D2, Draw\.io) * Complete API documentation with endpoints and data flows * Cost optimization analysis with savings projections * Disaster recovery blueprints with RPO/RTO targets * Production-ready code examples and operational runbooks ⏱️ **Analysis time:** 5-15 minutes depending on codebase size ## Sample Prompts {% hint style="success" %} **Examples** #### **pre‑deploy‑review** **Prompt:** “Analyze our service mesh and identify any single‑points‑of‑failure before deploying the new microservice.” #### **auth‑route‑audit** **Prompt:** “Verify all authentication routes and dataflows to ensure no unsecured endpoints exist.” #### **high‑latency‑diagnosis** **Prompt:** “Inspect the architecture for components contributing to the 200 ms latency spikes in the checkout flow.” #### **scaling‑risk‑assessment** **Prompt:** “Evaluate our current setup for risks when scaling from 100 to 10 000 concurrent users.” #### **third‑party‑dependency‑map** **Prompt**: “Generate a diagram showing external APIs and their expected request volumes for security review.”
{% endhint %} ### Why use it **Instead of:** * Spending 30-40 hours manually documenting architecture * Reverse-engineering systems from outdated diagrams * Guessing at cost optimization opportunities * Creating DR plans from scratch **You get:** * Automated discovery of your entire tech stack * Multi-format diagrams ready to edit * $335K+ average annual cloud savings identified * Production-ready monitoring and error handling code * Week-by-week implementation roadmap **Impact:** * 30-40 hours saved per project on documentation * 40-60% cloud cost reduction (average) * 15-minute RPO / 4-hour RTO for disaster recovery *** ### What it analyzes The agent performs deep analysis across multiple layers: #### 1. Technology Stack Discovery **Scans:** Languages, frameworks, cloud services\ **Finds:** Python versions, React/Node.js, Databricks, AWS/Azure/GCP\ **Example:** Detects "Python 3.11, FastAPI, Databricks Unity Catalog, Delta Lake" #### 2. Component Mapping **Analyzes:** Directory structures, config files\ **Finds:** Microservices, APIs, databases, storage layers\ **Example:** Maps package.json, Dockerfile, terraform files to system boundaries #### 3. Data Flow Analysis **Reads:** Source code, API calls, data pipelines\ **Finds:** Medallion architecture (Bronze/Silver/Gold), API endpoints, auth flows\ **Example:** Traces data from ingestion → transformation → consumption #### 4. Security Architecture **Maps:** Defense-in-depth layers, access controls\ **Finds:** Network isolation, authentication, encryption, RBAC\ **Example:** Documents 8-layer security framework with Private Link setup #### 5. Cost Optimization **Analyzes:** Compute resources, storage, network\ **Finds:** Auto-termination gaps, spot instance opportunities, rightsizing needs\ **Example:** Identifies $335K+ annual savings in unused compute #### 6. Disaster Recovery **Plans:** Multi-region failover, backup strategies\ **Finds:** Recovery point objectives (RPO), recovery time objectives (RTO)\ **Example:** Creates active-passive setup with 15-min RPO / 4-hour RTO *** ### How to use it #### Basic analysis Analyze your current directory: bash ```bash opsera-devops-agent:architecture-analyze ``` or in natural language: ``` "Analyze the architecture of this project" ``` *** #### Specific analyses **Full system with all diagram formats:** ``` "Analyze the entire system and generate diagrams in all formats" ``` **Focus on cost optimization:** ``` "Analyze this architecture and show me cost optimization opportunities" ``` **CI/CD architecture:** ``` "Generate a CI/CD architecture with pipeline recommendations" ``` **Security architecture only:** ``` "Map the security architecture with defense-in-depth layers" ``` **Specific directory:** ``` "Analyze only the ./src directory with API documentation" ``` *** ### What you'll see #### During analysis bash ```bash 🏗️ Architecture Analyze Agent Starting... Phase 1/6: Environment discovery... ✓ Detected: Python 3.11, FastAPI, React 18 ✓ Cloud services: Databricks, AWS S3, RDS ✓ Infrastructure: Terraform, Docker, Kubernetes Phase 2/6: Component mapping... ✓ Found 8 microservices ✓ Identified 24 API endpoints ✓ Mapped 3 data layers (Bronze/Silver/Gold) Phase 3/6: Logic extraction... ✓ Traced data flows across pipelines ✓ Documented authentication sequences ✓ Mapped API dependencies Phase 4/6: Multi-layer diagramming... ✓ Generated system overview ✓ Created data flow diagrams ✓ Rendered security architecture ✓ Produced 15 diagrams in 4 formats Phase 5/6: Strategic analysis... ✓ Identified $420K in potential savings ✓ Created week-by-week implementation plan ✓ Analyzed DR requirements (RPO/RTO) Phase 6/6: Artifact delivery... ✓ Generated 5 comprehensive reports ✓ Created operational runbooks ✓ Produced production-ready code examples 📁 Reports saved to: /Users/opsera/architecture-docs/ ``` *** ### Reports generated You'll get 5 comprehensive documentation files: #### 1. Architecture Documentation **File:** `architecture-documentation.md` **Contains:** * **Technology Stack Analysis:** Complete breakdown of backend, data layers, infrastructure * **API Specification:** All discovered endpoints with request/response formats * **Visual Blueprints:** 15+ diagrams including: * System overview * Data flow (Medallion architecture) * Component diagrams * Sequence diagrams * ER diagrams * **Multi-Format Export:** Mermaid, PlantUML, D2, Draw\.io XML **Example output:** markdown ````markdown ## System Overview Diagram ```mermaid graph TB API[FastAPI Gateway] DB[(PostgreSQL)] Cache[(Redis)] Queue[RabbitMQ] API --> DB API --> Cache API --> Queue ``` ## Technology Stack - **Backend:** Python 3.11 (FastAPI, Celery) - **Frontend:** React 18 (TypeScript, Tailwind) - **Data:** Databricks (Delta Lake, Unity Catalog) - **Infrastructure:** AWS (EKS, RDS, S3) ## API Endpoints - POST /api/v1/users - Create user - GET /api/v1/users/{id} - Get user details - POST /api/v1/pipelines/run - Execute data pipeline ```` *** #### 2. CI/CD Architecture **File:** `cicd-pipeline-architecture.md` **Contains:** * **Pipeline Orchestration:** Complete GitHub Actions workflows (600+ lines) * **Environment Strategy:** Dev → Staging → Prod promotion path * **Quality Gates:** Automated testing, security scanning, manual approvals * **Deployment Patterns:** Blue-green, canary, rolling updates **Example output:** yaml ```yaml # GitHub Actions Pipeline name: Deploy to Production on: push: branches: [main] jobs: test: runs-on: ubuntu-latest steps: - name: Run pytest - name: Security scan (TruffleHog, Bandit) - name: Code quality (SonarQube) deploy: needs: test steps: - name: Deploy to EKS - name: Run smoke tests - name: Health check ``` *** #### 3. Cost Optimization Analysis **File:** `cost-optimization-analysis.md` **Contains:** * **Current Spend Analysis:** Breakdown by service, region, resource type * **Optimization Opportunities:** Auto-termination, spot instances, rightsizing * **ROI Projections:** 3-year savings forecast * **Implementation Roadmap:** Week-by-week plan **Example output:** ``` Cost Optimization Opportunities ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Current Annual Spend: $840,000 Projected Annual Savings: $504,000 (60%) Payback Period: 3 months Quick Wins (Week 1-2): □ Enable auto-termination on Databricks clusters [$120K/year] □ Switch to spot instances for non-prod workloads [$80K/year] □ Rightsize RDS instances (t3.large → t3.medium) [$24K/year] Medium-term (Week 3-6): □ Implement data lifecycle policies (S3 → Glacier) [$60K/year] □ Optimize SQL warehouse size and concurrency [$100K/year] □ Enable compute autoscaling [$120K/year] 3-Year ROI Projection: Year 1: $504K savings - $15K implementation = $489K net Year 2: $504K savings Year 3: $504K savings Total 3-Year Savings: $1,497,000 ``` *** #### 4. Disaster Recovery Blueprint **File:** `disaster-recovery-architecture.md` **Contains:** * **Resiliency Targets:** RPO (15 minutes), RTO (4 hours) * **Failover Procedures:** Active-passive multi-region setup * **Automated Scripts:** Failover automation code * **Testing Procedures:** DR drill runbooks **Example output:** ```` Disaster Recovery Plan ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Targets: RPO (Recovery Point Objective): 15 minutes RTO (Recovery Time Objective): 4 hours Architecture: Primary Region: us-east-1 (Active) DR Region: us-west-2 (Passive) Replication: Cross-region automated backups Failover Steps: 1. Detect primary region failure (CloudWatch alarms) 2. Promote DR database replica to primary 3. Update Route 53 DNS to DR region 4. Validate application health checks 5. Notify stakeholders Automated Failover Script: ```bash #!/bin/bash # Failover to DR region aws rds promote-read-replica --db-instance dr-database aws route53 change-resource-record-sets --change-batch file://failover.json # Verify health curl https://api.example.com/health ``` DR Testing Schedule: - Monthly: Backup restore tests - Quarterly: Full failover drills - Annual: Complete DR simulation ```` *** #### 5. Production-Ready Code & Operations **Files:** `production_ready_code_examples.py`, `operational-guide.md` **Contains:** * **Hardened Code:** Circuit breakers, exponential backoff, structured logging * **Operational Runbooks:** Common tasks (scaling, troubleshooting, monitoring) * **Security Hardening:** Network isolation, RBAC, encryption **Example output:** python ```python # production_ready_code_examples.py import time import logging from functools import wraps # Circuit breaker pattern class CircuitBreaker: def __init__(self, failure_threshold=5, timeout=60): self.failure_count = 0 self.failure_threshold = failure_threshold self.timeout = timeout self.last_failure_time = None def call(self, func): if self.failure_count >= self.failure_threshold: if time.time() - self.last_failure_time < self.timeout: raise Exception("Circuit breaker is OPEN") else: self.failure_count = 0 # Reset after timeout try: result = func() self.failure_count = 0 # Reset on success return result except Exception as e: self.failure_count += 1 self.last_failure_time = time.time() raise # Exponential backoff with retry def retry_with_backoff(max_retries=3, base_delay=1): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): for attempt in range(max_retries): try: return func(*args, **kwargs) except Exception as e: if attempt == max_retries - 1: raise delay = base_delay * (2 ** attempt) logging.warning(f"Retry {attempt + 1}/{max_retries} after {delay}s") time.sleep(delay) return wrapper return decorator # Structured JSON logging import json from datetime import datetime class StructuredLogger: def log(self, level, message, **kwargs): log_entry = { "timestamp": datetime.utcnow().isoformat(), "level": level, "message": message, **kwargs } print(json.dumps(log_entry)) logger = StructuredLogger() logger.log("INFO", "Pipeline started", pipeline_id="abc123", user="admin") ``` *** ### After analysis #### 1. Review the architecture documentation Open `architecture-documentation.md` to see: * Complete system overview * All discovered APIs and endpoints * Visual diagrams in multiple formats * Technology stack breakdown *** #### 2. Implement cost optimizations Follow the quick wins in `cost-optimization-analysis.md`: * Enable auto-termination on compute clusters * Switch non-prod workloads to spot instances * Rightsize over-provisioned resources * Implement data lifecycle policies **Typical savings:** $300K-$500K annually *** #### 3. Deploy CI/CD improvements Use the workflows in `cicd-pipeline-architecture.md`: * Copy GitHub Actions workflows to `.github/workflows/` * Set up quality gates (testing, security scanning) * Configure environment promotion (Dev → Staging → Prod) * Enable automated deployments *** #### 4. Harden production systems Use `production_ready_code_examples.py`: * Add circuit breakers to external API calls * Implement exponential backoff with retry logic * Switch to structured JSON logging * Add comprehensive error handling *** #### 5. Prepare disaster recovery Follow the DR blueprint in `disaster-recovery-architecture.md`: * Set up multi-region replication * Configure automated failover scripts * Schedule quarterly DR drills * Document runbooks for emergencies *** ### Quality benchmarks Use these standards to measure architectural quality: | Metric | Target | Purpose | | -------------------------- | ------------------------ | ---------------------------- | | **Documentation Coverage** | 100% | All components documented | | **Diagram Accuracy** | Current | Reflects actual system state | | **Cost Efficiency** | Auto-termination enabled | No wasted compute | | **DR Preparedness** | RPO 15min / RTO 4hr | Business continuity | | **Security Layers** | 8-layer defense-in-depth | Comprehensive protection | **Architecture maturity levels:** * ✅ **Production Ready:** All targets met, DR tested, costs optimized * ⚠️ **Needs Hardening:** Documentation complete, DR planned, some cost waste * ❌ **Early Stage:** Incomplete docs, no DR plan, high cost waste *** ### Common issues **Analysis taking too long?** * Start with specific directory: "Analyze only the ./api directory" * Skip certain analyses: "Analyze architecture but skip cost optimization" * Larger codebases may take 15-20 minutes **Diagrams not rendering?** * Copy Mermaid/PlantUML code to specialized viewers * Use Draw\.io XML files for visual editing * Check diagram syntax in generated markdown **Missing components in diagrams?** * Ensure all config files are present (package.json, Dockerfile, etc.) * Check that services are running or have recent activity * Verify cloud provider credentials for service discovery **Cost analysis shows no savings?** * Your infrastructure may already be optimized * Run analysis on production environment for accurate data * Check for auto-termination and spot instance usage **Reports not generated?** * Check write permissions in output directory * Verify sufficient disk space * Look for errors in analysis output *** ### Examples **Quick architecture overview:** ``` "Analyze this project and show me the system architecture" ``` **Full analysis with all diagrams:** ``` "Generate a complete architecture analysis with all diagram formats and cost optimization" ``` **API documentation:** ``` "Analyze this codebase and document all API endpoints" ``` **Cost optimization focus:** ``` "Analyze my cloud architecture and identify cost savings opportunities" ``` **Security architecture:** ``` "Map the security architecture with defense-in-depth layers" ``` **DR planning:** ``` "Generate a disaster recovery plan with RPO and RTO targets" ``` --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://docs.agents.opsera.ai/devsecops-agents/architecture-analyze-agent.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.