(feat) update pages

2026-06-05 23:16:50 +00:00 · 2025-07-11 02:57:57 +03:00
parent 91cd3d0365
commit e44fa89283
39 changed files with 6036 additions and 780 deletions
--- a/pages/orchestration/instances/README.md
+++ b/pages/orchestration/instances/README.md
@@ -1,19 +1,137 @@
-### Description
+# Bot Instances Management

-This page helps you deploy and manage Hummingbot instances:
+The Bot Instances page provides centralized control for deploying, managing, and monitoring Hummingbot trading bot instances across your infrastructure.

- Starting and stopping Hummingbot Broker
- Creating, starting and stopping bot instances
- Managing strategy and script files that instances run
- Fetching status of running instances
+## Features

-### Maintainers
+### 🤖 Instance Management
+- **Create Bot Instances**: Deploy new Hummingbot instances with custom configurations
+- **Start/Stop Control**: Manage instance lifecycle with one-click controls
+- **Status Monitoring**: Real-time health checks and status updates
+- **Multi-Instance Support**: Manage multiple bots running different strategies simultaneously

-This page is maintained by Hummingbot Foundation as a template other pages:
+### 📁 Configuration Management
+- **Strategy File Upload**: Deploy strategy Python files to instances
+- **Script Management**: Upload and manage custom scripts
+- **Configuration Templates**: Save and reuse bot configurations
+- **Hot Reload**: Update strategies without restarting instances

-* [cardosfede](https://github.com/cardosfede)
-* [fengtality](https://github.com/fengtality)
+### 🔧 Broker Management
+- **Hummingbot Broker**: Start and stop the broker service
+- **Connection Status**: Monitor broker health and connectivity
+- **Resource Usage**: Track CPU and memory consumption
+- **Log Access**: View broker logs for debugging

-### Wiki
+### 📊 Instance Monitoring
+- **Performance Metrics**: Real-time P&L, trade count, and volume
+- **Active Orders**: View open orders across all instances
+- **Error Tracking**: Centralized error logs and alerts
+- **Resource Monitoring**: CPU, memory, and network usage per instance

-See the [wiki](https://github.com/hummingbot/dashboard/wiki/%F0%9F%90%99-Bot-Orchestration) for more information.
+## Usage Instructions
+
+### 1. Start Hummingbot Broker
+- Click "Start Broker" to initialize the Hummingbot broker service
+- Wait for the broker to reach "Running" status
+- Verify connection by checking the status indicator
+
+### 2. Create Bot Instance
+- Click "Create New Instance" button
+- Configure instance settings:
+  - **Instance Name**: Unique identifier for the bot
+  - **Image**: Select Hummingbot version/image
+  - **Strategy**: Choose strategy file to run
+  - **Credentials**: Select API keys to use
+- Click "Create" to deploy the instance
+
+### 3. Manage Strategies
+- **Upload Strategy**: Use the file uploader to add new strategy files
+- **Select Active Strategy**: Choose which strategy the instance should run
+- **Edit Strategy**: Modify strategy parameters through the editor
+- **Version Control**: Track strategy changes and rollback if needed
+
+### 4. Control Instances
+- **Start**: Launch a stopped instance
+- **Stop**: Gracefully shutdown a running instance
+- **Restart**: Stop and start an instance
+- **Delete**: Remove an instance and its configuration
+
+### 5. Monitor Performance
+- View real-time status in the instances table
+- Click on an instance for detailed metrics
+- Access logs for troubleshooting
+- Export performance data for analysis
+
+## Technical Notes
+
+### Architecture
+- **Docker-based**: Each instance runs in an isolated Docker container
+- **RESTful API**: Communication via Backend API Client
+- **WebSocket Updates**: Real-time status updates
+- **Persistent Storage**: Configurations and logs stored on disk
+
+### Instance Lifecycle
+1. **Created**: Instance configured but not running
+2. **Starting**: Docker container launching
+3. **Running**: Bot actively trading
+4. **Stopping**: Graceful shutdown in progress
+5. **Stopped**: Instance halted but configuration preserved
+6. **Error**: Instance encountered fatal error
+
+### Resource Management
+- **CPU Limits**: Configurable CPU allocation per instance
+- **Memory Limits**: Set maximum memory usage
+- **Network Isolation**: Instances communicate only through broker
+- **Storage Quotas**: Limit log and data storage per instance
+
+## Component Structure
+
+```
+instances/
+├── app.py                    # Main instances management page
+├── components/
+│   ├── instance_table.py     # Instance list and status display
+│   ├── instance_controls.py  # Start/stop/delete controls
+│   ├── broker_panel.py       # Broker management interface
+│   └── strategy_uploader.py  # Strategy file management
+└── utils/
+    ├── docker_manager.py     # Docker container operations
+    ├── instance_monitor.py   # Status polling and updates
+    └── resource_tracker.py   # Resource usage monitoring
+```
+
+## Best Practices
+
+### Instance Naming
+- Use descriptive names (e.g., "btc_market_maker_01")
+- Include strategy type in the name
+- Add exchange identifier if running multiple exchanges
+- Use consistent naming conventions
+
+### Strategy Management
+- Test strategies in paper trading first
+- Keep backups of working configurations
+- Document strategy parameters
+- Use version control for strategy files
+
+### Performance Optimization
+- Limit instances per broker (recommended: 5-10)
+- Monitor resource usage regularly
+- Restart instances weekly for stability
+- Clear old logs to save disk space
+
+## Error Handling
+
+The instances page handles various error scenarios:
+- **Broker Connection Lost**: Automatic reconnection attempts
+- **Instance Crashes**: Auto-restart with configurable retry limits
+- **Resource Exhaustion**: Graceful degradation and alerts
+- **Strategy Errors**: Detailed error logs and stack traces
+- **Network Issues**: Offline mode with cached status
+
+## Security Considerations
+
+- **API Key Isolation**: Each instance has access only to assigned credentials
+- **Network Segmentation**: Instances cannot communicate directly
+- **Resource Limits**: Prevent runaway processes from affecting system
+- **Audit Logging**: All actions are logged for compliance
--- a/pages/orchestration/instances/app.py
+++ b/pages/orchestration/instances/app.py
@@ -1,76 +1,384 @@
 import time
-from types import SimpleNamespace

+import pandas as pd
 import streamlit as st
-from streamlit_elements import elements, mui

-from frontend.components.bot_performance_card import BotPerformanceCardV2
-from frontend.components.dashboard import Dashboard
 from frontend.st_utils import get_backend_api_client, initialize_st_page

-# Constants for UI layout
-CARD_WIDTH = 12
-CARD_HEIGHT = 4
-NUM_CARD_COLS = 1
+initialize_st_page(icon="🦅", show_readme=False)
+
+# Initialize backend client
+backend_api_client = get_backend_api_client()
+
+# Initialize session state for auto-refresh
+if "auto_refresh_enabled" not in st.session_state:
+    st.session_state.auto_refresh_enabled = True
+
+# Set refresh interval
+REFRESH_INTERVAL = 10  # seconds


-def get_grid_positions(n_cards: int, cols: int = NUM_CARD_COLS, card_width: int = CARD_WIDTH, card_height: int = CARD_HEIGHT):
-    rows = n_cards // cols + 1
-    x_y = [(x * card_width, y * card_height) for x in range(cols) for y in range(rows)]
-    return sorted(x_y, key=lambda x: (x[1], x[0]))
+def stop_bot(bot_name):
+    """Stop a running bot."""
+    try:
+        backend_api_client.bot_orchestration.stop_and_archive_bot(bot_name)
+        st.success(f"Bot {bot_name} stopped and archived successfully")
+        time.sleep(2)  # Give time for the backend to process
+    except Exception as e:
+        st.error(f"Failed to stop bot {bot_name}: {e}")


-def update_active_bots(api_client):
-    active_bots_response = api_client.get_active_bots_status()
-    if active_bots_response.get("status") == "success":
-        current_active_bots = active_bots_response.get("data")
-        stored_bots = {card[1]: card for card in st.session_state.active_instances_board.bot_cards}
-
-        new_bots = set(current_active_bots.keys()) - set(stored_bots.keys())
-        removed_bots = set(stored_bots.keys()) - set(current_active_bots.keys())
-        for bot in removed_bots:
-            st.session_state.active_instances_board.bot_cards = [card for card in
-                                                                 st.session_state.active_instances_board.bot_cards
-                                                                 if card[1] != bot]
-        positions = get_grid_positions(len(current_active_bots), NUM_CARD_COLS, CARD_WIDTH, CARD_HEIGHT)
-        for bot, (x, y) in zip(new_bots, positions[:len(new_bots)]):
-            card = BotPerformanceCardV2(st.session_state.active_instances_board.dashboard, x, y, CARD_WIDTH, CARD_HEIGHT)
-            st.session_state.active_instances_board.bot_cards.append((card, bot))
+def archive_bot(bot_name):
+    """Archive a stopped bot."""
+    try:
+        backend_api_client.docker.stop_container(bot_name)
+        backend_api_client.docker.remove_container(bot_name)
+        st.success(f"Bot {bot_name} archived successfully")
+        time.sleep(1)
+    except Exception as e:
+        st.error(f"Failed to archive bot {bot_name}: {e}")


-initialize_st_page(title="Instances", icon="🦅")
-api_client = get_backend_api_client()
+def stop_controllers(bot_name, controllers):
+    """Stop selected controllers."""
+    success_count = 0
+    for controller in controllers:
+        try:
+            backend_api_client.controllers.update_bot_controller_config(
+                bot_name,
+                controller,
+                {"manual_kill_switch": True}
+            )
+            success_count += 1
+        except Exception as e:
+            st.error(f"Failed to stop controller {controller}: {e}")

-if not api_client.is_docker_running():
-    st.warning("Docker is not running. Please start Docker and refresh the page.")
-    st.stop()
+    if success_count > 0:
+        st.success(f"Successfully stopped {success_count} controller(s)")
+        # Temporarily disable auto-refresh to prevent immediate state reset
+        st.session_state.auto_refresh_enabled = False

-if "active_instances_board" not in st.session_state:
-    active_bots_response = api_client.get_active_bots_status()
-    bot_cards = []
-    board = Dashboard()
-    st.session_state.active_instances_board = SimpleNamespace(
-        dashboard=board,
-        bot_cards=bot_cards,
-    )
-    active_bots = active_bots_response.get("data")
-    number_of_bots = len(active_bots)
-    if number_of_bots > 0:
-        positions = get_grid_positions(number_of_bots, NUM_CARD_COLS, CARD_WIDTH, CARD_HEIGHT)
-        for (bot, bot_info), (x, y) in zip(active_bots.items(), positions):
-            bot_status = api_client.get_bot_status(bot)
-            card = BotPerformanceCardV2(board, x, y, CARD_WIDTH, CARD_HEIGHT)
-            st.session_state.active_instances_board.bot_cards.append((card, bot))
-else:
-    update_active_bots(api_client)
+    return success_count > 0

-with elements("active_instances_board"):
-    with mui.Paper(sx={"padding": "2rem"}, variant="outlined"):
-        mui.Typography("🏠 Local Instances", variant="h5")
-        for card, bot in st.session_state.active_instances_board.bot_cards:
-            with st.session_state.active_instances_board.dashboard():
-                card(bot)

-while True:
-    time.sleep(10)
-    st.rerun()
+def start_controllers(bot_name, controllers):
+    """Start selected controllers."""
+    success_count = 0
+    for controller in controllers:
+        try:
+            backend_api_client.controllers.update_bot_controller_config(
+                bot_name,
+                controller,
+                {"manual_kill_switch": False}
+            )
+            success_count += 1
+        except Exception as e:
+            st.error(f"Failed to start controller {controller}: {e}")
+
+    if success_count > 0:
+        st.success(f"Successfully started {success_count} controller(s)")
+        # Temporarily disable auto-refresh to prevent immediate state reset
+        st.session_state.auto_refresh_enabled = False
+
+    return success_count > 0
+
+
+def render_bot_card(bot_name):
+    """Render a bot performance card using native Streamlit components."""
+    try:
+        # Get bot status first
+        bot_status = backend_api_client.bot_orchestration.get_bot_status(bot_name)
+
+        # Only try to get controller configs if bot exists and is running
+        controller_configs = []
+        if bot_status.get("status") == "success":
+            bot_data = bot_status.get("data", {})
+            is_running = bot_data.get("status") == "running"
+            if is_running:
+                try:
+                    controller_configs = backend_api_client.controllers.get_bot_controller_configs(bot_name)
+                    controller_configs = controller_configs if controller_configs else []
+                except Exception as e:
+                    # If controller configs fail, continue without them
+                    st.warning(f"Could not fetch controller configs for {bot_name}: {e}")
+                    controller_configs = []
+
+        with st.container(border=True):
+
+            if bot_status.get("status") == "error":
+                # Error state
+                col1, col2 = st.columns([3, 1])
+                with col1:
+                    st.error(f"🤖 **{bot_name}** - Not Available")
+                st.error(f"An error occurred while fetching bot status of {bot_name}. Please check the bot client.")
+            else:
+                bot_data = bot_status.get("data", {})
+                is_running = bot_data.get("status") == "running"
+                performance = bot_data.get("performance", {})
+                error_logs = bot_data.get("error_logs", [])
+                general_logs = bot_data.get("general_logs", [])
+
+                # Bot header
+                col1, col2, col3 = st.columns([2, 1, 1])
+                with col1:
+                    if is_running:
+                        st.success(f"🤖 **{bot_name}** - Running")
+                    else:
+                        st.warning(f"🤖 **{bot_name}** - Stopped")
+
+                with col3:
+                    if is_running:
+                        if st.button("⏹️ Stop", key=f"stop_{bot_name}", use_container_width=True):
+                            stop_bot(bot_name)
+                    else:
+                        if st.button("📦 Archive", key=f"archive_{bot_name}", use_container_width=True):
+                            archive_bot(bot_name)
+
+                if is_running:
+                    # Calculate totals
+                    active_controllers = []
+                    stopped_controllers = []
+                    error_controllers = []
+                    total_global_pnl_quote = 0
+                    total_volume_traded = 0
+                    total_unrealized_pnl_quote = 0
+
+                    for controller, inner_dict in performance.items():
+                        controller_status = inner_dict.get("status")
+                        if controller_status == "error":
+                            error_controllers.append({
+                                "Controller": controller,
+                                "Error": inner_dict.get("error", "Unknown error")
+                            })
+                            continue
+
+                        controller_performance = inner_dict.get("performance", {})
+                        controller_config = next(
+                            (config for config in controller_configs if config.get("id") == controller), {}
+                        )
+
+                        controller_name = controller_config.get("controller_name", controller)
+
+                        connector_name = controller_config.get("connector_name", "N/A")
+                        trading_pair = controller_config.get("trading_pair", "N/A")
+                        kill_switch_status = controller_config.get("manual_kill_switch", False)
+
+                        realized_pnl_quote = controller_performance.get("realized_pnl_quote", 0)
+                        unrealized_pnl_quote = controller_performance.get("unrealized_pnl_quote", 0)
+                        global_pnl_quote = controller_performance.get("global_pnl_quote", 0)
+                        volume_traded = controller_performance.get("volume_traded", 0)
+
+                        close_types = controller_performance.get("close_type_counts", {})
+                        tp = close_types.get("CloseType.TAKE_PROFIT", 0)
+                        sl = close_types.get("CloseType.STOP_LOSS", 0)
+                        time_limit = close_types.get("CloseType.TIME_LIMIT", 0)
+                        ts = close_types.get("CloseType.TRAILING_STOP", 0)
+                        refreshed = close_types.get("CloseType.EARLY_STOP", 0)
+                        failed = close_types.get("CloseType.FAILED", 0)
+                        close_types_str = f"TP: {tp} | SL: {sl} | TS: {ts} | TL: {time_limit} | ES: {refreshed} | F: {failed}"
+
+                        controller_info = {
+                            "Select": False,
+                            "ID": controller_config.get("id"),
+                            "Controller": controller_name,
+                            "Connector": connector_name,
+                            "Trading Pair": trading_pair,
+                            "Realized PNL ($)": round(realized_pnl_quote, 2),
+                            "Unrealized PNL ($)": round(unrealized_pnl_quote, 2),
+                            "NET PNL ($)": round(global_pnl_quote, 2),
+                            "Volume ($)": round(volume_traded, 2),
+                            "Close Types": close_types_str,
+                            "_controller_id": controller
+                        }
+
+                        if kill_switch_status:
+                            stopped_controllers.append(controller_info)
+                        else:
+                            active_controllers.append(controller_info)
+
+                        total_global_pnl_quote += global_pnl_quote
+                        total_volume_traded += volume_traded
+                        total_unrealized_pnl_quote += unrealized_pnl_quote
+
+                    total_global_pnl_pct = total_global_pnl_quote / total_volume_traded if total_volume_traded > 0 else 0
+
+                    # Display metrics
+                    col1, col2, col3, col4 = st.columns(4)
+
+                    with col1:
+                        st.metric("🏦 NET PNL", f"${total_global_pnl_quote:.2f}")
+                    with col2:
+                        st.metric("💹 Unrealized PNL", f"${total_unrealized_pnl_quote:.2f}")
+                    with col3:
+                        st.metric("📊 NET PNL (%)", f"{total_global_pnl_pct:.2%}")
+                    with col4:
+                        st.metric("💸 Volume Traded", f"${total_volume_traded:.2f}")
+
+                    # Active Controllers
+                    if active_controllers:
+                        st.success("🚀 **Active Controllers:** Controllers currently running and trading")
+                        active_df = pd.DataFrame(active_controllers)
+
+                        edited_active_df = st.data_editor(
+                            active_df,
+                            column_config={
+                                "Select": st.column_config.CheckboxColumn(
+                                    "Select",
+                                    help="Select controllers to stop",
+                                    default=False,
+                                ),
+                                "_controller_id": None,  # Hide this column
+                            },
+                            disabled=[col for col in active_df.columns if col != "Select"],
+                            hide_index=True,
+                            use_container_width=True,
+                            key=f"active_table_{bot_name}"
+                        )
+
+                        selected_active = [
+                            row["_controller_id"]
+                            for _, row in edited_active_df.iterrows()
+                            if row["Select"]
+                        ]
+
+                        if selected_active:
+                            if st.button(f"⏹️ Stop Selected ({len(selected_active)})",
+                                         key=f"stop_active_{bot_name}",
+                                         type="secondary"):
+                                with st.spinner(f"Stopping {len(selected_active)} controller(s)..."):
+                                    stop_controllers(bot_name, selected_active)
+                                    time.sleep(1)
+
+                    # Stopped Controllers
+                    if stopped_controllers:
+                        st.warning("💤 **Stopped Controllers:** Controllers that are paused or stopped")
+                        stopped_df = pd.DataFrame(stopped_controllers)
+
+                        edited_stopped_df = st.data_editor(
+                            stopped_df,
+                            column_config={
+                                "Select": st.column_config.CheckboxColumn(
+                                    "Select",
+                                    help="Select controllers to start",
+                                    default=False,
+                                ),
+                                "_controller_id": None,  # Hide this column
+                            },
+                            disabled=[col for col in stopped_df.columns if col != "Select"],
+                            hide_index=True,
+                            use_container_width=True,
+                            key=f"stopped_table_{bot_name}"
+                        )
+
+                        selected_stopped = [
+                            row["_controller_id"]
+                            for _, row in edited_stopped_df.iterrows()
+                            if row["Select"]
+                        ]
+
+                        if selected_stopped:
+                            if st.button(f"▶️ Start Selected ({len(selected_stopped)})",
+                                         key=f"start_stopped_{bot_name}",
+                                         type="primary"):
+                                with st.spinner(f"Starting {len(selected_stopped)} controller(s)..."):
+                                    start_controllers(bot_name, selected_stopped)
+                                    time.sleep(1)
+
+                    # Error Controllers
+                    if error_controllers:
+                        st.error("💀 **Controllers with Errors:** Controllers that encountered errors")
+                        error_df = pd.DataFrame(error_controllers)
+                        st.dataframe(error_df, use_container_width=True, hide_index=True)
+
+                    # Logs sections
+                    with st.expander("📋 Error Logs"):
+                        if error_logs:
+                            for log in error_logs[:50]:
+                                timestamp = log.get("timestamp", "")
+                                message = log.get("msg", "")
+                                logger_name = log.get("logger_name", "")
+                                st.text(f"{timestamp} - {logger_name}: {message}")
+                        else:
+                            st.info("No error logs available.")
+
+                    with st.expander("📝 General Logs"):
+                        if general_logs:
+                            for log in general_logs[:50]:
+                                timestamp = pd.to_datetime(int(log.get("timestamp", 0)), unit="s")
+                                message = log.get("msg", "")
+                                logger_name = log.get("logger_name", "")
+                                st.text(f"{timestamp} - {logger_name}: {message}")
+                        else:
+                            st.info("No general logs available.")
+
+    except Exception as e:
+        with st.container(border=True):
+            st.error(f"🤖 **{bot_name}** - Error")
+            st.error(f"An error occurred while fetching bot status: {str(e)}")
+
+
+# Page Header
+st.title("🦅 Hummingbot Instances")
+
+# Auto-refresh controls
+col1, col2, col3 = st.columns([3, 1, 1])
+
+# Create placeholder for status message
+status_placeholder = col1.empty()
+
+with col2:
+    if st.button("▶️ Start Auto-refresh" if not st.session_state.auto_refresh_enabled else "⏸️ Stop Auto-refresh",
+                 use_container_width=True):
+        st.session_state.auto_refresh_enabled = not st.session_state.auto_refresh_enabled
+
+with col3:
+    if st.button("🔄 Refresh Now", use_container_width=True):
+        # Re-enable auto-refresh if it was temporarily disabled
+        if not st.session_state.auto_refresh_enabled:
+            st.session_state.auto_refresh_enabled = True
+        pass
+
+
+@st.fragment(run_every=REFRESH_INTERVAL if st.session_state.auto_refresh_enabled else None)
+def show_bot_instances():
+    """Fragment to display bot instances with auto-refresh."""
+    try:
+        active_bots_response = backend_api_client.bot_orchestration.get_active_bots_status()
+
+        if active_bots_response.get("status") == "success":
+            active_bots = active_bots_response.get("data", {})
+
+            # Filter out any bots that might be in transitional state
+            truly_active_bots = {}
+            for bot_name, bot_info in active_bots.items():
+                try:
+                    bot_status = backend_api_client.bot_orchestration.get_bot_status(bot_name)
+                    if bot_status.get("status") == "success":
+                        bot_data = bot_status.get("data", {})
+                        if bot_data.get("status") in ["running", "stopped"]:
+                            truly_active_bots[bot_name] = bot_info
+                except Exception:
+                    continue
+
+            if truly_active_bots:
+                # Show refresh status
+                if st.session_state.auto_refresh_enabled:
+                    status_placeholder.info(f"🔄 Auto-refreshing every {REFRESH_INTERVAL} seconds")
+                else:
+                    status_placeholder.warning("⏸️ Auto-refresh paused. Click 'Refresh Now' to resume.")
+
+                # Render each bot
+                for bot_name in truly_active_bots.keys():
+                    render_bot_card(bot_name)
+            else:
+                status_placeholder.info("No active bot instances found. Deploy a bot to see it here.")
+        else:
+            st.error("Failed to fetch active bots status.")
+
+    except Exception as e:
+        st.error(f"Failed to connect to backend: {e}")
+        st.info("Please make sure the backend is running and accessible.")
+
+
+# Call the fragment
+show_bot_instances()