(feat) update pages

This commit is contained in:
cardosofede
2025-07-11 02:57:57 +03:00
parent 91cd3d0365
commit e44fa89283
39 changed files with 6036 additions and 780 deletions

View File

@@ -1,19 +1,137 @@
### Description
# Bot Instances Management
This page helps you deploy and manage Hummingbot instances:
The Bot Instances page provides centralized control for deploying, managing, and monitoring Hummingbot trading bot instances across your infrastructure.
- Starting and stopping Hummingbot Broker
- Creating, starting and stopping bot instances
- Managing strategy and script files that instances run
- Fetching status of running instances
## Features
### Maintainers
### 🤖 Instance Management
- **Create Bot Instances**: Deploy new Hummingbot instances with custom configurations
- **Start/Stop Control**: Manage instance lifecycle with one-click controls
- **Status Monitoring**: Real-time health checks and status updates
- **Multi-Instance Support**: Manage multiple bots running different strategies simultaneously
This page is maintained by Hummingbot Foundation as a template other pages:
### 📁 Configuration Management
- **Strategy File Upload**: Deploy strategy Python files to instances
- **Script Management**: Upload and manage custom scripts
- **Configuration Templates**: Save and reuse bot configurations
- **Hot Reload**: Update strategies without restarting instances
* [cardosfede](https://github.com/cardosfede)
* [fengtality](https://github.com/fengtality)
### 🔧 Broker Management
- **Hummingbot Broker**: Start and stop the broker service
- **Connection Status**: Monitor broker health and connectivity
- **Resource Usage**: Track CPU and memory consumption
- **Log Access**: View broker logs for debugging
### Wiki
### 📊 Instance Monitoring
- **Performance Metrics**: Real-time P&L, trade count, and volume
- **Active Orders**: View open orders across all instances
- **Error Tracking**: Centralized error logs and alerts
- **Resource Monitoring**: CPU, memory, and network usage per instance
See the [wiki](https://github.com/hummingbot/dashboard/wiki/%F0%9F%90%99-Bot-Orchestration) for more information.
## Usage Instructions
### 1. Start Hummingbot Broker
- Click "Start Broker" to initialize the Hummingbot broker service
- Wait for the broker to reach "Running" status
- Verify connection by checking the status indicator
### 2. Create Bot Instance
- Click "Create New Instance" button
- Configure instance settings:
- **Instance Name**: Unique identifier for the bot
- **Image**: Select Hummingbot version/image
- **Strategy**: Choose strategy file to run
- **Credentials**: Select API keys to use
- Click "Create" to deploy the instance
### 3. Manage Strategies
- **Upload Strategy**: Use the file uploader to add new strategy files
- **Select Active Strategy**: Choose which strategy the instance should run
- **Edit Strategy**: Modify strategy parameters through the editor
- **Version Control**: Track strategy changes and rollback if needed
### 4. Control Instances
- **Start**: Launch a stopped instance
- **Stop**: Gracefully shutdown a running instance
- **Restart**: Stop and start an instance
- **Delete**: Remove an instance and its configuration
### 5. Monitor Performance
- View real-time status in the instances table
- Click on an instance for detailed metrics
- Access logs for troubleshooting
- Export performance data for analysis
## Technical Notes
### Architecture
- **Docker-based**: Each instance runs in an isolated Docker container
- **RESTful API**: Communication via Backend API Client
- **WebSocket Updates**: Real-time status updates
- **Persistent Storage**: Configurations and logs stored on disk
### Instance Lifecycle
1. **Created**: Instance configured but not running
2. **Starting**: Docker container launching
3. **Running**: Bot actively trading
4. **Stopping**: Graceful shutdown in progress
5. **Stopped**: Instance halted but configuration preserved
6. **Error**: Instance encountered fatal error
### Resource Management
- **CPU Limits**: Configurable CPU allocation per instance
- **Memory Limits**: Set maximum memory usage
- **Network Isolation**: Instances communicate only through broker
- **Storage Quotas**: Limit log and data storage per instance
## Component Structure
```
instances/
├── app.py # Main instances management page
├── components/
│ ├── instance_table.py # Instance list and status display
│ ├── instance_controls.py # Start/stop/delete controls
│ ├── broker_panel.py # Broker management interface
│ └── strategy_uploader.py # Strategy file management
└── utils/
├── docker_manager.py # Docker container operations
├── instance_monitor.py # Status polling and updates
└── resource_tracker.py # Resource usage monitoring
```
## Best Practices
### Instance Naming
- Use descriptive names (e.g., "btc_market_maker_01")
- Include strategy type in the name
- Add exchange identifier if running multiple exchanges
- Use consistent naming conventions
### Strategy Management
- Test strategies in paper trading first
- Keep backups of working configurations
- Document strategy parameters
- Use version control for strategy files
### Performance Optimization
- Limit instances per broker (recommended: 5-10)
- Monitor resource usage regularly
- Restart instances weekly for stability
- Clear old logs to save disk space
## Error Handling
The instances page handles various error scenarios:
- **Broker Connection Lost**: Automatic reconnection attempts
- **Instance Crashes**: Auto-restart with configurable retry limits
- **Resource Exhaustion**: Graceful degradation and alerts
- **Strategy Errors**: Detailed error logs and stack traces
- **Network Issues**: Offline mode with cached status
## Security Considerations
- **API Key Isolation**: Each instance has access only to assigned credentials
- **Network Segmentation**: Instances cannot communicate directly
- **Resource Limits**: Prevent runaway processes from affecting system
- **Audit Logging**: All actions are logged for compliance

View File

@@ -1,76 +1,384 @@
import time
from types import SimpleNamespace
import pandas as pd
import streamlit as st
from streamlit_elements import elements, mui
from frontend.components.bot_performance_card import BotPerformanceCardV2
from frontend.components.dashboard import Dashboard
from frontend.st_utils import get_backend_api_client, initialize_st_page
# Constants for UI layout
CARD_WIDTH = 12
CARD_HEIGHT = 4
NUM_CARD_COLS = 1
initialize_st_page(icon="🦅", show_readme=False)
# Initialize backend client
backend_api_client = get_backend_api_client()
# Initialize session state for auto-refresh
if "auto_refresh_enabled" not in st.session_state:
st.session_state.auto_refresh_enabled = True
# Set refresh interval
REFRESH_INTERVAL = 10 # seconds
def get_grid_positions(n_cards: int, cols: int = NUM_CARD_COLS, card_width: int = CARD_WIDTH, card_height: int = CARD_HEIGHT):
rows = n_cards // cols + 1
x_y = [(x * card_width, y * card_height) for x in range(cols) for y in range(rows)]
return sorted(x_y, key=lambda x: (x[1], x[0]))
def stop_bot(bot_name):
"""Stop a running bot."""
try:
backend_api_client.bot_orchestration.stop_and_archive_bot(bot_name)
st.success(f"Bot {bot_name} stopped and archived successfully")
time.sleep(2) # Give time for the backend to process
except Exception as e:
st.error(f"Failed to stop bot {bot_name}: {e}")
def update_active_bots(api_client):
active_bots_response = api_client.get_active_bots_status()
if active_bots_response.get("status") == "success":
current_active_bots = active_bots_response.get("data")
stored_bots = {card[1]: card for card in st.session_state.active_instances_board.bot_cards}
new_bots = set(current_active_bots.keys()) - set(stored_bots.keys())
removed_bots = set(stored_bots.keys()) - set(current_active_bots.keys())
for bot in removed_bots:
st.session_state.active_instances_board.bot_cards = [card for card in
st.session_state.active_instances_board.bot_cards
if card[1] != bot]
positions = get_grid_positions(len(current_active_bots), NUM_CARD_COLS, CARD_WIDTH, CARD_HEIGHT)
for bot, (x, y) in zip(new_bots, positions[:len(new_bots)]):
card = BotPerformanceCardV2(st.session_state.active_instances_board.dashboard, x, y, CARD_WIDTH, CARD_HEIGHT)
st.session_state.active_instances_board.bot_cards.append((card, bot))
def archive_bot(bot_name):
"""Archive a stopped bot."""
try:
backend_api_client.docker.stop_container(bot_name)
backend_api_client.docker.remove_container(bot_name)
st.success(f"Bot {bot_name} archived successfully")
time.sleep(1)
except Exception as e:
st.error(f"Failed to archive bot {bot_name}: {e}")
initialize_st_page(title="Instances", icon="🦅")
api_client = get_backend_api_client()
def stop_controllers(bot_name, controllers):
"""Stop selected controllers."""
success_count = 0
for controller in controllers:
try:
backend_api_client.controllers.update_bot_controller_config(
bot_name,
controller,
{"manual_kill_switch": True}
)
success_count += 1
except Exception as e:
st.error(f"Failed to stop controller {controller}: {e}")
if not api_client.is_docker_running():
st.warning("Docker is not running. Please start Docker and refresh the page.")
st.stop()
if success_count > 0:
st.success(f"Successfully stopped {success_count} controller(s)")
# Temporarily disable auto-refresh to prevent immediate state reset
st.session_state.auto_refresh_enabled = False
if "active_instances_board" not in st.session_state:
active_bots_response = api_client.get_active_bots_status()
bot_cards = []
board = Dashboard()
st.session_state.active_instances_board = SimpleNamespace(
dashboard=board,
bot_cards=bot_cards,
)
active_bots = active_bots_response.get("data")
number_of_bots = len(active_bots)
if number_of_bots > 0:
positions = get_grid_positions(number_of_bots, NUM_CARD_COLS, CARD_WIDTH, CARD_HEIGHT)
for (bot, bot_info), (x, y) in zip(active_bots.items(), positions):
bot_status = api_client.get_bot_status(bot)
card = BotPerformanceCardV2(board, x, y, CARD_WIDTH, CARD_HEIGHT)
st.session_state.active_instances_board.bot_cards.append((card, bot))
else:
update_active_bots(api_client)
return success_count > 0
with elements("active_instances_board"):
with mui.Paper(sx={"padding": "2rem"}, variant="outlined"):
mui.Typography("🏠 Local Instances", variant="h5")
for card, bot in st.session_state.active_instances_board.bot_cards:
with st.session_state.active_instances_board.dashboard():
card(bot)
while True:
time.sleep(10)
st.rerun()
def start_controllers(bot_name, controllers):
"""Start selected controllers."""
success_count = 0
for controller in controllers:
try:
backend_api_client.controllers.update_bot_controller_config(
bot_name,
controller,
{"manual_kill_switch": False}
)
success_count += 1
except Exception as e:
st.error(f"Failed to start controller {controller}: {e}")
if success_count > 0:
st.success(f"Successfully started {success_count} controller(s)")
# Temporarily disable auto-refresh to prevent immediate state reset
st.session_state.auto_refresh_enabled = False
return success_count > 0
def render_bot_card(bot_name):
"""Render a bot performance card using native Streamlit components."""
try:
# Get bot status first
bot_status = backend_api_client.bot_orchestration.get_bot_status(bot_name)
# Only try to get controller configs if bot exists and is running
controller_configs = []
if bot_status.get("status") == "success":
bot_data = bot_status.get("data", {})
is_running = bot_data.get("status") == "running"
if is_running:
try:
controller_configs = backend_api_client.controllers.get_bot_controller_configs(bot_name)
controller_configs = controller_configs if controller_configs else []
except Exception as e:
# If controller configs fail, continue without them
st.warning(f"Could not fetch controller configs for {bot_name}: {e}")
controller_configs = []
with st.container(border=True):
if bot_status.get("status") == "error":
# Error state
col1, col2 = st.columns([3, 1])
with col1:
st.error(f"🤖 **{bot_name}** - Not Available")
st.error(f"An error occurred while fetching bot status of {bot_name}. Please check the bot client.")
else:
bot_data = bot_status.get("data", {})
is_running = bot_data.get("status") == "running"
performance = bot_data.get("performance", {})
error_logs = bot_data.get("error_logs", [])
general_logs = bot_data.get("general_logs", [])
# Bot header
col1, col2, col3 = st.columns([2, 1, 1])
with col1:
if is_running:
st.success(f"🤖 **{bot_name}** - Running")
else:
st.warning(f"🤖 **{bot_name}** - Stopped")
with col3:
if is_running:
if st.button("⏹️ Stop", key=f"stop_{bot_name}", use_container_width=True):
stop_bot(bot_name)
else:
if st.button("📦 Archive", key=f"archive_{bot_name}", use_container_width=True):
archive_bot(bot_name)
if is_running:
# Calculate totals
active_controllers = []
stopped_controllers = []
error_controllers = []
total_global_pnl_quote = 0
total_volume_traded = 0
total_unrealized_pnl_quote = 0
for controller, inner_dict in performance.items():
controller_status = inner_dict.get("status")
if controller_status == "error":
error_controllers.append({
"Controller": controller,
"Error": inner_dict.get("error", "Unknown error")
})
continue
controller_performance = inner_dict.get("performance", {})
controller_config = next(
(config for config in controller_configs if config.get("id") == controller), {}
)
controller_name = controller_config.get("controller_name", controller)
connector_name = controller_config.get("connector_name", "N/A")
trading_pair = controller_config.get("trading_pair", "N/A")
kill_switch_status = controller_config.get("manual_kill_switch", False)
realized_pnl_quote = controller_performance.get("realized_pnl_quote", 0)
unrealized_pnl_quote = controller_performance.get("unrealized_pnl_quote", 0)
global_pnl_quote = controller_performance.get("global_pnl_quote", 0)
volume_traded = controller_performance.get("volume_traded", 0)
close_types = controller_performance.get("close_type_counts", {})
tp = close_types.get("CloseType.TAKE_PROFIT", 0)
sl = close_types.get("CloseType.STOP_LOSS", 0)
time_limit = close_types.get("CloseType.TIME_LIMIT", 0)
ts = close_types.get("CloseType.TRAILING_STOP", 0)
refreshed = close_types.get("CloseType.EARLY_STOP", 0)
failed = close_types.get("CloseType.FAILED", 0)
close_types_str = f"TP: {tp} | SL: {sl} | TS: {ts} | TL: {time_limit} | ES: {refreshed} | F: {failed}"
controller_info = {
"Select": False,
"ID": controller_config.get("id"),
"Controller": controller_name,
"Connector": connector_name,
"Trading Pair": trading_pair,
"Realized PNL ($)": round(realized_pnl_quote, 2),
"Unrealized PNL ($)": round(unrealized_pnl_quote, 2),
"NET PNL ($)": round(global_pnl_quote, 2),
"Volume ($)": round(volume_traded, 2),
"Close Types": close_types_str,
"_controller_id": controller
}
if kill_switch_status:
stopped_controllers.append(controller_info)
else:
active_controllers.append(controller_info)
total_global_pnl_quote += global_pnl_quote
total_volume_traded += volume_traded
total_unrealized_pnl_quote += unrealized_pnl_quote
total_global_pnl_pct = total_global_pnl_quote / total_volume_traded if total_volume_traded > 0 else 0
# Display metrics
col1, col2, col3, col4 = st.columns(4)
with col1:
st.metric("🏦 NET PNL", f"${total_global_pnl_quote:.2f}")
with col2:
st.metric("💹 Unrealized PNL", f"${total_unrealized_pnl_quote:.2f}")
with col3:
st.metric("📊 NET PNL (%)", f"{total_global_pnl_pct:.2%}")
with col4:
st.metric("💸 Volume Traded", f"${total_volume_traded:.2f}")
# Active Controllers
if active_controllers:
st.success("🚀 **Active Controllers:** Controllers currently running and trading")
active_df = pd.DataFrame(active_controllers)
edited_active_df = st.data_editor(
active_df,
column_config={
"Select": st.column_config.CheckboxColumn(
"Select",
help="Select controllers to stop",
default=False,
),
"_controller_id": None, # Hide this column
},
disabled=[col for col in active_df.columns if col != "Select"],
hide_index=True,
use_container_width=True,
key=f"active_table_{bot_name}"
)
selected_active = [
row["_controller_id"]
for _, row in edited_active_df.iterrows()
if row["Select"]
]
if selected_active:
if st.button(f"⏹️ Stop Selected ({len(selected_active)})",
key=f"stop_active_{bot_name}",
type="secondary"):
with st.spinner(f"Stopping {len(selected_active)} controller(s)..."):
stop_controllers(bot_name, selected_active)
time.sleep(1)
# Stopped Controllers
if stopped_controllers:
st.warning("💤 **Stopped Controllers:** Controllers that are paused or stopped")
stopped_df = pd.DataFrame(stopped_controllers)
edited_stopped_df = st.data_editor(
stopped_df,
column_config={
"Select": st.column_config.CheckboxColumn(
"Select",
help="Select controllers to start",
default=False,
),
"_controller_id": None, # Hide this column
},
disabled=[col for col in stopped_df.columns if col != "Select"],
hide_index=True,
use_container_width=True,
key=f"stopped_table_{bot_name}"
)
selected_stopped = [
row["_controller_id"]
for _, row in edited_stopped_df.iterrows()
if row["Select"]
]
if selected_stopped:
if st.button(f"▶️ Start Selected ({len(selected_stopped)})",
key=f"start_stopped_{bot_name}",
type="primary"):
with st.spinner(f"Starting {len(selected_stopped)} controller(s)..."):
start_controllers(bot_name, selected_stopped)
time.sleep(1)
# Error Controllers
if error_controllers:
st.error("💀 **Controllers with Errors:** Controllers that encountered errors")
error_df = pd.DataFrame(error_controllers)
st.dataframe(error_df, use_container_width=True, hide_index=True)
# Logs sections
with st.expander("📋 Error Logs"):
if error_logs:
for log in error_logs[:50]:
timestamp = log.get("timestamp", "")
message = log.get("msg", "")
logger_name = log.get("logger_name", "")
st.text(f"{timestamp} - {logger_name}: {message}")
else:
st.info("No error logs available.")
with st.expander("📝 General Logs"):
if general_logs:
for log in general_logs[:50]:
timestamp = pd.to_datetime(int(log.get("timestamp", 0)), unit="s")
message = log.get("msg", "")
logger_name = log.get("logger_name", "")
st.text(f"{timestamp} - {logger_name}: {message}")
else:
st.info("No general logs available.")
except Exception as e:
with st.container(border=True):
st.error(f"🤖 **{bot_name}** - Error")
st.error(f"An error occurred while fetching bot status: {str(e)}")
# Page Header
st.title("🦅 Hummingbot Instances")
# Auto-refresh controls
col1, col2, col3 = st.columns([3, 1, 1])
# Create placeholder for status message
status_placeholder = col1.empty()
with col2:
if st.button("▶️ Start Auto-refresh" if not st.session_state.auto_refresh_enabled else "⏸️ Stop Auto-refresh",
use_container_width=True):
st.session_state.auto_refresh_enabled = not st.session_state.auto_refresh_enabled
with col3:
if st.button("🔄 Refresh Now", use_container_width=True):
# Re-enable auto-refresh if it was temporarily disabled
if not st.session_state.auto_refresh_enabled:
st.session_state.auto_refresh_enabled = True
pass
@st.fragment(run_every=REFRESH_INTERVAL if st.session_state.auto_refresh_enabled else None)
def show_bot_instances():
"""Fragment to display bot instances with auto-refresh."""
try:
active_bots_response = backend_api_client.bot_orchestration.get_active_bots_status()
if active_bots_response.get("status") == "success":
active_bots = active_bots_response.get("data", {})
# Filter out any bots that might be in transitional state
truly_active_bots = {}
for bot_name, bot_info in active_bots.items():
try:
bot_status = backend_api_client.bot_orchestration.get_bot_status(bot_name)
if bot_status.get("status") == "success":
bot_data = bot_status.get("data", {})
if bot_data.get("status") in ["running", "stopped"]:
truly_active_bots[bot_name] = bot_info
except Exception:
continue
if truly_active_bots:
# Show refresh status
if st.session_state.auto_refresh_enabled:
status_placeholder.info(f"🔄 Auto-refreshing every {REFRESH_INTERVAL} seconds")
else:
status_placeholder.warning("⏸️ Auto-refresh paused. Click 'Refresh Now' to resume.")
# Render each bot
for bot_name in truly_active_bots.keys():
render_bot_card(bot_name)
else:
status_placeholder.info("No active bot instances found. Deploy a bot to see it here.")
else:
st.error("Failed to fetch active bots status.")
except Exception as e:
st.error(f"Failed to connect to backend: {e}")
st.info("Please make sure the backend is running and accessible.")
# Call the fragment
show_bot_instances()