产品中心
Product
首页 > 产品中心 > 网络产品 > UFM
返回

UFM Enterprise 网络可视化和控制


UFM Enterprise 平台将 UFM Telemetry 的优势与增强的网络监控和管理相结合。该平台可实现自动化网络发现和调配、流量监 控和拥塞发现,还支持作业调度调配,并能够与 Slurm 和 Platform Load Sharing Facility (LSF) 等行业领先的作业调度器以及云和集群管理器集成。 


主要特性: 


• 包含 UFM Telemetry 的功能 

• 自动化网络发现和验证 

• 安全线缆管理 

• 拥塞追踪以诊断流量瓶颈 

• 问题识别和解决 

• 全球软件更新 

• 与 Slurm 和 Platform LSF 集成并支持作业调度器调配 

• 高级报告和丰富的 REST API 

• 基于 Web 的丰富的 GUI



NVIDIA UNIFIED FABRIC
MANAGER (UFM) PORTFOLIO
AI-Powered Cyber Intelligence and Analytics Platforms


Data centers host many users and applications and have become the competitive advantage for research organizations and manufacturing companies. Keeping the data center intact and healthy is critical—a data center shutdown can mean the loss of millions of dollars. What’s more, malicious users often exploit data center access to misuse compute resources such as by running prohibited applications, resulting in higher operating costs.


NVIDIA® UFM® platforms revolutionize InfiniBand network management. By combining enhanced and real-time network telemetry with AI-powered cyber intelligence and analytics, the UFM platforms empower you to discover operation anomalies and predict network failures for preventive maintenance. UFM platforms comprise multiple levels of solutions and capabilities to suit yourdata center’s needs and requirements. At the basic level, the UFM Telemetry platform provides network validation tools, and monitors the network performance and conditions. It captures, for example, rich real-time network telemetry information, and workload usage data and system configuration, and streams it to a defined on-premises or cloud-based database for further analysis.


The mid-tier UFM Enterprise platform adds enhanced network monitoring, management, workload optimizations and periodic configuration checks. In addition to including all of the UFM Telemetry services, it provides network setup, connectivity validation, and secure cable management, automated network discovery and network provisioning, traffic monitoring, and congestion discovery. UFM Enterprise also enables job scheduler provisioning and integration with Slurm and Platform LSF, in addition to network provisioning and integration with OpenStack, Azure Cloud and VMware.


The enhanced UFM Cyber-AI platform includes all of the UFM Telemetry and UFM Enterprise services. The unique advantages of the Cyber-AI platform are based on capturing rich InfiniBand telemetry information over time and utilizing deep learning algorithms. The platform learns the data center’s “heartbeat,” operation mode, conditions, usage, and workload network signatures. It builds an enhanced database of telemetry information and discovers correlations between events. It detects performance degradations, usage and profile changes over time, and alerts to abnormal system and application behavior, and potential system failures. The Cyber-AI platform can also perform corrective actions.


In addition to detecting past and current events, the Cyber-AI platform can indicate future performance degradations or abnormal usage of the data center computing resources, by translating and correlating changes in the data center heartbeat. Such changes and correlations trigger the performing of predictive analytics, and initiate alerts that indicate abnormal system and application behavior, as well as potential system failures. System administrators can quickly detect and respond to such potential security threats, and address upcoming failures in an efficient manner, saving OPEX and maintaining end-user SLAs. Predictability is optimized over time with the collection of additional system data.



UFM ENTERPRISE 可视化面板

image.png




UFM各版本特征


image.png

image.png



上一篇:UFM Telemetry 实时监控