search

UMD     This Site





CALCE and CEEE researchers Amir Hossein Zabihi Tari and Dr. Diganta Das presented reliability analysis of thermal management systems (TMS) for edge data centers at the recent CEEE Consortium meeting. Their work, conducted in collaboration with Dr. Andres Sarmiento and Prof. Michael Ohadi, demonstrated CALCE's expertise in assessing thermal management system designs into actionable availability and reliability metrics for high-power computing infrastructure.

Pecht book cover

The team's analysis focused on a 2.4 kW liquid-cooled unit, combining physics-of-failure principles with reliability block diagram (RBD) modeling to validate thermal performance against strict availability targets (minimum 95% availability, maximum 438 hours of annual downtime). By grouping components such as fasteners, seals, and structural elements into cold plate assemblies for joint replacement, they demonstrated how maintenance choices can impact maintenance costs and downtime while maintaining required system uptime.

Pecht book cover

Over the course of their study, this reliability framework will be upscaled to a 1.5 MW ISO-40 container system housing 12 racks (126 kW per rack, 16 servers each). Their concurrent design approach integrates thermal performance, reliability modeling, degradation analysis (TIM pump-out, O-ring aging, microchannel clogging), and simulation-based design of experiments to identify optimal redundancy levels, maintenance strategies, and component groupings that meet demanding edge data center requirements.

The presentation highlighted CALCE's role as an industry partner, turning complex reliability modeling into guidance on component selection, redundancy planning, and grouped maintenance strategies. Consortium members gained direct insight into how CALCE's physics-of-failure toolkit addresses the unique challenges of edge infrastructure, ensuring systems that are not only thermally efficient but also robust, maintainable, and capable of meeting mission-critical uptime demands.

This work is conducted under the grant Flexnode, Inc., Cooling System Reliability Evaluation,” Award # 25020573. The broader project is funded in part by the U.S. Department of Energy’s Advanced Research Projects Agency-Energy (ARPA-E).

For more information about the presentation, contact Dr. Diganta Das.

Learn about CALCE's upcoming events here.



March 9, 2026


«Previous Story  

 

 

Current Headlines

ION Storage Systems Announced Successful Customer Qualification

Engineering safer, more sustainable AI for all

Reliability and Availability Analysis of Data Center Thermal Management System Presented at CEEE Consortium

Devon Richman Defends Doctoral Dissertation on Side?Channel Methods for Reliability Assessment and Counterfeit Detection

Celebrating Women’s History Month & Multiracial Heritage Month 2026

University of Maryland Research is Redefining Health Care

Anticipation Builds as Zupnik Hall Nears Completion

Alireza Khaligh Named ISR Director

ECE Chair Sennur Ulukus Named to Turkish Science Academy

MATRIX Lab Workshop Focuses on Fielding Autonomous Systems

 
 
Back to top  
Home Clark School Home UMD Home