![]() ![]() Here is what the 8x NVIDIA A100 80GB 500W nvidia-smi looks like: 8x NVIDIA A100 80GB 500W Nvidia Smi Output That one is waiting on the YouTube video to be complete. We also have a piece using the 500W GPUs in a system from another OEM coming in July. ![]() They have been offered by most OEMs for some time, but are not listed on NVIDIA’s current A100 data sheet which only lists 400W models. We will have a review of the Inspur NF5488A5 next week, but the 1.0-1035 result is using the 500W variant so we should exclude it here. One may immediately look at the Inspur NF5488A5 result and see 27.48 still using AMD CPUs. Looking at 1.0-1059 we can see this is using the SXM4-based A100’s that are the 80GB 400W models. Graphcore highlights its 37.12 minutes to complete the benchmark while the NVIDIA DGX A100 takes 28.77 minutes which is the result 1.0-1059. Graphcore is using its “Closed” division result and did not provide an “Open” division result. Here, Graphcore is claiming a 1.6x advantage over NVIDIA. Here is Graphcore’s blog post comparison: Source: Graphcore Blog MLPerf Training V1.0 RESNET 50 In the above case, this is the ResNet-50 benchmark. We whittled down the results to somewhat comparable systems, then ordered them based on the benchmark Graphcore is focusing on instead of the MLPerf ID. Specifically in this section, we are highlighting the Graphcore IPU-POD16 results as those were the primary focus for comparison with NVIDIA’s offerings in the Graphcore blog. Graphcore MLPerf Training V1.0 Closed Division Image Classification ResNet Results Available and Preview There are also many submissions for clusters, so we wanted to remove those as well. With MLPerf Training v1.0, each system does not have to run every test so most submissions were for 1 or 2 tests, not all eight. We tried to get a section of the results that are comparatively closest to the Graphcore IPU-POD16 figures. This is one of those areas where our direct experience intersects questionable vendor marketing and so we call that out so millions of STH readers can get to a better level of industry understanding.įirst, let us get to the numbers. Since this is STH, we are going to share why with the context that, of course, we are the site in the world that actually does hands-on independent server reviews in our various data center facilities. These statements are perhaps true, but require looking at a specific, effectively irrelevant comparison. ![]() In the case of ResNet-50 training on the IPU-POD16, this is a factor of 1.6x, while the Graphcore advantage for BERT is 1.3X. In Graphcore’s blog post about its MLPerf Training v1.0 results, it states:įor both ResNet-50 and BERT, it is clear that Graphcore systems deliver significantly better performance-per-$ than NVIDIA’s offering. Analyzing Graphcore’s Celebratory Analysis Prior to the MLPerf results, I would have enthusiastically put them in the camp of top AI training chip contenders. For those who are unaware, I have been very excited about Graphcore for some time. Then after I was done with the hands-on session, I had a few folks highlight Graphcore’s glowing analysis of its own results. I thought it was a bit harsh on Graphcore’s performance so that was toned down a bit. When I saw Cliff’s initial draft, he flagged something for me to look at. At this point, the STH team and I have gotten to know the A100 platforms from PCIe cards, Redstone, and Delta parts, even ones that are not on the official spec sheets. This is all for a series we are running in July. We have been working with OEMs to get their big GPU servers tested, the kinds that cannot be powered on standard 208V 30A North American data center circuits. The ContextĪs a bit of a backstory, this week when Cliff’s MLPerf Training v1.0 results went live I was actually testing a few NVIDIA HGX A100 8x GPU systems for a piece that will go live, after our review of an Inspur HGX A100-based server goes live, in the very near future. While many read Graphcore’s blog post and celebrated a victory, the details are quite a bit more nuanced. While some may have assumed that a dedicated AI accelerator would handily beat a NVIDIA GPU-derived architecture in terms of performance, or at least price performance, that is not exactly what happened. Graphcore finally submitted MLPerf Training v1.0 results. This has to be one of the strangest stories of 2021 thus far. Graphcore IPU Machine M2000 With Heatsink ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |