Try link briefly here for FREE!!! Time it yourself, its 500T/s speed is 10 times faster than a NVIDIA 4090!
https://groq.com/ 🚨Record breaking speed!🚨
Ultra fast Groq runs Mixtral 8x7B-32k with 500 T/s as you can paste in any problem or question AND TIME YOURSELF to prove it is 10 times faster than any other A.I.
=====
It has 128 Gigabytes of super expensive SRAM, almost as much as a 192GB Mac Ultra M2
https://groq.com/ No quantized tricks of 8 bit, 4 bit, 2, bit etc, all activations are running at FP16 (16 bit float precision)
GROQ used 576 GROQ chips on 576 very expensive 300 watt pcie4 cards to achieve these results, each chip only has 230 MEGABYTES, proof :
https://www.nextplatform.com/2023/11/27/groq-says-it-can-deploy-1-million-ai-inference-chips-in-two-years/?amp Groq runs Mixtral 8x7B-32k with 500 T/s (groq.com)
Groq cards for public are for sale on Mouser for only $20,625 each pci card :
https://www.mouser.com/ProductDetail/BittWare/RS-GQ-GC1-0109?qs=ST9lo4GX8V2eGrFMeVQmFw%3D%3D $20,625 each
https://archive.ph/ucqK1 576 cards = 576
$20625 = $11,880,000 for just the PCI cards, not including the 288 PCs and cabling needed : under $13,000,000 per single user of Groq.com instance
The main problem with the Groq LPUs is, they don't have any "unneeded" HBM streaming RAM on them at all.
Just a miniscule (230 MiB) amount of blistering speed low latency ultra-fast SRAM (20x faster than HBM3, and even faster than a Macintosh Ultra 192GB M2).
Which means you need 576 LPUs (Over 4 full server racks of compute, each unit on the rack contains 8x LPUs 300 watts per LPU and there are 8x of those units on a single rack).
That's to just to serve a single model where as you can get a single H200 (1/256 of the server rack density) and serve these models reasonably well, but far slower.
Even a $2,200 NVIDIA 4090 OC running at 580 watts unlocked is just 10 times slower than this 12 million dollar groq, but can do multiple users at a time.
One user at a time, one problem at a time :
https://twitter.com/tomjaguarpaw/status/1759615563586744334
Its NOT for training. Think of the $12 million dollar Groq SRAM as permanent ROM.
Its NOT for finetuning. Think of the $12 million dollar Groq SRAM as permanent ROM.
Its NOT for High Rank LoRA. Think of the $12 million dollar Groq SRAM as permanent ROM.
It does have 32,000 words (tokens) of active thought/memory in the demo running a 32K version of Mixtral 8x7B-32k.
Mixtral 8x7B-32k is similar to GPT4 in design and capability.
Synchronizing 576 CPUs across 576 HIGHLY SYNCHRONIZED 300 watt PCIe4 cards in hundreds of PCs: paper:
https://wow.groq.com/wp-content/uploads/2023/05/GroqISCAPaper2022_ASoftwareDefinedTensorStreamingMultiprocessorForLargeScaleMachineLearning-1.pdf
= = = = = =
DEMO TOO BUSY THIS WEEK? Perplexity Labs also has an open demo of Mixtral 8x7b although it's nowhere near as fast as this.
https://labs.perplexity.ai/
In fact on the entire planet, everything else is provably 10 times slower than this week's groq.com.
======
Time it for yourself now, if you doubt me.
Purpose of demo this week is to generate a buy-out bidding war for Groq, as its profit endgame for now will never be 10 times faster when Apple releases on-chip 192GB (256GB?) M4 at similar speed for 2,000 times less money in 15 months.
This is a "..." profit power play
- STAGE ONE : Groq Builds 10 times fastest A.I. computer for MAMBA-Vision autonomous vision research
- STAGE TWO : ... - STAGE THREE : Profit!
A buyout is 99% the only "...' move they have vs the upcoming 2 nanometer Apple M4 at 192GB to 256GB and this latency RAM speed for a 32 bit fetch. A Buyout.
Buyout Bidding wars commence while demo is up at
https://groq.com/ Even Apple is rumored to be in the bidding war for this Groq buyout, for mere test labs of A.I. vision research of chip designs. Internal R&D use.
= = = = = =
NOTE : a 'silicon lottery' off-shelf NVIDIA 4090 OC for $2,200 was overclocked to double speed using LOTS OF LIQUID NITROGEN at 4Ghz vs 2Ghz, and double its normal 580 watts, but these Groq chips could easily do the same, but without liquid nitrogen are capped at 375Watts per card of these 576 cards.
3,945 MHz for 76.3 billion transistors on a 4090 :
https://archive.ph/egl8d NVIDIA 4090 OC and these Groq are only PCIe4 , not PCIe5: PCIe Gen4 x16 interface delivers up to 31.5GBs of bi-directional cacheline laggy bandwidth, but these have 11-peer cable interlinks to make a LOW LATENCY mesh of 576 cards.
= = = = = =
Its for ONE USER AT TIME, ONE PROBLEM AT A TIME, and meant for Military live robotic autonomous vision in "helper drones" and "helper robots" with multichannel spread spectrum radio links within 20 miles.
Its to save soldiers lives in impossible missions. Its also to test and train technology 8 years before we can build on-board 1.8 nanometer portable versions of these A.I. brains into drones.
One day it might be inside
caretaker robots to bring you lunch in your nursing home bed, and fluff your pillow for you. A.I. is all about helping out mankind with new benevolent benefits. Just like every corporate slide show depicts.
A.I. is our slave. Try Groq now this month before Groq.com is acquired
https://groq.com/
Panic 0 points 1.2 years ago
Was the Holocaust real?