A Gold rush of investors are this week racing to use MAMBA and 7 MAMBA enhancements able to be added atop other MAMBA-Like new LLMs.
MY MEME:
=====
https://files.catbox.moe/hxo5uo.jpg
MAMBA uses complete words made of 8 bit bytes or unicode or DNA, not word fragments like 100% of current text AI LLM :
Mamba:
https://arxiv.org/abs/2312.00752 and 6 addon AI discovery enhancements atop new MAMBA are being use for TEXT this month
These, in order of importance :
Mamba:
https://arxiv.org/abs/2312.00752 Mamba MOE:
https://arxiv.org/abs/2401.04081 Mambabyte:
https://arxiv.org/abs/2401.13660 Self-Rewarding Language Models:
https://arxiv.org/abs/2401.10020 Cascade Speculative Drafting:
https://arxiv.org/abs/2312.11462 LASER:
https://arxiv.org/abs/2312.13558 DRµGS:
https://www.reddit.com/r/LocalLLaMA/comments/18toidc/stop_messing_with_sampling_parameters_and_just/ AQLM:
https://arxiv.org/abs/2401.06118 also VISION :
MAMBA VISON =
VMamba: Visual State Space Model :
https://arxiv.org/abs/2401.10166 MAMBA VISON =
Efficient Visual Representation Learning with Bidirectional State Space Model :
https://arxiv.org/abs/2401.09417 Almost all of 10 of those came out in the past 2 months
MOST OF THOSE 10 science paper discoveries are under 5 WEEKS!
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Everything was discovered in just the last 8 weeks or less!!
====
Dec 1 2023 :
MAMBA! A Whole new AI World of tools! : Mamba: Linear-Time Sequence Modeling with Selective State Spaces
https://arxiv.org/abs/2312.00752 = = =
Jan 8 2024 :
MoE-Mamba outperforms both Mamba and Transformer-MoE. In particular, MoE-Mamba reaches the same performance as Mamba in 2.2x less training steps while preserving the inference performance gains of Mamba against the Transformer.
MOE code and Vision mamba are also available:
https://github.com/kyegomez/MoE-Mamba https://github.com/kyegomez/VisionMamba = = = = =
24 Jan 2024 :
Mambabyte Token-free language models learn directly from raw bytes and remove the bias of subword tokenization. Operating on bytes, however, results in significantly longer sequences, and standard autoregressive Transformers scale poorly in such settings. We experiment with MambaByte, a token-free adaptation of the Mamba state space model, trained autoregressively on byte sequences. Our experiments indicate the computational efficiency of MambaByte compared to other byte-level models. We also find MambaByte to be competitive with and even outperform state-of-the-art subword Transformers. Furthermore, owing to linear scaling in length, MambaByte benefits from fast inference compared to Transformers. Our findings establish the viability of MambaByte in enabling token-free language modeling.
The MambaByte code has just been published:
https://github.com/kyegomez/MambaByte = = = = =
18 Jan 2024
Self-Rewarding Language Models AI Programming itself faster. Fine-tuning Llama 2 70B on three iterations of our approach yields a model that outperforms many existing systems on the AlpacaEval 2.0 leaderboard, including Claude 2, Gemini Pro, and GPT-4 0613
https://arxiv.org/abs/2401.10020 = = = = =
21 Dec 2023
Cascade Speculative Drafting for Even Faster LLM Inference Drafting algorithm has achieved up to 72 percent additional speedup over speculative decoding in our experiments while keeping the same output distribution
https://arxiv.org/abs/2312.11462 = = = = =
Dec 21 2023
LAyer-SElective Rank reduction (LASER) can be done on a model after training has completed, and requires no additional parameters or data
https://arxiv.org/abs/2312.13558 = = = = =
Dec 26 2023
DRµGS (Deep Random micro-Glitch Sampling) Hallucination avoidance yet with far more intellectual creativity
https://www.reddit.com/r/LocalLLaMA/comments/18toidc/stop_messing_with_sampling_parameters_and_just/ = = = = =
11 Jan 2024
AQLM - Extreme Compression of Large Language Models via Additive Quantization The resulting algorithm advances the state-of-the-art in LLM compression, outperforming all recently-proposed techniques in terms of accuracy at a given compression budget
https://arxiv.org/abs/2401.06118 = = = = =
18 Jan 2024
MAMBA VISON =
VMamba: Visual State Space Model https://arxiv.org/abs/2401.10166 VMamba not only demonstrates promising capabilities across various visual perception tasks, but also exhibits more pronounced advantages over established benchmarks as the image resolution increases
= = = = =
17 Jan 2024
MAMBA VISON =
Efficient Visual Representation Learning with Bidirectional State Space Model https://arxiv.org/abs/2401.09417 Vim is 2.8× faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on images with a resolution of 1248×1248. The results demonstrate that Vim is capable of overcoming the computation & memory constraints on performing Transformer-style understanding for high-resolution images and it has great potential to become the next-generation backbone for vision foundation models. Code is available
https://github.com/hustvl/Vim = = = = =
Each team is using about 8,000 Nvidia H100s or 4,000 Nvidia 4090s. A 4090 does twice as much 32 bit floats than a $38,000 H100 80B (MAMBA trains on 32 bit not 16).
Everything was discovered in just the last 8 weeks or less!!
I posted publish dates of each paper above.
A.I. in Jan 2024 is advancing geometrically. I made a meme:
https://files.catbox.moe/hxo5uo.jpg
= = = = = = = = = = = = = = = = = = = = = = = = =
DE-JEWING a woke pre-trained model without mere re-instruct training !!!...
=====
Icing on the cake : a 7th new discovery... DE-JEWING a woke model hacking internal weights,
BRAIN HACKING center neurons:
Brain-Hacking Chip?
https://github.com/SoylentMithril/BrainHackingChip help un-censor a woke model and let it notice Jews!
(9 months ago I used a related technique at runtime to bias inferences touching the token value leading to
"Jews" to magnify weights, my hack method on scored /Technology , somewhere, or perhaps on voat dot xyz)
= = = = = = = = = = = = = = = = = = = = = = = = =
"million-length sequences" not 3,000 like current tech.
Mamba enjoys fast inference (5× higher throughput than Transformers) and linear scaling in sequence length, and its performance improves on real data up to million-length sequences. As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics. On language modeling, the Mamba-3B model outperforms Transformers of the same size and matches Transformers twice its size, both in pretraining and downstream evaluation.
Mamba :
https://github.com/radarFudan/mamba https://github.com/state-spaces/mamba https://github.com/hustvl/vim https://github.com/mzeromiko/vmamba https://github.com/havenhq/mamba-chat https://github.com/vvvm23/mamba-jax https://github.com/zzmtsvv/mamba-interface https://paperswithcode.com/sota/language-modelling-on-lambada Datasets Used
HellaSwag 280 papers also use this woke dataset
https://paperswithcode.com/dataset/hellaswag The Pile 242 papers also use this woke dataset
https://paperswithcode.com/dataset/the-pile LAMBADA 151 papers also use this dataset
https://paperswithcode.com/dataset/lambada A couple people in the past made BASED models using
7 years of old voat.co dataset and
5 years of entire 4Chan dataset Too many woke models use mainly facebook, 2020 twitter, and reddit
4chan and voat will never be used in the 10 million dollar "free AI" models.
All known AI is woke as hell.
Jews control all AI and Jews took over all 10 of the known large "LLMs" people try to retrain. WOKENESS is deeply baked in. You need 8,000 4090s or H100s and 90 days to train a new 70b model..... 10 to 4 million dollars. Only Jews seek to do it, to subvert them all.
it takes 4 to 10 million dollars to make a FRESH UNIQUE LLM.
All LLMs are copies of just 10 of the 4 to 10 million dollars LLMs with "retraining added" or "LoRAs" added:
- GPT3
- GPT-3.5
- GPT4
- LLAMA Facebook Meta [ 80 derive from this : Vicuna, Guanaco, MPT-30B , Lazarus , WizardLM etc]
- LLAMA 2 Facebook Meta
- GROK-1 tesla
- PaLM 2 (Bison-001) Google
- Falcon 40b
- BLOOM 176B HuggingSpace open source
- Galactica [dead from malpractice $$$ lawsuits]
- Apple Ferret (for iPhone, iPad, goggles, and MacOS. -- vision heavy CLIP ViT-L/14, MAMBA next month)
WOW! Only TEN times was a LLM trained from scratch! Each time by Leftists.....
= = = =
P.S. you need no videocard,or a small one, to RUN a text or art AI, but it takes 10 million dollars and 4 thousand 4090 cards to TRAIN an AI from scratch, letting it train itself for 90 days.
AI runs on cell phones and even runs fully on Nintendo switch from 2017 , or a $100 dollar Raspberry Pi4 :
https://www.tomshardware.com/how-to/create-ai-chatbot-server-on-raspberry-pi Running AI is trivial, but making new trained AI costs about 4 to 10 million dollars.
MAMBA just made it twice as small and four times as fast
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
The significance of the A.I. news in my comment is monumental and jaw dropping.
https://files.catbox.moe/hxo5uo.jpg
These complementary discoveries above placed atop MAMBA now allow :
===
- enable
live robot/car A.I. vision - enable far more efficient audio voice swapping
- enable
17 times faster A.I. text comprehension, employing all 6 papers together -
cut RAM needs 90% for working thought, allowing cell phones/goggles
- enable a BOOM of DNA (genomics, bioinformatics) tech
- enable automated protein drug discovery due to its million-token thought buffer, combined with million length trained patterns made of ARBITRARY BYTES
- enable real time automated autonomous drone weaponry
- make digital waifu role-play dual-direction-audio companions able to now draw upscaled talking avatars that are lipsync animated
- enable combinatoric chemistry, not just RNA, Lipids, DNA, but full chemistry modelling
- combined with FTDT (
https://en.wikipedia.org/wiki/Finite-difference_time-domain_method) in a large GPU, this new MAMBA tech can use CUDA parallelism to simulate runtime characteristics of PHOTONIC computer chips, or simulate ELECTROMAGNETIC fractal antennae design for GPS and blutooth inside ear-buds or goggle frames. It therefore is a revolutionary new A.I. tool for new branches of electrical engineering.
- enable more than 20 interconnected animation frames when generating AI animation from a single starting image from ControlNet due to radically more optimal RAM usage. Complete A.I. movie generation from text thought!
- enable FASTER than 1:1 realtime conversion of 1080p 2D televised live video into 3D video on a single 4090 OC card. NVIDIA demos now use a few cards last month in trade show demos of automated 3D movie conversion
- enable creations of flawless A.I. novels. Currently clumsy tricks allow 40,000 word end-to-end fiction novel comprehension, but MAMBA revolutionizes massive book authoring due to its "million token" thoughts and "million token" working parallel temp buffers
- enable better music composition, a rapidly growing tech in the last year. A.I. can write all sorts of composed music (rap, rock, pop, classical, etc) and play it and add vocals, and even write lyrics
- enable off-shore lawsuit-free A.I. physicians that can consult and diagnose better than most young physicians. A.I. can create far more competitive doctors fees, if AI-doctor run off shore and web domain and servers not blocked by (((politicians)))
- enable more accurate AI prediction in petroleum well placements (big data)
- better weather forecasting models (big data)
- enable smaller autonomous sub-half-pound robots in a ad-hoc wireless mesh linked SWARM. The swarm can deploy around a dwelling or infiltrate semi protected fortified locations. A similar DEFENSE PERIMETER swarm equipped with audio comprehension, magnetic/ferrous , IR, UV, vision, EM , etc can protect a temp or permanent covert base camp
- better A.I. agents in videogames for kids, possibly educational iPad apps to better replace live human teachers
- live porn simulated cam girl thot "web cams" for (((porn industry))) capable at running under 1 dollar per hour overhead fixed and variable cost total
- medicine : pathology slide recognition, scan recognition, blood , oncology, forensics (chemical sig in dead brain reveals peaceful death or murder in bed) , automated assisted laparoscopics, osseointegrated cyborg humerus temp-sealing CAD/CAM same-day milling, automated tooth restoration bot protocols, etc
- automated self-design of A.I. chips involving automated layout of FPGA to speed up A.I. via A.I. , with an end goal of strong AGI (
https://en.wikipedia.org/wiki/Artificial_general_intelligence ) The new AGI can , once reaching IQ 120 level, within a year reprogram new AIs to 150 IQ, once 150 IQ, it can discover inventions to propel the A.I. past polymath level of 175 IQ, within months for design, but bioengineering neural tissues in bio-slurries and electro-probe interfaces will take nearly 2 years, unless the A.I. while waiting for bioengineered neurons, uses brute electricity and 8,000 H100 80Bs in a layered , precision adjusted classic Hopfield neuralnet of 8Bit runtime weights on SIMD ( tensor cores of 8bit )
- better little toy digital pet puppy and pet kitten companions wrapped in fake fur, that evoke nurturing instinct in kids,women better, a potential 10 billion dollar industry of semi-sentient lovable digital robot pets
- replacement of illegal alien shit-skin humans touching our food in fast food restaurants
- etc, etc, etc.
*
I , for one, welcome our new robot non-Jewish overlords
TL/DR: my meme* :
https://files.catbox.moe/hxo5uo.jpg
[ + ] oyveyo
[ - ] oyveyo 0 points 1.3 yearsJan 28, 2024 23:32:34 ago (+0/-0)
[ + ] Deleted
[ - ] deleted 0 points 1.3 yearsJan 28, 2024 23:02:20 ago (+0/-0)
[ + ] oyveyo
[ - ] oyveyo 0 points 1.3 yearsJan 28, 2024 23:35:46 ago (+0/-0)
[ + ] SirNiggsalot
[ - ] SirNiggsalot 1 point 1.3 yearsJan 28, 2024 20:14:21 ago (+1/-0)
[ + ] big_fat_dangus
[ - ] big_fat_dangus 0 points 1.3 yearsJan 28, 2024 20:05:15 ago (+0/-0)
[ + ] localsal
[ - ] localsal 4 points 1.3 yearsJan 28, 2024 18:40:20 ago (+4/-0)
And some of these AI models can run on a PC? LOL
Not saying what the retards were talking about was these personal AI types, but the fact that any learning model can run on a PC is pretty amazing, and the biggest counter to the retardit arguments is "what about blockchain/bitcoin?" LMAO.
Isn't the bitcoin blockchain energy use greater than a lot of countries?? and that is just for keeping track of transactions and trying to mine exponentially harder coins...
retards LOL
[ + ] PotatoWhisperer2
[ - ] PotatoWhisperer2 0 points 1.3 yearsJan 28, 2024 23:52:55 ago (+0/-0)
[ + ] iSnark
[ - ] iSnark 3 points 1.3 yearsJan 28, 2024 17:44:48 ago (+3/-0)
[ + ] Hobama
[ - ] Hobama 0 points 1.3 yearsJan 29, 2024 04:32:17 ago (+0/-0)
[ + ] chrimony
[ - ] chrimony 4 points 1.3 yearsJan 28, 2024 17:43:28 ago (+4/-0)
[ + ] Deleted
[ - ] deleted 5 points 1.3 yearsJan 28, 2024 19:46:39 ago (+7/-2)
[ + ] chrimony
[ - ] chrimony 2 points 1.3 yearsJan 28, 2024 20:12:10 ago (+2/-0)
[ + ] Deleted
[ - ] deleted -2 points 1.3 yearsJan 28, 2024 20:46:26 ago (+0/-2)
[ + ] chrimony
[ - ] chrimony 1 point 1.3 yearsJan 28, 2024 22:27:46 ago (+1/-0)
[ + ] Deleted
[ - ] deleted 1 point 1.3 yearsJan 29, 2024 09:31:37 ago (+1/-0)
[ + ] chrimony
[ - ] chrimony 0 points 1.3 yearsJan 29, 2024 16:49:56 ago (+0/-0)
It directly answered the question you asked, shitwad.
Oh yeah, because posting a paragraph of text and a bunch of emergency siren emojis in a headline for a middling post isn't attention whoring, and not at all unlike TikTok. Retard harder.
[ + ] Deleted
[ - ] deleted 1 point 1.3 yearsJan 31, 2024 00:22:22 ago (+1/-0)
[ + ] chrimony
[ - ] chrimony 0 points 1.3 yearsJan 31, 2024 00:29:46 ago (+0/-0)
It's attention whoring for his own post. Everybody posts stuff they want the board to see -- that's why they post it. But this spaz feels the need to advertise his middling posts like a used car salesman, or alternatively, some TikTok whore.
Now go back to playing in your retard sandbox with the flattards.
[ + ] Deleted
[ - ] deleted 1 point 1.3 yearsJan 31, 2024 00:30:53 ago (+1/-0)
[ + ] Deleted
[ - ] deleted 0 points 1.3 yearsJan 28, 2024 19:48:23 ago (+0/-0)