MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU

234 points | by chrsw 9 hours ago

43 comments