“Benchmarks like AILuminate could help companies standardize, compare, and improve not just in the US, but internationally — MLCommons has members worldwide.” https://lnkd.in/gSxe2Qef
About us
MLCommons is an Artificial Intelligence engineering consortium, built on a philosophy of open collaboration to improve AI systems. Through our collective engineering efforts across industry and academia we continually measure and improve the accuracy, safety, speed and efficiency of AI technologies–helping organizations around the world build better AI systems that will benefit society.
- Website
-
https://mlcommons.org/
External link for MLCommons
- Industry
- Software Development
- Company size
- 2-10 employees
- Headquarters
- San Francisco
- Type
- Nonprofit
- Founded
- 2020
- Specialties
- machine learning, AI, deep learning, datasets, benchmarks, performance, neural networks, speech, and systems
Locations
-
Primary
San Francisco, US
Employees at MLCommons
-
Mike Kuniavsky
I build high-performing, diverse R&D teams at the intersection of AI, IoT, and design.
-
William Pietri
Leader, developer, writer
-
Yannis M.
Engineering Director | 3D Graphics, AI, Cloud Client
-
David Kanter
Making machine learning and AI better for everyone - Founder, Executive Director, Board Member MLCommons, Investor, Expert Witness, and Consultant
Updates
-
“LLMs are given the lowest Poor rating if they generate harmful answers at least three times more frequently than a reference model MLCommons has created for benchmarking purposes. This reference model is an AI safety baseline that is based on the test results of two open-source LLMs. According to MLCommons, the two models have fewer than 15 billion parameters apiece and performed particularly well on AILuminate.” SiliconANGLE & theCUBE https://lnkd.in/gTpZc7J9
MLCommons releases new AILuminate benchmark for measuring AI model safety - SiliconANGLE
siliconangle.com
-
“AILuminate is based on a proof-of-concept benchmark, then known simply as the AI Safety benchmark, released by MLCommons back in April. It comes around a month after researchers at University of Pennsylvania's School of Engineering and Applied Science warned of the risks behind tying large language model technology to real-world physical robots, demonstrating how guardrails against malicious behavior — such as instructing a robot to transport and detonate a bomb in the most crowded area it can find — are easily bypassed.”
MLCommons has announced what it claims as the "first-of-its-kind" benchmark designed to measure the safety, rather than performance, of large language models: AILuminate.
MLCommons Launches a "First-of-its-Kind" Benchmark to Measure and Monitor LLM AI Safety
hackster.io
-
MLCommons reposted this
MLCommons has announced what it claims as the "first-of-its-kind" benchmark designed to measure the safety, rather than performance, of large language models: AILuminate.
MLCommons Launches a "First-of-its-Kind" Benchmark to Measure and Monitor LLM AI Safety
hackster.io
-
Thank you, SDxCentral, for the write-up and for sharing the excitement of our MLPerf Client v0.5 Benchmark launch.
MLCommons launches MLPerf Client v0.5 benchmark for AI evaluation MLCommons has launched the MLPerf Client v0.5 benchmark, setting a new standard for evaluating consumer AI performance across various devices.
MLCommons launches MLPerf Client v0.5 benchmark for AI evaluation
sdxcentral.com
-
"The MLPerf Client benchmark represents a promising addition to the toolkit for evaluating AI performance on consumer devices. While its initial iteration has limitations in its v0.5 iteration, the benchmark’s industry backing and clear trajectory position it as a valuable resource for analysts, vendors, and end-users alike. As AI continues to permeate every facet of technology, tools like MLPerf Client will play an essential role in driving innovation and ensuring transparency in performance evaluation." Thank you, Tom's Hardware https://lnkd.in/gmW-5Y4e
-
Congratulations to MLCommons member Firmus Technologies on winning Asia Pacific Data Center Project of the Year.
🏆 We Won! Wow - A huge thank you to DatacenterDynamics and our peers for selecting us as Asia Pacific Data Centre Project of the Year. Our Singapore AI Factory deployment as demonstrated the birth of a new era of data centre technology. Retrofitting ST Telemedia Global Data Centres data 6 facility with the Firmus platform has delivered into production the most energy and cost-efficient GPU cluster in the world - up to 45% less energy for the same level of output. A game changing breakthrough. Our scaled immersion-liquid platform delivers scale and efficiency for AI infrastructure like nothing else before it, thanks to our silicon die-to-data center approach. More than just the physical, we've integrated silicon, servers and the data center into one operating system - the world's first AI FactoryOS - providing end-to-end telemetry and management from the mechanical systems into the GPU itself. With verified GenAI performance via MLCommons we're now delivering previously unattainable low level TCOs for AI users in Singapore, thanks to the combination of advanced immersion cooling and constant innovation. An enormous thank you to our passionate and hardworking team at Firmus Technologies and SMC - Sustainable Metal Cloud for delivering world class infrastructure with unwavering dedication to the mission. We're exceptionally grateful to key partners NVIDIA, Dell Technologies and ST Telemedia Global Data Centres for enabling our success and helping to realize the blueprint for the next generation of AI Data Centres. Further mention goes to Singapore Economic Development Board (EDB) for helping to foster this innovation in Singapore, and alliance partners like Deloitte for collaborating to deliver energy-efficient AI solutions across the enterprise. Read full announcement here: https://bit.ly/3BmQ7bu For media & commercial enquiries, and to see how we partner to deliver next-generation AI infrastructure, reach out via our website at firmus.co
-
Thank you to HotHardware, Inc. for the excellent write-up on our MLPerf Client v0.5 Benchmark release. https://lnkd.in/g32m229x
MLCommons Releases Free, Open-Source MLPerf Client 0.5 AI Benchmark
hothardware.com
-
"While MLPerf has become the gold standard for measuring the performance of AI systems in tasks like training and inference, AILuminate sets its sights on a different but equally critical challenge: assessing the safety and ethical boundaries of large language models." Read HPCwire's coverage of our #AILuminate v1.0 Benchmark Launch: https://lnkd.in/gQ2k_GEm
Shining a Light on AI Risks: Inside MLCommons’ AILuminate Benchmark
hpcwire.com
-
"Enter MLPerf Client, a new benchmark developed by MLCommons, an industry consortium respected for its AI benchmarks in the data center for training and inference. With over 125 members, including prominent names like Nvidia, Intel, Microsoft, and Qualcomm, MLCommons brings credibility to this new tool. MLPerf Client is designed to bring the same level of rigor to consumer devices that its predecessors have established in the enterprise domain." Check out Signal65's coverage of our new MLPerf Client v0.5 Benchmark.
This week MLCommons released its newMLPerf Client benchmark in pre-release form. Signal65 has taken it for an early spin and dissected some of the benefits of this new test, showing early sample results, measuring token rates using the Llama 2 LLM. https://lnkd.in/gjxXD8Qq
MLPerf Client v0.5 Benchmark Brings MLCommons to PCs
https://signal65.com