Chi phí đào tạo AI đang tăng theo cấp số nhân — IBM cho biết máy tính lượng tử có thể là một giải pháp. Trong bài viết của Wall Street Journal đầu tháng này, một phần ba các nhà máy điện hạt nhân đang tiến hành thảo luận với các công ty công nghệ để cung cấp năng lượng cho các trung tâm dữ liệu mới của họ. Trong khi đó, Goldman Sachs dự báo rằng AI sẽ dẫn đến sự tăng 160% trong việc sử dụng năng lượng của các trung tâm dữ liệu từ bây giờ đến năm 2030. Điều đó sẽ đưa khí thải carbon dioxide lên hơn gấp đôi so với mức hiện tại. Mỗi truy vấn ChatGPT được ước lượng mất ít nhất 10 lần năng lượng so với một tìm kiếm Google. Câu hỏi đặt ra là: liệu chi phí đào tạo các mô hình AI tăng theo cấp số nhân sẽ giới hạn tiềm năng của AI cuối cùng? Lễ trao giải VB Transform 2024 đã thảo luận về chủ đề này trong một bảng trò chuyện do Hyunjun Park, đồng sáng lập và CEO của CATALOG, chủ trì. Để thảo luận về phạm vi của vấn đề và các giải pháp tiềm năng, Park đã mời Dr. Jamie Garcia, giám đốc thuật toán lượng tử và đối tác tại IBM; Paul Roberts, giám đốc tài khoản chiến lược tại AWS; và Kirk Bresniker, kiến trúc sư trưởng tại Hewlett Packard Labs, cũng như là một HPE Fellow và Phó Tổng giám đốc.
#sựkiện #AI #máytínhlượngtử
Earlier this month, the Wall Street Journal reported that a third of nuclear power plants are in talks with tech companies to power their new data centers. Meanwhile, Goldman Sachs projected that AI is going to drive a 160% increase in power usage by data centers from now until 2030. That is going to take carbon dioxide emissions to more than double current levels. Each ChatGPT query is estimated to take at least 10 times as much energy as a Google search. The question is: will the exponentially growing cost of training AI models ultimately limit the potential of AI?
VB Transform 2024 tackled the topic in a panel led by Hyunjun Park, co-founder and CEO of CATALOG. To talk about the scope of the problem and potential solutions, Park welcomed to the stage Dr. Jamie Garcia, director of quantum algorithms and partnerships at IBM; Paul Roberts, director of strategic accounts at AWS; and Kirk Bresniker, chief architect at Hewlett Packard Labs, as well as an HPE Fellow and VP.
Unsustainable resources and inequitable technology
“The 2030 touchdown is just far enough that we can make some course corrections, but it’s also real enough that we should be considering the ramifications of what we’re doing right now,” Bresniker said.
Somewhere between 2029 and 2031, the cost of resources to train a single model, one time, will surpass the U.S. GDP, he added — and will surpass worldwide IT spending by 2030, he added, so we’re headed for a hard ceiling, and now is when decisions must be made, and not just because the cost will become impossible.
“Because inherent in the question of sustainability is also equity,” he explained. “If something is provably unsustainable, then it’s inherently inequitable. So as we look at pervasive and hopefully universal access to this incredible technology, we have to be looking into what we can do. What do we have to change? Is there something about this technology that needs to be dramatically altered in order for us to make it universally accessible?”
The role of corporate responsibility
Some corporations are taking responsibility for this onrushing environmental disaster, as well as working to mitigate the impending financial disaster. On the carbon footprint side, AWS has been charting a course toward more responsible usage and sustainability, which today looks like implementing Nvidia’s recent liquid cooling solutions and more.
“We’re looking at both steel and concrete enhancements to lessen our carbon usage,” Roberts explained. “In addition to that, we’re looking at alternative fuels. Instead of just traditional diesel fuels in our generators, we’re looking at hydro vegetable oil, and other alternative sources there.”
They’re also pushing alternative chips. For example, they’ve released their own silicon, Trainium, which can be many times more efficient versus alternative options. And to mitigate the cost of inferencing, they’ve announced Inferentia which, he says, offers upwards of a 50% performance per watt improvement over existing options.
The company’s second generation ultra cluster network, which helps with training and pre-training, supports up to about 20,000 GPUs, and delivers about 10 petabits per second of network throughput on the same spine with a latency under 10 microseconds, a decrease in overall latency by 25%. The end result: training more models much faster at a lower cost.
Can quantum computing change the future?
Garcia’s work is centered on the ways quantum and AI interface with each other, and the takeaways have great promise. Quantum computing offers potential resource savings and speed benefits. Quantum machine learning can be used for AI in three ways, Garcia said: quantum models on classical data, quantum models on quantum data and classical models on quantum data.
“There have been different theoretical proofs in each of those different categories to show there’s an advantage to using quantum computers for tackling these types of areas,” Garcia said. “For example, if you have limited trainng data or very sparse data, or very interconnected data. One of the areas we’re thinking about that’s very promising in this space is thinking about healthcare and life sciences applications. Anything where you have something quantum mechanical in nature that you need to tackle.”
IBM is actively researching the vast potential for quantum machine learning. It already has a large number of applications in life sciences, industrial applications, materials science and more. IBM researchers are also developing Watson Code Assist, which helps users unfamiliar with quantum computing take advantage of a quantum computer for their applications.
“We’re leveraging AI to assist with that and help people be able to optimize circuits, to be able to define their problem in a way that it makes sense for the quantum computer to be able to solve,” she explained.
The solution, she added, will be a combination of bits, neurons and cubits.
“It’s going to be CPUs, plus GPUs, plus QPs working together and differentiating between the different pieces of the workflow,” she said. “We need to push the quantum technology to get to a point where we can run the circuits that we’re talking about, where we think we’re going to bring that sort of exponential speed up, polynomial speed up. But the potential of the algorithms is really promising for us.”
But the infrastructure requirements for quantum are a sticking point, before quantum becomes the hero of the day. That includes reducing the power consumption further, and improving component engineering.
“There’s a lot of physics research that needs to be done in order to be able to actualize the infrastructure requirements for quantum,” she explained. “For me, that’s the real challenge that I see to realize this vision of having all three working in concert together to solve problems in the most resource efficient manner.”
Choice and the hard ceiling
“More important than everything else is radical transparency, to afford decision-makers that deep understanding, all the way back through the supply chain, of the sustainability, the energy, the privacy and the security characteristics of all these technologies that we’re employing so we can understand the true cost,” Bresniker said. “That gives us the ability to calculate the true return on these investments. Right now we have deep subject matter experts all talking to the enterprise about adoption, but they’re not necessarily listing what the needs are to actually successfully and sustainably and equitably integrate these technologies.”
And part of that comes down to choice, Roberts said. The horse is out of the barn, and more and more organizations will be leveraging LLMs and gen AI. There’s an opportunity there to choose the performance characteristics that best fit the application, rather than indiscriminately eating up resources.
“From a sustainability and an energy perspective, you want to be thinking, what’s my use case that I’m trying to accomplish with that particular application and that model, and then what’s the silicon that I’m going to use to drive that inferencing?” he said.
You can also choose the host, and you can choose specific applications and specific tools that will abstract the underlying use case.
“The reason why that’s important is that that gives you choice, it gives you lots of control, and you can choose what is the most cost efficient and most optimal deployment for your application,” he said.
“If you throw in more data and more energy and more water and more people, this will be a bigger model, but is it actually better for the enterprise? That’s the real question around enterprise fitness,” Bresniker added. “We will hit a hard ceiling if we continue. As we begin that conversation, having that understanding and beginning to push back and say — I want some more transparency. I need to know where that data came from. How much energy is in that model? Is there another alternative? Maybe a couple of small models is better than one monolithic monoculture. Even before we get to the ceiling, we’ll deal with the monoculture.”