Stay Ahead, Stay ONMINE

How to run an LLM on your laptop

MIT Technology Review’s How To series helps you get things done.  Simon Willison has a plan for the end of the world. It’s a USB stick, onto which he has loaded a couple of his favorite open-weight LLMs—models that have been shared publicly by their creators and that can, in principle, be downloaded and run with local hardware. If human civilization should ever collapse, Willison plans to use all the knowledge encoded in their billions of parameters for help. “It’s like having a weird, condensed, faulty version of Wikipedia, so I can help reboot society with the help of my little USB stick,” he says. But you don’t need to be planning for the end of the world to want to run an LLM on your own device. Willison, who writes a popular blog about local LLMs and software development, has plenty of compatriots: r/LocalLLaMA, a subreddit devoted to running LLMs on your own hardware, has half a million members. For people who are concerned about privacy, want to break free from the control of the big LLM companies, or just enjoy tinkering, local models offer a compelling alternative to ChatGPT and its web-based peers. The local LLM world used to have a high barrier to entry: In the early days, it was impossible to run anything useful without investing in pricey GPUs. But researchers have had so much success in shrinking down and speeding up models that anyone with a laptop, or even a smartphone, can now get in on the action. “A couple of years ago, I’d have said personal computers are not powerful enough to run the good models. You need a $50,000 server rack to run them,” Willison says. “And I kept on being proved wrong time and time again.” Why you might want to download your own LLM Getting into local models takes a bit more effort than, say, navigating to ChatGPT’s online interface. But the very accessibility of a tool like ChatGPT comes with a cost. “It’s the classic adage: If something’s free, you’re the product,” says Elizabeth Seger, the director of digital policy at Demos, a London-based think tank.  OpenAI, which offers both paid and free tiers, trains its models on users’ chats by default. It’s not too difficult to opt out of this training, and it also used to be possible to remove your chat data from OpenAI’s systems entirely, until a recent legal decision in the New York Times’ ongoing lawsuit against OpenAI required the company to maintain all user conversations with ChatGPT. Google, which has access to a wealth of data about its users, also trains its models on both free and paid users’ interactions with Gemini, and the only way to opt out of that training is to set your chat history to delete automatically—which means that you also lose access to your previous conversations. In general, Anthropic does not train its models using user conversations, but it will train on conversations that have been “flagged for Trust & Safety review.”  Training may present particular privacy risks because of the ways that models internalize, and often recapitulate, their training data. Many people trust LLMs with deeply personal conversations—but if models are trained on that data, those conversations might not be nearly as private as users think, according to some experts. “Some of your personal stories may be cooked into some of the models, and eventually be spit out in bits and bytes somewhere to other people,” says Giada Pistilli, principal ethicist at the company Hugging Face, which runs a huge library of freely downloadable LLMs and other AI resources. For Pistilli, opting for local models as opposed to online chatbots has implications beyond privacy. “Technology means power,” she says. “And so who[ever] owns the technology also owns the power.” States, organizations, and even individuals might be motivated to disrupt the concentration of AI power in the hands of just a few companies by running their own local models. Breaking away from the big AI companies also means having more control over your LLM experience. Online LLMs are constantly shifting under users’ feet: Back in April, ChatGPT suddenly started sucking up to users far more than it had previously, and just last week Grok started calling itself MechaHitler on X. Providers tweak their models with little warning, and while those tweaks might sometimes improve model performance, they can also cause undesirable behaviors. Local LLMs may have their quirks, but at least they are consistent. The only person who can change your local model is you. Of course, any model that can fit on a personal computer is going to be less powerful than the premier online offerings from the major AI companies. But there’s a benefit to working with weaker models—they can inoculate you against the more pernicious limitations of their larger peers. Small models may, for example, hallucinate more frequently and more obviously than Claude, GPT, and Gemini, and seeing those hallucinations can help you build up an awareness of how and when the larger models might also lie. “Running local models is actually a really good exercise for developing that broader intuition for what these things can do,” Willison says. How to get started Local LLMs aren’t just for proficient coders. If you’re comfortable using your computer’s command-line interface, which allows you to browse files and run apps using text prompts, Ollama is a great option. Once you’ve installed the software, you can download and run any of the hundreds of models they offer with a single command.  If you don’t want to touch anything that even looks like code, you might opt for LM Studio, a user-friendly app that takes a lot of the guesswork out of running local LLMs. You can browse models from Hugging Face from right within the app, which provides plenty of information to help you make the right choice. Some popular and widely used models are tagged as “Staff Picks,” and every model is labeled according to whether it can be run entirely on your machine’s speedy GPU, needs to be shared between your GPU and slower CPU, or is too big to fit onto your device at all. Once you’ve chosen a model, you can download it, load it up, and start interacting with it using the app’s chat interface. As you experiment with different models, you’ll start to get a feel for what your machine can handle. According to Willison, every billion model parameters require about one GB of RAM to run, and I found that approximation to be accurate: My own 16 GB laptop managed to run Alibaba’s Qwen3 14B as long as I quit almost every other app. If you run into issues with speed or usability, you can always go smaller—I got reasonable responses from Qwen3 8B as well. And if you go really small, you can even run models on your cell phone. My beat-up iPhone 12 was able to run Meta’s Llama 3.2 1B using an app called LLM Farm. It’s not a particularly good model—it very quickly goes off into bizarre tangents and hallucinates constantly—but trying to coax something so chaotic toward usability can be entertaining. If I’m ever on a plane sans Wi-Fi and desperate for a probably false answer to a trivia question, I now know where to look. Some of the models that I was able to run on my laptop were effective enough that I can imagine using them in my journalistic work. And while I don’t think I’ll depend on phone-based models for anything anytime soon, I really did enjoy playing around with them. “I think most people probably don’t need to do this, and that’s fine,” Willison says. “But for the people who want to do this, it’s so much fun.”

MIT Technology Review’s How To series helps you get things done. 

Simon Willison has a plan for the end of the world. It’s a USB stick, onto which he has loaded a couple of his favorite open-weight LLMs—models that have been shared publicly by their creators and that can, in principle, be downloaded and run with local hardware. If human civilization should ever collapse, Willison plans to use all the knowledge encoded in their billions of parameters for help. “It’s like having a weird, condensed, faulty version of Wikipedia, so I can help reboot society with the help of my little USB stick,” he says.

But you don’t need to be planning for the end of the world to want to run an LLM on your own device. Willison, who writes a popular blog about local LLMs and software development, has plenty of compatriots: r/LocalLLaMA, a subreddit devoted to running LLMs on your own hardware, has half a million members.

For people who are concerned about privacy, want to break free from the control of the big LLM companies, or just enjoy tinkering, local models offer a compelling alternative to ChatGPT and its web-based peers.

The local LLM world used to have a high barrier to entry: In the early days, it was impossible to run anything useful without investing in pricey GPUs. But researchers have had so much success in shrinking down and speeding up models that anyone with a laptop, or even a smartphone, can now get in on the action. “A couple of years ago, I’d have said personal computers are not powerful enough to run the good models. You need a $50,000 server rack to run them,” Willison says. “And I kept on being proved wrong time and time again.”

Why you might want to download your own LLM

Getting into local models takes a bit more effort than, say, navigating to ChatGPT’s online interface. But the very accessibility of a tool like ChatGPT comes with a cost. “It’s the classic adage: If something’s free, you’re the product,” says Elizabeth Seger, the director of digital policy at Demos, a London-based think tank. 

OpenAI, which offers both paid and free tiers, trains its models on users’ chats by default. It’s not too difficult to opt out of this training, and it also used to be possible to remove your chat data from OpenAI’s systems entirely, until a recent legal decision in the New York Times’ ongoing lawsuit against OpenAI required the company to maintain all user conversations with ChatGPT.

Google, which has access to a wealth of data about its users, also trains its models on both free and paid users’ interactions with Gemini, and the only way to opt out of that training is to set your chat history to delete automatically—which means that you also lose access to your previous conversations. In general, Anthropic does not train its models using user conversations, but it will train on conversations that have been “flagged for Trust & Safety review.” 

Training may present particular privacy risks because of the ways that models internalize, and often recapitulate, their training data. Many people trust LLMs with deeply personal conversations—but if models are trained on that data, those conversations might not be nearly as private as users think, according to some experts.

“Some of your personal stories may be cooked into some of the models, and eventually be spit out in bits and bytes somewhere to other people,” says Giada Pistilli, principal ethicist at the company Hugging Face, which runs a huge library of freely downloadable LLMs and other AI resources.

For Pistilli, opting for local models as opposed to online chatbots has implications beyond privacy. “Technology means power,” she says. “And so who[ever] owns the technology also owns the power.” States, organizations, and even individuals might be motivated to disrupt the concentration of AI power in the hands of just a few companies by running their own local models.

Breaking away from the big AI companies also means having more control over your LLM experience. Online LLMs are constantly shifting under users’ feet: Back in April, ChatGPT suddenly started sucking up to users far more than it had previously, and just last week Grok started calling itself MechaHitler on X.

Providers tweak their models with little warning, and while those tweaks might sometimes improve model performance, they can also cause undesirable behaviors. Local LLMs may have their quirks, but at least they are consistent. The only person who can change your local model is you.

Of course, any model that can fit on a personal computer is going to be less powerful than the premier online offerings from the major AI companies. But there’s a benefit to working with weaker models—they can inoculate you against the more pernicious limitations of their larger peers. Small models may, for example, hallucinate more frequently and more obviously than Claude, GPT, and Gemini, and seeing those hallucinations can help you build up an awareness of how and when the larger models might also lie.

“Running local models is actually a really good exercise for developing that broader intuition for what these things can do,” Willison says.

How to get started

Local LLMs aren’t just for proficient coders. If you’re comfortable using your computer’s command-line interface, which allows you to browse files and run apps using text prompts, Ollama is a great option. Once you’ve installed the software, you can download and run any of the hundreds of models they offer with a single command

If you don’t want to touch anything that even looks like code, you might opt for LM Studio, a user-friendly app that takes a lot of the guesswork out of running local LLMs. You can browse models from Hugging Face from right within the app, which provides plenty of information to help you make the right choice. Some popular and widely used models are tagged as “Staff Picks,” and every model is labeled according to whether it can be run entirely on your machine’s speedy GPU, needs to be shared between your GPU and slower CPU, or is too big to fit onto your device at all. Once you’ve chosen a model, you can download it, load it up, and start interacting with it using the app’s chat interface.

As you experiment with different models, you’ll start to get a feel for what your machine can handle. According to Willison, every billion model parameters require about one GB of RAM to run, and I found that approximation to be accurate: My own 16 GB laptop managed to run Alibaba’s Qwen3 14B as long as I quit almost every other app. If you run into issues with speed or usability, you can always go smaller—I got reasonable responses from Qwen3 8B as well.

And if you go really small, you can even run models on your cell phone. My beat-up iPhone 12 was able to run Meta’s Llama 3.2 1B using an app called LLM Farm. It’s not a particularly good model—it very quickly goes off into bizarre tangents and hallucinates constantly—but trying to coax something so chaotic toward usability can be entertaining. If I’m ever on a plane sans Wi-Fi and desperate for a probably false answer to a trivia question, I now know where to look.

Some of the models that I was able to run on my laptop were effective enough that I can imagine using them in my journalistic work. And while I don’t think I’ll depend on phone-based models for anything anytime soon, I really did enjoy playing around with them. “I think most people probably don’t need to do this, and that’s fine,” Willison says. “But for the people who want to do this, it’s so much fun.”

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

SonicWall rolls out eight new firewalls, expands cyber warranty

NSa 4800: 24x1GbE, 8x10G SFP+, 20 Gbps firewall throughput, 13 Gbps threat prevention NSa 5800: 24x1GbE, 8x10G SFP+, 30 Gbps firewall throughput, 24 Gbps threat prevention SonicWall Multi-gigabit connectivity addresses real market demand The Generation 8 portfolio introduces multi-gigabit connectivity across both product lines. Even lower-end desktop models now support

Read More »

ConocoPhillips lets well stimulation services contract for North Sea assets

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: #c19a06; } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style

Read More »

Cisco strengthens AI networking story

“Overall, AI demand in the enterprise will grow over time. But enterprise customers need to see the value, see the ROI. Also, they have to have a well-defined use case,” Wollenweber said, noting that the 12-month innovation cycles of GPU vendors can be problematic if customers choose the wrong platform.

Read More »

DEF CON research takes aim at ZTNA, calls it a bust

Major vendor vulnerabilities span authentication and design flaws The research exposed critical vulnerabilities across Check Point, Zscaler and Netskope that fell into three primary categories: authentication bypasses, credential storage failures and cross-tenant exploitation. Authentication bypass vulnerabilities Zscaler’s SAML implementation contained the most severe authentication flaw. The researchers discovered that the

Read More »

Nuveen raises $1.3B for energy and power infrastructure fund

Dive Brief: Nuveen, the investment manager for retirement services company TIAA, said Wednesday it completed a $1.3 billion funding round for its second energy and power infrastructure credit fund to support a growing global power demand due to artificial intelligence, digitalization and electrification.  The firm’s EPIC II fund will provide companies with credit solutions aimed at “supporting the build-out of secure and reliable energy and power generation while also focusing on credit opportunities involving sustainable infrastructure,” Wednesday’s release said. The fund has a $2.5 billion target and will take an all-of-the-above strategy to its energy investments, investing in renewables and energy storage technologies, along with liquefied natural gas and other fossil fuels. Dive Insight: Nuveen manages more than $1.3 trillion in assets, including more than $35 billion in infrastructure assets as of March 31, according to the release. Those assets include solar and battery storage in the United States. The EPIC II fund — which shares the same investment strategy and name as its predecessor EPIC I — will provide project and corporate financing to companies for equipment and growth capital, along with financing acquisitions, capital restructuring and other structured credit options. The senior managing director and portfolio manager of Nuveen’s energy infrastructure credit platform, Don Dimitrievich, said the fund is “focused on deploying capital into resilient companies and projects across the energy and power ecosystem” to capitalize on a “historic market opportunity.”  “Investors are increasingly interested in strategies that capitalize on their conviction in the growing global energy demand brought on by digitalization, electrification and reindustrialization while also seeking downside risk mitigation to guard against macro volatility, and inflationary and geopolitical risk,” Dimitrievich said in the release. The initial funding round announced Wednesday was “anchored” by TIAA and an unnamed “leading Canadian pension fund manager,” Nuveen said. Investors other

Read More »

Top India Oil Explorer’s Profit Falls

State-run Oil and Natural Gas Corp.’s quarterly profit declined on lower crude prices and stagnant production from its aging fields.  The New Delhi-based explorer’s net income fell 10% on year in the three months ended June 30 to 80.24 billion rupees ($915 million), according to a stock exchange filing. That was almost in-line with 80.74 billion rupees estimated by analysts, according to a Bloomberg survey.  The tepid earnings comes as the South Asian nation’s largest oil and gas producer is looking to cut dependence on exploration by diversifying into refining and liquefied natural gas. It is investing 2 trillion rupees on new energy offerings and emission control in its goal to become an integrated energy company.  ONGC’s weak June quarter performance contrasts with those of global peers like Exxon Mobil Corp. and Chevron Corp., which made up for weak crude prices with record production.  The company’s oil output increased just 1%, and gas production was flat as most of its older fields are witnessing a natural decline in performance. To boost production, it will focus on recent discoveries and enhance recovery from aging fields, Chairman Arun Kumar Singh said in ONGC’s annual report published last week. The company is pursuing deepwater collaboration with global giants like BP Plc, ExxonMobil and TotalEnergies SE to mitigate exploration risks in difficult regions, he said.  ONGC, which accounts for two-thirds of India’s oil and more than half the gas output, saw its quarterly revenue fall 9.3% to 320 billion rupees. Its crude oil earnings per barrel plunged 20.4% to $66.13 a barrel, while that from gas rose 2.2% to $6.64 per million British thermal units.  WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate

Read More »

11 Projects Picked for US DOE’s Advanced Nuclear Reactors Pilot

The United States Department of Energy (DOE) has announced the first selections for a new pathway for the testing of advanced nuclear reactors. Last June, DOE opened application for testing such technologies outside of national laboratories using the federal authorization process. The program is called the Reactor Pilot Program. “DOE will work with industry on these 11 projects, with the goal to construct, operate and achieve criticality of at least three test reactors using the DOE authorization process by July 4, 2026”, the agency said in a statement on its website. “Today’s initial selections represent an important step toward streamlining nuclear reactor testing and unleash a new pathway toward fast-tracking commercial licensing activities”. The selected companies are Aalo Atomics Inc., Antares Nuclear Inc., Atomic Alchemy Inc., Deep Fission Inc., Last Energy Inc., Natura Resources LLC, Oklo Inc., Radiant Industries Inc., Terrestrial Energy Inc. and Valar Atomics Inc. “The diversity of applications received shows the remarkable breath of innovation and ingenuity in American reactor developers”, DOE said. “Each company will be responsible for all costs associated with designing, manufacturing, constructing, operating and decommissioning their test reactors”.   The pilot implements an order issued by President Donald Trump May 23 that seeks to reform the national lab process for reactor testing, establish a pilot program for reactor construction and operation outside of national labs and streamline environmental reviews for reactor facilities. The White House order aims to have at least three advanced reactors achieve “criticality” by July 2026. Under U.S. law, advanced nuclear reactors include fusion reactors or radioisotope power systems that use heat from radioactive decay to generate energy. Nuclear fission reactors may also be considered advanced reactors if they provide significant improvements compared to reactors operating as of 2020, according to Title 42 Section 16271(b)(1) of the U.S. Code. Last week DOE

Read More »

Texas Oil Regulator Reveals Latest Preliminary Oil, Gas Production Figures

The Texas Railroad Commission (RRC) revealed its latest preliminary crude oil and natural gas production figures in a statement posted on its site recently. The preliminary reported total volume of crude oil in Texas in May was 121,302,500 barrels, according to the statement, which showed that the preliminary reported total volume of natural gas in the state during the same month was 1.04 trillion cubic feet. The RRC noted in the statement that crude oil and natural gas production for May came from 159,047 oil wells and 83,808 gas wells. In its statement, the RRC highlighted that crude oil production reported by the RRC is limited to oil produced from oil leases and does not include condensate, which the organization said is reported separately by the RRC. The RRC also pointed out in the statement that preliminary figures are based on production volumes reported by operators and said they will be updated as late and corrected production reports are received. The RRC’s statement showed that the updated reported total volume of crude oil in Texas in May 2024 was 146,390,317 barrels. The preliminary reported total volume was 119,144,564 barrels, the statement highlighted. It showed that the updated reported total volume of natural gas in the state came in at 1.07 trillion cubic feet in May last year. The preliminary reported total volume was 907.4 billion cubic feet, the statement outlined. According to the RRC’s statement, the county in Texas with the highest preliminary crude oil production figure in May was Martin, with 20,553,614 barrels. Midland ranked second, with 17,459,325 barrels, Upton was third, with 8,649,677 barrels, Loving was fourth, with 7,696,320 barrels, Howard was fifth, with 6,950,654 barrels, Reeves was sixth, with 5,906,457 barrels, Karnes was seventh, with 5,875,271 barrels, Reagan was eighth, with 5,704,205 barrels, Andrews was ninth, with 4,130,374

Read More »

Amigo LNG Taps COMSA Marine for EPC Services

Amigo LNG SA de CV said it has awarded an engineering, procurement, and construction (EPC) contract for its marine facilities to international marine and port infrastructure contractor Constructora Manzanillo SA de CV (COMSA Marine). Amigo LNG’s export terminal, which is designed for a nameplate capacity of 7.8 million metric tons per annum (mtpa), is located in Guaymas, Sonora, on Mexico’s west coast. Under the EPC contract, COMSA Marine will be responsible for the detailed engineering, construction, and commissioning of the Amigo LNG terminal’s marine infrastructure, including the liquefied natural gas (LNG) jetty, berthing and mooring facilities, and associated utilities to support LNG loading operations, according to a news release from LNG Alliance Pte Ltd, which leads the project. Financial terms of the contract were not disclosed. The quad-berth marine facilities are planned to be equipped with high-capacity LNG loading arms exceeding 15,000 cubic meters per hour, targeting rapid vessel turnaround and efficient LNG loading operations, according to the release. Strategically located to leverage Guaymas’ deepwater port and proximity to major gas supplies, Amigo LNG aims to begin LNG exports by the third quarter of 2028, offering economic development, local supply chain engagement and job creation within Sonora. Amigo LNG’s “competitive pricing and reduced shipping distances could equate to 35 percent shorter voyage time,” LNG Alliance said. “Awarding the EPC contract for our marine facilities represents a key achievement in our project schedule,” LNG Alliance CEO Muthu Chezhian said. “COMSA Marine brings extensive experience in LNG terminal construction and marine engineering, which ensures we will meet the highest standards of safety, quality, and environmental stewardship”. COMSA Marine President Ruben Alamo said, “We are honored to be entrusted with this challenging project. Its success will be driven by our unique combination of local expertise and an unwavering commitment to the highest international standards

Read More »

USA Crude Oil Stocks Rise by 3 Million Barrels Week on Week

U.S. commercial crude oil inventories, excluding those in the Strategic Petroleum Reserve (SPR), increased by three million barrels from the week ending August 1 to the week ending August 8, the U.S. Energy Information Administration (EIA) highlighted in its latest weekly petroleum status report. That report was released on August 13 and included data for the week ending August 8. It showed that crude oil stocks, not including the SPR, stood at 426.7 million barrels on August 8, 423.7 million barrels on August 1, and 430.7 million barrels on August 9, 2024. Crude oil in the SPR stood at 403.2 million barrels on August 8, 403.0 million barrels on August 1, and 376.5 million barrels on August 9, 2024, the report revealed. Total petroleum stocks – including crude oil, total motor gasoline, fuel ethanol, kerosene type jet fuel, distillate fuel oil, residual fuel oil, propane/propylene, and other oils – stood at 1.670 billion barrels on August 8, the report highlighted. Total petroleum stocks were up 7.7 million barrels week on week and up 6.9 million barrels year on year, the report showed. “At 426.7 million barrels, U.S. crude oil inventories are about six percent below the five year average for this time of year,” the EIA said in its latest weekly petroleum status report. “Total motor gasoline inventories decreased by 0.8 million barrels from last week and are at the five year average for this time of year. Finished gasoline inventories increased and blending components inventories decreased last week,” it added. “Distillate fuel inventories increased by 0.7 million barrels last week and are about 15 percent below the five year average for this time of year. Propane/propylene inventories increased by 3.9 million barrels from last week and are 11 percent above the five year average for this time of year,”

Read More »

Nvidia targets data center with new servers, AI software

More Nvidia news from SIGGRAPH The RTX Pro server wasn’t the only news at the show. Nvidia also announced two new models of its Nemotron model family – the Nemotron Nano 2 and Llama Nemotron Super 1.5 – with advanced reasoning capabilities for building smarter AI agents. Nemotron is a family of enterprise-ready, open large language models (LLM) designed to enhance agentic AI for tasks requiring sophisticated reasoning, instruction following, coding, tool use, and multimodal (text+vision) understanding. These models deliver high accuracy for their relative size in areas such as scientific reasoning, coding, tool use, instruction following and chat, according to Nvidia. They are designed to imbue AI agents with deeper cognitive abilities and help AI systems explore options, weigh decisions and deliver results within defined constraints. Nvidia claims Nemotron Nano 2 achieves up to six times higher token generation throughput compared to other models its size. Llama Nemotron Super 1.5 offers top-tier performance and leads in reasoning accuracy, making it suitable for handling complex enterprise tasks. Also, Nvidia is empowering robotics and machines to “see” and react to what they see with new AI models that can ingest visual information and think about said information. The vendor just announced Cosmos Reason, a new open, customizable 7 billion-parameter reasoning Vision Language Models, or VLMs. VLMs allow robots and vision agents to think about what they see, just like a human. Up to now, robots have had the ability to “see,” but their reaction to what they saw was extremely limited. A VLM provides robotics with the ability to think about their actions.

Read More »

Cisco Q4 results: AI infrastructure orders surpass goal

Adding a little more color to the enterprise results, Robbins called out some of Cisco’s network modernization offerings for customers. “We’ve got [our] routing refresh, we’ve got a lot of new technology in our data center networking business … we have Wi-Fi 7 gear, which grew triple digits year over year. We’re in year eight of the [Catalyst 9000] offering, and honestly, if you look at products that were pre-[Catalyst 9000] that are still installed in our customer base, there’s tens of billions of dollars of install base there that we can go after,” Robbins said. Demand for AI infrastructure will extend to enterprises, Robbins said: “If you think about the AI revolution…We tend to see these things begin first in the cloud providers, which we’re clearly seeing the AI play in the cloud providers. Then we see it shift into the enterprise. We see a shift in the back end, in this case, from the back end to the front end. We believe that will occur as enterprises start using more of these services, and then enterprises will also build out inferencing… and we’re even seeing the telco business actually pick up as they’re telling us they’re increasing their network capacity and they’re modernizing their infrastructure in preparation for AI. So, we think AI is going to drive network modernization across all of these segments.” Taking aim at security Overall security order growth in Q4 was up double digits, Robbins said. “We have 80 new Hypershield customers, largely connected to this new smart switch [the N9300 Smart Switch]. So that strategy is working. And I would say that we had 480-plus new SSE customers during the quarter. So that’s our secure services edge [which] is really getting good traction. Robbins singled out the demand for integrated networking and security

Read More »

Uptime Institute’s Jay Dietrich on Why Net Zero Isn’t Enough for Sustainable Data Centers

In the latest episode of the Data Center Frontier Show podcast, Editor-in-Chief Matt Vincent sits down with Jay Dietrich, Research Director of Sustainability at Uptime Institute, to examine what real sustainability looks like inside the data center, and why popular narratives around net zero, offsets, and carbon neutrality often obscure more than they reveal. Over the course of our conversation, Dietrich walks listeners through Uptime’s expanding role in guiding data center operators toward measurable sustainability outcomes; not just certifications, but operational performance improvements at the facility level. “Window Dressing” vs. Real Progress Dietrich is candid about the challenges operators face in navigating the current landscape of sustainability reporting. Despite high-level claims of carbon neutrality, many facilities still operate inefficiently, relying heavily on carbon offsets or energy attribute certificates to hit corporate goals. “An EU survey found that 80% of data centers report carbon-free operations based on market calculations, while their national grids run at only 55% renewable,” Dietrich says. “The only thing that truly matters is the performance of the actual facility.” To close this gap, Uptime offers a Sustainability Gap Analysis and a Sustainable Operations Certification, helping data center operators minimize energy and water use, improve cooling efficiency, and increase the useful work delivered per megawatt hour. Redefining the Sustainable Data Center One of the discussion’s core messages: a net zero data center is not necessarily a sustainable one. Dietrich stresses the need to shift focus from corporate carbon accounting toward IT utilization, emphasizing metrics like: Work delivered per unit of energy consumed. Work delivered per metric ton of CO₂ emitted (location-based). Actual IT infrastructure utilization rates. Underutilized IT infrastructure — still common across the industry — is one of the biggest sustainability blind spots. “Running IT at 10% utilization wastes capacity, space, and energy,” says Dietrich. “Increasing that

Read More »

Data Center Jobs: Engineering, Construction, Commissioning, Sales, Field Service and Facility Tech Jobs Available in Major Data Center Hotspots

Each month Data Center Frontier, in partnership with Pkaza, posts some of the hottest data center career opportunities in the market. Here’s a look at some of the latest data center jobs posted on the Data Center Frontier jobs board, powered by Pkaza Critical Facilities Recruiting. Peter Kazella of Pkaza Critical Facilities Recruiting provides tips for building a healthy applicant pool. Switchgear Field Service Technician – Critical Facilities Nationwide Travel  This position is also available in any major data center region: Ashburn, VA; Charlotte, NC; Atlanta, GA; Denver, CO; Portland, OR; Seattle, WA; Las Vegas, NV; or Phoenix, AZ. Multiple opportunities for both senior and mid-level switchgear field service technicians. These openings are with a nationwide market leader of power distribution solutions, specializing in switchgear and controls for mission-critical environments. This company provides customized solutions for enterprise, colocation, and hyperscale data centers, ensuring reliability and uptime through controls integration, power distribution solutions, and switchgear installations. Their services include installations, retrofits, upgrades, turnkey electrical solutions, and preventive & corrective maintenance of UPS, switchgear, generators, and PLC systems. This is an excellent career-growth opportunity to work on exciting projects with leading-edge technology and competitive compensation. Electrical Commissioning Engineer New Albany, OH This traveling position is also available in: Richmond, VA; Ashburn, VA; Charlotte, NC; Atlanta, GA; Hampton, GA; Fayetteville, GA; Minneapolis, MN; Phoenix, AZ; Dallas, TX; or Chicago, IL. *** ALSO looking for a LEAD EE and ME CxA agents and CxA PMs *** Our client is an engineering design and commissioning company that has a national footprint and specializes in MEP critical facilities design. They provide design, commissioning, consulting and management expertise in the critical facilities space. They have a mindset to provide reliability, energy efficiency, sustainable design and LEED expertise when providing these consulting services for enterprise, colocation and hyperscale companies. This career-growth

Read More »

DCF Trends Summit 2025: Power Chat

Data Center Frontier Editor in Chief Matt Vincent and Contributing Editor Bill Kleyman (CEO/Apolo) recently had another video chat to discuss the Data Center Frontier Trends Summit 2025 and, in particular, the event’s focus on data center power. The second annual Trends Summit is scheduled for August 26-28 in Reston, Virginia. Register Now This second QuickChat in our series—following the opening discussion that set the stage for the conference—now zeroes in on data center power challenges. The conversation explores how AI’s exponential growth is straining power infrastructure, turning energy into both a critical market-entry barrier and a defining theme of the event. The Power Challenge: Quantifying the Scale Kleyman outlined the sheer scale of the power challenge facing the industry. He cited a Goldman Sachs report projecting that U.S. data center power consumption, currently around 2-3% of total consumption, could more than double to 8-9% by 2028. He also highlighted a forecast that the data center industry will require upwards of 50 gigawatts (GW) of power by 2035, a figure he put into perspective by noting that 1 GW can power a city of a million people. The discussion mentions a few large-scale projects that illustrate this trend: Kevin O’Leary‘s Wonder Valley project in Alberta, Canada, aiming for 8 GW. Tract’s new multi-gigawatt campus project in Texas amid plans for a staggering 25 GW national build-out. Kleyman also referenced the NVL576 rack unveiled at Nvidia GTC 2025, which is capable of supporting 600 kilowatts (kW) per rack, signaling the official arrival of the “megawatt era class of data center racks.” Grid-Optional Solutions and the Summit Panels The discussion transitioned to how the industry is moving from being solely grid-reliant to “grid-optional.” A significant trend highlighted is the inclusion of behind-the-meter power generation in new builds, with 30% of new U.S. data

Read More »

Earnings Frontier: Equinix and Modine

Welcome to Earnings Frontier, our new series where we cut through the noise of quarterly financial reports to get straight to what matters for the data center industry. Today, the spotlight is on a pair of companies whose recent earnings calls offered valuable insights into the market’s current trajectory: Equinix and Modine. On one hand, we have the colocation giant, Equinix, which continues to demonstrate its enduring strength as a foundational pillar of the digital economy. Their latest results underscore a familiar story of robust customer engagement, strategic capacity expansion, and a steadfast focus on interconnection. The numbers speak to the company’s ability to capitalize on the sustained demand for cloud and AI infrastructure, all while successfully navigating a complex and competitive landscape. Then there’s Modine, a company whose name might not be as synonymous with data centers as Equinix’s, but whose recent performance is a powerful testament to the critical role of cooling in the age of AI. Their earnings call highlighted a strategic pivot toward data center cooling technologies, with strong growth in their Climate Solutions segment. The company’s commentary reveals a deep understanding of the market’s evolving needs, with a keen focus on high-efficiency, advanced cooling strategies that are becoming non-negotiable for next-generation workloads. In the following segments, we’ll dive into the details of these two contrasting, yet equally illuminating, earnings reports, exploring the key metrics, strategic takeaways, and forward-looking statements that are shaping the future of digital infrastructure. Equinix Outlines Three-Pronged Strategy for 2025 Double-Digit Revenue Growth CEO Adaire Fox-Martin described growth from acquisitions, new projects, customer AI adoption, and long-term strategies, saying, “We were built for this moment.” Key Takeaways Leaders at Equinix, Inc. announced the acquisition of three data centers in Manila and has 59 projects in 34 metro areas, including key markets such

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »