Stay Ahead, Stay ONMINE

Roadmap to Becoming a Data Scientist, Part 4: Advanced Machine Learning

Introduction Data science is undoubtedly one of the most fascinating fields today. Following significant breakthroughs in machine learning about a decade ago, data science has surged in popularity within the tech community. Each year, we witness increasingly powerful tools that once seemed unimaginable. Innovations such as the Transformer architecture, ChatGPT, the Retrieval-Augmented Generation (RAG) framework, and state-of-the-art Computer Vision models — including GANs — have […]

Introduction

Data science is undoubtedly one of the most fascinating fields today. Following significant breakthroughs in machine learning about a decade ago, data science has surged in popularity within the tech community. Each year, we witness increasingly powerful tools that once seemed unimaginable. Innovations such as the Transformer architectureChatGPT, the Retrieval-Augmented Generation (RAG) framework, and state-of-the-art Computer Vision models — including GANs — have had a profound impact on our world.

However, with the abundance of tools and the ongoing hype surrounding AI, it can be overwhelming — especially for beginners — to determine which skills to prioritize when aiming for a career in data science. Moreover, this field is highly demanding, requiring substantial dedication and perseverance.

The first three parts of this series outlined the necessary skills to become a data scientist in three key areas: math, software engineering, and machine learning. While knowledge of classical Machine Learning and neural network algorithms is an excellent starting point for aspiring data specialists, there are still many important topics in machine learning that must be mastered to work on more advanced projects.

This article will focus solely on the math skills necessary to start a career in Data Science. Whether pursuing this path is a worthwhile choice based on your background and other factors will be discussed in a separate article.

The importance of learning evolution of methods in machine learning

The section below provides information about the evolution of methods in natural language processing (NLP).

In contrast to previous articles in this series, I have decided to change the format in which I present the necessary skills for aspiring data scientists. Instead of directly listing specific competencies to develop and the motivation behind mastering them, I will briefly outline the most important approaches, presenting them in chronological order as they have been developed and used over the past decades in machine learning.

The reason is that I believe it is crucial to study these algorithms from the very beginning. In machine learning, many new methods are built upon older approaches, which is especially true for NLP and computer vision.

For example, jumping directly into the implementation details of modern large language models (LLMs) without any preliminary knowledge may make it very difficult for beginners to grasp the motivation and underlying ideas of specific mechanisms.

Given this, in the next two sections, I will highlight in bold the key concepts that should be studied.

# 04. NLP

Natural language processing (NLP) is a broad field that focuses on processing textual information. Machine learning algorithms cannot work directly with raw text, which is why text is usually preprocessed and converted into numerical vectors that are then fed into neural networks.

Before being converted into vectors, words undergo preprocessing, which includes simple techniques such as parsingstemming, lemmatization, normalization, or removing stop words. After preprocessing, the resulting text is encoded into tokens. Tokens represent the smallest textual elements in a collection of documents. Generally, a token can be a part of a word, a sequence of symbols, or an individual symbol. Ultimately, tokens are converted into numerical vectors.

NLP roadmap

The bag of words method is the most basic way to encode tokens, focusing on counting the frequency of tokens in each document. However, in practice, this is usually not sufficient, as it is also necessary to account for token importance — a concept introduced in the TF-IDF and BM25 methods. While TF-IDF improves upon the naive counting approach of bag of words, researchers have developed a completely new approach called embeddings.

Embeddings are numerical vectors whose components preserve the semantic meanings of words. Because of this, embeddings play a crucial role in NLP, enabling input data to be trained or used for model inference. Additionally, embeddings can be used to compare text similarity, allowing for the retrieval of the most relevant documents from a collection.

Embeddings can also be used to encode other unstructured data, including images, audio, and videos.

As a field, NLP has been evolving rapidly over the last 10–20 years to efficiently solve various text-related problems. Complex tasks like text translation and text generation were initially addressed using recurrent neural networks (RNNs), which introduced the concept of memory, allowing neural networks to capture and retain key contextual information in long documents.

Although RNN performance gradually improved, it remained suboptimal for certain tasks. Moreover, RNNs are relatively slow, and their sequential prediction process does not allow for parallelization during training and inference, making them less efficient.

Additionally, the original Transformer architecture can be decomposed into two separate modules: BERT and GPT. Both of these form the foundation of the most state-of-the-art models used today to solve various NLP problems. Understanding their principles is valuable knowledge that will help learners advance further when studying or working with other large language models (LLMs).

Transformer architecture

When it comes to LLMs, I strongly recommend studying the evolution of at least the first three GPT models, as they have had a significant impact on the AI world we know today. In particular, I would like to highlight the concepts of few-shot and zero-shot learning, introduced in GPT-2, which enable LLMs to solve text generation tasks without explicitly receiving any training examples for them.

Another important technique developed in recent years is retrieval-augmented generation (RAG)The main limitation of LLMs is that they are only aware of the context used during their training. As a result, they lack knowledge of any information beyond their training data.

Example of a RAG pipeline

The retriever converts the input prompt into an embedding, which is then used to query a vector database. The database returns the most relevant context based on the similarity to the embedding. This retrieved context is then combined with the original prompt and passed to a generative model. The model processes both the initial prompt and the additional context to generate a more informed and contextually accurate response.

A good example of this limitation is the first version of the ChatGPT model, which was trained on data up to the year 2022 and had no knowledge of events that occurred from 2023 onward.

To address this limitation, OpenAI researchers developed a RAG pipeline, which includes a constantly updated database containing new information from external sources. When ChatGPT is given a task that requires external knowledge, it queries the database to retrieve the most relevant context and integrates it into the final prompt sent to the machine learning model.

The goal of distillation is to create a smaller model that can imitate a larger one. In practice, this means that if a large model makes a prediction, the smaller model is expected to produce a similar result.

In the modern era, LLM development has led to models with millions or even billions of parameters. As a consequence, the overall size of these models may exceed the hardware limitations of standard computers or small portable devices, which come with many constraints.

Quantization is the process of reducing the memory required to store numerical values representing a model’s weights.

This is where optimization techniques become particularly useful, allowing LLMs to be compressed without significantly compromising their performance. The most commonly used techniques today include distillation, quantization, and pruning.

Pruning refers to discarding the least important weights of a model.

Fine-tuning

Regardless of the area in which you wish to specialize, knowledge of fine-tuning is a must-have skill! Fine-tuning is a powerful concept that allows you to efficiently adapt a pre-trained model to a new task.

Fine-tuning is especially useful when working with very large models. For example, imagine you want to use BERT to perform semantic analysis on a specific dataset. While BERT is trained on general data, it might not fully understand the context of your dataset. At the same time, training BERT from scratch for your specific task would require a massive amount of resources.

Here is where fine-tuning comes in: it involves taking a pre-trained BERT (or another model) and freezing some of its layers (usually those at the beginning). As a result, BERT is retrained, but this time only on the new dataset provided. Since BERT updates only a subset of its weights and the new dataset is likely much smaller than the original one BERT was trained on, fine-tuning becomes a very efficient technique for adapting BERT’s rich knowledge to a specific domain.

Fine-tuning is widely used not only in NLP but also across many other domains.

# 05. Computer vision

As the name suggests, computer vision (CV) involves analyzing images and videos using machine learning. The most common tasks include image classification, object detection, image segmentation, and generation.

Most CV algorithms are based on neural networks, so it is essential to understand how they work in detail. In particular, CV uses a special type of network called convolutional neural networks (CNNs). These are similar to fully connected networks, except that they typically begin with a set of specialized mathematical operations called convolutions.

Computer vision roadmap

In simple terms, convolutions act as filters, enabling the model to extract the most important features from an image, which are then passed to fully connected layers for further analysis.

The next step is to study the most popular CNN architectures for classification tasks, such as AlexNet, VGG, Inception, ImageNet, and ResNet.

Speaking of the object detection task, the YOLO algorithm is a clear winner. It is not necessary to study all of the dozens of versions of YOLO. In reality, going through the original paper of the first YOLO should be sufficient to understand how a relatively difficult problem like object detection is elegantly transformed into both classification and regression problems. This approach in YOLO also provides a nice intuition on how more complex CV tasks can be reformulated in simpler terms.

While there are many architectures for performing image segmentation, I would strongly recommend learning about UNet, which introduces an encoder-decoder architecture.

Finally, image generation is probably one of the most challenging tasks in CV. Personally, I consider it an optional topic for learners, as it involves many advanced concepts. Nevertheless, gaining a high-level intuition of how generative adversial networks (GAN) function to generate images is a good way to broaden one’s horizons.

In some problems, the training data might not be enough to build a performant model. In such cases, the data augmentation technique is commonly used. It involves the artificial generation of training data from already existing data (images). By feeding the model more diverse data, it becomes capable of learning and recognizing more patterns.

# 06. Other areas

It would be very hard to present in detail the Roadmaps for all existing machine learning domains in a single article. That is why, in this section, I would like to briefly list and explain some of the other most popular areas in data science worth exploring.

First of all, recommender systems (RecSys) have gained a lot of popularity in recent years. They are increasingly implemented in online shops, social networks, and streaming services. The key idea of most algorithms is to take a large initial matrix of all users and items and decompose it into a product of several matrices in a way that associates every user and every item with a high-dimensional embedding. This approach is very flexible, as it then allows different types of comparison operations on embeddings to find the most relevant items for a given user. Moreover, it is much more rapid to perform analysis on small matrices rather than the original, which usually tends to have huge dimensions.

Matrix decomposition in recommender systems is one of the most commonly used methods

Ranking often goes hand in hand with RecSys. When a RecSys has identified a set of the most relevant items for the user, ranking algorithms are used to sort them to determine the order in which they will be shown or proposed to the user. A good example of their usage is search engines, which filter query results from top to bottom on a web page.

Closely related to ranking, there is also a matching problem that aims to optimally map objects from two sets, A and B, in a way that, on average, every object pair (a, b) is mapped “well” according to a matching criterion. A use case example might include distributing a group of students to different university disciplines, where the number of spots in each class is limited.

Clustering is an unsupervised machine learning task whose objective is to split a dataset into several regions (clusters), with each dataset object belonging to one of these clusters. The splitting criteria can vary depending on the task. Clustering is useful because it allows for grouping similar objects together. Moreover, further analysis can be applied to treat objects in each cluster separately.

The goal of clustering is to group dataset objects (on the left) into several categories (on the right) based on their similarity.

Dimensionality reduction is another unsupervised problem, where the goal is to compress an input dataset. When the dimensionality of the dataset is large, it takes more time and resources for machine learning algorithms to analyze it. By identifying and removing noisy dataset features or those that do not provide much valuable information, the data analysis process becomes considerably easier.

Similarity search is an area that focuses on designing algorithms and data structures (indexes) to optimize searches in a large database of embeddings (vector database). More precisely, given an input embedding and a vector database, the goal is to approximately find the most similar embedding in the database relative to the input embedding.

The goal of similarity search is to approximately find the most similar embedding in a vector database relative to a query embedding.

The word “approximately” means that the search is not guaranteed to be 100% precise. Nevertheless, this is the main idea behind similarity search algorithms — sacrificing a bit of accuracy in exchange for significant gains in prediction speed or data compression.

Time series analysis involves studying the behavior of a target variable over time. This problem can be solved using classical tabular algorithms. However, the presence of time introduces new factors that cannot be captured by standard algorithms. For instance:

  • the target variable can have an overall trend, where in the long term its values increase or decrease (e.g., the average yearly temperature rising due to global warming).
  • the target variable can have a seasonality which makes its values change based on the currently given period (e.g. temperature is lower in winter and higher in summer).

Most of the time series models take both of these factors into account. In general, time series models are mainly used a lot in financial, stock or demographic analysis.

Time series data if often decomposed in several components which include trend and seasonality.

Another advanced area I would recommend exploring is reinforcement learning, which fundamentally changes the algorithm design compared to classical machine learning. In simple terms, its goal is to train an agent in an environment to make optimal decisions based on a reward system (also known as the “trial and error approach”). By taking an action, the agent receives a reward, which helps it understand whether the chosen action had a positive or negative effect. After that, the agent slightly adjusts its strategy, and the entire cycle repeats.

Reinforcement learning framework. Image adopted by the author. Source: Reinforcement Learning. An Introduction. Second Edition | Richard S. Sutton and Andrew G. Barto

Reinforcement learning is particularly popular in complex environments where classical algorithms are not capable of solving a problem. Given the complexity of reinforcement learning algorithms and the computational resources they require, this area is not yet fully mature, but it has high potential to gain even more popularity in the future.

Main applications of reinforcement learning

Currently the most popular applications are:

  • Games. Existing approaches can design optimal game strategies and outperform humans. The most well-known examples are chess and Go.
  • Robotics. Advanced algorithms can be incorporated into robots to help them move, carry objects or complete routine tasks at home.
  • Autopilot. Reinforcement learning methods can be developed to automatically drive cars, control helicopters or drones.

Conclusion

This article was a logical continuation of the previous part and expanded the skill set needed to become a data scientist. While most of the mentioned topics require time to master, they can add significant value to your portfolio. This is especially true for the NLP and CV domains, which are in high demand today.

After reaching a high level of expertise in data science, it is still crucial to stay motivated and consistently push yourself to learn new topics and explore emerging algorithms.

Data science is a constantly evolving field, and in the coming years, we might witness the development of new state-of-the-art approaches that we could not have imagined in the past.

Resources

All images are by the author unless noted otherwise.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Nvidia launches research center to accelerate quantum computing breakthrough

The new research center aims to tackle quantum computing’s most significant challenges, including qubit noise reduction and the transformation of experimental quantum processors into practical devices. “By combining quantum processing units (QPUs) with state-of-the-art GPU technology, Nvidia hopes to accelerate the timeline to practical quantum computing applications,” the statement added.

Read More »

Keysight network packet brokers gain AI-powered features

The technology has matured considerably since then. Over the last five years, Singh said that most of Keysight’s NPB customers are global Fortune 500 organizations that have large network visibility practices. Meaning they deploy a lot of packet brokers with capabilities ranging anywhere from one gigabit networking at the edge,

Read More »

Adding, managing and deleting groups on Linux

$ sudo groupadd -g 1111 techs In this case, a specific group ID (1111) is being assigned. Omit the -g option to use the next available group ID (e.g., sudo groupadd techs). Once a group is added, you will find it in the /etc/group file. $ grep techs /etc/grouptechs:x:1111: Adding

Read More »

Trump Says He’ll Soon Sign Ukraine Natural Resources Deal

President Donald Trump said he would soon sign a natural-resources deal with Ukraine that would give the US partial control over revenue from future resource extraction. “One of the things we are doing is signing a deal very shortly with respect to rare earths with Ukraine,” Trump said Thursday during an education event at the White House. The agreement fell apart late last month after a tense Oval Office meeting between Trump and Ukrainian President Volodymyr Zelenskiy. Under that deal, Ukraine would contribute half of all revenues from future sales of natural resources, including minerals, oil, natural gas, hydrocarbons as well as infrastructure to a reconstruction fund jointly owned with the US. Trump’s comments come one day after his top spokeswoman said the administration was moving beyond the previously negotiated minerals deal with Kyiv to focus more broadly on a peace agreement between Russia and Ukraine. “We are now focused on a long-term peace agreement,” White House Press Secretary Karoline Leavitt said Wednesday. “We’ve moved beyond just the economic minerals deal framework.” Trump’s focus on Ukraine’s commodities has raised questions about what the nation really has to offer. It has no major rare-earth reserves that are internationally recognized as economically viable. Ukraine is an established producer of coal, iron ore, uranium, titanium, and magnesium, and expanding those sectors could be profitable for the US. The president spoke about the Ukraine agreement shortly after he invoked emergency powers to boost the US’s ability to produce critical minerals, and possibly coal, as part of a broad effort to ramp up domestic production and lower reliance on imports.  “I also signed an executive order to dramatically increase production of critical minerals and rare earths. It’s a big thing in this country, and as you know, we’re also signing agreements in various locations to

Read More »

Sagepoint Acquires Two RNG Production Sites in Kansas

Newly launched waste-to-energy company Sagepoint Energy LLC said it has acquired two renewable natural gas (RNG) production sites in Kansas state with an expected combined production of 1 billion cubic feet (Bcf) a year. Carmel, Indiana-based Sagepoint said the new assets grow its anticipated production of RNG, or biomethane, to 1.5 Bcf per annum. “These assets support Sagepoint’s mission to provide diversified resource efficiency solutions to its customers and reinforce the Company’s commitment to reducing emissions across the U.S. natural gas supply chain”, it said in an online statement. Lynx Renewable Energy Kansas and Renewable Power Producers (RPP), put into operation 2022 and 2017 respectively, both produce pipeline-quality RNG using landfill gas as feedstock, Sagepoint said. Lynx, adjacent to the Plumb Thicket Landfill in Harper, was acquired September 2024 and is set for expansion “to support various growth initiatives on-site”, Sagepoint said. RPP, adjacent to the HAMM Sanitary Landfill in Lawrence, was acquired February 2025. “We are thrilled to announce the Lynx and RPP transactions, which represent a significant milestone in Sagepoint’s feedstock diversification strategy”, commented Nick Soncrant, Sagepoint vice-president for business development. “We are eager to incorporate these valuable assets into our portfolio and contribute to the continued expansion of our RNG capabilities”. Sagepoint has two other landfill RNG assets. Plumb Thicket RNG in Harper, operational since 2022, is expected to produce 400,000 million British thermal units (MMBtu) a year. Lawrence LNG in Lawrence, operational since 2017, is expected to produce over 600,000 MMBtu a year, according to Sagepoint. Concurrent with the announcement of the two new acquisitions, Sagepoint announced the appointment of Terry Peak as vice president for landfill operations. Peak most recently served as vice president for RNG commercial operations development at Kinder Morgan. Before that, Peak served as chief operating officer at ProLiance Energy and Kinetrex

Read More »

Global Oil Demand Makes Strong Start to 2025

Global oil demand has made a strong start to 2025, analysts at Standard Chartered Bank, including Commodities Research Head Paul Horsnell, said in a report sent to Rigzone by Horsnell on Thursday. “Based on a variety of national sources and the 19 March Joint Organizations Data Initiative (JODI) release, we estimate that demand averaged 102.77 million barrels per day in January, a year on year increase of 2.19 million barrels per day,” the Standard Chartered Bank analysts said in the report. “This is in line with the U.S. Energy Information Administration (EIA) estimate for January that put demand at 102.74 million barrels per day and growth at 1.85 million barrels per day,” they added. In the report, the Standard Chartered Bank analysts noted that January is usually the seasonal low point for global demand and said they expect demand “to move above 105.0 million barrels per day for the first time in June before reaching a 2025 high of 105.6 million barrels per day in August”. “Our forecast for 2025 demand growth stands at 1.41 million barrels per day. After weakening in H2-2024, our forecast is now back where it stood at its initiation in January 2024,” they added. “While the main downside risk to demand comes from U.S. tariff policy and the economic uncertainty it creates, for now demand-side fundamentals appear robust despite negative sentiment,” they continued. The Standard Chartered Bank analysts stated in the report that they expect global demand to exceed supply by 0.9 million barrels per day in the second quarter and by 0.5 million barrels per day in the third quarter. Rigzone has contacted the Trump transition team and the White House for comment on Standard Chartered Bank’s report. At the time of writing, neither have responded to Rigzone. In a research note sent to

Read More »

Voyager Midstream Acquires Phillips 66 Stake in Panola NGL Pipeline

Voyager Midstream Holdings LLC has completed the purchase of a non-operating stake in a Texas natural gas liquid (NGL) pipeline from Phillips 66. The 254-mile Panola Pipeline, operated by Enterprise Products Partners LP, transports Y-Grade NGLs from Panola County to fractionation facilities in Mont Belvieu city. “Voyager’s interest in the Panola Pipeline is a strategic fit with the company’s existing footprint in East Texas and North Louisiana”, Houston, Texas-based Voyager said in an online statement, referring to assets also acquired from Phillips 66. Voyager chief executive Will Harvey commented, “Panola Pipeline is a critical NGL pipeline connecting the major East Texas gas processing complexes and Gulf Coast demand markets”. “We are excited to work alongside our partners in Panola Pipeline to safely transport liquids to satisfy growing demand for NGLs along the Gulf Coast”, Harvey added. Voyager did not disclose the financial details of the transaction. It said that in conjunction with the acquisition, it has entered into a credit facility with the Bank of Oklahoma. “This credit facility, along with existing equity commitments from Pearl, provides Voyager with substantial flexibility and capital to continue growing its business in support of its customers”, it said. Pearl Energy Investments launched Voyager in 2023 as a platform for the acquisition and development of crude oil, natural gas and produced water infrastructure across key basins in North America. Voyager operates about 550 miles of natural gas pipelines and associated compression. It also has 400 million cubic feet a day of cryogenic gas processing capacity and 12,000 barrels per day of liquid fractionation capacity. It also operates Carthage Hub, a gas trading and delivery hub capable of handling over 1 billion cubic feet per day. Carthage Hub interconnects multiple markets across the United States including LNG markets in Texas and Louisiana. All of these

Read More »

BP makes first divestment of target $20bn asset sale

Energy firm BP has sold a stake worth $1 billion (£733m) in the Trans-Anatolian Natural Gas Pipeline (TANAP) to Apollo as part of the first tranche of a $20bn asset sale target. The US asset manager will take a 25% non-operated stake in BP Pipelines (TANAP) (BP TANAP) which itself holds BP’s 12% interest in TANAP, owner and operator of the pipeline that carries natural gas from Azerbaijan across Turkey. The sale comes after BP chief executive Murray Auchincloss unveiled plans to review assets for a potential sale including its core lubricants business, Castrol and its solar business, BP Lightsource. The $20bn target was announced alongside a “fundamental reset” for the firm as it turns focus to its traditional oil and gas production business. The deal marks the second such sale agreed with US fund manager, Apollo. Last year Apollo snapped up another $1bn BP-owned stake in Trans Adriatic Pipeline (TAP). TANAP, running for approximately 1,120 miles (1,800km) across Turkey, is the central section of the Southern Gas Corridor project (SGC) pipeline system. The SGC transports gas from the BP-operated Shah Deniz gas field in the Azerbaijan sector of the Caspian Sea to markets in Europe, including Italy and Greece. It connects to TAP at the Greek-Turkish border, which crosses Northern Greece, Albania and the Adriatic Sea before coming ashore in Southern Italy to connect to the Italian natural gas network. BP said the deal allows it to “monetise” its interest in TANAP while retaining control of the asset. BP executive vice president for gas and low carbon energy William Lin said: “This unlocks capital from our global portfolio while retaining our role in this strategic asset for bringing Azerbaijan gas to Europe. BP and Apollo will continue to explore further strategic cooperation and mutually beneficial opportunities.” Apollo partner Skardon Baker

Read More »

‘First Major Project’ for GB Energy Announced

A release posted on the UK government website on Friday announced that the “first major project” for Great British Energy (GB Energy) “is to put rooftop solar panels on around 200 schools and 200 NHS sites, saving hundreds of millions on their energy bills”. Hundreds of schools, NHS trusts and communities across the UK will benefit from new rooftop solar power and renewable schemes to save money on their energy bills, thanks to a total GBP 200 million ($258.6 million) investment from the UK government and Great British Energy, the release stated. “In England around GBP 80 million ($103.4 million) in funding will support around 200 schools, alongside GBP 100 million ($129.3 million) for nearly 200 NHS sites, covering a third of NHS trusts, to install rooftop solar panels that could power classrooms and operations, with potential to sell leftover energy back to the grid,” the release noted. The first panels are expected to be in schools and hospitals by the end of summer 2025, according to the release. The release stated that local authorities and community energy groups will also be supported by nearly GBP 12 million ($15.5 million) to help build local clean energy projects. A further GBP 9.3 million ($12.0 million) will power schemes in Scotland, Wales, and Northern Ireland including community energy or rooftop solar for public buildings, the release added. “Great British Energy’s first major project will be to help our vital public institutions save hundreds of millions on bills to reinvest on the frontline,” Energy Secretary Ed Miliband said in the release. “Great British Energy will provide power for pupils and patients,” he added. “Parents at the school gate and patients in hospitals will experience the difference Great British Energy can make. This is our clean energy superpower mission in action, with lower bills

Read More »

PEAK:AIO adds power, density to AI storage server

There is also the fact that many people working with AI are not IT professionals, such as professors, biochemists, scientists, doctors, clinicians, and they don’t have a traditional enterprise department or a data center. “It’s run by people that wouldn’t really know, nor want to know, what storage is,” he said. While the new AI Data Server is a Dell design, PEAK:AIO has worked with Lenovo, Supermicro, and HPE as well as Dell over the past four years, offering to convert their off the shelf storage servers into hyper fast, very AI-specific, cheap, specific storage servers that work with all the protocols at Nvidia, like NVLink, along with NFS and NVMe over Fabric. It also greatly increased storage capacity by going with 61TB drives from Solidigm. SSDs from the major server vendors typically maxed out at 15TB, according to the vendor. PEAK:AIO competes with VAST, WekaIO, NetApp, Pure Storage and many others in the growing AI workload storage arena. PEAK:AIO’s AI Data Server is available now.

Read More »

SoftBank to buy Ampere for $6.5B, fueling Arm-based server market competition

SoftBank’s announcement suggests Ampere will collaborate with other SBG companies, potentially creating a powerful ecosystem of Arm-based computing solutions. This collaboration could extend to SoftBank’s numerous portfolio companies, including Korean/Japanese web giant LY Corp, ByteDance (TikTok’s parent company), and various AI startups. If SoftBank successfully steers its portfolio companies toward Ampere processors, it could accelerate the shift away from x86 architecture in data centers worldwide. Questions remain about Arm’s server strategy The acquisition, however, raises questions about how SoftBank will balance its investments in both Arm and Ampere, given their potentially competing server CPU strategies. Arm’s recent move to design and sell its own server processors to Meta signaled a major strategic shift that already put it in direct competition with its own customers, including Qualcomm and Nvidia. “In technology licensing where an entity is both provider and competitor, boundaries are typically well-defined without special preferences beyond potential first-mover advantages,” Kawoosa explained. “Arm will likely continue making independent licensing decisions that serve its broader interests rather than favoring Ampere, as the company can’t risk alienating its established high-volume customers.” Industry analysts speculate that SoftBank might position Arm to focus on custom designs for hyperscale customers while allowing Ampere to dominate the market for more standardized server processors. Alternatively, the two companies could be merged or realigned to present a unified strategy against incumbents Intel and AMD. “While Arm currently dominates processor architecture, particularly for energy-efficient designs, the landscape isn’t static,” Kawoosa added. “The semiconductor industry is approaching a potential inflection point, and we may witness fundamental disruptions in the next 3-5 years — similar to how OpenAI transformed the AI landscape. SoftBank appears to be maximizing its Arm investments while preparing for this coming paradigm shift in processor architecture.”

Read More »

Nvidia, xAI and two energy giants join genAI infrastructure initiative

The new AIP members will “further strengthen the partnership’s technology leadership as the platform seeks to invest in new and expanded AI infrastructure. Nvidia will also continue in its role as a technical advisor to AIP, leveraging its expertise in accelerated computing and AI factories to inform the deployment of next-generation AI data center infrastructure,” the group’s statement said. “Additionally, GE Vernova and NextEra Energy have agreed to collaborate with AIP to accelerate the scaling of critical and diverse energy solutions for AI data centers. GE Vernova will also work with AIP and its partners on supply chain planning and in delivering innovative and high efficiency energy solutions.” The group claimed, without offering any specifics, that it “has attracted significant capital and partner interest since its inception in September 2024, highlighting the growing demand for AI-ready data centers and power solutions.” The statement said the group will try to raise “$30 billion in capital from investors, asset owners, and corporations, which in turn will mobilize up to $100 billion in total investment potential when including debt financing.” Forrester’s Nguyen also noted that the influence of two of the new members — xAI, owned by Elon Musk, along with Nvidia — could easily help with fundraising. Musk “with his connections, he does not make small quiet moves,” Nguyen said. “As for Nvidia, they are the face of AI. Everything they do attracts attention.” Info-Tech’s Bickley said that the astronomical dollars involved in genAI investments is mind-boggling. And yet even more investment is needed — a lot more.

Read More »

IBM broadens access to Nvidia technology for enterprise AI

The IBM Storage Scale platform will support CAS and now will respond to queries using the extracted and augmented data, speeding up the communications between GPUs and storage using Nvidia BlueField-3 DPUs and Spectrum-X networking, IBM stated. The multimodal document data extraction workflow will also support Nvidia NeMo Retriever microservices. CAS will be embedded in the next update of IBM Fusion, which is planned for the second quarter of this year. Fusion simplifies the deployment and management of AI applications and works with Storage Scale, which will handle high-performance storage support for AI workloads, according to IBM. IBM Cloud instances with Nvidia GPUs In addition to the software news, IBM said its cloud customers can now use Nvidia H200 instances in the IBM Cloud environment. With increased memory bandwidth (1.4x higher than its predecessor) and capacity, the H200 Tensor Core can handle larger datasets, accelerating the training of large AI models and executing complex simulations, with high energy efficiency and low total cost of ownership, according to IBM. In addition, customers can use the power of the H200 to process large volumes of data in real time, enabling more accurate predictive analytics and data-driven decision-making, IBM stated. IBM Consulting capabilities with Nvidia Lastly, IBM Consulting is adding Nvidia Blueprint to its recently introduced AI Integration Service, which offers customers support for developing, building and running AI environments. Nvidia Blueprints offer a suite pre-validated, optimized, and documented reference architectures designed to simplify and accelerate the deployment of complex AI and data center infrastructure, according to Nvidia.  The IBM AI Integration service already supports a number of third-party systems, including Oracle, Salesforce, SAP and ServiceNow environments.

Read More »

Nvidia’s silicon photonics switches bring better power efficiency to AI data centers

Nvidia typically uses partnerships where appropriate, and the new switch design was done in collaboration with multiple vendors across different aspects, including creating the lasers, packaging, and other elements as part of the silicon photonics. Hundreds of patents were also included. Nvidia will licensing the innovations created to its partners and customers with the goal of scaling this model. Nvidia’s partner ecosystem includes TSMC, which provides advanced chip fabrication and 3D chip stacking to integrate silicon photonics into Nvidia’s hardware. Coherent, Eoptolink, Fabrinet, and Innolight are involved in the development, manufacturing, and supply of the transceivers. Additional partners include Browave, Coherent, Corning Incorporated, Fabrinet, Foxconn, Lumentum, SENKO, SPIL, Sumitomo Electric Industries, and TFC Communication. AI has transformed the way data centers are being designed. During his keynote at GTC, CEO Jensen Huang talked about the data center being the “new unit of compute,” which refers to the entire data center having to act like one massive server. That has driven compute to be primarily CPU based to being GPU centric. Now the network needs to evolve to ensure data is being fed to the GPUs at a speed they can process the data. The new co-packaged switches remove external parts, which have historically added a small amount of overhead to networking. Pre-AI this was negligible, but with AI, any slowness in the network leads to dollars being wasted.

Read More »

Critical vulnerability in AMI MegaRAC BMC allows server takeover

“In disruptive or destructive attacks, attackers can leverage the often heterogeneous environments in data centers to potentially send malicious commands to every other BMC on the same management segment, forcing all devices to continually reboot in a way that victim operators cannot stop,” the Eclypsium researchers said. “In extreme scenarios, the net impact could be indefinite, unrecoverable downtime until and unless devices are re-provisioned.” BMC vulnerabilities and misconfigurations, including hardcoded credentials, have been of interest for attackers for over a decade. In 2022, security researchers found a malicious implant dubbed iLOBleed that was likely developed by an APT group and was being deployed through vulnerabilities in HPE iLO (HPE’s Integrated Lights-Out) BMC. In 2018, a ransomware group called JungleSec used default credentials for IPMI interfaces to compromise Linux servers. And back in 2016, Intel’s Active Management Technology (AMT) Serial-over-LAN (SOL) feature which is part of Intel’s Management Engine (Intel ME), was exploited by an APT group as a covert communication channel to transfer files. OEM, server manufacturers in control of patching AMI released an advisory and patches to its OEM partners, but affected users must wait for their server manufacturers to integrate them and release firmware updates. In addition to this vulnerability, AMI also patched a flaw tracked as CVE-2024-54084 that may lead to arbitrary code execution in its AptioV UEFI implementation. HPE and Lenovo have already released updates for their products that integrate AMI’s patch for CVE-2024-54085.

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »