From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities

Stay Ahead, Stay ONMINE

From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities

Introduction: Can AI really distinguish dog breeds like human experts? One day while taking a walk, I saw a fluffy white puppy and wondered, Is that a Bichon Frise or a Maltese? No matter how closely I looked, they seemed almost identical. Huskies and Alaskan Malamutes, Shiba Inus and Akitas, I always found myself second-guessing. How do professional veterinarians and researchers spot the differences at a glance? What are they focusing on?

This question kept coming back to me while developing PawMatchAI. One day, while struggling to improve my model’s accuracy, I realized that when I recognize objects, I don’t process all details at once. Instead, I first notice the overall shape, then refine my focus on specific features. Could this “coarse-to-fine” processing be the key to how experts identify similar dog breeds so accurately? Digging into research, I came across a cognitive science paper confirming that human visual recognition relies on multi-level feature analysis. Experts don’t just memorize images, they analyze structured traits such as: Overall body proportions (large vs. small dogs, square vs. elongated body shapes) Head features (ear shape, muzzle length, eye spacing) Fur texture and distribution (soft vs. curly vs. smooth, double vs. single coat) Color and pattern (specific markings, pigment distribution) Behavioral and postural features (tail posture, walking style) This made me rethink traditional CNNs (Convolutional Neural Networks). While they are incredibly powerful at learning local features, they don’t explicitly separate key characteristics the way human experts do. Instead, these features are entangled within millions of parameters without clear interpretability. So I designed the Morphological Feature Extractor, an approach that helps AI analyze breeds in structured layers—just like how experts do. This architecture specifically focuses on body proportions, head shape, fur texture, tail structure, and color patterns, making AI not just see objects, but understand them. PawMatchAI is my personal project that can identify 124 dog breeds and provide breed comparisons and recommendations based on user preferences. If you’re interested, you can try it on HuggingFace Space or check out the complete code on GitHub:

HuggingFace: PawMatchAI

GitHub: PawMatchAI In this article, I’ll dive deeper into this biologically-inspired design and share how I turned simple everyday observations into a practical AI solution. 1. Human vision vs. machine vision: Two fundamentally different ways of perceiving the world At first, I thought humans and AI recognized objects in a similar way. But after testing my model and looking into cognitive science, I realized something surprising, humans and AI actually process visual information in fundamentally different ways. This completely changed how I approached AI-based recognition.

Human vision: Structured and adaptive The human visual system follows a highly structured yet flexible approach when recognizing objects:

Seeing the big picture first → Our brain first scans the overall shape and size of an object. This is why, just by looking at a dog’s silhouette, we can quickly tell whether it’s a large or small breed. Personally, this is always my first instinct when spotting a dog.

Focusing on key features → Next, our attention automatically shifts to the features that best differentiate one breed from another. While researching, I found that professional veterinarians often emphasize ear shape and muzzle length as primary indicators for breed identification. This made me realize how experts make quick decisions.

Learning through experience → The more dogs we see, the more we refine our recognition process. Someone seeing a Samoyed for the first time might focus on its fluffy white fur, while an experienced dog enthusiast would immediately recognize its distinctive “Samoyed smile”, a unique upturned mouth shape.

How CNNs “see” the world Convolutional Neural Networks (CNNs) follow a completely different recognition strategy: A complex system that’s hard to interpret → CNNs do learn patterns from simple edges and textures to high-level features, but all of this happens inside millions of parameters, making it hard to understand what the model is really focusing on. When AI confuses the background for the dog → One of the most frustrating problems I ran into was that my model kept misidentifying breeds based on their surroundings. For example, if a dog was in a snowy setting, it almost always guessed Siberian Husky, even if the breed was completely different. 2. Morphological Feature Extractor: Inspiration from cognitive science 2.1 Core design philosophy Throughout the development of PawMatchAI, I’ve been trying to make the model identify similar-looking dog breeds as accurately as human experts can. However, my early attempts didn’t go as planned. At first, I thought training deeper CNNs with more parameters would improve performance. But no matter how powerful the model became, it still struggled with similar breeds, mistaking Bichon Frises for Maltese, or Huskies for Eskimo Dog. That made me wonder: Can AI really understand these subtle differences just by getting bigger and deeper? Then I thought back to something I had noticed before, when humans recognize objects, we don’t process everything at once. We start by looking at the overall shape, then gradually zoom in on the details. This got me thinking, what if CNNs could mimic human object recognition habits by starting with overall morphology and then focusing on detailed features? Would this improve recognition capabilities? Based on this idea, I decided to stop simply making CNNs deeper and instead design a more structured model architecture, ultimately establishing three core design principles: Explicit morphological features: This made me reflect on my own question: What exactly are professionals looking at? It turns out that veterinarians and breed experts don’t just rely on instinct, they follow a clear set of criteria, focusing on specific traits. So instead of letting the model “guess” which parts matter, I designed it to learn directly from these expert-defined features, making its decision-making process closer to human cognition. Multi-scale parallel processing: This corresponds to my cognitive insight: humans don’t process visual information linearly but attend to features at different levels simultaneously. When we see a dog, we don’t need to complete our analysis of the overall outline before observing local details; rather, these processes happen concurrently. Therefore, I designed multiple parallel feature analyzers, each focusing on features at different scales, working together rather than sequentially. Why relationships between features matter more than individual traits: I came to realize that looking at individual features alone often isn’t enough to determine a breed. The recognition process isn’t just about identifying separate traits, it’s about how they interact. For example, a dog with short hair and pointed ears could be a Doberman, if it has a slender body. But if that same combination appears on a stocky, compact frame, it’s more likely a Boston Terrier. Clearly, the way features relate to one another is often the key to distinguishing breeds. 2.2 Technical implementation of the five morphological feature analyzers Each analyzer uses different convolution kernel sizes and layers to address various features:

Body proportion analyzer # Using large convolution kernels (7×7) to capture overall body features ‘body_proportion’: nn.Sequential( nn.Conv2d(64, 128, kernel_size=7, padding=3), nn.BatchNorm2d(128), nn.ReLU(), nn.Conv2d(128, 128, kernel_size=3, padding=1), nn.BatchNorm2d(128), nn.ReLU() ) Initially, I tried even larger kernels but found they focused too much on the background. I eventually used (7×7) kernels to capture overall morphological features, just like how canine experts first notice whether a dog is large, medium, or small, and whether its body shape is square or rectangular. For example, when identifying similar small white breeds (like Bichon Frise vs. Maltese), body proportions are often the initial distinguishing point.

Head feature analyzer # Medium-sized kernels (5×5) are suitable for analyzing head structure ‘head_features’: nn.Sequential( nn.Conv2d(64, 128, kernel_size=5, padding=2), nn.BatchNorm2d(128), nn.ReLU(), nn.Conv2d(128, 128, kernel_size=3, padding=1), nn.BatchNorm2d(128), nn.ReLU() ) The head feature analyzer was the part I tested most extensively. The technical challenge was that the head contains multiple key identification points (ears, muzzle, eyes), but their relative positions are crucial for overall recognition. The final design using 5×5 convolution kernels allows the model to learn the relative positioning of these features while maintaining computational efficiency.

Tail feature analyzer ‘tail_features’: nn.Sequential( nn.Conv2d(64, 128, kernel_size=5, padding=2), nn.BatchNorm2d(128), nn.ReLU(), nn.Conv2d(128, 128, kernel_size=3, padding=1), nn.BatchNorm2d(128), nn.ReLU() ) Tails typically occupy only a small portion of an image and come in many forms. Tail shape is a key identifying feature for certain breeds, such as the curled upward tail of Huskies and the back-curled tail of Samoyeds. The final solution uses a structure similar to the head analyzer but incorporates more data augmentation during training (like random cropping and rotation).

Fur feature analyzer # Small kernels (3×3) are better for capturing fur texture ‘fur_features’: nn.Sequential( nn.Conv2d(64, 128, kernel_size=3, padding=1), nn.BatchNorm2d(128), nn.ReLU(), nn.Conv2d(128, 128, kernel_size=3, padding=1), nn.BatchNorm2d(128), nn.ReLU() ) Fur texture and length are critical features for distinguishing visually similar breeds. When judging fur length, a larger receptive field is needed. Through experimentation, I found that stacking two 3×3 convolutional layers improved recognition accuracy.

Color pattern analyzer # Color feature analyzer: analyzing color distribution ‘color_pattern’: nn.Sequential( # First layer: capturing basic color distribution nn.Conv2d(64, 128, kernel_size=3, padding=1), nn.BatchNorm2d(128), nn.ReLU(), # Second layer: analyzing color patterns and markings nn.Conv2d(128, 128, kernel_size=3, padding=1), nn.BatchNorm2d(128), nn.ReLU(), # Third layer: integrating color information nn.Conv2d(128, 128, kernel_size=1), nn.BatchNorm2d(128), nn.ReLU() ) The color pattern analyzer has a more complex design than other analyzers because of the difficulty in distinguishing between colors themselves and their distribution patterns. For example, German Shepherds and Rottweilers both have black and tan fur, but their distribution patterns differ. The three-layer design allows the model to first capture basic colors, then analyze distribution patterns, and finally integrate this information through 1×1 convolutions. 2.3 Feature interaction and integration mechanism: The key breakthrough Having different analyzers for each feature is important, but making them interact with each other is the most crucial part: # Feature attention mechanism: dynamically adjusting the importance of different features self.feature_attention = nn.MultiheadAttention( embed_dim=128, num_heads=8, dropout=0.1, batch_first=True ) # Feature relationship analyzer: analyzing connections between different morphological features self.relation_analyzer = nn.Sequential( nn.Linear(128 * 5, 256), # Combination of five morphological features nn.LayerNorm(256), nn.ReLU(), nn.Linear(256, 128), nn.LayerNorm(128), nn.ReLU() ) # Feature integrator: intelligently combining all features self.feature_integrator = nn.Sequential( nn.Linear(128 * 6, in_features), # Five original features + one relationship feature nn.LayerNorm(in_features), nn.ReLU() ) The multi-head attention mechanism is vital for identifying the most representative features of each breed. For example, short-haired breeds rely more on body type and head features for identification, while long-haired breeds depend more on fur texture and color. 2.4 Feature Relationship Analyzer: Why feature relationships are so important After weeks of frustration, I finally realized my model was missing a crucial element – when we humans identify something, we don’t just recall individual details. Our brains connect the dots, linking features to form a complete image. The relationships between features are just as important as the features themselves. A small dog with pointed ears and fluffy fur is likely a Pomeranian, but the same features on a large dog might indicate a Samoyed. So I built the Feature Relationship Analyzer to embody this concept. Instead of processing each feature separately, I connected all five morphological features before passing them to the connecting layer. This lets the model learn relationships between features, helping it distinguish breeds that look almost identical at first glance, especially in four key aspects: Body and head coordination → Shepherd breeds typically have wolf-like heads paired with slender bodies, while bulldog breeds have broad heads with muscular, stocky builds. The model learns these associations rather than processing head and body shapes separately. Fur and color joint distribution → Certain breeds have specific fur types often accompanied by unique colors. For example, Border Collies tend to have black and white bicolor fur, while Golden Retrievers typically have long golden fur. Recognizing these co-occurring features improves accuracy. Head and tail paired features → Pointed ears and curled tails are common in northern sled dog breeds (like Samoyeds and Huskies), while drooping ears and straight tails are more typical of hound and spaniel breeds. Body, fur, and color three-dimensional feature space → Some combinations are strong indicators of specific breeds. Large build, short hair, and black-and-tan coloration almost always point to a German Shepherd. By focusing on how features interact rather than processing them separately, the Feature Relationship Analyzer bridges the gap between human intuition and AI-based recognition. 2.5 Residual connection: Keeping original information intact At the end of the forward propagation function, there’s a key residual connection: # Final integration with residual connection integrated_features = self.feature_integrator(final_features) return integrated_features + x # Residual connection This residual connection (+ x) serves a few important roles： Preserving important details → Ensures that while focusing on morphological features, the model still retains key information from the original representation. Helping deep models train better → In large architectures like ConvNeXtV2, residuals prevent gradients from vanishing, keeping learning stable. Providing flexibility → If the original features are already useful, the model can “skip” certain transformations instead of forcing unnecessary changes. Mimicking how the brain processes images → Just like our brains analyze objects and their locations at the same time, the model learns different perspectives in parallel. In the model design, a similar concept was adopted, allowing different feature analyzers to operate simultaneously, each focusing on different morphological features (like body type, fur, ear shape, etc.). Through residual connections, these different information channels can complement each other, ensuring the model doesn’t miss critical information and thereby improving recognition accuracy. 2.6 Overall workflow The complete feature processing flow is as follows: Five morphological feature analyzers simultaneously process spatial features, each using different-sized convolution layers and focusing on different features The feature attention mechanism dynamically adjusts focus on different features The feature relationship analyzer captures correlations between features, truly understanding breed characteristics The feature integrator combines all information (five original features + one relationship feature) Residual connections ensure no original information is lost 3. Architecture flow diagram: How the morphological feature extractor works Looking at the diagram, we can see a clear distinction between two processing paths: on the left, a specialized morphological feature extraction process, and on the right, the traditional CNN-based recognition path. Left path: Morphological feature processing Input feature tensor: This is the model’s input, featuring information from the CNN’s middle layers, similar to how humans first get a rough outline when viewing an image. The Feature Space Transformer reshapes compressed 1D features into a structured 2D representation, improving the model’s ability to capture spatial relationships. For example, when analyzing a dog’s ears, their features might be scattered in a 1D vector, making it harder for the model to recognize their connection. By mapping them into 2D space, this transformation brings related traits closer together, allowing the model to process them simultaneously, just as humans naturally do. 2D feature map: This is the transformed two-dimensional representation which, as mentioned above, now has more spatial structure and can be used for morphological analysis. At the heart of this system are five specialized Morphological Feature Analyzers, each designed to focus on a key aspect of dog breed identification: Body Proportion Analyzer: Uses large convolution kernels (7×7) to capture overall shape and proportion relationships, which is the first step in preliminary classification Head Feature Analyzer: Uses medium-sized convolution kernels (5×5) combined with smaller ones (3×3), focusing on head shape, ear position, muzzle length, and other key features Tail Feature Analyzer: Similarly uses a combination of 5×5 and 3×3 convolution kernels to analyze tail shape, curl degree, and posture, which are often decisive features for distinguishing similar breeds Fur Feature Analyzer: Uses consecutive small convolution kernels (3×3), specifically designed to capture fur texture, length, and density – these subtle features Color Pattern Analyzer: Employs a multi-layered convolution architecture, including 1×1 convolutions for color integration, specifically analyzing color distribution patterns and specific markings Similar to how our eyes instinctively focus on the most distinguishing features when recognizing faces, the Feature Attention Mechanism dynamically adjusts its focus on key morphological traits, ensuring the model prioritizes the most relevant details for each breed. Right path: Standard CNN processing Original feature representation: The initial feature representation of the image. CNN backbone (ConvNeXtV2): Uses ConvNeXtV2 as the backbone network, extracting features through standard deep learning methods. Classifier head: Transforms features into classification probabilities for 124 dog breeds. Integration path The Feature Relation Analyzer goes beyond isolated traits, it examines how different features interact, capturing relationships that define a breed’s unique appearance. For example, combinations like “head shape + tail posture + fur texture” might point to specific breeds. Feature integrator: Integrates morphological features and their relationship information to form a more comprehensive representation. Enhanced feature representation: The final feature representation, combining original features (through residual connections) and features obtained from morphological analysis. Finally, the model delivers its prediction, determining the breed based on a combination of original CNN features and morphological analysis. 4. Performance observations of the morphological feature extractor After analyzing the entire model architecture, the most important question was: Does it actually work? To verify the effectiveness of the Morphological Feature Extractor, I tested 30 photos of dog breeds that models typically confuse. A comparison between models shows a significant improvement: the baseline model correctly classified 23 out of 30 images (76.7%), while the addition of the Morphological Feature Extractor increased accuracy to 90% (27 out of 30 images). This improvement is not just reflected in numbers but also in how the model differentiates breeds. The heat maps below show which image regions the model focuses on before and after integrating the feature extractor. 4.1 Recognizing a Dachshund’s unique body proportions Let’s start with a misclassification case. The heatmap below shows that without the Morphological Feature Extractor, the model incorrectly classified a Dachshund as a Golden Retriever. Without morphological features, the model relied too much on color and fur texture, rather than recognizing the dog’s overall structure. The heat map reveals that the model’s attention was scattered, not just on the dog’s face, but also on background elements like the roof, which likely influenced the misclassification. Since long-haired Dachshunds and Golden Retrievers share a similar coat color, the model was misled, focusing more on superficial similarities rather than distinguishing key features like body proportions and ear shape. This shows a common issue with deep learning models, without proper guidance, they can focus on the wrong things and make mistakes. Here, the background distractions kept the model from noticing the Dachshund’s long body and short legs, which set it apart from a Golden Retriever. However, after integrating the Morphological Feature Extractor, the model’s attention shifted significantly, as seen in the heatmap below: Key observations from the Dachshund’s attention heatmap: Background distractions were significantly reduced. The model learned to ignore environmental elements like grass and trees, focusing more on the dog’s structural features. The model’s focus has shifted to the Dachshund’s facial features, particularly the eyes, nose, and mouth, key traits for breed recognition. Compared to before, attention is no longer scattered, resulting in a more stable and confident classification. This confirms that the Morphological Feature Extractor helps the model filter out irrelevant background noise and focus on the defining facial traits of each breed, making its predictions more reliable. 4.2 Distinguishing Siberian Huskies from other northern breeds For sled dogs, the impact of the Morphological Feature Extractor was even more pronounced. Below is a heatmap before the extractor was applied, where the model misclassified a Siberian Husky as an Eskimo Dog. As seen in the heatmap, the model failed to focus on any distinguishing features, instead displaying a diffused, unfocused attention distribution. This suggests the model was uncertain about the defining traits of a Husky, leading to misclassification. However, after incorporating the Morphological Feature Extractor, a critical transformation occurred: Distinguishing Siberian Huskies from other northern breeds (like Alaskan Malamutes) is another case that impressed me. As you can see in the heatmap, the model’s attention is highly concentrated on the Husky’s facial features. What’s interesting is the yellow highlighted area around the eyes. The Husky’s iconic blue eyes and distinctive “mask” pattern are key features that distinguish it from other sled dogs. The model also notices the Husky’s distinctive ear shape, which is smaller and closer to the head than an Alaskan Malamute’s, forming a distinct triangular shape. Most surprising to me was that despite the snow and red berries in the background (elements that might interfere with the baseline model), the improved model pays minimal attention to these distractions, focusing on the breed itself. 4.3 Summary of heatmap analysis Through these heatmaps, we can clearly see how the Morphological Feature Extractor has changed the model’s “thinking process,” making it more similar to expert recognition abilities: Morphology takes priority over color: The model is no longer swayed by surface features (like fur color) but has learned to prioritize body type, head shape, and other features that experts use to distinguish similar breeds. Dynamic allocation of attention: The model demonstrates flexibility in feature prioritization: emphasizing body proportions for Dachshunds and facial markings for Huskies, similar to expert recognition processes. Enhanced interference resistance: The model has learned to ignore backgrounds and non-characteristic parts, maintaining focus on key morphological features even in noisy environments. 5. Potential applications and future improvements Through this project, I believe the concept of Morphological Feature Extractors won’t be limited to dog breed identification. This concept could be applicable to other domains that rely on recognizing fine-grained differences. However, defining what constitutes a ‘morphological feature’ varies by field, making direct transferability a challenge. 5.1 Applications in fine-grained visual classification Inspired by biological classification principles, this approach is particularly useful for distinguishing objects with subtle differences. Some practical applications include: Medical diagnosis: Tumor classification, dermatological analysis, and radiology (X-ray/CT scans), where doctors rely on shape, texture, and boundary features to differentiate conditions. Plant and insect identification: Certain poisonous mushrooms closely resemble edible ones, requiring expert knowledge to differentiate based on morphology. Industrial quality control: Detecting microscopic defects in manufactured products, such as shape errors in electronic components or surface scratches on metals. Art and artifact authentication: Museums and auction houses often rely on texture patterns, carving details, and material analysis to distinguish genuine artifacts from forgeries, an area where AI can assist. This methodology could also be applied to surveillance and forensic analysis, such as recognizing individuals through gait analysis, clothing details, or vehicle identification in criminal investigations. 5.2 Challenges and future improvements While the Morphological Feature Extractor has demonstrated its effectiveness, there are several challenges and areas for improvement: Feature selection flexibility: The current system relies on predefined feature sets. Future enhancements could incorporate adaptive feature selection, dynamically adjusting key features based on object type (e.g., ear shape for dogs, wing structure for birds). Computational efficiency: Although initially expected to scale well, real-world deployment revealed increased computational complexity, posing limitations for mobile or embedded devices. Integration with advanced architectures: Combining morphological analysis with models like Transformers or Self-Supervised Learning could enhance performance but introduces challenges in feature representation consistency. Cross-domain adaptability: While effective for dog breed classification, applying this approach to new fields (e.g., medical imaging or plant identification) requires redefinition of morphological features. Explainability and few-shot learning potential: The intuitive nature of morphological features may facilitate low-data learning scenarios. However, overcoming deep learning’s dependency on large labeled datasets remains a key challenge. These challenges indicate areas where the approach can be refined, rather than fundamental flaws in its design. Conclusion This development process made me realize that the Morphological Feature Extractor isn’t just another machine learning technique, it’s a step toward making AI think more like humans. Instead of passively memorizing patterns, this approach helps AI focus on key features, much like experts do. Beyond Computer Vision, this idea could influence AI’s ability to reason, make decisions, and interpret information more effectively. As AI evolves, we are not just improving models but shaping systems that learn in a more human-like way. Thank you for reading. Through developing PawMatchAI, I’ve gained valuable experience regarding AI visual systems and feature recognition, giving me new perspectives on AI development. If you have any viewpoints or topics you’d like to discuss, I welcome the exchange.

References & data sources Dataset Sources Stanford Dogs Dataset – Kaggle Dataset Originally sourced from Stanford Vision Lab – ImageNet Dogs Citation: Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. Novel dataset for Fine-Grained Image Categorization. FGVC Workshop, CVPR, 2011. Unsplash Images – Additional images of four breeds (Bichon Frise, Dachshund, Shiba Inu, Havanese) were sourced from Unsplash for dataset augmentation. Research references Image attribution All images, unless otherwise noted, are created by the author. Disclaimer The methods and approaches described in this article are based on my personal research and experimental findings. While the Morphological Feature Extractor has demonstrated improvements in specific scenarios, its performance may vary depending on datasets, implementation details, and training conditions. This article is intended for educational and informational purposes only. Readers should conduct independent evaluations and adapt the approach based on their specific use cases. No guarantees are made regarding its effectiveness across all applications.

Introduction: Can AI really distinguish dog breeds like human experts?

One day while taking a walk, I saw a fluffy white puppy and wondered, Is that a Bichon Frise or a Maltese? No matter how closely I looked, they seemed almost identical. Huskies and Alaskan Malamutes, Shiba Inus and Akitas, I always found myself second-guessing. How do professional veterinarians and researchers spot the differences at a glance? What are they focusing on?

This question kept coming back to me while developing PawMatchAI. One day, while struggling to improve my model’s accuracy, I realized that when I recognize objects, I don’t process all details at once. Instead, I first notice the overall shape, then refine my focus on specific features. Could this “coarse-to-fine” processing be the key to how experts identify similar dog breeds so accurately?

Digging into research, I came across a cognitive science paper confirming that human visual recognition relies on multi-level feature analysis. Experts don’t just memorize images, they analyze structured traits such as:

Overall body proportions (large vs. small dogs, square vs. elongated body shapes)
Head features (ear shape, muzzle length, eye spacing)
Fur texture and distribution (soft vs. curly vs. smooth, double vs. single coat)
Color and pattern (specific markings, pigment distribution)
Behavioral and postural features (tail posture, walking style)

This made me rethink traditional CNNs (Convolutional Neural Networks). While they are incredibly powerful at learning local features, they don’t explicitly separate key characteristics the way human experts do. Instead, these features are entangled within millions of parameters without clear interpretability.

So I designed the Morphological Feature Extractor, an approach that helps AI analyze breeds in structured layers—just like how experts do. This architecture specifically focuses on body proportions, head shape, fur texture, tail structure, and color patterns, making AI not just see objects, but understand them.

PawMatchAI is my personal project that can identify 124 dog breeds and provide breed comparisons and recommendations based on user preferences. If you’re interested, you can try it on HuggingFace Space or check out the complete code on GitHub:

HuggingFace: PawMatchAI

GitHub: PawMatchAI

In this article, I’ll dive deeper into this biologically-inspired design and share how I turned simple everyday observations into a practical AI solution.

1. Human vision vs. machine vision: Two fundamentally different ways of perceiving the world

At first, I thought humans and AI recognized objects in a similar way. But after testing my model and looking into cognitive science, I realized something surprising, humans and AI actually process visual information in fundamentally different ways. This completely changed how I approached AI-based recognition.

Human vision: Structured and adaptive

The human visual system follows a highly structured yet flexible approach when recognizing objects:

Seeing the big picture first → Our brain first scans the overall shape and size of an object. This is why, just by looking at a dog’s silhouette, we can quickly tell whether it’s a large or small breed. Personally, this is always my first instinct when spotting a dog.

Focusing on key features → Next, our attention automatically shifts to the features that best differentiate one breed from another. While researching, I found that professional veterinarians often emphasize ear shape and muzzle length as primary indicators for breed identification. This made me realize how experts make quick decisions.

Learning through experience → The more dogs we see, the more we refine our recognition process. Someone seeing a Samoyed for the first time might focus on its fluffy white fur, while an experienced dog enthusiast would immediately recognize its distinctive “Samoyed smile”, a unique upturned mouth shape.

How CNNs “see” the world

Convolutional Neural Networks (CNNs) follow a completely different recognition strategy:

A complex system that’s hard to interpret → CNNs do learn patterns from simple edges and textures to high-level features, but all of this happens inside millions of parameters, making it hard to understand what the model is really focusing on.
When AI confuses the background for the dog → One of the most frustrating problems I ran into was that my model kept misidentifying breeds based on their surroundings. For example, if a dog was in a snowy setting, it almost always guessed Siberian Husky, even if the breed was completely different.

2. Morphological Feature Extractor: Inspiration from cognitive science

2.1 Core design philosophy

Throughout the development of PawMatchAI, I’ve been trying to make the model identify similar-looking dog breeds as accurately as human experts can. However, my early attempts didn’t go as planned. At first, I thought training deeper CNNs with more parameters would improve performance. But no matter how powerful the model became, it still struggled with similar breeds, mistaking Bichon Frises for Maltese, or Huskies for Eskimo Dog. That made me wonder: Can AI really understand these subtle differences just by getting bigger and deeper?

Then I thought back to something I had noticed before, when humans recognize objects, we don’t process everything at once. We start by looking at the overall shape, then gradually zoom in on the details. This got me thinking, what if CNNs could mimic human object recognition habits by starting with overall morphology and then focusing on detailed features? Would this improve recognition capabilities?

Based on this idea, I decided to stop simply making CNNs deeper and instead design a more structured model architecture, ultimately establishing three core design principles:

Explicit morphological features: This made me reflect on my own question: What exactly are professionals looking at? It turns out that veterinarians and breed experts don’t just rely on instinct, they follow a clear set of criteria, focusing on specific traits. So instead of letting the model “guess” which parts matter, I designed it to learn directly from these expert-defined features, making its decision-making process closer to human cognition.
Multi-scale parallel processing: This corresponds to my cognitive insight: humans don’t process visual information linearly but attend to features at different levels simultaneously. When we see a dog, we don’t need to complete our analysis of the overall outline before observing local details; rather, these processes happen concurrently. Therefore, I designed multiple parallel feature analyzers, each focusing on features at different scales, working together rather than sequentially.
Why relationships between features matter more than individual traits: I came to realize that looking at individual features alone often isn’t enough to determine a breed. The recognition process isn’t just about identifying separate traits, it’s about how they interact. For example, a dog with short hair and pointed ears could be a Doberman, if it has a slender body. But if that same combination appears on a stocky, compact frame, it’s more likely a Boston Terrier. Clearly, the way features relate to one another is often the key to distinguishing breeds.

2.2 Technical implementation of the five morphological feature analyzers

Each analyzer uses different convolution kernel sizes and layers to address various features:

Body proportion analyzer

# Using large convolution kernels (7x7) to capture overall body features
'body_proportion': nn.Sequential(
    nn.Conv2d(64, 128, kernel_size=7, padding=3),
    nn.BatchNorm2d(128),
    nn.ReLU(),
    nn.Conv2d(128, 128, kernel_size=3, padding=1),
    nn.BatchNorm2d(128),
    nn.ReLU()
)

Initially, I tried even larger kernels but found they focused too much on the background. I eventually used (7×7) kernels to capture overall morphological features, just like how canine experts first notice whether a dog is large, medium, or small, and whether its body shape is square or rectangular. For example, when identifying similar small white breeds (like Bichon Frise vs. Maltese), body proportions are often the initial distinguishing point.

Head feature analyzer

# Medium-sized kernels (5x5) are suitable for analyzing head structure
'head_features': nn.Sequential(
    nn.Conv2d(64, 128, kernel_size=5, padding=2),
    nn.BatchNorm2d(128),
    nn.ReLU(),
    nn.Conv2d(128, 128, kernel_size=3, padding=1),
    nn.BatchNorm2d(128),
    nn.ReLU()
)

The head feature analyzer was the part I tested most extensively. The technical challenge was that the head contains multiple key identification points (ears, muzzle, eyes), but their relative positions are crucial for overall recognition. The final design using 5×5 convolution kernels allows the model to learn the relative positioning of these features while maintaining computational efficiency.

Tail feature analyzer

'tail_features': nn.Sequential(
    nn.Conv2d(64, 128, kernel_size=5, padding=2),
    nn.BatchNorm2d(128),
    nn.ReLU(),
    nn.Conv2d(128, 128, kernel_size=3, padding=1),
    nn.BatchNorm2d(128),
    nn.ReLU()
)

Tails typically occupy only a small portion of an image and come in many forms. Tail shape is a key identifying feature for certain breeds, such as the curled upward tail of Huskies and the back-curled tail of Samoyeds. The final solution uses a structure similar to the head analyzer but incorporates more data augmentation during training (like random cropping and rotation).

Fur feature analyzer

# Small kernels (3x3) are better for capturing fur texture
'fur_features': nn.Sequential(
    nn.Conv2d(64, 128, kernel_size=3, padding=1),
    nn.BatchNorm2d(128),
    nn.ReLU(),
    nn.Conv2d(128, 128, kernel_size=3, padding=1),
    nn.BatchNorm2d(128),
    nn.ReLU()
)

Fur texture and length are critical features for distinguishing visually similar breeds. When judging fur length, a larger receptive field is needed. Through experimentation, I found that stacking two 3×3 convolutional layers improved recognition accuracy.

Color pattern analyzer

# Color feature analyzer: analyzing color distribution
'color_pattern': nn.Sequential(
    # First layer: capturing basic color distribution
    nn.Conv2d(64, 128, kernel_size=3, padding=1),
    nn.BatchNorm2d(128),
    nn.ReLU(),

    # Second layer: analyzing color patterns and markings
    nn.Conv2d(128, 128, kernel_size=3, padding=1),
    nn.BatchNorm2d(128),
    nn.ReLU(),

    # Third layer: integrating color information
    nn.Conv2d(128, 128, kernel_size=1),
    nn.BatchNorm2d(128),
    nn.ReLU()
)

The color pattern analyzer has a more complex design than other analyzers because of the difficulty in distinguishing between colors themselves and their distribution patterns. For example, German Shepherds and Rottweilers both have black and tan fur, but their distribution patterns differ. The three-layer design allows the model to first capture basic colors, then analyze distribution patterns, and finally integrate this information through 1×1 convolutions.

2.3 Feature interaction and integration mechanism: The key breakthrough

Having different analyzers for each feature is important, but making them interact with each other is the most crucial part:

# Feature attention mechanism: dynamically adjusting the importance of different features
self.feature_attention = nn.MultiheadAttention(
    embed_dim=128,
    num_heads=8,
    dropout=0.1,
    batch_first=True
)

# Feature relationship analyzer: analyzing connections between different morphological features
self.relation_analyzer = nn.Sequential(
    nn.Linear(128 * 5, 256),  # Combination of five morphological features
    nn.LayerNorm(256),
    nn.ReLU(),
    nn.Linear(256, 128),
    nn.LayerNorm(128),
    nn.ReLU()
)

# Feature integrator: intelligently combining all features
self.feature_integrator = nn.Sequential(
    nn.Linear(128 * 6, in_features),  # Five original features + one relationship feature
    nn.LayerNorm(in_features),
    nn.ReLU()
)

The multi-head attention mechanism is vital for identifying the most representative features of each breed. For example, short-haired breeds rely more on body type and head features for identification, while long-haired breeds depend more on fur texture and color.

2.4 Feature Relationship Analyzer: Why feature relationships are so important

After weeks of frustration, I finally realized my model was missing a crucial element – when we humans identify something, we don’t just recall individual details. Our brains connect the dots, linking features to form a complete image. The relationships between features are just as important as the features themselves. A small dog with pointed ears and fluffy fur is likely a Pomeranian, but the same features on a large dog might indicate a Samoyed.

So I built the Feature Relationship Analyzer to embody this concept. Instead of processing each feature separately, I connected all five morphological features before passing them to the connecting layer. This lets the model learn relationships between features, helping it distinguish breeds that look almost identical at first glance, especially in four key aspects:

Body and head coordination → Shepherd breeds typically have wolf-like heads paired with slender bodies, while bulldog breeds have broad heads with muscular, stocky builds. The model learns these associations rather than processing head and body shapes separately.
Fur and color joint distribution → Certain breeds have specific fur types often accompanied by unique colors. For example, Border Collies tend to have black and white bicolor fur, while Golden Retrievers typically have long golden fur. Recognizing these co-occurring features improves accuracy.
Head and tail paired features → Pointed ears and curled tails are common in northern sled dog breeds (like Samoyeds and Huskies), while drooping ears and straight tails are more typical of hound and spaniel breeds.
Body, fur, and color three-dimensional feature space → Some combinations are strong indicators of specific breeds. Large build, short hair, and black-and-tan coloration almost always point to a German Shepherd.

By focusing on how features interact rather than processing them separately, the Feature Relationship Analyzer bridges the gap between human intuition and AI-based recognition.

2.5 Residual connection: Keeping original information intact

At the end of the forward propagation function, there’s a key residual connection:

# Final integration with residual connection
integrated_features = self.feature_integrator(final_features)

return integrated_features + x  # Residual connection

This residual connection (+ x) serves a few important roles：

Preserving important details → Ensures that while focusing on morphological features, the model still retains key information from the original representation.
Helping deep models train better → In large architectures like ConvNeXtV2, residuals prevent gradients from vanishing, keeping learning stable.
Providing flexibility → If the original features are already useful, the model can “skip” certain transformations instead of forcing unnecessary changes.
Mimicking how the brain processes images → Just like our brains analyze objects and their locations at the same time, the model learns different perspectives in parallel.

In the model design, a similar concept was adopted, allowing different feature analyzers to operate simultaneously, each focusing on different morphological features (like body type, fur, ear shape, etc.). Through residual connections, these different information channels can complement each other, ensuring the model doesn’t miss critical information and thereby improving recognition accuracy.

2.6 Overall workflow

The complete feature processing flow is as follows:

Five morphological feature analyzers simultaneously process spatial features, each using different-sized convolution layers and focusing on different features
The feature attention mechanism dynamically adjusts focus on different features
The feature relationship analyzer captures correlations between features, truly understanding breed characteristics
The feature integrator combines all information (five original features + one relationship feature)
Residual connections ensure no original information is lost

3. Architecture flow diagram: How the morphological feature extractor works

Looking at the diagram, we can see a clear distinction between two processing paths: on the left, a specialized morphological feature extraction process, and on the right, the traditional CNN-based recognition path.

Left path: Morphological feature processing

Input feature tensor: This is the model’s input, featuring information from the CNN’s middle layers, similar to how humans first get a rough outline when viewing an image.
The Feature Space Transformer reshapes compressed 1D features into a structured 2D representation, improving the model’s ability to capture spatial relationships. For example, when analyzing a dog’s ears, their features might be scattered in a 1D vector, making it harder for the model to recognize their connection. By mapping them into 2D space, this transformation brings related traits closer together, allowing the model to process them simultaneously, just as humans naturally do.
2D feature map: This is the transformed two-dimensional representation which, as mentioned above, now has more spatial structure and can be used for morphological analysis.
At the heart of this system are five specialized Morphological Feature Analyzers, each designed to focus on a key aspect of dog breed identification:
- Body Proportion Analyzer: Uses large convolution kernels (7×7) to capture overall shape and proportion relationships, which is the first step in preliminary classification
- Head Feature Analyzer: Uses medium-sized convolution kernels (5×5) combined with smaller ones (3×3), focusing on head shape, ear position, muzzle length, and other key features
- Tail Feature Analyzer: Similarly uses a combination of 5×5 and 3×3 convolution kernels to analyze tail shape, curl degree, and posture, which are often decisive features for distinguishing similar breeds
- Fur Feature Analyzer: Uses consecutive small convolution kernels (3×3), specifically designed to capture fur texture, length, and density – these subtle features
- Color Pattern Analyzer: Employs a multi-layered convolution architecture, including 1×1 convolutions for color integration, specifically analyzing color distribution patterns and specific markings
Similar to how our eyes instinctively focus on the most distinguishing features when recognizing faces, the Feature Attention Mechanism dynamically adjusts its focus on key morphological traits, ensuring the model prioritizes the most relevant details for each breed.

Right path: Standard CNN processing

Original feature representation: The initial feature representation of the image.
CNN backbone (ConvNeXtV2): Uses ConvNeXtV2 as the backbone network, extracting features through standard deep learning methods.
Classifier head: Transforms features into classification probabilities for 124 dog breeds.

Integration path

The Feature Relation Analyzer goes beyond isolated traits, it examines how different features interact, capturing relationships that define a breed’s unique appearance. For example, combinations like “head shape + tail posture + fur texture” might point to specific breeds.
Feature integrator: Integrates morphological features and their relationship information to form a more comprehensive representation.
Enhanced feature representation: The final feature representation, combining original features (through residual connections) and features obtained from morphological analysis.
Finally, the model delivers its prediction, determining the breed based on a combination of original CNN features and morphological analysis.

4. Performance observations of the morphological feature extractor

After analyzing the entire model architecture, the most important question was: Does it actually work? To verify the effectiveness of the Morphological Feature Extractor, I tested 30 photos of dog breeds that models typically confuse. A comparison between models shows a significant improvement: the baseline model correctly classified 23 out of 30 images (76.7%), while the addition of the Morphological Feature Extractor increased accuracy to 90% (27 out of 30 images).

This improvement is not just reflected in numbers but also in how the model differentiates breeds. The heat maps below show which image regions the model focuses on before and after integrating the feature extractor.

4.1 Recognizing a Dachshund’s unique body proportions

Let’s start with a misclassification case. The heatmap below shows that without the Morphological Feature Extractor, the model incorrectly classified a Dachshund as a Golden Retriever.

Without morphological features, the model relied too much on color and fur texture, rather than recognizing the dog’s overall structure. The heat map reveals that the model’s attention was scattered, not just on the dog’s face, but also on background elements like the roof, which likely influenced the misclassification.
Since long-haired Dachshunds and Golden Retrievers share a similar coat color, the model was misled, focusing more on superficial similarities rather than distinguishing key features like body proportions and ear shape.

This shows a common issue with deep learning models, without proper guidance, they can focus on the wrong things and make mistakes. Here, the background distractions kept the model from noticing the Dachshund’s long body and short legs, which set it apart from a Golden Retriever.

However, after integrating the Morphological Feature Extractor, the model’s attention shifted significantly, as seen in the heatmap below:

Key observations from the Dachshund’s attention heatmap:

Background distractions were significantly reduced. The model learned to ignore environmental elements like grass and trees, focusing more on the dog’s structural features.
The model’s focus has shifted to the Dachshund’s facial features, particularly the eyes, nose, and mouth, key traits for breed recognition. Compared to before, attention is no longer scattered, resulting in a more stable and confident classification.

This confirms that the Morphological Feature Extractor helps the model filter out irrelevant background noise and focus on the defining facial traits of each breed, making its predictions more reliable.

4.2 Distinguishing Siberian Huskies from other northern breeds

For sled dogs, the impact of the Morphological Feature Extractor was even more pronounced. Below is a heatmap before the extractor was applied, where the model misclassified a Siberian Husky as an Eskimo Dog.

As seen in the heatmap, the model failed to focus on any distinguishing features, instead displaying a diffused, unfocused attention distribution. This suggests the model was uncertain about the defining traits of a Husky, leading to misclassification.

However, after incorporating the Morphological Feature Extractor, a critical transformation occurred:

Distinguishing Siberian Huskies from other northern breeds (like Alaskan Malamutes) is another case that impressed me. As you can see in the heatmap, the model’s attention is highly concentrated on the Husky’s facial features.

What’s interesting is the yellow highlighted area around the eyes. The Husky’s iconic blue eyes and distinctive “mask” pattern are key features that distinguish it from other sled dogs. The model also notices the Husky’s distinctive ear shape, which is smaller and closer to the head than an Alaskan Malamute’s, forming a distinct triangular shape.

Most surprising to me was that despite the snow and red berries in the background (elements that might interfere with the baseline model), the improved model pays minimal attention to these distractions, focusing on the breed itself.

4.3 Summary of heatmap analysis

Through these heatmaps, we can clearly see how the Morphological Feature Extractor has changed the model’s “thinking process,” making it more similar to expert recognition abilities:

Morphology takes priority over color: The model is no longer swayed by surface features (like fur color) but has learned to prioritize body type, head shape, and other features that experts use to distinguish similar breeds.
Dynamic allocation of attention: The model demonstrates flexibility in feature prioritization: emphasizing body proportions for Dachshunds and facial markings for Huskies, similar to expert recognition processes.
Enhanced interference resistance: The model has learned to ignore backgrounds and non-characteristic parts, maintaining focus on key morphological features even in noisy environments.

5. Potential applications and future improvements

Through this project, I believe the concept of Morphological Feature Extractors won’t be limited to dog breed identification. This concept could be applicable to other domains that rely on recognizing fine-grained differences. However, defining what constitutes a ‘morphological feature’ varies by field, making direct transferability a challenge.

5.1 Applications in fine-grained visual classification

Inspired by biological classification principles, this approach is particularly useful for distinguishing objects with subtle differences. Some practical applications include:

Medical diagnosis: Tumor classification, dermatological analysis, and radiology (X-ray/CT scans), where doctors rely on shape, texture, and boundary features to differentiate conditions.
Plant and insect identification: Certain poisonous mushrooms closely resemble edible ones, requiring expert knowledge to differentiate based on morphology.
Industrial quality control: Detecting microscopic defects in manufactured products, such as shape errors in electronic components or surface scratches on metals.
Art and artifact authentication: Museums and auction houses often rely on texture patterns, carving details, and material analysis to distinguish genuine artifacts from forgeries, an area where AI can assist.

This methodology could also be applied to surveillance and forensic analysis, such as recognizing individuals through gait analysis, clothing details, or vehicle identification in criminal investigations.

5.2 Challenges and future improvements

While the Morphological Feature Extractor has demonstrated its effectiveness, there are several challenges and areas for improvement:

Feature selection flexibility: The current system relies on predefined feature sets. Future enhancements could incorporate adaptive feature selection, dynamically adjusting key features based on object type (e.g., ear shape for dogs, wing structure for birds).
Computational efficiency: Although initially expected to scale well, real-world deployment revealed increased computational complexity, posing limitations for mobile or embedded devices.
Integration with advanced architectures: Combining morphological analysis with models like Transformers or Self-Supervised Learning could enhance performance but introduces challenges in feature representation consistency.
Cross-domain adaptability: While effective for dog breed classification, applying this approach to new fields (e.g., medical imaging or plant identification) requires redefinition of morphological features.
Explainability and few-shot learning potential: The intuitive nature of morphological features may facilitate low-data learning scenarios. However, overcoming deep learning’s dependency on large labeled datasets remains a key challenge.

These challenges indicate areas where the approach can be refined, rather than fundamental flaws in its design.

Conclusion

This development process made me realize that the Morphological Feature Extractor isn’t just another machine learning technique, it’s a step toward making AI think more like humans. Instead of passively memorizing patterns, this approach helps AI focus on key features, much like experts do.

Beyond Computer Vision, this idea could influence AI’s ability to reason, make decisions, and interpret information more effectively. As AI evolves, we are not just improving models but shaping systems that learn in a more human-like way.

Thank you for reading. Through developing PawMatchAI, I’ve gained valuable experience regarding AI visual systems and feature recognition, giving me new perspectives on AI development. If you have any viewpoints or topics you’d like to discuss, I welcome the exchange.

References & data sources

Dataset Sources

Stanford Dogs Dataset – Kaggle Dataset
- Originally sourced from Stanford Vision Lab – ImageNet Dogs
- Citation:
  - Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. Novel dataset for Fine-Grained Image Categorization. FGVC Workshop, CVPR, 2011.
Unsplash Images – Additional images of four breeds (Bichon Frise, Dachshund, Shiba Inu, Havanese) were sourced from Unsplash for dataset augmentation.

Research references

Image attribution

All images, unless otherwise noted, are created by the author.

Disclaimer

The methods and approaches described in this article are based on my personal research and experimental findings. While the Morphological Feature Extractor has demonstrated improvements in specific scenarios, its performance may vary depending on datasets, implementation details, and training conditions.

This article is intended for educational and informational purposes only. Readers should conduct independent evaluations and adapt the approach based on their specific use cases. No guarantees are made regarding its effectiveness across all applications.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Kyndryl extends Google Cloud partnership for AI-based mainframe modernization

“Many need to integrate mainframe data with the cloud and update applications to meet new security and compliance rules. Providers are using genAI-based code inspection to accelerate the discovery of code changes and modernization of legacy code,” ISG stated. “GenAI can inspect code and explain how it works, reducing and

Google Cloud partners with mLogica to offer mainframe modernization

Other than the partnership with mLogica, Google Cloud also offers a variety of other mainframe migration tools, including Radis and G4 that can be employed to modernize specific applications. Enterprises can also use a combination of migration tools to modernize their mainframe applications. Some of these tools include the Gemini-powered

Red Hat’s AI portfolio evolves to address enterprise deployment challenges

RHEL AI lets companies deploy workloads where they need them, including cloud, on premises, or on the edge, IDC analyst Michele Rosen says. “AI applications need to live as close as possible to their data,” she says. “For security, latency, and what have you. So you want to have a

Ethernet roadmap: AI drives high-speed, efficient Ethernet networks

“While the IEEE P802.3dj project is working toward defining 200G per lane for Ethernet by late 2026, the industry is (loudly) asking for 400G per lane yesterday, if not sooner,” Jones wrote in a recent Ethernet Alliance blog. In a post about Ethernet’s AI evolution, John D’Ambrosia wrote about the

BP, Iraq Finalize Deal on Oil Development in Kurds-Claimed Province

Baghdad has fully ratified a contract with BP PLC for oil and gas redevelopments and potential exploration in Kirkuk province, the company said. The agreement, laid out last year, provides for re-entry and rehabilitation in several “giant oil fields”, according to the British energy giant. The assets eyed for production are the Kirkuk oilfield’s Avanah and Baba domes and three adjacent fields: Bai Hassan, Jambur and Khabbaz. All of these are operated by state-owned North Oil Co. (NOC). Besides BP and NOC, the tripartite deal also includes the Iraqi state’s North Gas Co. (NGC). The agreement is for an initial phase with a target production of over 3 billion barrels of oil equivalent (boe). “The wider resource opportunity across the contract and surrounding area is believed to include up to 20 billion barrels of oil equivalent”, BP said in an online statement. “This is an enormous opportunity as we grow bp’s oil and gas business and fully aligned with our strategy of strengthening our upstream portfolio”, commented chief executive Murray Auchincloss. BP will now form a new unincorporated company to take over operatorship from NOC. The operator will be composed mostly of personnel from NOC and NGC, with secondees from BP. “Subsequent to this agreement, bp expects to form a standalone incorporated joint venture to hold its interests in the operator”, BP earlier said February 25. “Under the terms of the agreement, bp will work with NOC, NGC and the new operator to stabilize and grow production”, it said. “Work will include a drilling program, the rehabilitation of existing wells and facilities, and the construction of new infrastructure, including gas expansion projects. “Under the agreement, bp’s remuneration will be linked to incremental production volumes, price and costs. bp will be able to book a share of production and reserves proportionate to

Swiss engineering firm chooses Aberdeen for first UK base

Swiss technology and engineering firm Rosenxt is to open its first UK office in Aberdeen. The move is seen as a boost for the city, with high-quality engineering jobs already being advertised, including positions for structural engineers. Rosenxt, known for its expertise in critical infrastructure within harsh environments, will take residence at Union Point, in Blaikie’s Quay, this summer. Rosenxt’s summer move The firm, which operates across Europe, North America, and Asia, specializes in subsea and offshore structures, industrial manufacturing, and waterline infrastructures. Its 1,369 sq ft office space is currently being fitted out by Standard Real Estate (SRE) Group in the city’s revitalised 11-storey harbour building, which has recently undergone significant refurbishment. Meanwhile, the building’s current occupier, Aurora Offshore, has agreed to expand its offices in a move to Aberdeen’s Merchant Exchange. It is moving to a significantly larger bespoke fitted suite of 1,250 sq ft, enabling the firm to grow its business in the provision of high-end offshore vessels. Aberdeen’s appeal to engineering sector John Grewar, letting manager for the SRE Group, said Rosenxt launching its UK office in Aberdeen underlines the city’s continued appeal to the engineering sector. He said Aurora Offshore’s expansion shows its continued commitment to Aberdeen. Salvesen Tower, now known as Union Point, was acquired by SRE in 2018. Image: Kami Thomson/DC Thomson “We have worked closely with each business to create space that is perfect for their growing teams,” he added. “Both Union Point and Merchant Exchange have undergone comprehensive, high-quality refurbishments in recent years, breathing new life into each building. “This allows us to support Aberdeen’s ongoing drive to attract new, exciting businesses to the City.” SRE is estimated to have spent £2 million revitalising Union Point. Now the property firm’s owners say they are on a mission to bring more abandoned

Salamander receives onshore planning permission

The Salamander floating offshore wind farm celebrated having received planning permission in principle for its onshore works in “record time” . The developers submitted the onshore application to Aberdeenshire Council in August last year, laying out plans for a site roughly 1.2 miles (2km) north of Peterhead for infrastructure including the substation, a 50MW battery storage facility and the onshore export cables. A second application was also made to the Energy Consents Unit of the Scottish government for the wind farm’s energy balancing infrastructure, which includes the battery. This has been validated and is progressing through the assessment process. Salamander project director Hugh Yendole said: “We are incredibly proud to have secured an almost-unheard-of unanimous approval in record time – only seven months after submission. We have achieved a number of significant ‘firsts’ with this consent – the first combined onshore substation and battery consent and the first consent of any of the innovation projects awarded exclusivity agreements under INTOG. “It is also worth noting that the joint venture team that delivered this consent did so under somewhat challenging conditions, especially differentiating Salamander’s low-impact grid connection from the profusion of GW-scale infrastructure that is planned.” © Supplied by Big PartnershipA map showing the location of the planned Salamander floating offshore wind project. The 100MW project is being developed by Ørsted, Simply Blue Group and Subsea7, and will install up to seven wind turbines on floating foundations in water 21.75 miles (35km) off Peterhead. The project also submitted its offshore consent application in May and is awaiting approval from Scottish Ministers – this would pave the way to develop the project’s offshore components. Salamander was a successful innovation bidder in Crown Estate Scotland’s innovation and targeted oil and gas (INTOG) leasing round. The application envisions onshore construction starting in January 2027 at

Video: Bluefield banks on solar expansion despite global headwinds

In an exclusive studio interview, James Armstrong, co-founder and managing partner of investment manager Bluefield Partners, discusses how the solar investor’s evolving business model has been a “big driver” of shareholder performance. The prospect of zonal pricing does not worry Armstrong, who said it is unlikely to happen “overnight” with much of the FTSE 250 listed trust’s portfolio being in the southern half of England and Wales. According to Armstrong, investing at the development stage through to operation has contributed to shareholder performance. This is in spite of the major headwind that has faced listed investment trusts in the past few years: the fact that most trade at a discount to the underlying net asset value of the portfolio – primarily as a result of rising interest prices. Bluefield started assembling development-stage projects at least five years ago on the basis that contracts for difference linked assets, such as the Yelvertoft solar farm, would gradually replace assets in the portfolio under the former Renewables Obligation subsidy. Last year, it partnered with £4.1 billion infrastructure investment fund GLIL Infrastructure, which invests on behalf of local authority groups, to expand on this strategy.

Ashtead Technology sees “no shortage” of M&A opportunities

Aberdeen’s Ashtead Technology sees ample opportunities to drive its campaign of mergers and acquisitions (M&A) in the near future. Speaking to Energy Voice, Ashtead chief strategy and marketing officer Colin Ross said: “If you look at the wider market, we see no shortage of opportunities to continue to explore M&A in the year ahead.” Ashtead Technology has been on the acquisition trail since 2017, adding nine companies in the past seven years. It acquired WeSubsea and Hiretech in late 2022, ACE Winches in November 2023, and Seatronics and J2 Subsea in November 2024. These acquisitions “have been tremendously important for our business, great opportunities to grow and scale, to bring in talent to broaden our offering and really build a stronger business with more capability to serve our customers in a really effective way,” Ross said. Back in 2024, after announcing the company’s full-year results for 2023, CEO Allan Pirie said that he saw an opportunity to grow the company by “consolidating a fragmented market”. A year on and the Scottish energy services sector has seen two major movements in the M&A space in a matter of weeks. US fund manager Apollo snapped up Aberdeen-based offshore energy services group OEG Group in a deal that valued the firm at more than $1 billion (£770m). It will take an unspecified majority stake in the group, leaving its former owner, Los Angeles-based Oaktree, with a small minority stake. In addition, the future of services group Wood was thrown into doubt as Dubai-based Sidara revived its takeover bid for the company. Sidara abandoned its previous attempts to buy the company in August after offering 230p per share. Since then, Wood’s share price has fallen considerably, with the company extending the deadline for Sidara to make a final offer for the Aberdeen-based services company.

Scotland generated record amount of renewable electricity in 2024

Scotland generated a record amount of energy from renewables last year – with data also showing the electricity generated north of the border helped power the rest of the UK. Renewable sources such as onshore and offshore wind, hydro power and solar generated a total of 38.4TWh of electricity in 2025 – an increase of 13.2% on the previous year and 8.4% higher than the previous peak of 35.5TWh, which was recorded in 2022. The majority of energy was produced by wind technology, with onshore and offshore wind projects generating 30.1TWh, the data, which was published by the Scottish Government, showed. Meanwhile, hydro power generated 5.2TWh, solar produced 0.6TWh and other forms of renewables resulted in 2.6TWh of electricity. The report also revealed: “Scotland continues to generate more electricity than it needs. In 2024, there was 19.7TWh of net electricity exports to other UK nations.” The report also said Scotland’s capacity to produce electricity from renewable sources had “increased substantially over the past 10 years”. In 2024 alone, capacity increased by 14.3% to stand at 17.6GW, compared to 15.4GW in 2023. As of the end of 2024 a total of 904 further electricity projects were being planned, with a combined capacity of 65.4GW. These included 640 projects for energy generation, with an estimated capacity of 37.5GW, along with 264 electricity storage projects, with an estimated capacity of 27.9GW. Environmental campaigners at Friends of the Earth Scotland said the figures gave a “glimpse of what’s possible for Scotland”. Speaking about the “positive renewable energy statistics,” head of campaigns Caroline Rance said: “The benefits of renewables are huge, but they are not yet sufficiently reaching our communities and the workers who are responsible for their deployment, whether that is due to manufacturing taking place overseas or big business sucking up all the

Former Arista COO launches NextHop AI for customized networking infrastructure

Sadana argued that unlike traditional networking where an IT person can just plug a cable into a port and it works, AI networking requires intricate, custom solutions. The core challenge is creating highly optimized, efficient networking infrastructure that can support massive AI compute clusters with minimal inefficiencies. How NextHop is looking to change the game for hyperscale networking NextHop AI is working directly alongside its hyperscaler customers to develop and build customized networking solutions. “We are here to build the most efficient AI networking solutions that are out there,” Sadana said. More specifically, Sadana said that NextHop is looking to help hyperscalers in several ways including: Compressing product development cycles: “Companies that are doing things on their own can compress their product development cycle by six to 12 months when they partner with us,” he said. Exploring multiple technological alternatives: Sadana noted that hyperscalers might try and build on their own and will often only be able to explore one or two alternative approaches. With NextHop, Sadana said his company will enable them to explore four to six different alternatives. Achieving incremental efficiency gains: At the massive cloud scale that hyperscalers operate, even an incremental one percent improvement can have an oversized outcome. “You have to make AI clusters as efficient as possible for the world to use all the AI applications at the right cost structure, at the right economics, for this to be successful,” Sadana said. “So we are participating by making that infrastructure layer a lot more efficient for cloud customers, or the hyperscalers, which, in turn, of course, gives the benefits to all of these software companies trying to run AI applications in these cloud companies.” Technical innovations: Beyond traditional networking In terms of what the company is actually building now, NextHop is developing specialized network switches

Microsoft abandons data center projects as OpenAI considers its own, hinting at a market shift

A potential ‘oversupply position’ In a new research note, TD Cowan analysts reportedly said that Microsoft has walked away from new data center projects in the US and Europe, purportedly due to an oversupply of compute clusters that power AI. This follows reports from TD Cowen in February that Microsoft had “cancelled leases in the US totaling a couple of hundred megawatts” of data center capacity. The researchers noted that the company’s pullback was a sign of it “potentially being in an oversupply position,” with demand forecasts lowered. OpenAI, for its part, has reportedly discussed purchasing billions of dollars’ worth of data storage hardware and software to increase its computing power and decrease its reliance on hyperscalers. This fits with its planned Stargate Project, a $500 billion, US President Donald Trump-endorsed initiative to build out its AI infrastructure in the US over the next four years. Based on the easing of exclusivity between the two companies, analysts say these moves aren’t surprising. “When looking at storage in the cloud — especially as it relates to use in AI — it is incredibly expensive,” said Matt Kimball, VP and principal analyst for data center compute and storage at Moor Insights & Strategy. “Those expenses climb even higher as the volume of storage and movement of data grows,” he pointed out. “It is only smart for any business to perform a cost analysis of whether storage is better managed in the cloud or on-prem, and moving forward in a direction that delivers the best performance, best security, and best operational efficiency at the lowest cost.”

PEAK:AIO adds power, density to AI storage server

There is also the fact that many people working with AI are not IT professionals, such as professors, biochemists, scientists, doctors, clinicians, and they don’t have a traditional enterprise department or a data center. “It’s run by people that wouldn’t really know, nor want to know, what storage is,” he said. While the new AI Data Server is a Dell design, PEAK:AIO has worked with Lenovo, Supermicro, and HPE as well as Dell over the past four years, offering to convert their off the shelf storage servers into hyper fast, very AI-specific, cheap, specific storage servers that work with all the protocols at Nvidia, like NVLink, along with NFS and NVMe over Fabric. It also greatly increased storage capacity by going with 61TB drives from Solidigm. SSDs from the major server vendors typically maxed out at 15TB, according to the vendor. PEAK:AIO competes with VAST, WekaIO, NetApp, Pure Storage and many others in the growing AI workload storage arena. PEAK:AIO’s AI Data Server is available now.

SoftBank to buy Ampere for $6.5B, fueling Arm-based server market competition

SoftBank’s announcement suggests Ampere will collaborate with other SBG companies, potentially creating a powerful ecosystem of Arm-based computing solutions. This collaboration could extend to SoftBank’s numerous portfolio companies, including Korean/Japanese web giant LY Corp, ByteDance (TikTok’s parent company), and various AI startups. If SoftBank successfully steers its portfolio companies toward Ampere processors, it could accelerate the shift away from x86 architecture in data centers worldwide. Questions remain about Arm’s server strategy The acquisition, however, raises questions about how SoftBank will balance its investments in both Arm and Ampere, given their potentially competing server CPU strategies. Arm’s recent move to design and sell its own server processors to Meta signaled a major strategic shift that already put it in direct competition with its own customers, including Qualcomm and Nvidia. “In technology licensing where an entity is both provider and competitor, boundaries are typically well-defined without special preferences beyond potential first-mover advantages,” Kawoosa explained. “Arm will likely continue making independent licensing decisions that serve its broader interests rather than favoring Ampere, as the company can’t risk alienating its established high-volume customers.” Industry analysts speculate that SoftBank might position Arm to focus on custom designs for hyperscale customers while allowing Ampere to dominate the market for more standardized server processors. Alternatively, the two companies could be merged or realigned to present a unified strategy against incumbents Intel and AMD. “While Arm currently dominates processor architecture, particularly for energy-efficient designs, the landscape isn’t static,” Kawoosa added. “The semiconductor industry is approaching a potential inflection point, and we may witness fundamental disruptions in the next 3-5 years — similar to how OpenAI transformed the AI landscape. SoftBank appears to be maximizing its Arm investments while preparing for this coming paradigm shift in processor architecture.”

Nvidia, xAI and two energy giants join genAI infrastructure initiative

The new AIP members will “further strengthen the partnership’s technology leadership as the platform seeks to invest in new and expanded AI infrastructure. Nvidia will also continue in its role as a technical advisor to AIP, leveraging its expertise in accelerated computing and AI factories to inform the deployment of next-generation AI data center infrastructure,” the group’s statement said. “Additionally, GE Vernova and NextEra Energy have agreed to collaborate with AIP to accelerate the scaling of critical and diverse energy solutions for AI data centers. GE Vernova will also work with AIP and its partners on supply chain planning and in delivering innovative and high efficiency energy solutions.” The group claimed, without offering any specifics, that it “has attracted significant capital and partner interest since its inception in September 2024, highlighting the growing demand for AI-ready data centers and power solutions.” The statement said the group will try to raise “$30 billion in capital from investors, asset owners, and corporations, which in turn will mobilize up to $100 billion in total investment potential when including debt financing.” Forrester’s Nguyen also noted that the influence of two of the new members — xAI, owned by Elon Musk, along with Nvidia — could easily help with fundraising. Musk “with his connections, he does not make small quiet moves,” Nguyen said. “As for Nvidia, they are the face of AI. Everything they do attracts attention.” Info-Tech’s Bickley said that the astronomical dollars involved in genAI investments is mind-boggling. And yet even more investment is needed — a lot more.

IBM broadens access to Nvidia technology for enterprise AI

The IBM Storage Scale platform will support CAS and now will respond to queries using the extracted and augmented data, speeding up the communications between GPUs and storage using Nvidia BlueField-3 DPUs and Spectrum-X networking, IBM stated. The multimodal document data extraction workflow will also support Nvidia NeMo Retriever microservices. CAS will be embedded in the next update of IBM Fusion, which is planned for the second quarter of this year. Fusion simplifies the deployment and management of AI applications and works with Storage Scale, which will handle high-performance storage support for AI workloads, according to IBM. IBM Cloud instances with Nvidia GPUs In addition to the software news, IBM said its cloud customers can now use Nvidia H200 instances in the IBM Cloud environment. With increased memory bandwidth (1.4x higher than its predecessor) and capacity, the H200 Tensor Core can handle larger datasets, accelerating the training of large AI models and executing complex simulations, with high energy efficiency and low total cost of ownership, according to IBM. In addition, customers can use the power of the H200 to process large volumes of data in real time, enabling more accurate predictive analytics and data-driven decision-making, IBM stated. IBM Consulting capabilities with Nvidia Lastly, IBM Consulting is adding Nvidia Blueprint to its recently introduced AI Integration Service, which offers customers support for developing, building and running AI environments. Nvidia Blueprints offer a suite pre-validated, optimized, and documented reference architectures designed to simplify and accelerate the deployment of complex AI and data center infrastructure, according to Nvidia. The IBM AI Integration service already supports a number of third-party systems, including Oracle, Salesforce, SAP and ServiceNow environments.

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle