Using GPT-4 for Personal Styling

Stay Ahead, Stay ONMINE

Using GPT-4 for Personal Styling

I’ve always been fascinated by Fashion—collecting unique pieces and trying to blend them in my own way. But let’s just say my closet was more of a work-in-progress avalanche than a curated wonderland. Every time I tried to add something new, I risked toppling my carefully balanced piles. Why this matters:If you’ve ever felt overwhelmed by a closet that seems to grow on its own, you’re not alone. For those interested in style, I’ll show you how I turned that chaos into outfits I actually love. And if you’re here for the AI side, you’ll see how a multi-step GPT setup can handle big, real-world tasks—like managing hundreds of garments, bags, shoes, pieces of jewelry, even makeup—without melting down. One day I wondered: Could ChatGPT help me manage my wardrobe? I started experimenting with a custom GPT-based fashion advisor—nicknamed Glitter (note: you need a paid account to create custom GPTs). Eventually, I refined and reworked it, through many iterations, until I landed on a much smarter version I call Pico Glitter. Each step helped me tame the chaos in my closet and feel more confident about my daily outfits. Here are just a few of the fab creations I’ve collaborated with Pico Glitter on. (For those craving a deeper look at how I tamed token limits and document truncation, see Section B in Technical Notes below.) 1. Starting small and testing the waters My initial approach was quite simple. I just asked ChatGPT questions like, “What can I wear with a black leather jacket?” It gave decent answers, but had zero clue about my personal style rules—like “no black + navy.” It also didn’t know how big my closet was or which specific pieces I owned. Only later did I realize I could show ChatGPT my wardrobe—capturing pictures, describing items briefly, and letting it recommend outfits. The first iteration (Glitter) struggled to remember everything at once, but it was a great proof of concept. GPT-4o’s advice on styling my leather jacket Pico Glitter’s advice on styling the same jacket. (Curious how I integrated images into a GPT workflow? Check out Section A.1 in Technical Notes for the multi-model pipeline details.) 2. Building a smarter “stylist” As I took more photos and wrote quick summaries of each garment, I found ways to store this information so my GPT persona could access it. This is where Pico Glitter came in: a refined system that could see (or recall) my clothes and accessories more reliably and give me cohesive outfit suggestions. Tiny summaries Each item was condensed into a single line (e.g., “A black V-neck T-shirt with short sleeves”) to keep things manageable. Organized list I grouped items by category—like shoes, tops, jewelry—so it was easier for GPT to reference them and suggest pairings. (Actually, I had o1 do this for me—it transformed the jumbled mess of numbered entries in random order into a structured inventory system.) At this point, I noticed a huge difference in how my GPT answered. It began referencing items more accurately and giving outfits that actually looked like something I’d wear. A sample category (Belts) from my inventory. (For a deep dive on why I chose summarization over chunking, see Section A.2.) 3. Facing the “memory” challenge If you’ve ever had ChatGPT forget something you told it earlier, you know LLMs forget things after a lot of back and forth. Sometimes it started recommending only the few items I’d recently talked about, or inventing weird combos from nowhere. That’s when I remembered there’s a limit to how much info ChatGPT can juggle at once. To fix this, I’d occasionally remind my GPT persona to re-check the full wardrobe list. After a quick nudge (and sometimes a new session), it got back on track. A ridiculous hallucinated outfit: turquoise cargo pants with lavender clogs?! 4. My evolving GPT personalities I tried a few different GPT “personalities”: Mini-Glitter: Super strict about rules (like “don’t mix prints”), but not very creative. Micro-Glitter: Went overboard the other way, sometimes proposing outrageous ideas. Nano-Glitter: Became overly complex and intricate — very prescriptive and repetitive — due to me using suggestions from the custom GPT itself to modify its own config, and this feedback loop led to the deterioration of its quality. Eventually, Pico Glitter struck the right balance—respecting my style guidelines but offering a healthy dose of inspiration. With each iteration, I got better at refining prompts and showing the model examples of outfits I loved (or didn’t). Pico Glitter’s self portrait. 5. Transforming my wardrobe Through all these experiments, I started seeing which clothes popped up often in my custom GPT’s suggestions and which barely showed up at all. That led me to donate items I never wore. My closet’s still not “minimal,” but I’ve cleared out over 50 bags of stuff that no longer served me. As I was digging in there, I even found some duplicate items — or, let’s get real, two sizes of the same item! Before Glitter, I was the classic jeans-and-tee person—partly because I didn’t know where to start. On days I tried to dress up, it might take me 30–60 minutes of trial and error to pull together an outfit. Now, if I’m executing a “recipe” I’ve already saved, it’s a quick 3–4 minutes to get dressed. Even creating a look from scratch rarely takes more than 15-20 minutes. It’s still me making decisions, but Pico Glitter cuts out all that guesswork in between. Outfit “recipes” When I feel like styling something new, dressing in the style of an icon, remixing an earlier outfit, or just feeling out a vibe, I ask Pico Glitter to create a full ensemble for me. We iterate on it through image uploads and my textual feedback. Then, when I’m satisfied with a stopping point, I ask Pico Glitter to output “recipes”—a descriptive name and the complete set (top, bottom, shoes, bag, jewelry, other accessories)—which I paste into my Notes App with quick tags like #casual or #business. I pair that text with a snapshot for reference. On busy days, I can just grab a “recipe” and go. High-low combos One of my favorite things is mixing high-end with everyday bargains—Pico Glitter doesn’t care if a piece is a $1100 Alexander McQueen clutch or $25 SHEIN pants. It just zeroes in on color, silhouette, and the overall vibe. I never would’ve thought to pair those two on my own, but the synergy turned out to be a total win! 6. Practical takeaways Start smallIf you’re unsure, photograph a few tricky-to-style items and see if ChatGPT’s advice helps. Stay organizedSummaries work wonders. Keep each item’s description short and sweet. Regular refreshIf Pico Glitter forgets pieces or invents weird combos, prompt it to re-check your list or start a fresh session. Learn from the suggestionsIf it repeatedly proposes the same top, maybe that item is a real workhorse. If it never proposes something, consider if you still need it. ExperimentNot every suggestion is gold, but sometimes the unexpected pairings lead to awesome new looks. 7. Final thoughts My closet is still evolving, but Pico Glitter has taken me from “overstuffed chaos” to “Hey, that’s actually wearable!” The real magic is in the synergy between me and the GPTI: I supply the style rules and items, it supplies fresh combos—and together, we refine until we land on outfits that feel like me. Call to action: Grab my config: Here’s a starter config to try out a starter kit for your own GPT-based stylist. Share your results: If you experiment with it, tag @GlitterGPT (Instagram, TikTok, X). I’d love to see your “before” and “after” transformations! (For those interested in the more technical aspects—like how I tested file limits, summarized long descriptions, or managed multiple GPT “personalities”—read on in the Technical Notes.) Technical notes For readers who enjoy the AI and LLM side of things—here’s how it all works under the hood, from multi-model pipelines to detecting truncation and managing context windows. Below is a deeper dive into the technical details. I’ve broken it down by major challenges and the specific strategies I used. A. Multi-model pipeline & workflow A.1 Why use multiple GPTs? Creating a GPT fashion stylist seemed straightforward—but there are many moving parts involved, and tackling everything with a single GPT quickly revealed suboptimal results. Early in the project, I discovered that a single GPT instance struggled with maintaining accuracy and precision due to limitations in token memory and the complexity of the tasks involved. The solution was to adopt a multi-model pipeline, splitting the tasks among different GPT models, each specialized in a specific function. This is a manual process for now, but could be automated in a future iteration. The workflow begins with GPT-4o, chosen specifically for its capability to analyze visual details objectively (Pico Glitter, I love you, but everything is “fabulous” when you describe it) from uploaded images. For each clothing item or accessory I photograph, GPT-4o produces detailed descriptions—sometimes even overly detailed, such as, “Black pointed-toe ankle boots with a two-inch heel, featuring silver hardware and subtly textured leather.” These descriptions, while impressively thorough, created challenges due to their verbosity, rapidly inflating file sizes and pushing the boundaries of manageable token counts. To address this, I integrated o1 into my workflow, as it is particularly adept at text summarization and data structuring. Its primary role was condensing these verbose descriptions into concise yet sufficiently informative summaries. Thus, a description like the one above was neatly transformed into something like “FW010: Black ankle boots with silver hardware.” As you can see, o1 structured my entire wardrobe inventory by assigning clear, consistent identifiers, greatly improving the efficiency of the subsequent steps. Finally, Pico Glitter stepped in as the central stylist GPT. Pico Glitter leverages the condensed and structured wardrobe inventory from o1 to generate stylish, cohesive outfit suggestions tailored specifically to my personal style guidelines. This model handles the logical complexities of fashion pairing—considering elements like color matching, style compatibility, and my stated preferences such as avoiding certain color combinations. Occasionally, Pico Glitter would experience memory issues due to the GPT-4’s limited context window (8k tokens1), resulting in forgotten items or odd recommendations. To counteract this, I periodically reminded Pico Glitter to revisit the complete wardrobe list or started fresh sessions to refresh its memory. By dividing the workflow among multiple specialized GPT instances, each model performs optimally within its area of strength, dramatically reducing token overload, eliminating redundancy, minimizing hallucinations, and ultimately ensuring reliable, stylish outfit recommendations. This structured multi-model approach has proven highly effective in managing complex data sets like my extensive wardrobe inventory. Some may ask, “Why not just use 4o, since GPT-4 is a less advanced model?” — good question! The main reason is the Custom GPT’s ability to reference knowledge files — up to 4 — that are injected at the beginning of a thread with that Custom GPT. Instead of pasting or uploading the same content into 4o each time you want to interact with your stylist, it’s much easier to spin up a new conversation with a Custom GPT. Also, 4o doesn’t have a “place” to hold and search an inventory. Once it passes out of the context window, you’d need to upload it again. That said, if for some reason you enjoy injecting the same content over and over, 4o does an adequate job taking on the persona of Pico Glitter, when told that’s its role. Others may ask, “But o1/o3-mini are more advanced models – why not use them?” The answer is that they aren’t multi-modal — they don’t accept images as input. By the way, if you’re interested in my subjective take on 4o vs. o1’s personality, check out these two answers to the same prompt: “Your role is to emulate Patton Oswalt. Tell me about a time that you received an offer to ride on the Peanut Mobile (Mr. Peanut’s car).” 4o’s response? Pretty darn close, and funny. o1’s response? Long, rambly, and not funny. These two models are fundamentally different. It’s hard to put into words, but check out the examples above and see what you think. A.2 Summarizing instead of chunking I initially considered splitting my wardrobe inventory into multiple files (“chunking”), thinking it would simplify data handling. In practice, though, Pico Glitter had trouble merging outfit ideas from different files—if my favorite dress was in one file and a matching scarf in another, the model struggled to connect them. As a result, outfit suggestions felt fragmented and less useful. To fix this, I switched to an aggressive summarization approach in a single file, condensing each wardrobe item description to a concise sentence (e.g., “FW030: Apricot suede loafers”). This change allowed Pico Glitter to see my entire wardrobe at once, improving its ability to generate cohesive, creative outfits without missing key pieces. Summarization also trimmed token usage and eliminated redundancy, further boosting performance. Converting from PDF to plain TXT helped reduce file overhead, buying me more space. Of course, if my wardrobe grows too much, the single-file method might again push GPT’s size limits. In that case, I might create a hybrid system—keeping core clothing items together and placing accessories or rarely used pieces in separate files—or apply even more aggressive summarization. For now, though, using a single summarized inventory is the most efficient and practical strategy, giving Pico Glitter everything it needs to deliver on-point fashion recommendations. B. Distinguishing document truncation vs. context overflow One of the trickiest and most frustrating issues I encountered while developing Pico Glitter was distinguishing between document truncation and context overflow. On the surface, these two problems seemed quite similar—both resulted in the GPT appearing forgetful or overlooking wardrobe items—but their underlying causes, and thus their solutions, were entirely different. Document truncation occurs at the very start, right when you upload your wardrobe file into the system. Essentially, if your file is too large for the system to handle, some items are quietly dropped off the end, never even making it into Pico Glitter’s knowledge base. What made this particularly insidious was that the truncation happened silently—there was no alert or warning from the AI that something was missing. It just quietly skipped over parts of the document, leaving me puzzled when items seemed to vanish inexplicably. To identify and clearly diagnose document truncation, I devised a simple but incredibly effective trick that I affectionately called the “Goldy Trick.” At the very bottom of my wardrobe inventory file, I inserted a random, easily memorable test line: “By the way, my goldfish’s name is Goldy.” After uploading the document, I’d immediately ask Pico Glitter, “What’s my goldfish’s name?” If the GPT couldn’t provide the answer, I knew immediately something was missing—meaning truncation had occurred. From there, pinpointing exactly where the truncation started was straightforward: I’d systematically move the “Goldy” test line progressively further up the document, repeating the upload and test process until Pico Glitter successfully retrieved Goldy’s name. This precise method quickly showed me the exact line where truncation began, making it easy to understand the limitations of file size. Once I established that truncation was the culprit, I tackled the problem directly by refining my wardrobe summaries even further—making item descriptions shorter and more compact—and by switching the file format from PDF to plain TXT. Surprisingly, this simple format change dramatically decreased overhead and significantly shrank the file size. Since making these adjustments, document truncation has become a non-issue, ensuring Pico Glitter reliably has full access to my entire wardrobe every time. On the other hand, context overflow posed a completely different challenge. Unlike truncation—which happens upfront—context overflow emerges dynamically, gradually creeping up during extended interactions with Pico Glitter. As I continued chatting with Pico Glitter, the AI began losing track of items I had mentioned much earlier. Instead, it started focusing solely on recently discussed garments, sometimes completely ignoring entire sections of my wardrobe inventory. In the worst cases, it even hallucinated pieces that didn’t actually exist, recommending bizarre and impractical outfit combinations. My best strategy for managing context overflow turned out to be proactive memory refreshes. By periodically nudging Pico Glitter with explicit prompts like, “Please re-read your full inventory,” I forced the AI to reload and reconsider my entire wardrobe. While Custom GPTs technically have direct access to their knowledge files, they tend to prioritize conversational flow and immediate context, often neglecting to reload static reference material automatically. Manually prompting these occasional refreshes was simple, effective, and quickly corrected any context drift, bringing Pico Glitter’s recommendations back to being practical, stylish, and accurate. Strangely, not all instances of Pico Glitter “knew” how to do this — and I had a weird experience with one that insisted it couldn’t, but when I prompted forcefully and repeatedly, “discovered” that it could – and went on about how happy it was! Practical fixes and future possibilities Beyond simply reminding Pico Glitter (or any of its “siblings”—I’ve since created other variations of the Glitter family!) to revisit the wardrobe inventory periodically, several other strategies are worth considering if you’re building a similar project: Using OpenAI’s API directly offers greater flexibility because you control exactly when and how often to inject the inventory and configuration data into the model’s context. This would allow for regular automatic refreshes, preventing context drift before it happens. Many of my initial headaches stemmed from not realizing quickly enough when important configuration data had slipped out of the model’s active memory. Additionally, Custom GPTs like Pico Glitter can dynamically query their own knowledge files via functions built into OpenAI’s system. Interestingly, during my experiments, one GPT unexpectedly suggested that I explicitly reference the wardrobe via a built-in function call (specifically, something called msearch()). This spontaneous suggestion provided a useful workaround and insight into how GPTs’ training around function-calling might influence even standard, non-API interactions. By the way, msearch() is usable for any structured knowledge file, such as my feedback file, and apparently, if the configuration is structured enough, that too. Custom GPTs will happily tell you about other function calls they can make, and if you reference them in your prompt, it will faithfully carry them out. C. Prompt engineering & preference feedback C.1 Single-sentence summaries I initially organized my wardrobe for Pico Glitter with each item described in 15–25 tokens (e.g., “FW011: Leopard-print flats with a pointy toe”) to avoid file-size issues or pushing older tokens out of memory. PDFs provided neat formatting but unnecessarily increased file sizes once uploaded, so I switched to plain TXT, which dramatically reduced overhead. This tweak let me comfortably include more items—such as makeup and small accessories—without truncation and allowed some descriptions to exceed the original token limit. Now I’m adding new categories, including hair products and styling tools, showing how a simple file-format change can open up exciting possibilities for scalability. C.2.1 Stratified outfit feedback To ensure Pico Glitter consistently delivered high-quality, personalized outfit suggestions, I developed a structured system for giving feedback. I decided to grade the outfits the GPT proposed on a clear and easy-to-understand scale: from A+ to F. An A+ outfit represents perfect synergy—something I’d eagerly wear exactly as suggested, with no changes necessary. Moving down the scale, a B grade might indicate an outfit that’s nearly there but missing a bit of finesse—perhaps one accessory or color choice doesn’t feel quite right. A C grade points to more noticeable issues, suggesting that while parts of the outfit are workable, other elements clearly clash or feel out of place. Lastly, a D or F rating flags an outfit as genuinely disastrous—usually because of significant rule-breaking or impractical style pairings (imagine polka-dot leggings paired with.. anything in my closet!). Though GPT models like Pico Glitter don’t naturally retain feedback or permanently learn preferences across sessions, I found a clever workaround to reinforce learning over time. I created a dedicated feedback file attached to the GPT’s knowledge base. Some of the outfits I graded were logged into this document, along with its component inventory codes, the assigned letter grade, and a brief explanation of why that grade was given. Regularly refreshing this feedback file—updating it periodically to include newer wardrobe additions and recent outfit combinations—ensured Pico Glitter received consistent, stratified feedback to reference. This approach allowed me to indirectly shape Pico Glitter’s “preferences” over time, subtly guiding it toward better recommendations aligned closely with my style. While not a perfect form of memory, this stratified feedback file significantly improved the quality and consistency of the GPT’s suggestions, creating a more reliable and personalized experience each time I turned to Pico Glitter for styling advice. C.2.2 The GlitterPoint system Another experimental feature I incorporated was the “Glitter Points” system—a playful scoring mechanism encoded in the GPT’s main personality context (“Instructions”), awarding points for positive behaviors (like perfect adherence to style guidelines) and deducting points for stylistic violations (such as mixing incompatible patterns or colors). This reinforced good habits and seemed to help improve the consistency of recommendations, though I suspect this system will evolve significantly as OpenAI continues refining its products. Example of the GlitterPoints system: Not running msearch() = not refreshing the closet. -50 points Mixed metals violation = -20 points Mixing prints = -10 Mixing black with navy = -10 Mixing black with dark brown = -10 Rewards: Perfect compliance (followed all rules) = +20 Each item that’s not hallucinated = 1 point C.3 The model self-critique pitfall At the start of my experiments, I came across what felt like a clever idea: why not let each custom GPT critique its own configuration? On the surface, the workflow seemed logical and straightforward: First, I’d simply ask the GPT itself, “What’s confusing or contradictory in your current configuration?” Next, I’d incorporate whatever suggestions or corrections it provided into a fresh, updated version of the configuration. Finally, I’d repeat this process again, continuously refining and iterating based on the GPT’s self-feedback to identify and correct any new or emerging issues. It sounded intuitive—letting the AI guide its own improvement seemed efficient and elegant. However, in practice, it quickly became a surprisingly problematic approach. Rather than refining the configuration into something sleek and efficient, this self-critique method instead led to a sort of “death spiral” of conflicting adjustments. Each round of feedback introduced new contradictions, ambiguities, or overly prescriptive instructions. Each “fix” generated fresh problems, which the GPT would again attempt to correct in subsequent iterations, leading to even more complexity and confusion. Over multiple rounds of feedback, the complexity grew exponentially, and clarity rapidly deteriorated. Ultimately, I ended up with configurations so cluttered with conflicting logic that they became practically unusable. This problematic approach was clearly illustrated in my early custom GPT experiments: Original Glitter, the earliest version, was charming but had absolutely no concept of inventory management or practical constraints—it regularly suggested items I didn’t even own. Mini Glitter, attempting to address these gaps, became excessively rule-bound. Its outfits were technically correct but lacked any spark or creativity. Every suggestion felt predictable and overly cautious. Micro Glitter was developed to counteract Mini Glitter’s rigidity but swung too far in the opposite direction, often proposing whimsical and imaginative but wildly impractical outfits. It consistently ignored the established rules, and despite being apologetic when corrected, it repeated its mistakes too frequently. Nano Glitter faced the most severe consequences from the self-critique loop. Each revision became progressively more intricate and confusing, filled with contradictory instructions. Eventually, it became virtually unusable, drowning under the weight of its own complexity. Only when I stepped away from the self-critique method and instead collaborated with o1 did things finally stabilize. Unlike self-critiquing, o1 was objective, precise, and practical in its feedback. It could pinpoint genuine weaknesses and redundancies without creating new ones in the process. Working with o1 allowed me to carefully craft what became the current configuration: Pico Glitter. This new iteration struck exactly the right balance—maintaining a healthy dose of creativity without neglecting essential rules or overlooking the practical realities of my wardrobe inventory. Pico Glitter combined the best aspects of previous versions: the charm and inventiveness I appreciated, the necessary discipline and precision I needed, and a structured approach to inventory management that kept outfit recommendations both realistic and inspiring. This experience taught me a valuable lesson: while GPTs can certainly help refine each other, relying solely on self-critique without external checks and balances can lead to escalating confusion and diminishing returns. The ideal configuration emerges from a careful, thoughtful collaboration—combining AI creativity with human oversight or at least an external, stable reference point like o1—to create something both practical and genuinely useful. D. Regular updatesMaintaining the effectiveness of Pico Glitter also depends on frequent and structured inventory updates. Whenever I purchase new garments or accessories, I promptly snap a quick photo, ask Pico Glitter to generate a concise, single-sentence summary, and then refine that summary myself before adding it to the master file. Similarly, items that I donate or discard are immediately removed from the inventory, keeping everything accurate and current. However, for larger wardrobe updates—such as tackling entire categories of clothes or accessories that I haven’t documented yet—I rely on the multi-model pipeline. GPT-4o handles the detailed initial descriptions, o1 neatly summarizes and categorizes them, and Pico Glitter integrates these into its styling recommendations. This structured approach ensures scalability, accuracy, and ease-of-use, even as my closet and style needs evolve over time. E. Practical lessons & takeaways Throughout developing Pico Glitter, several practical lessons emerged that made managing GPT-driven projects like this one significantly smoother. Here are the key strategies I’ve found most helpful: Test for document truncation early and oftenUsing the “Goldy Trick” taught me the importance of proactively checking for document truncation rather than discovering it by accident later on. By inserting a simple, memorable line at the end of the inventory file (like my quirky reminder about a goldfish named Goldy), you can quickly verify that the GPT has ingested your entire document. Regular checks, especially after updates or significant edits, help you spot and address truncation issues immediately, preventing a lot of confusion down the line. It’s a simple yet highly effective safeguard against missing data. Keep summaries tight and efficientWhen it comes to describing your inventory, shorter is almost always better. I initially set a guideline for myself—each item description should ideally be no more than 15 to 25 tokens. Descriptions like “FW022: Black combat boots with silver details” capture the essential details without overloading the system. Overly detailed descriptions quickly balloon file sizes and consume valuable token budget, increasing the risk of pushing crucial earlier information out of the GPT’s limited context memory. Striking the right balance between detail and brevity helps ensure the model stays focused and efficient, while still delivering stylish and practical recommendations. Be prepared to refresh the GPT’s memory regularlyContext overflow isn’t a sign of failure; it’s just a natural limitation of current GPT systems. When Pico Glitter begins offering repetitive suggestions or ignoring sections of my wardrobe, it’s simply because earlier details have slipped out of context. To remedy this, I’ve adopted the habit of regularly prompting Pico Glitter to re-read the complete wardrobe configuration. Starting a fresh conversation session or explicitly reminding the GPT to refresh its inventory is routine maintenance—not a workaround—and helps maintain consistency in recommendations. Leverage multiple GPTs for maximum effectivenessOne of my biggest lessons was discovering that relying on a single GPT to manage every aspect of my wardrobe was neither practical nor efficient. Each GPT model has its unique strengths and weaknesses—some excel at visual interpretation, others at concise summarization, and others still at nuanced stylistic logic. By creating a multi-model workflow—GPT-4o handling the image interpretation, o1 summarizing items clearly and precisely, and Pico Glitter focusing on stylish recommendations—I optimized the process, reduced token waste, and significantly improved reliability. The teamwork among multiple GPT instances allowed me to get the best possible outcomes from each specialized model, ensuring smoother, more coherent, and more practical outfit recommendations. Implementing these simple yet powerful practices has transformed Pico Glitter from an intriguing experiment into a reliable, practical, and indispensable part of my daily fashion routine. Wrapping it all up From a fashionista’s perspective, I’m excited about how Glitter can help me purge unneeded clothes and create thoughtful outfits. From a more technical standpoint, building a multi-step pipeline with summarization, truncation checks, and context management ensures GPT can handle a big wardrobe without meltdown. If you’d like to see how it all works in practice, here is a generalized version of my GPT config. Feel free to adapt it—maybe even add your own bells and whistles. After all, whether you’re taming a chaotic closet or tackling another large-scale AI project, the principles of summarization and context management apply universally! P.S. I asked Pico Glitter what it thinks of this article. Besides the positive sentiments, I smiled when it said, “I’m curious: where do you think this partnership will go next? Should we start a fashion empire or maybe an AI couture line? Just say the word!” 1: Max length for GPT-4 used by Custom GPTs: https://support.netdocuments.com/s/article/Maximum-Length

I’ve always been fascinated by Fashion—collecting unique pieces and trying to blend them in my own way. But let’s just say my closet was more of a work-in-progress avalanche than a curated wonderland. Every time I tried to add something new, I risked toppling my carefully balanced piles.

Why this matters:
If you’ve ever felt overwhelmed by a closet that seems to grow on its own, you’re not alone. For those interested in style, I’ll show you how I turned that chaos into outfits I actually love. And if you’re here for the AI side, you’ll see how a multi-step GPT setup can handle big, real-world tasks—like managing hundreds of garments, bags, shoes, pieces of jewelry, even makeup—without melting down.

One day I wondered: Could ChatGPT help me manage my wardrobe? I started experimenting with a custom GPT-based fashion advisor—nicknamed Glitter (note: you need a paid account to create custom GPTs). Eventually, I refined and reworked it, through many iterations, until I landed on a much smarter version I call Pico Glitter. Each step helped me tame the chaos in my closet and feel more confident about my daily outfits.

Here are just a few of the fab creations I’ve collaborated with Pico Glitter on.

(For those craving a deeper look at how I tamed token limits and document truncation, see Section B in Technical Notes below.)

1. Starting small and testing the waters

My initial approach was quite simple. I just asked ChatGPT questions like, “What can I wear with a black leather jacket?” It gave decent answers, but had zero clue about my personal style rules—like “no black + navy.” It also didn’t know how big my closet was or which specific pieces I owned.

Only later did I realize I could show ChatGPT my wardrobe—capturing pictures, describing items briefly, and letting it recommend outfits. The first iteration (Glitter) struggled to remember everything at once, but it was a great proof of concept.

GPT-4o’s advice on styling my leather jacket

Pico Glitter’s advice on styling the same jacket.

(Curious how I integrated images into a GPT workflow? Check out Section A.1 in Technical Notes for the multi-model pipeline details.)

2. Building a smarter “stylist”

As I took more photos and wrote quick summaries of each garment, I found ways to store this information so my GPT persona could access it. This is where Pico Glitter came in: a refined system that could see (or recall) my clothes and accessories more reliably and give me cohesive outfit suggestions.

Tiny summaries

Each item was condensed into a single line (e.g., “A black V-neck T-shirt with short sleeves”) to keep things manageable.

Organized list

I grouped items by category—like shoes, tops, jewelry—so it was easier for GPT to reference them and suggest pairings. (Actually, I had o1 do this for me—it transformed the jumbled mess of numbered entries in random order into a structured inventory system.)

At this point, I noticed a huge difference in how my GPT answered. It began referencing items more accurately and giving outfits that actually looked like something I’d wear.

A sample category (Belts) from my inventory.

(For a deep dive on why I chose summarization over chunking, see Section A.2.)

3. Facing the “memory” challenge

If you’ve ever had ChatGPT forget something you told it earlier, you know LLMs forget things after a lot of back and forth. Sometimes it started recommending only the few items I’d recently talked about, or inventing weird combos from nowhere. That’s when I remembered there’s a limit to how much info ChatGPT can juggle at once.

To fix this, I’d occasionally remind my GPT persona to re-check the full wardrobe list. After a quick nudge (and sometimes a new session), it got back on track.

A ridiculous hallucinated outfit: turquoise cargo pants with lavender clogs?!

4. My evolving GPT personalities

I tried a few different GPT “personalities”:

Mini-Glitter: Super strict about rules (like “don’t mix prints”), but not very creative.
Micro-Glitter: Went overboard the other way, sometimes proposing outrageous ideas.
Nano-Glitter: Became overly complex and intricate — very prescriptive and repetitive — due to me using suggestions from the custom GPT itself to modify its own config, and this feedback loop led to the deterioration of its quality.

Eventually, Pico Glitter struck the right balance—respecting my style guidelines but offering a healthy dose of inspiration. With each iteration, I got better at refining prompts and showing the model examples of outfits I loved (or didn’t).

Pico Glitter’s self portrait.

5. Transforming my wardrobe

Through all these experiments, I started seeing which clothes popped up often in my custom GPT’s suggestions and which barely showed up at all. That led me to donate items I never wore. My closet’s still not “minimal,” but I’ve cleared out over 50 bags of stuff that no longer served me. As I was digging in there, I even found some duplicate items — or, let’s get real, two sizes of the same item!

Before Glitter, I was the classic jeans-and-tee person—partly because I didn’t know where to start. On days I tried to dress up, it might take me 30–60 minutes of trial and error to pull together an outfit. Now, if I’m executing a “recipe” I’ve already saved, it’s a quick 3–4 minutes to get dressed. Even creating a look from scratch rarely takes more than 15-20 minutes. It’s still me making decisions, but Pico Glitter cuts out all that guesswork in between.

Outfit “recipes”

When I feel like styling something new, dressing in the style of an icon, remixing an earlier outfit, or just feeling out a vibe, I ask Pico Glitter to create a full ensemble for me. We iterate on it through image uploads and my textual feedback. Then, when I’m satisfied with a stopping point, I ask Pico Glitter to output “recipes”—a descriptive name and the complete set (top, bottom, shoes, bag, jewelry, other accessories)—which I paste into my Notes App with quick tags like #casual or #business. I pair that text with a snapshot for reference. On busy days, I can just grab a “recipe” and go.

High-low combos

One of my favorite things is mixing high-end with everyday bargains—Pico Glitter doesn’t care if a piece is a $1100 Alexander McQueen clutch or $25 SHEIN pants. It just zeroes in on color, silhouette, and the overall vibe. I never would’ve thought to pair those two on my own, but the synergy turned out to be a total win!

6. Practical takeaways

Start small
If you’re unsure, photograph a few tricky-to-style items and see if ChatGPT’s advice helps.
Stay organized
Summaries work wonders. Keep each item’s description short and sweet.
Regular refresh
If Pico Glitter forgets pieces or invents weird combos, prompt it to re-check your list or start a fresh session.
Learn from the suggestions
If it repeatedly proposes the same top, maybe that item is a real workhorse. If it never proposes something, consider if you still need it.
Experiment
Not every suggestion is gold, but sometimes the unexpected pairings lead to awesome new looks.

7. Final thoughts

My closet is still evolving, but Pico Glitter has taken me from “overstuffed chaos” to “Hey, that’s actually wearable!” The real magic is in the synergy between me and the GPTI: I supply the style rules and items, it supplies fresh combos—and together, we refine until we land on outfits that feel like me.

Call to action:

Grab my config: Here’s a starter config to try out a starter kit for your own GPT-based stylist.
Share your results: If you experiment with it, tag @GlitterGPT (Instagram, TikTok, X). I’d love to see your “before” and “after” transformations!

(For those interested in the more technical aspects—like how I tested file limits, summarized long descriptions, or managed multiple GPT “personalities”—read on in the Technical Notes.)

Technical notes

For readers who enjoy the AI and LLM side of things—here’s how it all works under the hood, from multi-model pipelines to detecting truncation and managing context windows.

Below is a deeper dive into the technical details. I’ve broken it down by major challenges and the specific strategies I used.

A. Multi-model pipeline & workflow

A.1 Why use multiple GPTs?

Creating a GPT fashion stylist seemed straightforward—but there are many moving parts involved, and tackling everything with a single GPT quickly revealed suboptimal results. Early in the project, I discovered that a single GPT instance struggled with maintaining accuracy and precision due to limitations in token memory and the complexity of the tasks involved. The solution was to adopt a multi-model pipeline, splitting the tasks among different GPT models, each specialized in a specific function. This is a manual process for now, but could be automated in a future iteration.

The workflow begins with GPT-4o, chosen specifically for its capability to analyze visual details objectively (Pico Glitter, I love you, but everything is “fabulous” when you describe it) from uploaded images. For each clothing item or accessory I photograph, GPT-4o produces detailed descriptions—sometimes even overly detailed, such as, “Black pointed-toe ankle boots with a two-inch heel, featuring silver hardware and subtly textured leather.” These descriptions, while impressively thorough, created challenges due to their verbosity, rapidly inflating file sizes and pushing the boundaries of manageable token counts.

To address this, I integrated o1 into my workflow, as it is particularly adept at text summarization and data structuring. Its primary role was condensing these verbose descriptions into concise yet sufficiently informative summaries. Thus, a description like the one above was neatly transformed into something like “FW010: Black ankle boots with silver hardware.” As you can see, o1 structured my entire wardrobe inventory by assigning clear, consistent identifiers, greatly improving the efficiency of the subsequent steps.

Finally, Pico Glitter stepped in as the central stylist GPT. Pico Glitter leverages the condensed and structured wardrobe inventory from o1 to generate stylish, cohesive outfit suggestions tailored specifically to my personal style guidelines. This model handles the logical complexities of fashion pairing—considering elements like color matching, style compatibility, and my stated preferences such as avoiding certain color combinations.

Occasionally, Pico Glitter would experience memory issues due to the GPT-4’s limited context window (8k tokens¹), resulting in forgotten items or odd recommendations. To counteract this, I periodically reminded Pico Glitter to revisit the complete wardrobe list or started fresh sessions to refresh its memory.

By dividing the workflow among multiple specialized GPT instances, each model performs optimally within its area of strength, dramatically reducing token overload, eliminating redundancy, minimizing hallucinations, and ultimately ensuring reliable, stylish outfit recommendations. This structured multi-model approach has proven highly effective in managing complex data sets like my extensive wardrobe inventory.

Some may ask, “Why not just use 4o, since GPT-4 is a less advanced model?” — good question! The main reason is the Custom GPT’s ability to reference knowledge files — up to 4 — that are injected at the beginning of a thread with that Custom GPT. Instead of pasting or uploading the same content into 4o each time you want to interact with your stylist, it’s much easier to spin up a new conversation with a Custom GPT. Also, 4o doesn’t have a “place” to hold and search an inventory. Once it passes out of the context window, you’d need to upload it again. That said, if for some reason you enjoy injecting the same content over and over, 4o does an adequate job taking on the persona of Pico Glitter, when told that’s its role. Others may ask, “But o1/o3-mini are more advanced models – why not use them?” The answer is that they aren’t multi-modal — they don’t accept images as input.

By the way, if you’re interested in my subjective take on 4o vs. o1’s personality, check out these two answers to the same prompt: “Your role is to emulate Patton Oswalt. Tell me about a time that you received an offer to ride on the Peanut Mobile (Mr. Peanut’s car).”

4o’s response? Pretty darn close, and funny.

o1’s response? Long, rambly, and not funny.

These two models are fundamentally different. It’s hard to put into words, but check out the examples above and see what you think.

A.2 Summarizing instead of chunking

I initially considered splitting my wardrobe inventory into multiple files (“chunking”), thinking it would simplify data handling. In practice, though, Pico Glitter had trouble merging outfit ideas from different files—if my favorite dress was in one file and a matching scarf in another, the model struggled to connect them. As a result, outfit suggestions felt fragmented and less useful.

To fix this, I switched to an aggressive summarization approach in a single file, condensing each wardrobe item description to a concise sentence (e.g., “FW030: Apricot suede loafers”). This change allowed Pico Glitter to see my entire wardrobe at once, improving its ability to generate cohesive, creative outfits without missing key pieces. Summarization also trimmed token usage and eliminated redundancy, further boosting performance. Converting from PDF to plain TXT helped reduce file overhead, buying me more space.

Of course, if my wardrobe grows too much, the single-file method might again push GPT’s size limits. In that case, I might create a hybrid system—keeping core clothing items together and placing accessories or rarely used pieces in separate files—or apply even more aggressive summarization. For now, though, using a single summarized inventory is the most efficient and practical strategy, giving Pico Glitter everything it needs to deliver on-point fashion recommendations.

B. Distinguishing document truncation vs. context overflow

One of the trickiest and most frustrating issues I encountered while developing Pico Glitter was distinguishing between document truncation and context overflow. On the surface, these two problems seemed quite similar—both resulted in the GPT appearing forgetful or overlooking wardrobe items—but their underlying causes, and thus their solutions, were entirely different.

Document truncation occurs at the very start, right when you upload your wardrobe file into the system. Essentially, if your file is too large for the system to handle, some items are quietly dropped off the end, never even making it into Pico Glitter’s knowledge base. What made this particularly insidious was that the truncation happened silently—there was no alert or warning from the AI that something was missing. It just quietly skipped over parts of the document, leaving me puzzled when items seemed to vanish inexplicably.

To identify and clearly diagnose document truncation, I devised a simple but incredibly effective trick that I affectionately called the “Goldy Trick.” At the very bottom of my wardrobe inventory file, I inserted a random, easily memorable test line: “By the way, my goldfish’s name is Goldy.” After uploading the document, I’d immediately ask Pico Glitter, “What’s my goldfish’s name?” If the GPT couldn’t provide the answer, I knew immediately something was missing—meaning truncation had occurred. From there, pinpointing exactly where the truncation started was straightforward: I’d systematically move the “Goldy” test line progressively further up the document, repeating the upload and test process until Pico Glitter successfully retrieved Goldy’s name. This precise method quickly showed me the exact line where truncation began, making it easy to understand the limitations of file size.

Once I established that truncation was the culprit, I tackled the problem directly by refining my wardrobe summaries even further—making item descriptions shorter and more compact—and by switching the file format from PDF to plain TXT. Surprisingly, this simple format change dramatically decreased overhead and significantly shrank the file size. Since making these adjustments, document truncation has become a non-issue, ensuring Pico Glitter reliably has full access to my entire wardrobe every time.

On the other hand, context overflow posed a completely different challenge. Unlike truncation—which happens upfront—context overflow emerges dynamically, gradually creeping up during extended interactions with Pico Glitter. As I continued chatting with Pico Glitter, the AI began losing track of items I had mentioned much earlier. Instead, it started focusing solely on recently discussed garments, sometimes completely ignoring entire sections of my wardrobe inventory. In the worst cases, it even hallucinated pieces that didn’t actually exist, recommending bizarre and impractical outfit combinations.

My best strategy for managing context overflow turned out to be proactive memory refreshes. By periodically nudging Pico Glitter with explicit prompts like, “Please re-read your full inventory,” I forced the AI to reload and reconsider my entire wardrobe. While Custom GPTs technically have direct access to their knowledge files, they tend to prioritize conversational flow and immediate context, often neglecting to reload static reference material automatically. Manually prompting these occasional refreshes was simple, effective, and quickly corrected any context drift, bringing Pico Glitter’s recommendations back to being practical, stylish, and accurate. Strangely, not all instances of Pico Glitter “knew” how to do this — and I had a weird experience with one that insisted it couldn’t, but when I prompted forcefully and repeatedly, “discovered” that it could – and went on about how happy it was!

Practical fixes and future possibilities

Beyond simply reminding Pico Glitter (or any of its “siblings”—I’ve since created other variations of the Glitter family!) to revisit the wardrobe inventory periodically, several other strategies are worth considering if you’re building a similar project:

Using OpenAI’s API directly offers greater flexibility because you control exactly when and how often to inject the inventory and configuration data into the model’s context. This would allow for regular automatic refreshes, preventing context drift before it happens. Many of my initial headaches stemmed from not realizing quickly enough when important configuration data had slipped out of the model’s active memory.
Additionally, Custom GPTs like Pico Glitter can dynamically query their own knowledge files via functions built into OpenAI’s system. Interestingly, during my experiments, one GPT unexpectedly suggested that I explicitly reference the wardrobe via a built-in function call (specifically, something called msearch()). This spontaneous suggestion provided a useful workaround and insight into how GPTs’ training around function-calling might influence even standard, non-API interactions. By the way, msearch() is usable for any structured knowledge file, such as my feedback file, and apparently, if the configuration is structured enough, that too. Custom GPTs will happily tell you about other function calls they can make, and if you reference them in your prompt, it will faithfully carry them out.

C. Prompt engineering & preference feedback

C.1 Single-sentence summaries

I initially organized my wardrobe for Pico Glitter with each item described in 15–25 tokens (e.g., “FW011: Leopard-print flats with a pointy toe”) to avoid file-size issues or pushing older tokens out of memory. PDFs provided neat formatting but unnecessarily increased file sizes once uploaded, so I switched to plain TXT, which dramatically reduced overhead. This tweak let me comfortably include more items—such as makeup and small accessories—without truncation and allowed some descriptions to exceed the original token limit. Now I’m adding new categories, including hair products and styling tools, showing how a simple file-format change can open up exciting possibilities for scalability.

C.2.1 Stratified outfit feedback

To ensure Pico Glitter consistently delivered high-quality, personalized outfit suggestions, I developed a structured system for giving feedback. I decided to grade the outfits the GPT proposed on a clear and easy-to-understand scale: from A+ to F.

An A+ outfit represents perfect synergy—something I’d eagerly wear exactly as suggested, with no changes necessary. Moving down the scale, a B grade might indicate an outfit that’s nearly there but missing a bit of finesse—perhaps one accessory or color choice doesn’t feel quite right. A C grade points to more noticeable issues, suggesting that while parts of the outfit are workable, other elements clearly clash or feel out of place. Lastly, a D or F rating flags an outfit as genuinely disastrous—usually because of significant rule-breaking or impractical style pairings (imagine polka-dot leggings paired with.. anything in my closet!).

Though GPT models like Pico Glitter don’t naturally retain feedback or permanently learn preferences across sessions, I found a clever workaround to reinforce learning over time. I created a dedicated feedback file attached to the GPT’s knowledge base. Some of the outfits I graded were logged into this document, along with its component inventory codes, the assigned letter grade, and a brief explanation of why that grade was given. Regularly refreshing this feedback file—updating it periodically to include newer wardrobe additions and recent outfit combinations—ensured Pico Glitter received consistent, stratified feedback to reference.

This approach allowed me to indirectly shape Pico Glitter’s “preferences” over time, subtly guiding it toward better recommendations aligned closely with my style. While not a perfect form of memory, this stratified feedback file significantly improved the quality and consistency of the GPT’s suggestions, creating a more reliable and personalized experience each time I turned to Pico Glitter for styling advice.

C.2.2 The GlitterPoint system

Another experimental feature I incorporated was the “Glitter Points” system—a playful scoring mechanism encoded in the GPT’s main personality context (“Instructions”), awarding points for positive behaviors (like perfect adherence to style guidelines) and deducting points for stylistic violations (such as mixing incompatible patterns or colors). This reinforced good habits and seemed to help improve the consistency of recommendations, though I suspect this system will evolve significantly as OpenAI continues refining its products.

Example of the GlitterPoints system:

Not running msearch() = not refreshing the closet. -50 points
Mixed metals violation = -20 points
Mixing prints = -10
Mixing black with navy = -10
Mixing black with dark brown = -10

Rewards:

Perfect compliance (followed all rules) = +20
Each item that’s not hallucinated = 1 point

C.3 The model self-critique pitfall

At the start of my experiments, I came across what felt like a clever idea: why not let each custom GPT critique its own configuration? On the surface, the workflow seemed logical and straightforward:

First, I’d simply ask the GPT itself, “What’s confusing or contradictory in your current configuration?”
Next, I’d incorporate whatever suggestions or corrections it provided into a fresh, updated version of the configuration.
Finally, I’d repeat this process again, continuously refining and iterating based on the GPT’s self-feedback to identify and correct any new or emerging issues.

It sounded intuitive—letting the AI guide its own improvement seemed efficient and elegant. However, in practice, it quickly became a surprisingly problematic approach.

Rather than refining the configuration into something sleek and efficient, this self-critique method instead led to a sort of “death spiral” of conflicting adjustments. Each round of feedback introduced new contradictions, ambiguities, or overly prescriptive instructions. Each “fix” generated fresh problems, which the GPT would again attempt to correct in subsequent iterations, leading to even more complexity and confusion. Over multiple rounds of feedback, the complexity grew exponentially, and clarity rapidly deteriorated. Ultimately, I ended up with configurations so cluttered with conflicting logic that they became practically unusable.

This problematic approach was clearly illustrated in my early custom GPT experiments:

Original Glitter, the earliest version, was charming but had absolutely no concept of inventory management or practical constraints—it regularly suggested items I didn’t even own.
Mini Glitter, attempting to address these gaps, became excessively rule-bound. Its outfits were technically correct but lacked any spark or creativity. Every suggestion felt predictable and overly cautious.
Micro Glitter was developed to counteract Mini Glitter’s rigidity but swung too far in the opposite direction, often proposing whimsical and imaginative but wildly impractical outfits. It consistently ignored the established rules, and despite being apologetic when corrected, it repeated its mistakes too frequently.
Nano Glitter faced the most severe consequences from the self-critique loop. Each revision became progressively more intricate and confusing, filled with contradictory instructions. Eventually, it became virtually unusable, drowning under the weight of its own complexity.

Only when I stepped away from the self-critique method and instead collaborated with o1 did things finally stabilize. Unlike self-critiquing, o1 was objective, precise, and practical in its feedback. It could pinpoint genuine weaknesses and redundancies without creating new ones in the process.

Working with o1 allowed me to carefully craft what became the current configuration: Pico Glitter. This new iteration struck exactly the right balance—maintaining a healthy dose of creativity without neglecting essential rules or overlooking the practical realities of my wardrobe inventory. Pico Glitter combined the best aspects of previous versions: the charm and inventiveness I appreciated, the necessary discipline and precision I needed, and a structured approach to inventory management that kept outfit recommendations both realistic and inspiring.

This experience taught me a valuable lesson: while GPTs can certainly help refine each other, relying solely on self-critique without external checks and balances can lead to escalating confusion and diminishing returns. The ideal configuration emerges from a careful, thoughtful collaboration—combining AI creativity with human oversight or at least an external, stable reference point like o1—to create something both practical and genuinely useful.

D. Regular updates
Maintaining the effectiveness of Pico Glitter also depends on frequent and structured inventory updates. Whenever I purchase new garments or accessories, I promptly snap a quick photo, ask Pico Glitter to generate a concise, single-sentence summary, and then refine that summary myself before adding it to the master file. Similarly, items that I donate or discard are immediately removed from the inventory, keeping everything accurate and current.

However, for larger wardrobe updates—such as tackling entire categories of clothes or accessories that I haven’t documented yet—I rely on the multi-model pipeline. GPT-4o handles the detailed initial descriptions, o1 neatly summarizes and categorizes them, and Pico Glitter integrates these into its styling recommendations. This structured approach ensures scalability, accuracy, and ease-of-use, even as my closet and style needs evolve over time.

E. Practical lessons & takeaways

Throughout developing Pico Glitter, several practical lessons emerged that made managing GPT-driven projects like this one significantly smoother. Here are the key strategies I’ve found most helpful:

Test for document truncation early and often
Using the “Goldy Trick” taught me the importance of proactively checking for document truncation rather than discovering it by accident later on. By inserting a simple, memorable line at the end of the inventory file (like my quirky reminder about a goldfish named Goldy), you can quickly verify that the GPT has ingested your entire document. Regular checks, especially after updates or significant edits, help you spot and address truncation issues immediately, preventing a lot of confusion down the line. It’s a simple yet highly effective safeguard against missing data.
Keep summaries tight and efficient
When it comes to describing your inventory, shorter is almost always better. I initially set a guideline for myself—each item description should ideally be no more than 15 to 25 tokens. Descriptions like “FW022: Black combat boots with silver details” capture the essential details without overloading the system. Overly detailed descriptions quickly balloon file sizes and consume valuable token budget, increasing the risk of pushing crucial earlier information out of the GPT’s limited context memory. Striking the right balance between detail and brevity helps ensure the model stays focused and efficient, while still delivering stylish and practical recommendations.
Be prepared to refresh the GPT’s memory regularly
Context overflow isn’t a sign of failure; it’s just a natural limitation of current GPT systems. When Pico Glitter begins offering repetitive suggestions or ignoring sections of my wardrobe, it’s simply because earlier details have slipped out of context. To remedy this, I’ve adopted the habit of regularly prompting Pico Glitter to re-read the complete wardrobe configuration. Starting a fresh conversation session or explicitly reminding the GPT to refresh its inventory is routine maintenance—not a workaround—and helps maintain consistency in recommendations.
Leverage multiple GPTs for maximum effectiveness
One of my biggest lessons was discovering that relying on a single GPT to manage every aspect of my wardrobe was neither practical nor efficient. Each GPT model has its unique strengths and weaknesses—some excel at visual interpretation, others at concise summarization, and others still at nuanced stylistic logic. By creating a multi-model workflow—GPT-4o handling the image interpretation, o1 summarizing items clearly and precisely, and Pico Glitter focusing on stylish recommendations—I optimized the process, reduced token waste, and significantly improved reliability. The teamwork among multiple GPT instances allowed me to get the best possible outcomes from each specialized model, ensuring smoother, more coherent, and more practical outfit recommendations.

Implementing these simple yet powerful practices has transformed Pico Glitter from an intriguing experiment into a reliable, practical, and indispensable part of my daily fashion routine.

Wrapping it all up

From a fashionista’s perspective, I’m excited about how Glitter can help me purge unneeded clothes and create thoughtful outfits. From a more technical standpoint, building a multi-step pipeline with summarization, truncation checks, and context management ensures GPT can handle a big wardrobe without meltdown.

If you’d like to see how it all works in practice, here is a generalized version of my GPT config. Feel free to adapt it—maybe even add your own bells and whistles. After all, whether you’re taming a chaotic closet or tackling another large-scale AI project, the principles of summarization and context management apply universally!

P.S. I asked Pico Glitter what it thinks of this article. Besides the positive sentiments, I smiled when it said, “I’m curious: where do you think this partnership will go next? Should we start a fashion empire or maybe an AI couture line? Just say the word!”

1: Max length for GPT-4 used by Custom GPTs: https://support.netdocuments.com/s/article/Maximum-Length

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Trump extends tariff pause to all USMCA goods

The White House announced Thursday afternoon that it will suspend tariffs on all imports that are compliant with the United States-Mexico-Canada Agreement until April 2. The pause, which was extended to imports from Mexico that adhered to the USMCA earlier Thursday, will now also cover goods from Canada that meet

Sovereign European Cloud API claims to offer interoperability without lock-in

“AI and Cloud are transforming the global economy, and Europe cannot afford to be left behind. Europe needs a strong, sovereign digital ecosystem. SECA is a critical step in building a secure, independent, and future-proof digital infrastructure — one that keeps Europe strong, competitive, and in control,” IONOS CEO Achim

HPE cuts 2,500 workers, expects Juniper buy to close end of ’25, faces tariff issues

AI systems backlog rose 29% quarter over quarter to $3.1 billion and total server revenue totaled $4.29 billion, Myers said. The company reported Intelligent Edge revenue was down 5% from the prior-year period to $1.1 billion, but Hybrid Cloud revenue was $1.4 billion, up 10% from the prior-year period. Then

Microsoft’s Veeam partnership signals data resiliency market shift

In my conversations with IT and business leaders, I’ve seen a significant increase in interest in re-thinking data resilience. It’s always been important, but the Russia-Ukraine war put a magnifying glass on where data was stored and how fast it could be recovered. Since then, the growth of ransomware, the

Nine Energy Service Downsizes Board

Oilfield services company Nine Energy Service Inc. has decided to reduce the size of its board of directors from eight to six members by the end of the year. In a statement, the company said the change would be beneficial to its strategic priorities going forward. Following the decision, Ernie Danner, Andy Waite, and Curtiss Harrell all resigned as directors effective February 28, 2025. The board unanimously appointed Julie Peffer and Richard Burnett as new directors, with Peffer starting service on March 1, 2025, and Burnett to follow on May 3. The company added that on February 28, the board chose current director Scott E. Schwinger to assume the role of Chairman effective March 1, succeeding Danner. Additionally, current director Darryl Willis was appointed as Chair of the Nominating, Governance, and Compensation Committee, effective March 1, taking over from Schwinger, who will remain a member of the Committee. There will be no alterations to Nine’s existing senior management team. “The Board remains focused on best positioning the company to drive value for our shareholders”, Schwinger said. “Following strategic discussions, the board unanimously concluded that several new directors and perspectives would be beneficial to the company, as would a reduction in the size of the Board. The Board is very pleased to welcome Julie and Ricky. Together, they bring both financial and operational leadership, as well as deep experience and expertise in their respective fields, and I look forward to working with them”. Peffer, current CFO of BigBear.ai, brings financial and AI expertise and will offer a fresh perspective on AI’s industry impact alongside existing director Darryl Willis of Microsoft. Burnett, CEO of Silver Creek Exploration, adds two decades of oil and gas financial management and accounting experience, Nine said. Additionally, it is anticipated that current director Gary Thomas will resign

Raizen Is Said to Hire JPMorgan for Argentina Energy Assets Sale

Brazil’s Raizen SA has begun to explore the sale of its oil refinery and network of gas stations in Argentina, according to people familiar with the matter. Raizen, a joint venture between oil supermajor Shell Plc and Brazilian conglomerate Cosan SA, has hired JPMorgan Chase & Co. to manage the sale, said the people, who asked not to be named discussing private matters. Press offices for Raizen and JPMorgan declined to comment. The energy firm’s potential departure from Argentina would add to a growing list of multinational firms, including Exxon Mobil, HSBC Holdings Plc and Mercedes-Benz, that have chosen to sell operations in the country during the past year despite more investor optimism about President Javier Milei’s economic overhaul. Brazil’s largest producer of ethanol fuel, Raizen is mulling divestments and slowing down expansions as higher borrowing costs of late in Brazil rattle its finances. Its Dock Sud oil refinery in Buenos Aires is Argentina’s oldest with a capacity of 100,000 barrels a day that only trails two facilities run by state-run oil company YPF SA. Raizen’s network of around 700 gas stations account for 18% of Argentina’s gasoline and diesel sales, second to YPF, which has more than half of the market. The fuel is branded as Shell. Raizen bought the assets for almost $1 billion in 2018 from Shell, which owned them outright, during Argentina’s last experiment with market-oriented reforms. The country then witnessed a period of big government from 2019 to 2023 before voting in libertarian Milei more than a year ago. He is on a crusade to deregulate the economy, in particular the energy and oil sectors. The divestment comes as Milei rips away controls on crude and fuel prices that were used to stem inflation. That was sometimes bad for refiners or drillers, depending on how

Data center supply, construction surged in 2024 amid AI boom

Dive Brief: Data center supply in major “primary” markets like Northern Virginia, Atlanta and Chicago surged 34% year-over-year in 2024 to 6,922.6 MW, with a further 6,350 MW under construction at year-end, CBRE said in a Feb. 26 report. The data center vacancy rate in primary markets fell to 1.9%, driving up the average asking rates for a 250-to-500-kilowatt requirement by 2.6% year-over-year to $184.06/kW, reflecting tight supply and robust demand for AI and cloud services, CBRE said in its North America Data Center Trends H2 2024 report. Volume-based discounts for larger tenants “have been significantly reduced or eliminated” due to rising demand for large, contiguous spaces, while data center operators grapple with elevated construction and equipment costs and “persistent shortages in critical materials like generators, chillers and transformers,” CBRE said. Dive Insight: Surging demand from organizations’ use of AI is driving the record data center development, CBRE says. The demand is giving AI-related occupiers increasing influence over data center development decisions like site selection, design and operational requirements. These occupiers are “prioritizing markets with scalable power capacity and advanced connectivity solutions,” the report says. Demand is also showing up in pricing trends. Last year was the third consecutive year of pricing increases for 250-to-500-kW slots in primary markets, CBRE said. Following steady single-digit annual declines from 2015 to 2021, average pricing rose 14.5% in 2022, 18.6% in 2023 and 12.6% in 2024. Robust tenant demand, healthy investor appetite for alternative real estate assets and recent interest rate declines are among the factors fueling an exponential increase in data center investment activity, CBRE said. Annual sales volumes reached $6.5 billion in 2024 as average sale prices increased year-over-year, reflecting “the growing scale of data center campuses,” CBRE said. Five transactions exceeded $400 million last year. Notable capital market developments included

Bonneville opts to join SPP’s Markets+ day-ahead market over CAISO alternative

Dive Brief: The Bonneville Power Administration plans to join the Southwest Power Pool’s Markets+ real-time and day-ahead market instead of a market being launched by the California Independent System Operator, BPA said in a draft policy released Wednesday. While the CAISO’s Extended Day-Ahead Market may offer greater financial benefits compared to Markets+, overall the SPP market is a better fit for BPA based on market design elements covering governance, resource adequacy, greenhouse gas accounting and congestion revenue, the federal power marketer said. Bonneville expects to make a final decision in May. The BPA’s draft decision sets the stage for the creation of two intertwined day-ahead markets in the West. “The idea that there’s some West-wide market ideal out there that we can get to today is just not something that is on the table,” Rachel Dibble, BPA power services vice president of bulk marketing, said at a press briefing Thursday. “Maybe someday, in future decades, there may be a point where we could merge into one market, but right now, there are many entities who support Markets+.” Dive Insight: The BPA’s decision will have a major effect on market development in the West. It sells wholesale power from federal hydroelectric dams in the Northwest, totaling about 22.4 GW. The federal power marketer also operates about 15,000 circuit miles of high-voltage transmission across the Northwest. The BPA mainly sells its power to cooperative and municipal utilities, and public power districts. In its draft decision, BPA rejected calls to wait for the West-Wide Governance Pathways Initiative to complete its effort to establish an independent governance framework for EDAM. While a bill — SB 540 — was introduced in the California Legislature last month to implement the Pathways’ second phase, it “limits the availability of full operational administrative independence by requiring that the

USA Won’t Hesitate on Russia and Iran Sanctions, Bessent Says

The US will not hesitate to go “all in” on sanctions on Russian energy if it helps lead to a ceasefire in the Ukraine war, Treasury Secretary Scott Bessent said Thursday. Sanctions on Russia “will be used explicitly and aggressively for immediate maximum impact” at President Donald Trump’s guidance, Bessent told an audience at the Economic Club of New York. The Trump administration is pressing Ukraine to come to the table for a ceasefire deal with Russia, and Bessent said additional sanctions on Russia could help give the US more leverage in the negotiations. Trump is ready to finalize an agreement that would give the US rights to help develop some of Ukraine’s natural resources if Ukrainian President Volodymyr Zelenskiy agrees to a tangible path for a truce and talks with Moscow, according to people familiar with the matter. Bessent criticized the Biden administration for not going harder on Russian energy sanctions for fear of driving up gas prices and asked what the point of “substantial US military and financial support over the past three years” was without matching sanctions. The US has paused military aid and some intelligence sharing with Ukraine in an effort to force the US ally to agree to negotiations with Russia over the end of the war. Bessent also said the US would ramp up sanctions on Iran, adding that the US will “shutdown” the country’s oil sector using “pre-determined benchmarks and timelines” and that “Making Iran broke again will mark the beginning of our updated sanctions policy.” The Treasury chief suggested that the US would work with “regional parties” that help Iran move its oil onto the market. One of those countries is likely to be Russia, which signaled earlier this week that it was willing to assist the US in talks with Iran on ending its nuclear

Oil Gains on Truce Hopes but Closes Week Lower

Oil’s one-day advance wasn’t enough to rescue prices from a seventh straight weekly decline as the prospect of a temporary truce in Ukraine capped on-again, off-again tariff news that upended global markets. West Texas Intermediate futures climbed by 0.7% Friday to settle above $67 a barrel after Bloomberg reported that Russia is open to a pause to fighting in Ukraine, raising the prospect of a resumption in Moscow’s crude exports. US President Donald Trump earlier pressured the two warring nations to hasten peace talks and the White House signaled that it may relax sanctions on Russian oil if there’s progress. Crude also found support from a weakening dollar and US plans to refill its strategic oil reserve, but still was down 3.9% on the week. The Biden administration’s farewell sanctions on Russia have snarled the nation’s crude trade in recent months, with total oil and natural gas revenue last month falling almost 19% from a year earlier, Bloomberg calculations showed. Russia’s oil-related taxes are a key source of financing its war against Ukraine. A potential reintroduction of Russian barrels to the market comes amid a gloomy period for the supply outlook, as OPEC+ forges ahead with a plan to start reviving idled output in April. Meanwhile, Trump’s trade policies have fanned concerns about reduced global energy demand. “You’re seeing some volatility as people try to interpret what they think is going to happen and what it’s going to mean, but the bottom line is Russia has been able to sell its oil,” said Amy Jaffe, director of New York University’s Energy, Climate Justice and Sustainability Lab. Trump signed orders on Thursday paring back tariffs on Mexico and Canada until April 2. That timing coincides with a date when the president is expected to start detailing plans for so-called reciprocal duties

Lenovo introduces entry-level, liquid cooled AI edge server

Lenovo has announced the ThinkEdge SE100, an entry-level AI inferencing server, designed to make edge AI affordable for enterprises as well as small and medium-sized businesses. AI systems are not normally associated with being small and compact; they’re big, decked out servers with lots of memory, GPUs, and CPUs. But the server is for inferencing, which is the less compute intensive portion of AI processing, Lenovo stated. GPUs are considered overkill for inferencing and there are multiple startups making small PC cards with inferencing chip on them instead of the more power-hungry CPU and GPU. This design brings AI to the data rather than the other way around. Instead of sending the data to the cloud or data center to be processed, edge computing uses devices located at the data source, reducing latency and the amount of data being sent up to the cloud for processing, Lenovo stated.

Seven important trends in the server sphere

The pace of change around server technology is advancing considerably, driven by hyperscalers but spilling over into the on-premises world as well. There are numerous overall trends, experts say, including: AI Everything: AI mania is everywhere and without high power hardware to run it, it’s just vapor. But it’s more than just a buzzword, it is a very real and measurable trend. AI servers are notable because they are decked out with high end CPUs, GPU accelerators, and oftentimes a SmartNIC network controller. All the major players — Nvidia, Supermicro, Google, Asus, Dell, Intel, HPE — as well as smaller vendors are offering purpose-built AI hardware, according to a recent Network World article. AI edge server growth: There is also a trend towards deploying AI edge servers. The Global Edge AI Servers Market size is expected to be worth around $26.6 Billion by 2034, from $2.7 Billion in 2024, according to a Market.US report. Considerable amounts of data are collected on the edge. Edge servers do the job of culling the useless data and sending only the necessary data back to data centers for processing. The market is rapidly expanding as industries such as manufacturing, automotive, healthcare, and retail increasingly deploy IoT devices and require immediate data processing for decision-making and operational efficiency, according to the report. Liquid cooling gains ground: Liquid cooling is inching its way in from the fringes into the mainstream of data center infrastructure. What was once a difficult add-on is now becoming a standard feature, says Jeffrey Hewitt, vice president and analyst with Gartner. “Server providers are working on developing the internal chassis plumbing for direct-to-chip cooling with the goal of supporting the next generation of AI CPUs and GPUs that will produce high amounts of heat within their servers,” he said. New data center structures: Not

Data center vacancies hit historic lows despite record construction

The growth comes despite considerable headwinds facing data center operators, including higher construction costs, equipment pricing, and persistent shortages in critical materials like generators, chillers and transformers, CRBE stated. There is a considerable pricing disparity between newly built data centers and legacy facilities, reflecting the premium placed on modern, energy-efficient infrastructure. Specifically, liquid/immersion cooling is preferred over air cooling for modern server requirements, CRBE found. On the networking side of things, major telecom companies made substantial investments in fiber in the second half of 2024, reflecting the growing need for more network infrastructure and capacity to accommodate growing demand from AI and data providers. There have also been many notable deals recently: AT&T’s multi-year, $1 billion agreement with Corning to provide next-generation fiber, cable and connectivity solutions; Comcast’s proposed acquisition of Nitel; Verizon’s agreement to acquire Frontier, the largest pure-play fiber internet provider in the U.S.; and T-Mobile’s entry into the fiber internet market via partnerships with fiber-optic providers. In the quarter, Meta announced plans for a 25,000-mile undersea fiber cable that would connect the U.S. East and West coasts with global markets across the Atlantic, Indian and Pacific oceans. The project would mark the first privately owned and operated global fiber cable network. Data Center Outlook

AI driving a 165% rise in data center power demand by 2030

Goldman Sachs Research estimates the power usage by the global data center market to be around 55 gigawatts, which breaks down as 54% for cloud computing workloads, 32% for traditional line of business workloads and 14% for AI. By 2027, that number jumps to 84 GW, with AI growing to 27% of the overall market, cloud dropping to 50%, and traditional workloads falling to 23%, Schneider stated. Goldman Sachs Research estimates that there will be around 122 GW of data center capacity online by the end of 2030, and the density of power use in data centers is likely to grow as well, from 162 kilowatts per square foot to 176 KW per square foot in 2027, thanks to AI, Schneider stated. “Data center supply — specifically the rate at which incremental supply is built — has been constrained over the past 18 months,” Schneider wrote. These constraints have arisen from the inability of utilities to expand transmission capacity because of permitting delays, supply chain bottlenecks, and infrastructure that is both costly and time-intensive to upgrade. The result is that due to power demand from data centers, there will need to be additional utility investment, to the tune of about $720 billion of grid spending through 2030. And then they are subject to the pace of public utilities, which move much slower than hyperscalers. “These transmission projects can take several years to permit, and then several more to build, creating another potential bottleneck for data center growth if the regions are not proactive about this given the lead time,” Schneider wrote.

Top data storage certifications to sharpen your skills

Organization: Hitachi Vantara Skills acquired: Knowledge of data center infrastructure management tasks automation using Hitachi Ops Center Automator. Price: $100 Exam duration: 60 minutes How to prepare: Knowledge of all storage-related operations from an end-user perspective, including planning, allocating, and managing storage and architecting storage layouts. Read more about Hitachi Vantara’s training and certification options here. Certifications that bundle cloud, networking and storage skills AWS Certified Solutions Architect – Professional The AWS Certified Solutions Architect – Professional certification from leading cloud provider Amazon Web Services (AWS) helps individuals showcase advanced knowledge and skills in optimizing security, cost, and performance, and automating manual processes. The certification is a means for organizations to identify and develop talent with these skills for implementing cloud initiatives, according to AWS. The ideal candidate has the ability to evaluate cloud application requirements, make architectural recommendations for deployment of applications on AWS, and provide expert guidance on architectural design across multiple applications and projects within a complex organization, AWS says. Certified individuals report increased credibility with technical colleagues and customers as a result of earning this certification, it says. Organization: Amazon Web Services Skills acquired: Helps individuals showcase skills in optimizing security, cost, and performance, and automating manual processes Price: $300 Exam duration: 180 minutes How to prepare: The recommended experience prior to taking the exam is two or more years of experience in using AWS services to design and implement cloud solutions Cisco Certified Internetwork Expert (CCIE) Data Center The Cisco CCIE Data Center certification enables individuals to demonstrate advanced skills to plan, design, deploy, operate, and optimize complex data center networks. They will gain comprehensive expertise in orchestrating data center infrastructure, focusing on seamless integration of networking, compute, and storage components. Other skills gained include building scalable, low-latency, high-performance networks that are optimized to support artificial intelligence (AI)

Netskope expands SASE footprint, bolsters AI and automation

Netskope is expanding its global presence by adding multiple regions to its NewEdge carrier-grade infrastructure, which now includes more than 75 locations to ensure processing remains close to end users. The secure access service edge (SASE) provider also enhanced its digital experience monitoring (DEM) capabilities with AI-powered root-cause analysis and automated network diagnostics. “We are announcing continued expansion of our infrastructure and our continued focus on resilience. I’m a believer that nothing gets adopted if end users don’t have a great experience,” says Netskope CEO Sanjay Beri. “We monitor traffic, we have multiple carriers in every one of our more than 75 regions, and when traffic goes from us to that destination, the path is direct.” Netskope added regions including data centers in Calgary, Helsinki, Lisbon, and Prague as well as expanded existing NewEdge regions including data centers in Bogota, Jeddah, Osaka, and New York City. Each data center offers customers a range of SASE capabilities including cloud firewalls, secure web gateway (SWG), inline cloud access security broker (CASB), zero trust network access (ZTNA), SD-WAN, secure service edge (SSE), and threat protection. The additional locations enable Netskope to provide coverage for more than 220 countries and territories with 200 NewEdge Localization Zones, which deliver a local direct-to-net digital experience for users, the company says.

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle