Developing taste
The GAN theory of picking great ideas
I’m obsessed with taste, i.e., the skill of telling great ideas from bad ones, and have written substantively about its downstream consequences for research idea development: Idea triage and the haunt filter, How to develop great ideas, and You and Your Research1.
Now I am taking a different stab at the problem by focusing on taste itself and how to develop it. This already foreshadows what I now think about the widespread view of taste as an intangible skill that cannot be learned but appears with the wave of a magic wand.
For a long time, this is what I believed—or hoped—: that if I put in the work, I would have great taste. It felt like blind trust—blind in the sense of me not having evidence of this working out, but not not having any evidence at all, as I have seen people with great taste.
This is not a bad thing per se: as in many cases of Dunning-Kruger, when you’re a novice, you don’t know what you don’t know, so the best thing you can do is find a mentor or guide who can show you the path and whom you can trust.
This still leaves the process implicit, presumably leaving a lot on the table.
The question I keep circling back to: can’t we do this better? Can’t we open up the black box of developing taste and being able to learn and teach it more effectively?
People do develop taste somehow, which means there has to be a way to teach it, even if implicitly. Reflecting on what I am doing now when I supervise students (and looking back, what my supervisors and more senior colleagues have been doing) is teaching taste.
The systemic problem in research—among others—is that taste is not taught explicitly. You pick it up somehow, but no one seems to ask whether this is the most efficient or effective way.
What I will present here is a compilation of my observations, my experience, and my understanding of the latest research, with an additional, extremely pressing contextualization of what needs to change as genAI tools enter knowledge work.
To get to the bottom of it, we first need to discuss the different types of skills, how they can be learned, and how they affect one another.
Generative vs discriminative skills
We need to distinguish between two types of expertise: doing (generative) and judging (discriminative). Taste is a discriminative skill. This means we actually don’t need the generative part to develop taste, right? Yes and no.
To understand the nuance, let’s use an analogy from machine learning: the Generative Adversarial Network (Goodfellow et al. 2014). A GAN is two neural networks paired together: a generator trying to produce images that fool the discriminator into believing they’re real, and a discriminator trying to tell real from generated. Training becomes an arms race: the discriminator gets sharper at spotting fakes (i.e., developing taste), and the generator gets better at producing images harder to judge.
For argument’s sake, let’s imagine the extremes (assuming no external feedback):
No discriminative skill: you produce and ship, but there’s no feedback. This leads to slop (footnote: irrespective of whether you’re using AI) and stagnation, since you don’t have the signal to improve your generative skills.
Only discriminative skill: you can’t make, but you can recognize. You can coach, edit, judge, but probably only the result. You are not able to tell someone how to improve their work, only whether it’s great or not.
Both: The discriminator closes the feedback loop; thus the generator can improve. The catch: a feedback loop only works if you actually take the feedback.
The trap is having generative skill without the discriminative one.
You’ll mistake your output for good output — because your judgement is not calibrated and because you’re attached to your own ideas. The ease of producing something can also be mistaken for a sense of its quality, and genAI is skewing this picture.
You don’t always need to make to know what’s good
The observation that started putting cracks in my taste always comes later belief was hearing stories about people with exceptional taste who have no expertise whatsoever in making the stuff they judge. And judge they can. Rick Rubin, the producer behind many of this era’s music stars, is the canonical example, alongside literary critics, food critics, and many sports coaches.
It’s not black and white, though. Many sports coaches were competitive athletes first; they know what athletes should feel and what to pay attention to. Fred Duncan recently encouraged coaches in his newsletter to sprint themselves. To say the least, there is a confounder.
In research, the data might suggest the same: look at any group leader, PI, or professor, and you’ll find they were doing research first. But this is classic post hoc ergo propter hoc: just because it came first doesn’t make it the cause.
Turning to the research, the belief turns out to be in part a mirage. You can train discriminative skills across many settings, though, as always, the specifics of each study matter.
You can provide multiple solutions and ask people to judge them, and evaluative judgement turns out to be directly trainable, not a by-product of producing. Students sharpen their own competence by assessing peers (Tai et al. 2018; Ibarra-Sáiz et al. 2020); comparison shifts attention to higher-order features before anyone can produce expert work (Bouwer et al. 2018); and diagnostic skill trains directly, from spotting fractures in radiographs (Chen et al. 2017) to critical paper reading (Yuan et al. 2023). The question is old: Stratton & Brown (1972) already trained creative thinking by the judgment of solutions, not only their production. If discriminative skills can be learned in isolation, then the conventional generative-first timeline is only a convention. Instruction and feedback can substitute for generative reps, as long as the environment is kind enough that the signal is legible.
This doesn’t make generative skills useless, though. First, we should be clear about what we want discriminative skill for: only to tell bad results from great ones? Or to help others develop theirs? Telling someone that what they produced is bad isn’t the most helpful thing you can do. In reinforcement-learning terms it’s a sparse reward on the outcome: it doesn’t tell you how to get there. The fix is a process reward model: feedback throughout the process, not just on the outcome. A literary critic probably can’t give you that, because it requires generative skills too — or at least a deep familiarity with the process.
This is documented in the literature as the recognition-versus-repair gap2. People can fix an error once it’s pointed out, yet fail to spot it on their own (Hacker et al. 1994). Detecting a flaw is a different skill from correcting it (Grosse & Renkl 2007). Tellingly, students who evaluate AI solutions often catch that something is wrong but rarely reconstruct the right principle (Dickey et al. 2026). Judging the output can be trained rather cheaply and without generative skills; but guiding others through the process heavily benefits from having done the thing yourself.
How I learned and now teach taste
When I started my PhD, I had generative skills (I could write and run experiments) and almost no discriminative ones. So I produced. The writing was sloppy, and the figures didn’t make sense. Then came the feedback: from advisors, from reviewers, from collaborators. It often hurt a lot — but the struggle of leaning into feedback, especially when it hurts, is what taught me to do things better and to become more aware of my blind spots.
The feedback givers didn’t always tell me how to write better directly — sometimes they did, especially when I felt lost. What they most often did was give me an (implicit) objective function — this is what good looks like — and I had to figure out how to get there.
Once I had the objective function, I could run the eval loop myself. I’d write a paragraph, reread it, and ask: Does this match what good looks like? I learned to see my own work the way a critic would. That’s how you develop taste: you mimic someone else’s objective function of what counts as great. It becomes a meta-loop:
How to improve your taste
You copy someone else’s taste (it won’t be a perfect copy).
You follow that copycat taste to produce something.
You bring what you produced back to the same person and ask for feedback: a signal of how good your copy of their taste was.
Repeat.
This resembles the iterative, two-stage structure of many learning algorithms: this is Expectation-Maximization, or how you train a GAN: you alternate between estimating taste and improving your output against your current estimate of it.
Importantly, step 2 doesn’t require that you produce the thing being judged. That’s only how it usually gets done, but it does not need to!
Producing is often slow, so the feedback is sparse. If instead you could judge work that others (or an AI) generated, by the dozen, you’d develop taste far faster. That’s my hypothesis, based on my reading of the research and how this has worked for me.
The most common mistake I see in PhD students, my early self included, is refusing to talk to a supervisor until they have something in hand: a proof, experimental results. But what they bring will probably be low quality, precisely because they lack the taste to tell. Getting feedback earlier, and being willing to kill your darlings, leaves more room to execute on the great ideas and to build the taste in the first place. This can become a vicious circle:
You need dense feedback to improve your taste more efficiently. This requires exposing yourself to frequent failure.
One caveat: what your feedback giver considers good is not necessarily the gold standard now or in the future, so surround yourself wisely. But that’s how your taste gets built.
While it’s happening, the process feels magical and you can’t put your finger on how it actually works. Reflection helps a lot (Footnote: many thanks to Bob Williamson for encouraging me to journal on the experience of becoming a scientist). But what cracked the code for me was doubling down on teaching it, as teaching forces you to make tacit knowledge explicit.
On a project I’m supervising with a student, I caught myself making explicit statements about what a good paper looks like (see Improving clarity in scientific papers). I’m generally against the unconditional acceptance of tradition, but sharing what’s expected from a paper — its structure, for instance — makes it easier for readers to navigate. As I argued in Science as a debate, the writing of a paper isn’t about science; it’s about convincing.
Why this matters more now
A common path to becoming a manager is going through the technical contributor stage first. You built things, you know what you want, you can tell when someone hands you great work.
With AI agents, you’re the manager, except the barrier to entry is so low that you can skip the contributor stage entirely. You can ship code, papers, designs before you’ve ever developed the discriminator that would tell you whether what you shipped is any good.
The widening gap between producing polished work and judging whether it’s sound is what Bearman et al. (2024) call developing evaluative judgement for a time of generative AI. AI assistance can speed skill decay without your noticing, because assisted performance stays high while the underlying capability erodes (Macnamara et al. 2024); early-career researchers already cut verification under time pressure (Wang et al. 2024). The automation literature predicted the same out-of-the-loop failure decades ago (Endsley & Kiris 1995; Parasuraman et al. 1993).
What about the generative skills, then?
Losing, or not developing, some generative skill isn’t bad per se. This is where I think Cal Newport’s digital deskilling argument partially overshoots — at least if I understand it correctly. Is it inherently valuable to be able to write TikZ code? To write API calls without autocomplete?
In (at least) three cases, yes:
When that’s the only way to produce something you can judge;
When you need to know how to do it well because you want to teach it, and want to make the feedback dense;
When you get fulfilment out of doing the thing—even if it’s not the most efficient.
When none of these holds, and you don’t need or want the generative skill, it’s fine to focus on the discriminative side. So say you’ve reflected on the above and want to improve your taste — faster. What do you do?
How to develop taste
Don’t underestimate the passive act of observation from the outside. In writing and photography, one of the most frequent pieces of advice a novice gets is to go out, look at great works, and dissect why they work. And then you’re urged to go produce—and to accept that most of it will be bad.
Why don’t we do the same in research, systemically?
There are PIs out there who devote time to this, but not nearly enough. You could say producing research is costly, so you only get a few reps. But you can step outside that box, e.g., you could:
Generate ideas by the dozen and discuss them with others.
Write many tentative abstracts (assuming there’s a paper behind each), and discuss which would be great work. Pick those and execute.
Study the masters3.
Always, always try to articulate why something does or doesn’t work — this is akin to teaching, and if you can teach it, all the better.
There is research supporting these claims (again, check the specifics): comparison, peer review, and exemplars train judgement directly (Tai et al. 2018; Bouwer et al. 2018), and concrete expert examples calibrate better than abstract criteria (Froese & Roelle 2022; Gyamfi et al. 2021) — i.e., a sharper version of “study the masters.”
What does a great research paper look like?
So let me demonstrate how this could work. I’ll take one of my early papers, show what makes it not a great research paper, then highlight everything that works — so you can hear what the eval loop sounds like from the inside.
Embrace the Gap
This was my first PhD paper, and I learned an enormous amount from it, though there are things I’d do differently. The science is interesting, the experiments well-designed, and the figures nice. But there is room for improvement.
What doesn’t work
The paper has multiple messages. It’s very dense and tries to do too many things at once (reviewers noted this too). We considered splitting it into two to promote one theoretical result (interesting on its own) into a separate paper. We dropped that, but maybe we shouldn’t have: the paper is simply too long, and I doubt many people went through all the lemmata to reach the main proof.
No spectrum of details in the technical core. The theoretical writing is too dense. Granted, there’s an intuitive Figure 1 and an informal description of why things work; still, I’d push the formal theorem and assumptions to the appendix and keep an informal version up front, knowing most readers don’t want to go that deep.
Less can be more. I wasn’t strict enough about cutting from the appendix.
What works
Experiment-theory match. The experimental design that tests our theory in controlled settings (the ingenious idea of Luigi Gresele) is beautiful: it closes the theory-practice gap that is usually left open.
Clean and unique design. The figures’ color scheme and typography are distinctive (nothing fancy, but not the default palette either). This adds nothing to the science, but it adds to the presentation.
Figure 1 carries the intuition; still somewhat heavy, but far lower cognitive load than the full theory.
Robust evaluation. The claims are decently robust to setup and assumption violations (you can always run more seeds, but as far as I know, we did everything we could think of).
So, can taste be taught?
I think it can. Deliberately, and much faster than we do now in how we train researchers. I want to test that. If there were a program to build research taste this way, would you join? Tell me in the comments.
Resources
Christopher Olah’s Research Taste Exercises
Taste for Makers by Paul Graham
You and Your Research by Richard Hamming
Daniel Pink’s take on taste from a recent commencement speech, with a hilarious story of potatoes.
—-
P.P.P.S.: an amazing speech-to-text app that actually works, with clever functionality like custom snippets and styles. This link gives you and me a free month of Pro. The free version is also plenty—it has rate limits, but you get almost the same functionality.
The seminal pieces that highly influenced my thinking are Taste for Makers by Paul Graham and You and Your Research by Richard Hamming.
As Neil Gaiman put it in his rules of writing: “when people tell you something’s wrong or doesn’t work for them, they are almost always right. When they tell you exactly what they think is wrong and how to fix it, they are almost always wrong”
For writing, please don’t try to emulate most academics; learn from writers and journalists. For figures, graphic designers; for code, software engineers; and so on.



