Epidemiologist using P2C to analyze complex data.

A few years into practice, an epidemiologist I’ll call Elena found herself living a familiar contradiction. She had strong ideas, good data, and a clear sense of what “better” analysis could look like. She also had a calendar full of meetings, stakeholders waiting for answers, and a laptop full of half-finished scripts.

Elena did what many of us do when time gets tight. She narrowed the question until it fit the tools she could reliably execute. She simplified. She made peace with the basic model she learned in school. The analysis ran, the table exported, the slide deck went out.

But Elena knew what she was leaving on the table.

“I’m not avoiding better methods because I don’t understand them. I’m avoiding them because I can’t afford the implementation.”

Elena’s Daily Challenges

The barrier wasn’t the science. Elena could define outcomes, defend assumptions, and explain confounding. What kept pushing her toward the simplest analysis was the friction of code: repetitive reshaping, plotting, formatting, and the small errors that can swallow an afternoon.

The consequences showed up in predictable ways. Reviewer questions that required extra sensitivity analyses became hard to answer quickly. Messy datasets slowed momentum. “Standard approach” became a stand-in for “good enough to survive the deadline.”

The frustrating part was not what she knew. It was what she could deliver on time.

Using P2C

Elena tried something that felt, at first, almost like cheating. Instead of starting with a blank script, she started with a paragraph.

She wrote what she wanted in plain language, the way she would describe methods to a colleague: the dataset structure, unit of analysis, outcome definition, time scale, model family, covariates, robustness checks, and what the output should look like. She asked an AI assistant to draft the code. Then she ran it, broke it, corrected assumptions, and iterated.

That workflow is what I call P2C: prompt-to-code.

In this post, I use P2C to mean using a large language model to generate and refine software code from natural-language prompts. This is not qualitative coding. This is not outsourcing interpretation. It is a way to reduce the engineering drag that often prevents epidemiologists from using methods they already understand.

 

 

Flow CHART

Experimental Example: The First P2C Win

Elena’s first experiment was modest. She was working with quarterly surveillance data and wanted a clean descriptive pipeline. In the past, she would copy fragments from older projects and adapt them—sometimes successfully, sometimes not.

This time, she asked for a script that validated the data, checked missingness, generated a trend plot, and exported a formatted summary table.

The first draft was imperfect. A column name was wrong. A date format assumption didn’t match her data. A denominator mistake would have quietly distorted rates. Yet the difference was that she was no longer stuck at the starting line. She had something tangible to critique.

She responded with clear corrections: rates should be per 100,000; quarters are YYYY-Q; drop rows with missing population; treat negative counts as errors. Each iteration tightened the tool.

Within a short span, she had a reusable script that would have taken far longer to build from scratch.

“The model didn’t ‘do the analysis.’ It helped me get past the parts that usually trap me before the science even begins.”

What Changed for Elena

Once she had a dependable foundation, Elena stopped negotiating with the limitations of time and brittle scripts. She began asking questions she had been avoiding—not because they were too advanced, but because they were too expensive to implement.She revisited model choices she learned during training but rarely used under pressure. She added sensitivity checks and produced outputs that were easier to audit and defend.

P2C didn’t make her a better epidemiologist by giving her new theory. It made her a better epidemiologist by removing a bottleneck between theory and practice.

Best Practices for Implementation

P2C is powerful, but it only works responsibly if the generated code earns trust. A script that runs is not the same as a script that is correct.

Here’s what helped Elena most.

Prompting tips that produce usable code

When you write your prompt, include the essentials up front:

  1. Define the data schema clearly (file type, column names, coding conventions).

  2. State your unit of analysis and time index (person, facility, county; week/month/quarter; date formats).

  3. Write explicit definitions (outcome, exposure, exclusions, missingness rules).

  4. Specify outputs (tables, plots, file formats, labels, rounding rules).

  5. Require error checks (assertions like population > 0; counts ≥ 0; ease to understand dates).

Validation practices to avoid “polished-but-wrong”

A minimal set of checks that should become routine:

  • Run the pipeline on a tiny toy dataset where you know the correct answers.

  • Compare outputs against a trusted baseline (your prior scripts or a manual calculation).

  • Add sanity checks (rates within plausible ranges, totals match expectations).

  • Keep a reproducible environment (pin package versions; save session info).

  • Document assumptions in the script header like a mini methods section.

Responsible Use in Public Health Settings

P2C also raises practical governance issues. If you work with restricted or sensitive data, keep the workflow compliant: use de-identified or synthetic samples when developing and debugging. Keep code generation separate from data access when required by policy. Treat outputs as draft until reviewed.

Bottom Line

Many epidemiologists already know more than they routinely deploy. We understand the limitations of oversimplified models. We can explain why sensitivity analyses matter. But we stop at the edge of implementation because the engineering cost is high.

P2C offers a bridge across that edge.

Used responsibly, prompt-to-code helps turn “I learned this in school” into “I can run this today,” shifting time away from wrestling syntax and toward study design, validity, and interpretation.

Call to Action

If you’ve tried P2C workflows in public health—whether for code, writing, or project management—I’d like to hear what worked and what didn’t. Email me at info@aipublichealthupdate.com

This post was drafted with assistance from OpenAI’s GPT-5.2 Thinking (ChatGPT). I provided the concept, structure, and final editorial decisions, and I reviewed and edited the final text for accuracy, clarity, and tone.