Boxplots in R: Manual, ChatGPT/AI, and Hybrid Approaches in the AI Era
If you’re a data analyst or scientist, your workflow in 2025 probably looks a lot different than it did just a few years ago. The buzz around AI coding assistants like GitHub Copilot and large language models like Gemini or ChatGPT isn’t just hype—it’s a seismic shift in how we translate ideas into code. This is especially true in data visualization, where the goal is to move from raw data to clear insights as efficiently as possible.
But this new power brings new questions. While AI promises to generate a complex plot with a single prompt, what do we sacrifice in the process? Are we giving up the precision, customizability, and deep understanding that comes from meticulously crafting a plot by hand? When should you trust an AI with your data visualization, and when is it better to roll up your sleeves and take full control?
In this post, we’ll compare three distinct workflows for creating a classic yet powerful visualization—the boxplot. Using the well-known mpg dataset in R, we will explore:
-
Manual ggplot2 Boxplot in R (Reordered & Flipped): Using
ggplot2to build a plot from scratch, giving us ultimate control over every detail. - The Pure AI Approach: Using a direct prompt to an AI model to generate the entire plot, testing its speed and capabilities.
- The Hybrid Approach: Leveraging AI for the initial “heavy lifting” and then manually refining the code to create a perfect, presentation-ready graphic.
By the end of this practical comparison, you’ll have a clear framework for deciding which approach—the artisan’s, the assistant’s, or the pragmatist’s—is best for your specific task, skill level, and goals.
Manual Boxplot in R with ggplot2 (The Artisan’s Way or the Old Way 🙂 )
This section establishes our “ground truth.” It’s the traditional method that guarantees full control and deepens your understanding of the plotting library. This is our quality benchmark against which the other methods will be compared.
1. Setup: Using the mpg Dataset in R
First, let’s set up our environment. We’ll be working in R and using the powerful ggplot2 library. Our data will be the mpg dataset, which is included with ggplot2 and contains fuel economy data for various car models.
Our specific goal is to create a boxplot that examines the distribution of city fuel efficiency (cty) across different vehicle classes (class). We want the final plot to be clear, sorted for easy comparison, and professionally labeled.
Let’s start by loading the library:
library(ggplot2) theme_set(theme_bw(16))
2. Basic Boxplot in ggplot2
The “manual” approach means we build the plot iteratively. We start with the most basic version to get a feel for the data’s structure.
# Create a basic boxplot # vehicle types vs city mileage mpg |> ggplot(aes(x = class, y = cty)) + geom_boxplot()
This gives us a functional but unrefined boxplot. The vehicle classes are arranged alphabetically, not in a way that helps comparison, and the labels are just the raw variable names. This is our raw material.
3. Refining the Boxplot: Reorder, Flip, and Style: a Presentation-Ready Plot
Now, we’ll apply our expertise to make this plot insightful and readable. This is where the artisan’s touch comes in:
- Reorder for Insight: We’ll reorder the vehicle classes based on their median fuel efficiency. This simple change makes it much easier to see which classes are more or less efficient.
- Improve Aesthetics: We’ll add color for visual appeal and highlight outliers.
- Add Clear Labels: We’ll provide a descriptive title and axis labels.
- Flip Coordinates: Since the x-axis labels are long and could overlap, we’ll flip the coordinates to make them horizontal and easy to read.
-
Apply a Clean Theme:
theme_minimal()provides a clean, professional look without a distracting background.
Here is the commented code for our final, manually-crafted plot:
# Create a polished, presentation-ready boxplot
mpg |>
ggplot(aes(x = reorder(class, cty, FUN = median), y = cty)) +
geom_boxplot(fill = "skyblue", color = "black", outlier.colour = "red") +
coord_flip() + # Flip coordinates for readable labels
labs(
title = "City Fuel Efficiency by Vehicle Class",
subtitle = "Higher MPG indicates better fuel efficiency",
x = "Vehicle Class",
y = "City Miles Per Gallon (MPG)"
) +
theme_minimal()
This final plot is clear, insightful, and ready for a report.
ggplot2 Scatterplots — AI Workflow
Plan, generate, and refine scatter plots in R with a reproducible AI-assisted workflow — prompts, code, and practical tips.
Read more →
4.Pros and Cons of Manual Boxplots in ggplot2
- Pros: Total control over every pixel and label; the process deepens your understanding of the library’s grammar; quality is guaranteed to match your vision.
-
Cons: Can be time-consuming; requires specific knowledge of functions and arguments (like
reorder); has a higher learning curve.
ChatGPT/AI-Generated Boxplot in R with ggplot2 (The Assistant’s Way)
Now, let’s switch hats from artisan to director. Instead of writing the code ourselves, we will instruct an AI assistant to write it for us. The quality of the output is directly tied to the quality of our prompt. A great prompt is specific, provides context, and clearly defines the desired outcome.
Our goal is to replicate the final, polished plot from the manual section. Therefore, our prompt must include all the key refinements we made.
1. The Prompt: How to Ask AI/ChatGPT for ggplot2 Code
We’ll provide a detailed prompt to a generative AI model (like Gemini, ChatGPT, or Copilot). Notice how we translate each of our manual refinements into a clear instruction.
Prompt:
Using the built-in `mpg` dataset in R with the `ggplot2` library, create a boxplot.
The plot should show the distribution of city fuel efficiency (`cty`) for each vehicle `class`.
- The y-axis should be `cty` and the x-axis should be `class`.
- Reorder the classes on the x-axis based on the median `cty` in ascending order.
- Flip the coordinates so the vehicle classes are on the vertical axis.
- Set the fill color of the boxplots to “skyblue”.
- Add the following labels: title should be “City Fuel Efficiency by Vehicle Class”, subtitle should be “Higher MPG indicates better fuel efficiency”, x-axis should be “Vehicle Class”, and y-axis should be “City Miles Per Gallon (MPG)”.
- Finally, apply `theme_minimal()`.
- Add which AI model created the code it at the bottom right corner
Please provide the complete, runnable R code.
2. AI/ChatGPT’s Response: R Code for Boxplot with ggplot2
After processing the prompt, the AI model ChatGPT-5 generated the following code that is identical to what we wrote before. Other AI models would also give code that looks very much like this:
# ChatGPT generated R code for Boxplot
# Load library
library(ggplot2)
# Create the boxplot
ggplot(mpg, aes(x = reorder(class, cty, FUN = median), y = cty)) +
geom_boxplot(fill = "skyblue") +
coord_flip() +
labs(
title = "City Fuel Efficiency by Vehicle Class",
subtitle = "Higher MPG indicates better fuel efficiency",
x = "Vehicle Class",
y = "City Miles Per Gallon (MPG)",
caption = "Code generated by GPT-5"
) +
theme_minimal() +
theme(
plot.caption = element_text(hjust = 1, face = "italic", size = 8) # bottom right
)
And here is the boxplot using the code generated by ChatGPT 5, and it is identical.
3. Result Analysis: How Well Did AI Perform?
The result is impressive. In a matter of seconds, the ChatGPT/AI generated code that is identical to our final, manually-crafted version. ChatGPT correctly interpreted the need for reorder(), applied all the labels, flipped the coordinates, and used the correct theme. For a clearly defined task, the AI/ChatGPT performed exceptionally well, saving a significant amount of time.
4. Pros and Cons of AI/GhatGPT-Generated Boxplots
- Pros: Incredible speed—reduces development time from minutes to seconds; highly accessible for those who don’t know the exact syntax; great for rapid prototyping and generating boilerplate code.
- Cons: Heavily dependent on prompt quality; can be a “black box” where you don’t fully understand the code generated; may produce code that is subtly incorrect or suboptimal; risks becoming a crutch that hinders learning.
Hybrid Workflow for Boxplots in R (The Pragmatist’s Way)
We’ve seen the total control of the manual method and the stunning speed of the AI approach. In practice, the most effective workflow often lies in the middle. The hybrid approach uses AI for what it does best—generating boilerplate code quickly—and reserves human expertise for what truly matters: refinement, nuance, and ensuring the final visualization tells the right story.
The workflow is a simple two-step process: Generate, then Refine.
Step 1: Generate Boxplot with AI with a Simple Prompt to ChatGPT
Instead of crafting a perfect, detailed prompt, we’ll give the AI a simpler, high-level request. Our goal isn’t the final product, but a solid foundation. This saves us from typing out the basic ggplot() structure.
Simple Prompt:
“Using R and the ggplot2 library, create a boxplot of the `cty` variable by `class` from the built-in `mpg` dataset.”
The AI will quickly return a functional, unrefined plot—very similar to the first draft we created manually.
# AI-generated starting point library(ggplot2) ggplot(mpg, aes(x = class, y = cty)) + geom_boxplot()
Step 2: Refine Boxplot code with Human Expertise
This code is correct, but it’s not insightful. Now, we take over from the AI. We read the generated code to ensure it’s correct, and then we apply our knowledge to enhance it. This is the exact same refinement process as the manual approach, but we’ve saved ourselves the initial setup time.
# We take the AI's code and add our manual refinements
ggplot(mpg, aes(x = reorder(class, cty, FUN = median), y = cty)) + # Our expertise: reorder for insight
geom_boxplot(fill = "skyblue", color = "black", outlier.colour = "red") + # Our expertise: add aesthetics
coord_flip() + # Our expertise: improve readability
labs(
title = "City Fuel Efficiency by Vehicle Class",
subtitle = "Higher MPG indicates better fuel efficiency",
x = "Vehicle Class",
y = "City Miles Per Gallon (MPG)"
) + # Our expertise: add clear labels
theme_minimal() # Our expertise: apply a clean theme
Step 3. Analysis of the Result
This approach feels like a partnership. The AI acts as a junior developer, handling the initial, slightly tedious task of setting up the plot. We then step in as the senior developer, using our experience to elevate the basic draft into a polished final product. This leverages the strengths of both human and machine.
Step 4. Pros and Cons of the Hybrid Approach
- Pros: Highly efficient, balancing speed and quality; maintains full creative control for the final output; encourages learning by forcing you to read and understand the AI’s code before making changes.
- Cons: Requires enough expertise to identify where the AI’s code is weak or can be improved; can be inefficient if the AI’s initial code is fundamentally wrong, forcing a restart.
Want to learn what is a good prompt for making boxplots with ggplot2 in R, check out 10 ChatGPT Prompts for ggplot2 Boxplots: Complete Guide with Working R Code
Manual vs AI vs Hybrid: Which Boxplot Workflow Should You Choose?
We’ve explored three distinct paths to creating the same boxplot, each with its own philosophy and trade-offs. We went from the meticulous control of the Manual approach, to the incredible speed of the Pure AI method, to the balanced efficiency of the Hybrid workflow.
So, which one is best? The answer, of course, is: it depends on your goal.
A Quick Note: The Real-Time “Copilot” Workflow
Beyond the three distinct approaches we’ve detailed, there’s a fourth, more integrated method worth mentioning: using a real-time coding assistant like GitHub Copilot directly within your editor.
This isn’t about giving a prompt and waiting for code. Instead, it’s an interactive partnership while you type. The workflow looks like this:
- You begin writing your code manually, for instance, typing
ggplot(mpg, aes(x = class, y = cty)) +. - As you type, the AI anticipates your next step and offers a suggestion as greyed-out “ghost text” (e.g., it might suggest the entire
geom_boxplot()line). - You can press ‘Tab’ to accept the suggestion, modify it, or simply ignore it and continue typing your own code.
This “Copilot” method is essentially a supercharged version of the Manual and Hybrid approaches. It doesn’t break your coding flow by making you switch to a separate chat window. Instead, it acts as an intelligent autocomplete, reducing the cognitive load of remembering exact syntax and speeding up the entire process. It’s the artisan with a power tool, blending human intent with AI assistance seamlessly and in real-time.
Comparative Summary: Speed, Control, and Learning
To make the comparison clear, let’s summarize the three approaches across a few key criteria.
| Criterion | Manual Approach | Pure AI Approach | Hybrid Approach |
|---|---|---|---|
| Speed | Slowest | Fastest | Fast |
| Control & Quality | Highest | Variable (Depends on prompt) | High (Final control is yours) |
| Learning & Skill Growth | Best for deep learning | Lowest (Risk of dependency) | Good for learning by example |
| Required Expertise | High | Low (For basic plots) | Medium to High |
Recommendation: Choosing the Right Approach for You
-
If you are a beginner…
Start with the Manual Approach. There is no substitute for building foundational knowledge ofggplot2syntax and the “grammar of graphics.” Use the Hybrid Approach on the side to see how an expert (or an AI) might solve the same problem, which can accelerate your learning. -
If you are an experienced analyst…
The Hybrid Approach should be your default workflow. It respects your time by automating the tedious parts of coding while ensuring your expertise is applied to the final, crucial refinements. It’s the ultimate pragmatic choice. -
If you are rapid prototyping…
Use the Pure AI Approach. When you just need to “see the data” quickly to check a hypothesis and the final polish doesn’t matter, nothing beats the speed of a direct prompt.
Final Thoughts: AI as Your Data Visualization Copilot
The rise of AI reshapes coding skills. By learning to be a skilled “director” of your AI assistant, you can spend less time wrestling with syntax and more time on what matters most: uncovering the stories hidden within your data.
Ready to Level Up Your ggplot2 Skills?
Check out our full tutorial on creating scatter plots with AI-powered workflows in R.
Explore the Complete ggplot2 Guide
35+ tutorials with code: scatterplots, boxplots, themes, annotations, facets, and more—tested and beginner-friendly.
Visit the ggplot2 Hub → No fluff—just code and visuals.
2 comments
Comments are closed.