How to use (and not use) AI: Lessons from Harvard Business School

How can you best use AI?

On the one hand, AI tools like ChatGPT can slash the time it takes to write an outline, compose code, or brainstorm ideas. On the other hand, it often produces lackluster results that our human brains could have done better. How do you know when to use it and when to rely on your own intellect?

That’s what academics are here to answer. Harvard Business School's Technology & Operations Management Unit recently released a working paper involving 758 consultants from Boston Consulting Group on how to best use AI tools in the workplace.

The findings paint a layered picture of what work life with AI tools like ChatGPT could look like. In the pros category, these tools supercharged consultants, helping them produce higher-quality work in less time. They even leveled the playing field between high and low-performers. But there were also some cons: consultants sometimes leaned too hard on the AI, leading some to rely on inaccurate recommendations the tool gave instead of their own smarts.

Here are the details you need to know to perfect your own workflow.

How Harvard put office workers to the test

First, each of the consultants were put into one of three groups. One group had access to GPT-4, but were left to their own devices on how to use it. A second group also had access to GPT-4, and were given an overview of prompt engineering instructional videos and information on strategies to effectively use the tool. A third group had to complete the tasks with no AI tools, like all of us poor souls way back in November 2022.

The positive results

The study authors first asked the consultants to go through a typical process, from ideation to product launch, for a fictional footwear company. The work included a bunch of different parts, like generating ideas for a new shoe targeting an underserved market, segmenting the market, describing a potential prototype, and coming up with a comprehensive launch plan. The tasks were deliberately set up to mimic the kind of work they'd usually be doing day-to-day at their jobs.

The results? AI made for some pretty impressive gains.

1. Consultants using AI tools produced higher quality results.

Independent human raters graded the quality of the assignments from each participant for each task. The assignments from consultants using the AI tools were consistently rated better: quality was about 40% higher than for those who didn't use them.

2. Consultants using AI tools were faster and more productive.

AI also improved the time to complete each task and the total number of tasks completed. Consultants who used the AI tools reduced the time they spent by a quarter and finished 12% more tasks, on average.

3. Below-average performers got the biggest boost from using AI tools.

The study measured the consultants' base performance by having them complete a task without AI. Based on those scores, they divided the consultants into low and high-performing groups. The quality level of the low performers increased more than those in the high-performing groups: 43% to 17%.

4. Low performers + AI > High performers - AI.

Even though the quality of the work by low performers using AI remained below that of high-performers using AI, the use of AI turned the tables in terms of average quality. The average quality of low performers using AI (5.79 out of 7) slightly exceeded that of the high-performing group without AI (5.20 out of 7). In other words, low performers using AI edged out high performers working on their own.

5. Training helps with both quality and speed.

Receiving training on how to use AI tools helped people boost the quality of their results and how fast they completed assignments. Not only did they produce better quality work faster than those without AI, they also topped the quality and performance scores of those who used AI but weren’t given any training.

‍

Line graph of performance by study participants — People using AI with training (red) produced higher quality results faster than people using AI without training (green) and people not using AI (blue). *Source:* *Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 24-013*

The not-so-positive results

The second experiment was set up to be deliberately outside the AI tool’s capabilities—though that fact wasn’t made obvious—to see if that would trip up the consultants.

The study authors gave the consultants a spreadsheet of data and interviews, and asked them to recommend which of a company's brands the CEO should focus on to drive revenue growth. When considered separately vs. together, the different information sources would yield conflicting results, so ChatGPT was set up to give the wrong recommendation.

The experimenters assessed whether or not the consultants would come to the wrong conclusions as a result of the AI tool guiding them down the wrong path.

The answer? Yes, yes they would.

Bar graph of performance in tasks outside the AI tool’s capabilities — On tasks that were outside the AI tool’s capabilities, people not using AI (blue) performed better than people using AI (green) and using AI with extra training (pink). *Source:* *Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 24-013*

‍

1. Those who used AI tools were more likely to trust the (incorrect) output

Those not using AI got the answer right 85% of the time. Those using AI tools, on the other hand, were frequently incorrect, getting the correct conclusion 60 to 70% of the time.

2. Those with AI tools were wrong, but dangerously convincing

Even though the actual recommendations were incorrect, the quality of the output was rated higher among those who used AI tools. This meant that the recommendations were confidently incorrect, and potentially persuasive even though based on incorrect assumptions. That’s a pretty dangerous finding.

3. Training led to even more trust in the (incorrect) output

Training turned out to be a negative in this scenario. Those who got a primer on how to use the AI actually performed worse in getting the recommendation correct. It's possible that the training gave them a false sense of security, making them overconfident in their use of the AI tools.

4. The results were more cookie-cutter among those using AI tools

When assessed for variability, the ideas churned out by the AI-assisted crowd were a lot more similar to each other than those coming from consultants on their own. It's likely that multiple consultants cribbed from ChatGPT's playbook, ending up with similar answers.

Takeaways: How to get good at working with AI, according to this study

This experiment has a few important takeaways for anyone using AI tools.

Know what AI is good at—and what it’s not

When picking the tasks for the experiment, the researchers chose tasks that were similar in difficulty for the consultants, but were outside of the AI's capabilities.

In the words of the researchers, the frontier of tasks that technology tools can do is "jagged.” Just because a task is hard for a human doesn't mean it's hard for the AI — and the opposite is also true. Even for skilled professionals, it can be hard to know whether you've pushed the AI tools beyond its capabilities.

‍

Diagram showing the jagged frontier of AI capabilities — Two tasks that are of equal difficulty for humans could be of two vastly different difficulties for AI. *Source:* *Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 24-013*

‍

Things that we might consider easy, like basic math (or hitting an accurate word count), can be really hard for LLMs, while other things like generating ideas might be hard for us but easy for the AI. As new AI tools become available, this "frontier" is constantly shifting—it could even go backward. Thinking about what the AI tool can and can’t do well should always be a part of a workflow using AI tools.

Use AI with a critical eye

The tools are black boxes—you don't know how the LLM came up with the result it’s giving you. This makes it even harder to know when the tool is telling you the truth, or convincingly spouting incorrect information.

Not only that, but the capabilities are inconsistent. Right now, ChatGPT and similar tools are constantly being upgraded and refined. This all means that while it's tempting to trust the authoritative-sounding content it provides, you should do so at your own risk. Part of your process should always be to critically evaluate the information it's giving to you.

Know your AI working style

During the experiment, researchers closely analyzed the consultants who most effectively used the tools. They named these people after the methods they used: Centaurs delineated their work from the AI's work and delegated certain tasks to AI (the same way a centaur’s human half is clearly delineated from its horse half), while Cyborgs completely integrated their workflow and the AI tool and worked with it continuously (the same way a cyborg is a human integrated with a robot). To get your workflow right, it pays to figure out which style is best for your personality and your task.

Don't know which you are? Read about the two styles here.

Conclusion

AI tools like ChatGPT are game changers, no doubt about it. These tools can supercharge productivity, level the playing field, and crank out some high-quality work. But we need to remember that if we're not careful, we can get taken for a ride with confident but incorrect output.

So, as new tools keep morphing what's possible on the technological frontier, let's not forget: it's up to us to steer the ship, not just go along for the ride.

How to use (and not use) AI: Lessons from Harvard Business School

How to use (and not use) AI: Lessons from Harvard Business School

What type of content do you primarily create?

What type of content do you primarily create?