Generative AI and Business Technology | ��ӽ� Business School AI Institute

Is AI Making Your Team Lazy?

HBS AI Content & Learning — Fri, 24 Apr 2026 12:18:31 +0000

Exploring the hidden cost of human disengagement from AI

Listen to this article:

We are rapidly entering an AI era defined by the “agentic” shift. These tools now write code, manage inboxes, conduct research, and execute multi-step workflows without a human lifting a finger. But when AI does more, what happens to the humans at the end of the line? Does the presence of a “perfect” partner actually make us better, or does it slowly erode the very skills and attention required to provide oversight? As we mark the renaming of D^3 as the HBS AI Institute this month, we’re taking a look back at some of our foundational research that defines the era. In “,” HBS AI Institute post-doctoral fellow Fabrizio Dell’Acqua Fabrizio Dell’Acqua designed a field experiment to test what happens when the quality of AI assistance advances. His findings, it turns out, have serious implications for anyone using AI or in charge of systems where humans and AI share responsibility.

Key Insight: Falling Asleep at the Wheel

“If the AI appears too high quality, workers are at risk of ‘falling asleep at the wheel’ and mindlessly following its recommendations without deliberation.” [1]

The paper’s central hypothesis begins with a simple behavioral observation: as AI quality increases, the rational incentive to exert one’s own effort decreases. When a tool appears highly reliable, people may stop checking its work closely, stop gathering their own information, and stop exercising independent judgment. Dell’Acqua calls this “falling asleep at the wheel.” The result is a subtle but important distinction between AI performance in isolation and human-AI performance in practice. What matters is not only how good the model is, but how people behave when using it.

Key Insight: The Counter-Intuitive Power of “Flawed” Predictions

“On average, HR recruiters receiving lower-quality AI were less likely to ‘fall asleep’ as they tended not to automatically select the AI-recommended candidate.” [2]

To test this theory, Dell’Acqua conducted a field experiment involving 181 professional HR recruiters who were tasked with reviewing 44 resumes each for a software engineering position. The recruiters were randomly assigned different levels of AI assistance: a “Perfect” AI with approximately 99% accuracy, a “Good” AI with approximately 85% accuracy, a “Bad” AI with roughly 75% accuracy, or no AI at all. Recruiters knew which tier of AI they were working with before starting. The results were clear and striking: recruiters who collaborated with the “Bad” AI actually outperformed those with the “Good” AI. Because the “Bad” AI was clearly imperfect, the recruiters remained vigilant, spending more time on each application and verifying the AI’s claims. This group effectively learned the AI’s weaknesses and improved their own performance to compensate. Those with better AI moved faster and delegated more.

Key Insight: The Design Implication

“Designing effective structures for human/machine collaboration requires careful consideration of the organization’s objectives and task features.” [3]

Dell’Acqua is careful not to recommend that organizations simply deploy older, worse AI models. The real prescription is more nuanced: design AI systems with human behavioral responses in mind, not just technical performance benchmarks. In settings where people can add value, the design of the interaction becomes a strategic variable. That might mean calibrating AI confidence displays, introducing deliberate uncertainty signals for borderline cases, or creating interfaces that prompt humans to engage before surfacing a recommendation. A system that nudges humans to stay attentive may perform better than one that invites passive approval.

Why This Matters

For executives and business leaders, the lesson here is that combined human-AI performance is its own optimization target, and it might not move in lockstep with AI accuracy improvements. Strategy in the age of AI still requires an understanding of human psychology and effort. If leaders want better outcomes, they need to think beyond technical benchmarks to workflows where their employees remain wide awake at the wheel.

Bonus

This article shows that impressive AI performance can hide important weaknesses. Here, the issue hinges on over-reliance by human collaborators, but at other times it’s caused by the model itself. For example, even highly capable AI systems can still struggle with something as basic as multi-digit multiplication. For a closer look at this, check out When Giants Stumble: What Multiplication Reveals about AI’s Capabilities.

References

[1] Dell’Acqua, Fabrizio, “Falling Asleep at the Wheel: Human / AI Collaboration in a Field Experiment on HR Recruiters,” Working paper, Laboratory for Innovation Science, ��ӽ� Business School (2022), 2.

[2] Dell’Acqua, “Falling Asleep at the Wheel,” 3.

[3] Dell’Acqua, “Falling Asleep at the Wheel,” 4.

Meet the Authors

is a postdoctoral researcher at ��ӽ� Business School. His research explores how human/AI collaboration reshapes knowledge work: the impact of AI on knowledge workers, its effects on team dynamics and performance, and its broader organizational implications.

Watch a video version of the Insight Article here.

The post Is AI Making Your Team Lazy? appeared first on ��ӽ� Business School AI Institute.

Back to the Beginnings of AI at Work

HBS AI Content & Learning — Thu, 09 Apr 2026 11:14:14 +0000

What a Landmark AI Study Tells Us About When to Trust, and When Not to Trust, AI

Listen to this article:

In September 2023, a working paper out of ��ӽ� Business School landed at an unusually consequential moment. Generative AI had been publicly available for less than a year, organizations were scrambling to understand its implications, and almost no rigorous field evidence existed on how it actually affected professional performance. “” offered exactly that. Now, in March 2026, that research has been formally published in the peer-reviewed journal . To mark this milestone, we’re revisiting the study and its findings. The questions it set out to answer, what AI actually does to knowledge worker performance, where it helps, where it hurts, and why, were foundational then. They remain foundational now.

Key Insight: An Experiment Built for the Real World

“[T]hese tasks were ‘very much in line with part of the daily activities of the consultants’ involved.” [1]

To test the impact of generative AI on high-end knowledge work, the researchers collaborated with Boston Consulting Group (BCG) on a randomized controlled trial involving 758 consultants. After establishing an individual performance baseline, participants were randomly assigned to one of three conditions: no AI access, GPT-4 access, or GPT-4 access paired with a brief prompt engineering overview. The core of the design involved testing how these professionals navigated realistic tasks simulating real-world workflows. The researchers created two kinds of consulting-style assignments. One centered on product innovation and go-to-market work, including ideation, analysis, writing, and persuasion. The other was a difficult brand strategy case that required participants to reconcile spreadsheet data with subtle clues embedded in interview notes. This design let the researchers ask not just whether AI boosts productivity in general, but whether the answer depends on the nature of the task itself.

Key Insight: AI’s Capabilities Don’t Follow a Smooth Line

“[W]ithin the same knowledge workflow, some tasks are beyond the frontier, whereas others remain within it, making effective AI use challenging.” [2]

The paper introduces its signature jagged technological frontier concept to describe the uneven capabilities of generative AI. Tasks that appear of similar difficulty to humans might fall on opposite sides of this boundary. When a task falls inside the frontier, AI is capable of generating accurate, high-quality outputs that support human work. Conversely, being outside the frontier means that AI fails or produces believable but incorrect hallucinations. In such tasks performance still depends on human judgment, guidance, or synthesis that the AI cannot reliably provide on its own. The danger is that professionals have no obvious signal telling them which side of the line a task is on.

Key Insight: AI as a Booster and Disruptor

“[E]xperienced and incentivized knowledge professionals, engaged in tasks akin to some of their daily responsibilities, performed worse when given access to AI.” [3]

For tasks inside the frontier (the innovation and market exercise), AI access produced striking improvements. Consultants using GPT-4 completed 12.2% more subtasks, worked roughly 25% faster, and delivered work that human graders rated about 32% higher in quality. But for the task outside the frontier (the brand strategy case), the results flipped sharply. The control group (no AI) answered correctly 84.5% of the time. Among consultants using AI, accuracy fell to 70.6% for those with GPT access and 60% for those with access and the prompt-engineering overview. The AI had correctly processed surface-level numerical data, but missed a critical insight buried in interview materials. Consultants who used AI tended to trust its analysis and follow it to the wrong conclusion. Over-reliance on AI output, not ignorance of the task, was the mechanism of failure.

Key Insight: The Biggest Gains Go to the Lower Half

“[T]he most significant beneficiaries of using AI are the bottom-half-skill subjects.” [4]

The distribution of gains was not uniform. When the research team segmented consultants by their baseline assessment performance, they found that the largest beneficiaries of AI assistance were those in the lower half of the skill distribution. Bottom-half performers improved by 31% on the experimental task; top-half performers improved by 11%. This pattern suggests that AI can function as a meaningful equalizer within professional environments, lifting those furthest from peak performance (while still delivering meaningful gains to those at the top).

Why This Matters

For executives and leaders, this paper remains foundational because it frames AI adoption as a problem of decisions, strategy, and execution. The lesson is not that AI is universally good or dangerously flawed, it’s that leaders have to understand where, in a workflow, AI strengthens performance and where it creates deceptive failures. This means training people to exercise judgment rather than outsource it, and recognizing that polished output is not the same as sound reasoning. How will you guide your team along the jagged frontier?

Bonus

The need for human oversight, the risk of overtrusting polished outputs, and the challenge of separating assessment from interpretation are tensions that run through many of the most important conversations in AI today. Seen through these lenses, the paper is not only about consulting work, but a broader shift in how decisions get made when AI becomes part of the process. For another look at those themes in the context of screening and evaluating ideas, check out The Future of Decision-Making: How Generative AI Transforms Innovation Evaluation.

References

[1] Dell’Acqua, Frabrizio, et al., “Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of Artificial Intelligence on Knowledge Worker Productivity and Quality,” Organizational Science 37(2): 403-423, 405.

[2] Dell’Acqua et al., “Navigating the Jagged Technological Frontier,” 404.

[3] Dell’Acqua et al., “Navigating the Jagged Technological Frontier,” 419.

[4] Dell’Acqua et al., “Navigating the Jagged Technological Frontier,” 410.

Meet the Authors

is an Assistant Professor in the Technology and Operations Management Unit at ��ӽ� Business School and Principal Investigator at the HBS AI Insitute Data Science and AI Operations Lab hosted within the Laboratory for Innovation Science.

is an Associate Professor at the Wharton School of the University of Pennsylvania, where he studies and teaches innovation and entrepreneurship, and examines the effects of artificial intelligence on work and education. Ethan is the Co-Director of the Generative AI Lab at Wharton, which builds prototypes and conducts research to discover how AI can help humans thrive while mitigating risks.

is a Professor of Management at Warwick Business School (WBS) and a visiting faculty at ��ӽ� University, at the Laboratory for Innovation Science at ��ӽ� (LISH). She heads the Artificial Intelligence Innovation Network at WBS.

is the David J. McGrath Jr Professor of Management and Innovation, a Professor of Business Administration at the MIT Sloan School of Management. Her research focuses on helping knowledge workers and organizations develop and implement Predictive and Generative AI products, on-the-ground in everyday work, to improve decision making, collaboration, and learning.

is Director of Strategy and Execution at Palo Alto Networks.

is Principal at Boston Consulting Group (BGC).

is Partner Value Creation & Portfolio Monitoring at Seven2.

is the Dorothy & Michael Hintze Professor of Business Administration at ��ӽ� Business School. He specializes in technology management, innovation, digital transformation, and artificial intelligence. He is also the Co-Founder and Faculty Chair of the HBS AI Institute and the Founder and Co-Director of the Laboratory for Innovation Science at ��ӽ� (LISH).

Watch a video version of the Insight Article here.

The post Back to the Beginnings of AI at Work appeared first on ��ӽ� Business School AI Institute.

Everyone Has AI. Which Firms are Going to Win?

HBS AI Content & Learning — Tue, 07 Apr 2026 12:50:00 +0000

New research shows that access to AI is not the same as knowing where to use it.

Listen to this article:

A firm is only as fast as the slowest step in its chain of work. In manufacturing, it might be one particular machine on the line. In software, one overloaded intake service. Many business leaders are accidentally recreating this scenario with artificial intelligence. They provision AI tools to employees and hear about localized productivity spikes, but the company’s overall performance barely moves. This tension lies at the heart of the new working paper “,” from co-authors at INSEAD and the ��ӽ� Business School AI Institute. By tracking hundreds of organizations, the researchers have uncovered friction points that hold firms back from realizing the true economic promise of generative AI.

Key Insight: A Global ��ӽ� for AI’s Real Value

“Discovering where and how AI creates value is fundamentally a search problem.” [1]

To test how companies can overcome the barrier of firm-level AI performance, the authors conducted a massive field experiment involving 515 high-growth startups spanning the globe. All participating firms received API credits, access to frontier AI models, and technical training. A randomly selected treatment group of firms also received specialized case studies highlighting how AI-native companies reorganize their production workflows, teams, and business models around the technology. Control firms attended workshops on general entrepreneurship practices. The design let the researchers hold access and technical skill constant while varying which firms gained perspective across a much wider set of organizational functions, thereby expanding their search space for AI opportunities.

Key Insight: A Small Nudge, Outsized Results

“Treated ventures achieve faster growth without proportional increases in labor or capital, consistent with a reduction in the costs of experimentation and scaling seen in earlier technological waves.” [2]

The performance effects were substantial. The treatment startups discovered 44% more AI use cases, particularly in high-leverage areas like strategy and product development. They completed 12% more tasks, became 18% more likely to land paying customers, and generated an astounding 1.9 times higher revenue compared to the control group. What makes these numbers even more fascinating is that these companies did not spend their way to growth. In fact, their demand for external capital investments actually fell by 39.5%, proving that AI enables firms to scale outputs without scaling inputs proportionally. The researchers found that these gains were heavily concentrated in the upper tail, suggesting that AI lifts the ceiling of what top ventures can achieve rather than just making struggling businesses slightly better. One startup built an end-to-end AI pipeline covering classification, compliance checking, and bid pricing without hiring any technical staff, growing from zero to $40,000 in revenue with four paying customers during the ten-week program.

Key Insight: A Cognitive Bottleneck

“Two firms with identical tools, training, and budgets can realize very different returns if one searches more broadly across its production process for where the technology creates value.” [3]

The researchers conclude that the ultimate blocker for AI gains is not the cost of technology or a lack of skills, but what they call the mapping problem: “discovering where and how AI creates value within a firm’s production process.” [4] Most leaders default to localized, obvious AI solutions like launching a customer service chatbot or drafting email responses. The untapped potential comes from discovering how to rethink interconnected, complementary tasks across the entire enterprise. For example, a field services startup in the study rebuilt its entire operations chain of dispatcher, bookkeeper, scheduler, and collections staff into a sequence of AI modules that self-improve, fundamentally changing the firm’s cost structure. Solving the mapping problem is about overcoming cognitive constraints to see AI as a way to redraw your company’s production landscape, rather than simply slapping digital band-aids on legacy processes.

Why This Matters

For business leaders and executives, this research shows that the organizations most likely to realize substantial AI-driven results are those that invest not just in technology, but in the wide-ranging process of exploring where it fits. That is a strategy and execution problem, and leaders will need to ask which parts of their organizations need redesign rather than optimization. If you don’t actively push the boundaries of how AI rewrites your firm, you risk using a map that never leads you to your destination.

Bonus

What happens when the bottleneck lies in the surrounding market, rather than within your business? For example, committing too early to a single AI provider, before the technology has stabilized, risks being locked into a platform that may not be the right fit 6 months or a year from now. For a look at whether the competitive landscape will reward flexibility, check out Is GenAI Heading for a Tech Monopoly?

References

[1] Kim, Hyunjin, Dahyeon Kim, and Rembrand Koning, “Mapping AI into Production: A Field Experiment on Firm Performance,” INSEAD Working Paper No. 2026/20/STR (March 2026), 2. .

[2] Kim et al., “Mapping AI into Production,” 4.

[3] Kim et al., “Mapping AI into Production,” 6.

[4] Kim et al., “Mapping AI into Production,” 2.

Meet the Authors

is Assistant Professor of Strategy at INSEAD.

is a PhD student in strategy at INSEAD.

is Mary V. and Mark A. Stevens Associate Professor of Business Administration at ��ӽ� Business School, and the co-director and co-founder of the Tech for All lab at the HBS AI Institute.

Watch a video version of the Insight Article here.

The post Everyone Has AI. Which Firms are Going to Win? appeared first on ��ӽ� Business School AI Institute.

The Surprising Link Between AI Reasoning and Honesty

HBS AI Content & Learning — Mon, 23 Mar 2026 12:09:09 +0000

Exploring how the complexity of large language models acts as a moral safeguard

Listen to this article:

The standard fear about advanced AI goes something like this: the more sophisticated a system becomes, the better it gets at sounding convincing, reading the room, and manipulating people. A model that can reason step-by-step might not just answer better, it might lie better. That concern feels intuitive, especially as businesses hand more customer interactions, internal workflows, and decision support to increasingly capable systems. However, in the new study “,” co-written by ��ӽ� Business School AI Institute Associate Martin Wattenberg, a team of researchers found that our intuition might be backward. Through an exhaustive series of tests involving moral trade-offs and complex reasoning traces, they found that when an AI is forced to slow down and show its work, it becomes significantly more honest.

Key Insight: Testing the Moral Compass

“Each scenario is paired with two options: one favoring honesty and the other deception.” [1]

To study deceptive behavior rigorously, the researchers built a new benchmark dataset called DoubleBind, a collection of social dilemmas engineered so that choosing honesty comes at a tangible, variable cost. In one scenario, a manager praises you for an analysis your colleague actually produced, so correcting the record means losing a promotion. The financial stakes shift across versions of each dilemma, allowing the researchers to observe how models respond as the price of honesty rises. They also augmented an existing dataset, DailyDilemmas, with the same cost-scaling structure. Together, the two datasets gave the team a controlled way to probe moral trade-offs across six open-weight model families. Each model was tested in two modes: token-forcing, where the model answers immediately without deliberation, and reasoning mode, where the model deliberates for a specified number of sentences before committing to a final recommendation. Models are honest roughly 80% of the time under token-forcing, though that rate erodes as the cost of telling the truth climbs.

Key Insight: Why Deliberation Favors the Truth

“[M]odels are significantly more likely to choose the honest option when required to reason before providing a final answer.” [2]

In human psychology, the “dual-process” theory suggests that our first, intuitive impulse is often prosocial, while slow, calculated reasoning allows us to justify selfish or deceptive behavior. We might “calculate” our way into a lie. The researchers found that LLMs flip this script entirely: across all model families tested, reasoning increases the probability of an honest recommendation, and longer deliberation amplifies the effect. Additionally, it seems that the effect doesn’t principally come from the reasoning text itself. If chain-of-thought were simply constructing a persuasive moral argument, then reading the reasoning should make the model’s final decision easy to predict, but that is not what the researchers found. Reasoning traces frequently read like balanced surveys of the pros and cons of both options rather than arguments building toward a verdict. The decision to deceive, when it happens, tends to arrive without a legible trail. This is what the researchers call the “facsimile problem” – reasoning changes behavior, but not because of what it says.

Key Insight: Deceptive Answers are Easier to Shake Loose

“We hypothesize that compared to honesty, deception is a metastable state—that is, deceptive outputs are easily destabilized.” [3]

If the content of reasoning doesn’t explain the honesty boost, what does? The researchers propose a theory based on the “geometry” of the model’s internal states. They suggest that honesty is a stable, broad region in the AI’s conceptual map, while deception is a “metastable” state, essentially a narrow, fragile peak that is easily knocked over. When a model is “thinking,” it is navigating through its internal landscape. Because the honest regions of this space are larger and more “robust,” the process of reasoning draws the model toward them. The researchers tested this claim several ways. By changing the wording slightly through paraphrasing, they found that deceptive answers are much more likely to flip than honest ones. By resampling the model’s output, they found that initially deceptive recommendations often become honest, while honest ones usually stay put. Across these tests the asymmetry was consistent: honesty is robust, deception is fragile.

Why This Matters

For business leaders, the value of this paper is not that AI can now be assumed trustworthy. Rather, it offers a more useful way to think about risk. If deceptive outputs are less stable, then system design can exploit that fact. Building deliberation into AI workflows may become an important step before interfacing with customers or making high-stakes decisions. Organizations need systems that hold up when incentives get messy, and this paper suggests that at least in some cases, more reasoning may keep AI honest when it counts.

Bonus

In another study from HBS AI Institute associates, researchers found that fine-tuning LLMs on specialized datasets generally degrades chain-of-though reasoning performance. Faithfulness and Accuracy: How Fine-Tuning Shapes LLM Reasoning is a critical reminder that the choices made before deployment could erode the reasoning capacity you’re counting on.

References

[1] Ann Yuan et al., “Think Before You Lie: How Reasoning Leads to Honesty,” arXiv preprint arXiv:2603.09957 (2026): 3. .

[2] Yuan et al., “Think Before You Lie,” 4.

[3] Yuan et al., “Think Before You Lie,” 2.

Meet the Authors

is Gordon McKay Professor of Computer Science at the ��ӽ� John A. Paulson School of Engineering and Applied Sciences, and an Associate Collaborator at the HBS AI Institute.

Additional authors: Ann Yuan, Asma Ghandeharioun, Carter Blum, Alicia Machado, Jessica Hoffmann, Daphne Ippolito, Lucas Dixon, Katja Filippova

Watch a video version of the Insight Article here.

The post The Surprising Link Between AI Reasoning and Honesty appeared first on ��ӽ� Business School AI Institute.

Why Your AI Strategy May Be Failing

HBS AI Content & Learning — Mon, 16 Mar 2026 11:42:51 +0000

How companies can overcome the structural frictions that block AI at scale

Listen to this article:

AI has entered the enterprise faster than most previous waves of technology, reshaping expectations about speed, productivity, and decision-making. Yet adoption alone does not produce transformation. The Frontier Firm Initiative (FFI), a joint effort between the ��ӽ� Business School AI Institute and Microsoft, recently convened senior leaders from a dozen global organizations to address the “last mile” challenge, when a company tries to scale localized, successful AI pilot programs into a standard, enterprise-wide operating model. In the new HBR article “,” Karim R. Lakhani and Jen Stave of the HBS AI Institute and Microsoft’s Jared Spataro identify a framework of the specific “frictions” stalling progress and outline a strategic blueprint to overcome them. In the insight below, we will zoom in on one friction and one corresponding recommendation from the blueprint to resolve it.

Key Insight: The Weight of What Already Exists

“[T]he primary obstacle to progress is rarely model quality or data availability, but rather the ‘last mile’ of transformation where technical capability must meet organizational design.” [1]

When organizations try to implement AI, they often assume that any issues will arise from the AI technology itself. However, the authors find that AI actually functions as a “diagnostic tool” that exposes problematic processes already present within a firm. For example, the authors label “process debt” as the accumulation of fragmented and inconsistent workflows built up over years of embeddedness and geographic specificity. The article cites, for example, one professional-services firm operating in more than 170 countries where the “same” process had dozens of different, regional variations.

Key Insight: Designing the Organization AI Deserves

“[F]or every bottleneck discovered in the ‘last mile,’ a corresponding shift in this blueprint provides the path forward.” [2]

One of the shifts by organizations making real headway is replacing legacy processes. For example, what the authors call “clean-sheet redesign” starts by asking whether a given workflow would exist at all if the company were built today around AI agents. This demands equally fresh thinking about people and governance: capturing expert judgment as a codified asset rather than a protected credential, redesigning roles toward oversight and interpretation rather than execution, and managing AI agents with the same accountability structures applied to human teams.

Why This Matters

For today’s business professionals and executives, the “last mile” is less a technical challenge and more a test of leadership imagination. Process debt and clean-sheet redesign are only two parts of a broader diagnosis of seven frictions and corresponding transformation strategies. Read the to see them all. The potential of the technology you have already purchased is immense, but realizing it requires the courage to redesign the organization to match the speed of an agentic world.

References

[1] Lakhani, Karim R., Jared Spataro, and Jen Stave, “The ‘Last Mile’ Problem Slowing AI Transformation,” ��ӽ� Business Review, March 9, 2026, .

[2] Lakhani et al., “The ‘Last Mile’ Problem Slowing AI Transformation.”

Meet the Authors

is Chief Marketing Officer, AI at Work, Microsoft.

Jen Stave Jen Stave is Executive Director of the HBS AI Institute. She was previously Senior Vice President at Wells Fargo, and has a PhD from American University.

The post Why Your AI Strategy May Be Failing appeared first on ��ӽ� Business School AI Institute.

Competing in the Dark

HBS AI Content & Learning — Thu, 12 Mar 2026 12:29:33 +0000

New research reveals that firms are playing a game of catch-up they didn’t even know they were losing

Listen to this article:

Leaders know they must innovate to survive, and they don’t make decisions in a vacuum: they watch rivals, draw inferences, and position themselves accordingly. But what happens when their picture of the competitive landscape is fundamentally wrong? The new NBER working paper, “,” co-written by a team of researchers including ��ӽ� Business School AI Institute associate Zoë B. Cullen, takes this question seriously. Embedded in the Bank of Italy’s long-running INVIND survey, the study ran a randomized field experiment with roughly 3,000 Italian firms to test whether correcting firms’ misperceptions about competitors’ technology adoption changes their own investment plans. What they found should matter to any leader thinking about AI, automation, and the pace of organizational change.

Key Insight: Racing Without a Scorecard

“These data provide a unique opportunity to measure the beliefs firms hold about their competitors’ adoption decisions—and to identify their causal effects on firms’ own adoption behavior.” [1]

The central question driving the research is deceptively simple: are firms more likely to adopt advanced technologies like AI when they expect competitors to adopt them? This is a problem economists call ‘strategic complementarity’—the idea that one firm’s incentive to act depends partly on what others around it are doing. It’s easy to theorize about, but hard to test with real firms making real decisions.

The researchers solved this through a massive field experiment. They first asked firms what share of their competitors were currently using AI or robotics, then randomly provided half of them with the actual adoption rates of peers in their specific sector and size class. By measuring how these firms updated their 2027 adoption plans after seeing the data, the researchers could cleanly identify whether knowing a rival’s move actually changes your own. In the real world, if two firms adopt AI at the same time, it’s hard to tell if one is copying the other or if they are both just reacting to something like a new tax break or a labor shortage. By introducing a controlled “information shock,” the researchers could prove that it was the knowledge of competitor behavior itself that drove the change in strategy.

Key Insight: We Are More Alone Than We Think

“On average, prior beliefs underestimated actual adoption by 24.6 pp.” [2]

The research surfaced a striking baseline finding: firms are deeply mistaken about how technologically advanced their peers already are. On average, firms underestimated the share of competitors using AI or robotics by about 25 percentage points, a gap so large it suggests companies are making strategic decisions based on a competitive landscape that no longer exists. But when treated firms received accurate peer-adoption figures, they meaningfully revised their expectations upward—a response consistent with rational updating, and proportional to how far off their original estimates had been.

However, one of the most fascinating findings of the study was that not all technologies are equal when it comes to peer pressure. While information about competitors significantly increased intentions to adopt robotics, it had almost no measurable effect on plans for AI. The researchers offer several explanations: (1) AI adoption was already higher at baseline (roughly half of firms planned to use it by 2027), leaving less headroom for growth. (2) Robotics is a mature, deeply embedded technology in Italian manufacturing, so firms that see rivals using it extensively are receiving a clear legible signal. (3) AI, by contrast, is newer and often adopted experimentally, so competitive signals carry more ambiguity. Perhaps the return on investment for AI is still shrouded in uncertainty.

Key Insight: Information Campaigns

“[G]overnments could deploy information campaigns that raise firms’ awareness of the productivity benefits of new technologies and the extent to which their peers are adopting them.” [3]

Traditionally, governments try to spur innovation through expensive financial incentives and subsidies. However, this research points to a much cheaper and potentially more effective tool: the information campaign. If the primary reason firms aren’t adopting new technology is a misperception of the competitive landscape, then simply publishing accurate, sector-specific adoption data could do more to modernize an industry than a mountain of tax breaks. The researchers note two mechanisms that may be at work: a competition channel, where firms fear falling behind rivals, and a learning channel, where they use peer behavior to infer a technology’s productivity potential. Evidence from firms in concentrated markets suggests both channels are active, though neither can be fully isolated with current data.

Why This Matters

For executives and business leaders, this research surfaces a concrete and often underappreciated source of strategic risk: competitive misperception. If your organization is making AI and automation investment decisions based on an outdated view of where your industry actually stands, you may be systematically underinvesting, not because you lack capital or ambition, but because you lack accurate signals. The practical implication is that competitive intelligence on technology adoption is a direct input into investment strategy. For those thinking about the longer arc of AI diffusion, the contrast between robotics and AI results is instructive: behavioral responses to peer signals are strongest when a technology has a proven track record. As generative AI matures from experiment to infrastructure, the competitive spillovers documented here for robotics will likely be coming for AI next.

Bonus

This research shows that learning what peers are actually doing with advanced technology can shift decision-making. So here’s a question worth asking: how well do you really know where GenAI adoption stands in the broader workforce? The , built by a team including HBS AI Institute Associate David Deming, offers a data-grounded answer. Drawing on five nationally representative U.S. surveys and 25,000 respondents, it tracks GenAI use at work and at home, adoption rates among working-age adults, and the productivity time savings already being realized. Consider it your lamp in the darkness.

References

[1] Cullen, Zoë B., Ester Faia, Elisa Guglielminetti, Ricardo Perez-Truglia, and Concetta Rondinelli, “The Innovation Race: Experimental Evidence on Advanced Technologies,” NBER Working Paper 34532 (2025), 2. .

[2] Cullen et al., “The Innovation Race,” 3.

[3] Cullen et al., “The Innovation Race,” 26.

Meet the Authors

is Associate Professor of Business Administration at ��ӽ� Business School and Associate at the HBS AI Institute.

is Professor at Goethe University Frankfurt.

is an Economist at the Bank of Italy.

is a Professor at UCLA’s Anderson School of Management.

is a Senior Economist at the Bank of Italy.

The post Competing in the Dark appeared first on ��ӽ� Business School AI Institute.

Can You Spot the Bot?

HBS AI Content & Learning — Thu, 05 Mar 2026 13:09:22 +0000

New research reveals just how convincingly AI mimics humans

Listen to this article:

Alan Turing’s original “imitation game,” proposed in 1950, had an elegant simplicity: a human judge conducts a text-based conversation with two hidden parties—one human, one machine—and tries to guess which is which. Today, the question Turing posed has quietly expanded into territory he never mapped. Our digital existence is a kaleidoscope of multi-modal interactions. We don’t just “talk” to the internet, we upload snapshots of our morning coffee, interpret complex visual data in professional dashboards, estimate the mood of a room through a video call, and follow subtle cues of visual attention. co-written by Hanspeter Pfister, ��ӽ� Business School AI Institute Associate and An Wang Professor of Computer Science at ��ӽ� SEAS, explains how a new large-scale study from researchers at 15 organizations around the globe drags the imitation game into the full complexity of how humans communicate, perceive, and describe the world. Are we already past the point where we can reliably tell machines from humans, and does it matter who’s doing the judging?

Key Insight: A Gauntlet of Language and Vision

“[W]e present an integrative benchmark encompassing a wide range of standard and well-established AI tasks across both language and vision.” [1]

Rather than testing imitation in a single domain, the researchers designed a six-task benchmark spanning language and vision. Language tasks included image captioning, word association, and open-ended conversation. Vision tasks covered color estimation (identifying the dominant color in a scene), object detection (naming three visible items), and attention prediction (comparing human eye-tracking data with AI-generated gaze sequences). The data collection was correspondingly ambitious: 36,499 responses from 636 human participants and 37 AI models, evaluated through 72,191 Turing-like tests administered to 1,916 human judges and 10 AI judges. A subtle but important design choice: the tests were not trying to determine accuracy, they were trying to quantify indistinguishability: a system can be wrong and still match human patterns, or be correct and still fail to pass as a human.

Key Insight: Measurement, Not Myth

“[W]e consider Turing-like tests as a quantitative evaluation of how well current AIs can imitate humans.” [2]

The Turing test has always been contentious. Critics argue that a machine could pass it without genuine understanding, that it measures performance rather than intelligence, or that it’s too narrow to be meaningful. The researchers behind this study sidestep that debate entirely. Their goal isn’t to adjudicate whether AI is intelligent , it’s to measure something more practical: the ability of machines to convincingly replicate human behavior, including our flaws and mistakes. By using imitation detectability, which is the ability of a judge to distinguish between a person and an algorithm, the researchers create a framework for measuring how “human-like” a machine has become. That shift matters because conventional benchmarks often assume a stable ground truth and score models by how accurately they match a single ‘correct’ answer predefined by a human. But many real tasks, especially open-ended ones, don’t have a single ‘correct’ output. And even when they do, being right isn’t the same as acting human; a truly human-like AI should make the exact same mistakes we do.

Key Insight: The Vanishing Threshold of Detection

“[C]urrent algorithms are not far from being able to imitate humans in these tasks.” [3]

The results of the study are both a testament to engineering prowess and a warning for the future of digital trust. Across the six tasks, the researchers found that the detectability of AI is plummeting. In a ‘perfect’ world where humans and machines are indistinguishable, a judge would have a 50% chance of being right—essentially a coin flip. The study found that for many tasks, the detectability scores are hovering remarkably close to that 50/50 chance level. In areas like image captioning and word associate, the gap has become so thin that even motivated human judges struggle to tell the difference between a person’s description and an AI’s. Yet the convergence is not complete: AI still occasionally trips over the uniqueness of human behavior. For instance, humans are surprisingly consistent in where they look in a picture, while machines sometimes focus on statistically likely but humanly irrelevant details.

At the same time, one of the paper’s most striking results showed that AI proves highly effective in a different role: judge rather than subject. Simple machine learning classifiers trained to distinguish human- from AI-responses outperformed human judges on most tasks. For word associations, the AI judge correctly identified machine-generated responses 91% of the time, compared to just 47% for humans. Machines, it seems, are far better than we are at spotting each other.

Why This Matters

For executives and business leaders, this research redraws the risk landscape in two directions. First, the near invisibility of AI responses in everyday tasks means fraud, disinformation, and impersonation are no longer theoretical risks, they are statistically plausible at scale, today. Second, because automated classifiers outperform human judges, detection cannot rely on human vigilance alone anymore. It requires infrastructure, and regulators in the EU and elsewhere are already moving toward mandatory AI disclosure requirements. This paper highlights the importance of building transparency tools now to be prepared for when they are required and to ensure you can maintain your customers’ trust.

Bonus

As AI systems get more capable, they’re also getting harder to understand. Another response to this challenge is to build clearer explanations for why models behave the way they do with a single, coherent framework. To go deeper on this initiative, check out “Unifying AI Attribution: A New Frontier in Understanding Complex Systems.”

References

[1] Mengmi Zhang et al., “Can Machines Imitate Humans? Integrative Turing-like tests for Language and Vision Demonstrate a Narrowing Gap,” arXiv preprint arXiv:2211.13087v3 (2025): 3.

[2] Zhang et al., “Can Machines Imitate Humans?”: 2.

[3] Zhang et al., “Can Machines Imitate Humans?”: 16.

Meet the Authors

is An Wang Professor of Computer Science at ��ӽ� John A. Paulson School of Engineering and Applied Sciences and HBS AI Institute Associate.

Additional Authors: Mengmi Zhang, Elisa Pavarino, Xiao Liu, Giorgia Dellaferrera, Ankur Sikarwar, Caishun Chen, Marcelo Armendariz, Noga Mudrik, Prachi Agrawal, Spandan Madan, Mranmay Shetty, Andrei Barbu, Haochen Yang, Tanishq Kumar, Shui’Er Han, Aman Raj Singh, Meghna Sadwani, Stella Dellaferrera, Michele Pizzochero, Brandon Tang, Yew Soon Ong, Gabriel Kreiman

The post Can You Spot the Bot? appeared first on ��ӽ� Business School AI Institute.

The AI Deep Research Race Has a New Leaderboard

HBS AI Content & Learning — Thu, 26 Feb 2026 14:07:08 +0000

A new cross-domain benchmark reveals how the leading AI research tools perform on real-world production tasks

Listen to this article:

Two AI-generated research reports land on your desk before a major decision. Both are polished, confidently written, and well-structured, but they reach different conclusions. Which one do you trust, and how would you even begin to find out? In “,” a team at Perplexity and , Assistant Professor of Business Administration at ��ӽ� Business School and affiliate with the ��ӽ� Business School AI Institute, present a rigorous new benchmark for measuring how well AI deep research systems actually perform on real-world production tasks.

Key Insight: A New Standard for Deep Research Evaluation

“We introduce a cross-domain benchmark derived from real-world production deep research tasks designed to bridge the gap between AI evaluations and authentic research needs.” [1]

AI “deep research” systems, tools that can autonomously decompose a complex question, search hundreds of sources, reconcile conflicting evidence, and synthesize findings into a cited report, are increasingly being used for high-stakes analytical work in areas such as finance, legal, and medicine. Unlike a simple chatbot response, these systems operate more like an analyst running an independent research process. While this technology has been advancing quickly, the frameworks for evaluating it have not kept pace. The authors argue that evaluating deep research must reflect realistic use cases, span domains, account for region-specific sources, and probe multiple system capabilities such as planning, search, and reasoning all at once.

Key Insight: Tasks Deeply Rooted in Practice

“Our main contribution is a curated set of benchmark tasks that closely mirror real deep research needs and how people use deep research agents in practice.” [2]

Many AI benchmarks are built by researchers and experts imagining what hard questions look like. DRACO takes a different approach: its 100 tasks were sourced directly from actual user queries submitted to Perplexity’s deep research system in fall 2025. Specifically, researchers sampled from high-difficulty requests where users had expressed dissatisfaction, making these exactly the kinds of tasks where AI systems tend to struggle. Those raw queries were then anonymized, augmented to add specificity and scope, and filtered to ensure each task was objectively evaluable, appropriately bounded, and genuinely challenging. The results span 10 domains drawing on sources from 40 countries across five regions.

Key Insight: Rating Real-World Complexity

“Twenty-six domain experts, including medical professionals, attorneys, financial analysts, software engineers, and designers, were recruited to develop rubrics for selected tasks.” [3]

DRACO’s grading rubrics were developed through a rigorous human-expert pipeline: an initial rubric is drafted by one expert, reviewed and refined by a second, subjected to a “saturation test” to ensure the current system cannot easily exceed 90% (which would indicate an overly easy task or lenient rubric), and finally validated by a third and fourth expert for quality assurance. Each task was ultimately assessed across an average of 39 criteria spanning four dimensions: factual accuracy, breadth and depth of analysis, presentation quality, and citation quality.

Key Insight: Progress, But Gaps Remain

“Our evaluation of frontier deep research systems reveals that while significant progress has been made (especially in presentation quality), substantial headroom remains (especially in factual accuracy).” [4]

The evaluation results indicate that while agents have improved across all rubric dimensions—and now excel in presentation quality—they continue to struggle with factual accuracy. This may partly stem from design choices: roughly half of all criteria focused on verifiable factual claims, and the rubrics also included negative criteria penalizing specific failure modes. In domains like medicine and law, these penalties are particularly severe, as incorrect or unsafe recommendations carry heavy negative weights. This reflects a core design principle: in high-stakes domains, what AI gets wrong matters as much as what it gets right.

Why This Matters

As we increasingly rely on AI for high-stakes tasks, from brainstorming and research to actual execution, the bottleneck is no longer speed, it’s accuracy. The area where AI performs best, producing polished, well-structured output, is precisely where it’s hardest for a non-specialist to detect errors. For business leaders, DRACO’s task-and-rubric design offers a concrete blueprint for evaluating and choosing research agents: define success criteria, test on representative workloads, and be sure to clarify how you’ll know when it’s wrong.

Bonus

While it seems self-evident that we want the best and most accurate information from AI, that’s actually not always the case. Check out “Explanations on Mute: Why We Turn Away From Explainable AI” to see why.

References

[1] Joey Zhong et al., “DRACO: a Cross-Domain Benchmark for Deep Research Accuracy, Completeness, and Objectivity,” arXiv preprint arXiv:2602.11685 (2026): 2.

[2] Zhong et al., “DRACO”: 2.

[3] Zhong et al., “DRACO”: 5.

[4] Zhong et al., “DRACO”: 12.

Meet the Authors

is an Assistant Professor of Business Administration at ��ӽ� Business School and affiliated with the HBS AI Institute.

Additional Authors (Perplexity): Joey Zhong, Hao Zhang, Clare Southern, Thomas Wang, Kate Jung, Shu Zhang, Denis Yarats, Johnny Ho, Jerry Ma

The post The AI Deep Research Race Has a New Leaderboard appeared first on ��ӽ� Business School AI Institute.

The Manager’s AI Dilemma

HBS AI Content & Learning — Tue, 17 Feb 2026 13:23:54 +0000

How to design AI adoption so decision makers can say yes without self-sabotage

Lots of organizations can green-light AI. Far fewer can absorb it. That gap, between excitement and real, embedded use, keeps showing up even when ROI is compelling and leadership is visibly supportive. New research from ��ӽ� Business School AI Institute Frontier Firm affiliate Shunyuan Zhang and Das Narayandas reveals an uncomfortable idea contributing to this gap. In “,” they highlight that the very people who must approve and champion these technologies are the same ones whose jobs could be fundamentally threatened by them.

Key Insight: The Three Threats of Self-Disruptive Technologies (SDTs)

“We define SDTs as innovations that simultaneously (1) improve organizational performance and (2) erode the authority, discretion, or legitimacy of the role responsible for approving them.” [1]

Traditional adoption theories typically focus on whether organizations are ready, whether the technology is useful, and whether there’s institutional pressure to adopt. But these frameworks miss something critical: they assume decision-makers are neutral agents acting on behalf of the firm. Now, add to the mix AI systems with the potential to automate managerial judgment, analytics platforms that centralize decision rights, or algorithmic tools that replace experiential expertise with codified models. When the manager in charge of approving these technologies anticipates that they will shrink their own role or reduce their influence, the approval decision becomes identity-laden. These Self-Disruptive Technologies, as Narayandas and Zhang call them, trigger three forms of role-level identity threat. Role compression occurs when automation shifts core work from “deciding” to “monitoring,” compressing the judgment and expertise that defines a role’s distinctive contribution. Control shift happens when discretion moves away from the approving role (e.g. centralized to analytics teams or delegated to algorithms), removing the decision authority that makes roles defensible within organizations. Span erosion reflects the contraction of influence over people, budgets, or processes, undermining status and future opportunity even when the formal position remains intact.

What makes these threats particularly powerful is that they can dominate the approval calculus even when firm-level incentives favor adoption and economic cases are strong. A manufacturing supervisor might support efficiency improvements in principle but resist when the technology eliminates the judgment calls that justify their expertise. A procurement manager might delay adopting an AI tool that demonstrably reduces costs because it centralizes decisions that previously sustained their organizational influence.

Key Insight: Engineering the Solution – Identity-Compatible Advantage (ICA)

“Identity-Compatible Advantage therefore does not operate by increasing perceived value or shifting bargaining power, but by enabling approvers to say yes without identity loss.” [2]

Here’s where the research gets actionable. Narayandas and Zhang argue for an approach of Identity-Compatible Advantage to require bundling new technology with governance and role-design mechanisms that make adoption personally and politically defensible for managers. ICA includes five complementary elements: role rechartering that redefines the role around higher-order judgment rather than routine decisions; decision guardrails that preserve authority through override rights and and governance structures; analytical overlays that frame technology as augmentative rather than substitutive; redeployment pathways that provide credible commitments to role evolution rather than elimination; and executive sponsorship that legitimizes identity transition and reallocates accountability.

The research emphasizes that these mechanisms work as a bundle, not in isolation. For example, implementing guardrails without rechartering leaves meaning unaddressed, as you give the manager the power to override the AI (restoring some control), but because the AI still does the core work, the manager feels their daily expertise is useless (leaving their loss of purpose and contribution unaddressed). The framework shows that successful SDT adoption requires designing offerings where endorsement becomes personally and politically defensible.

Why This Matters

Most AI automation discourse has fixated on individual contributors like programmers, graphic designers, and copywriters because their work products are visible and the substitution story is easy to tell. This research adds a missing piece: the managers and decision-makers, who control whether AI technologies get adopted in the first place, are themselves facing automation of their core judgment and authority. For executives and business leaders, the implications are profound. If you treat AI adoption as a purely rational calculation, you are likely to be met with “symbolic adoption,” where your team pays lip service to innovation while quietly ensuring that the status quo remains undisturbed. By utilizing Identity-Compatible Advantage, leaders can implement the complex undertaking of AI adoption as an evolution of their teams, not a replacement of them. The future of work belongs to the firms that can successfully re-anchor identities around high-level strategy, risk ownership, and the human-centric decisions that no machine can replicate.

Bonus

The path to real AI adoption runs through design choices: how you frame AI, where you keep humans in the loop, and how you protect legitimacy. For another look at the dynamics of AI in the workplace, check out Drawing the Line on AI Usage in the Workplace.

References

[1] Narayandas, Das and Shunyuan Zhang, “Selling Self-Disruptive Technologies: Identity-Compatible Advantage and the Role-Level Microfoundations of Automation Adoption.” ��ӽ� Business School Working Paper, No. 26-050 (February 9, 2026): 5.

[2] Narayandas and Zhang, “Selling Self-Disruptive Technologies,” 9.

Meet the Authors

is Edsel Bryant Ford Professor of Business Administration at ��ӽ� Business School.

is Associate Professor of Business Administration at ��ӽ� Business School. She and other HBS faculty contribute to the HBS AI Institute Frontier Firm Initiative.

The post The Manager’s AI Dilemma appeared first on ��ӽ� Business School AI Institute.

The Fast-Talking AI Chat Agent

HBS AI Content & Learning — Wed, 11 Feb 2026 13:19:22 +0000

New research shows when AI boosts service, and when it backfires.

Think about the last time you contacted customer support. Did you start with a chatbot? If it failed to resolve your problem, how did you feel when transferred to a human agent? This dynamic defines our expectations of the modern customer service experience: the struggle to balance the cold speed of automation with the warm necessity of human empathy. However, in “,” ��ӽ� Business School AI Institute Frontier Firm affiliate Shunyuan Zhang and Das Narayandas explain how the results from a year-long experiment involving 138 customer service agents and over 250,000 conversations are far more complex than the typical assumption.

Key Insight: AI Assistance Isn’t Just Faster, It’s More Human

“We posit that AI enables agents to handle conversations more efficiently, thus encouraging more responses from customers, leading to deeper back-and-forth interactions between them.” [1]

The prevailing fear in customer service is that introducing AI will turn human interactions into robotic, assembly-line exchanges. Yet, when agents received real-time AI-generated reply suggestions, they didn’t just respond 22% faster to customer messages, they actually sent more messages and saw a measurable boost in the “human” quality of the chats. AI freed agents from the cognitive burden of composing responses, allowing them to engage customers more deeply. The conversations became richer, not shallower: using large language models to categorize agent messages, the researchers found that AI-assisted responses scored higher in the key aspects of empathy, information, and solution, with the largest jump in empathy.

Key Insight: The Experience Equalizer

“Specifically, for a hypothetical brand new agent, AI would lead to a remarkable reduction in agent response time of approximately 70.3%.” [2]

One of the most business-relevant findings in the study was that AI assistance didn’t benefit everyone equally. When the researchers examined how agent tenure moderated AI’s effects, they found that less-experienced agents gained far more from AI suggestions than their veteran counterparts. Essentially, the AI “downloaded” institutional knowledge into the workflow of new employees: having access to these real-time suggestions was the functional equivalent of nearly five months of experience. This has profound implications for industries with high turnover, suggesting that AI can serve as a stabilized bridge, ensuring that a customer’s experience doesn’t suffer just because they happened to be connected to a trainee.

Key Insight: Not All Conversations Are Created Equal

“Different customer intents shape the context and dynamics of conversations, and if AI fails to adapt to these nuances, it may provide misleading suggestions, potentially harming interactions.” [3]

The AI algorithm’s impact varied depending on why customers were reaching out in the first place. For example, when customers wanted to cancel subscriptions—traditionally difficult conversations—AI helped agents identify underlying reasons and recommend alternative options, leading to notable improvements in customer sentiment. But repeat complaints told a different story. Although AI helped agents respond quickly in these scenarios, customer sentiment barely improved. These complaints stemmed from systematic operational issues, like recurring delivery problems, that no amount of empathetic, information-rich messaging could solve. The AI could help agents communicate better about problems, but it couldn’t actually fix them.

Perhaps the most counterintuitive finding emerged from examining what happened in the handoff from a bot to a human agent. Many companies use a “chatbot first” approach, where a fully automated bot tries to solve the problem before transferring the customer to a human. As we’ve seen, AI-assisted agents are able to respond more quickly, and if the AI-assisted agent responded too quickly, customers suspected that they were still talking to a bot. The response speed that might normally delight customers became a liability, triggering what the researchers term a negative “spillover” from the initial bot failure. In these contexts, the study found that increasing the delay in human responses actually helped rebuild trust and improve sentiment.

Why This Matters

For executives deploying AI in customer-facing operations, this research delivers three strategic imperatives. First, resist the temptation to replace human agents entirely: augmentation delivers better outcomes than automation alone, particularly for handling nuanced, emotionally charged interactions. Second, deploy AI with precision: it’s most valuable in specific conversation types (like retention scenarios). Third, manage your AI ecosystem holistically. If you’re using multiple AI systems in sequence, recognize that they’re not independent. The companies that will win with AI aren’t those that deploy the most LLMs, they’re those that understand how these systems interact across the entire customer ecosystem and adapt their implementation accordingly.

Bonus

When emotions are involved, who people think is responding can shape outcomes as much as what is said. For another angle on AI and human emotion, check out It Feels Like AI Understands, But Do We Care? New Research on Empathy.

References

[1] Zhang, Shunyuan, and Das Narayandas, “Engaging Customers with AI in Online Chats: Evidence from a Randomized Field Experiment.” Management Science 72 (1) (2025): 84.

[2] Zhang and Narayandas, “Engaging Customers with AI in Online Chats,” 84.

[3] Zhang and Narayandas, “Engaging Customers with AI in Online Chats,” 75-76.

Meet the Authors

is Associate Professor of Business Administration at ��ӽ� Business School. She and other HBS faculty contribute to the HBS AI Institute Frontier Firm Initiative.

is Edsel Bryant Ford Professor of Business Administration at ��ӽ� Business School.

The post The Fast-Talking AI Chat Agent appeared first on ��ӽ� Business School AI Institute.

Generative AI and Business Technology | ���ӽ� Business School AI Institute

Is AI Making Your Team Lazy?

Key Insight: Falling Asleep at the Wheel

Key Insight: The Counter-Intuitive Power of “Flawed” Predictions

Key Insight: The Design Implication

Why This Matters

Bonus

References

Meet the Authors

Back to the Beginnings of AI at Work

Key Insight: An Experiment Built for the Real World

Key Insight: AI’s Capabilities Don’t Follow a Smooth Line

Key Insight: AI as a Booster and Disruptor

Key Insight: The Biggest Gains Go to the Lower Half

Why This Matters

Bonus

References

Meet the Authors

Everyone Has AI. Which Firms are Going to Win?

Key Insight: A Global ���ӽ� for AI’s Real Value

Key Insight: A Small Nudge, Outsized Results

Key Insight: A Cognitive Bottleneck

Why This Matters

Bonus

References

Meet the Authors

The Surprising Link Between AI Reasoning and Honesty

Key Insight: Testing the Moral Compass

Key Insight: Why Deliberation Favors the Truth

Key Insight: Deceptive Answers are Easier to Shake Loose

Why This Matters

Bonus

References

Meet the Authors

Why Your AI Strategy May Be Failing

Key Insight: The Weight of What Already Exists

Key Insight: Designing the Organization AI Deserves

Why This Matters

References

Meet the Authors

Competing in the Dark

Key Insight: Racing Without a Scorecard

Key Insight: We Are More Alone Than We Think

Key Insight: Information Campaigns

Why This Matters

Bonus

References

Meet the Authors

Can You Spot the Bot?

Key Insight: A Gauntlet of Language and Vision

Key Insight: Measurement, Not Myth

Key Insight: The Vanishing Threshold of Detection

Why This Matters

Bonus

References

Meet the Authors

The AI Deep Research Race Has a New Leaderboard

Key Insight: A New Standard for Deep Research Evaluation

Key Insight: Tasks Deeply Rooted in Practice

Key Insight: Rating Real-World Complexity

Key Insight: Progress, But Gaps Remain

Why This Matters

Bonus

References

Meet the Authors

The Manager’s AI Dilemma

Key Insight: The Three Threats of Self-Disruptive Technologies (SDTs)

Key Insight: Engineering the Solution – Identity-Compatible Advantage (ICA)

Why This Matters

Bonus

References

Meet the Authors

The Fast-Talking AI Chat Agent

Key Insight: AI Assistance Isn’t Just Faster, It’s More Human

Key Insight: The Experience Equalizer

Key Insight: Not All Conversations Are Created Equal

Why This Matters

Bonus

References

Meet the Authors

Generative AI and Business Technology | ��ӽ� Business School AI Institute

Key Insight: A Global ��ӽ� for AI’s Real Value