Generative AI and Business Technology | 性视界 Business School AI Institute /category/generative-ai-and-business-technology/ The 性视界 Business School AI Institute catalyzes new knowledge to invent a better future by solving ambitious challenges. Fri, 24 Apr 2026 13:06:07 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 /wp-content/uploads/2026/04/cropped-Screenshot-2026-04-16-at-10.14.43-AM-32x32.png Generative AI and Business Technology | 性视界 Business School AI Institute /category/generative-ai-and-business-technology/ 32 32 Is AI Making Your Team Lazy? /is-ai-making-your-team-lazy/ Fri, 24 Apr 2026 12:18:31 +0000 /?p=30560 Exploring the hidden cost of human disengagement from AI Listen to this article: We are rapidly entering an AI era defined by the 鈥渁gentic鈥 shift. These tools now write code, manage inboxes, conduct research, and execute multi-step workflows without a human lifting a finger. But when AI does more, what happens to the humans at […]

The post Is AI Making Your Team Lazy? appeared first on 性视界 Business School AI Institute.

]]>
Exploring the hidden cost of human disengagement from AI

We are rapidly entering an AI era defined by the 鈥渁gentic鈥 shift. These tools now write code, manage inboxes, conduct research, and execute multi-step workflows without a human lifting a finger. But when AI does more, what happens to the humans at the end of the line? Does the presence of a 鈥減erfect鈥 partner actually make us better, or does it slowly erode the very skills and attention required to provide oversight? As we mark the renaming of D^3 as the HBS AI Institute this month, we鈥檙e taking a look back at some of our foundational research that defines the era. In 鈥,鈥 HBS AI Institute post-doctoral fellow Fabrizio Dell’Acqua Headshot of Fabrizio Dell'Acqua Fabrizio Dell’Acqua designed a field experiment to test what happens when the quality of AI assistance advances. His findings, it turns out, have serious implications for anyone using AI or in charge of systems where humans and AI share responsibility. 

Key Insight: Falling Asleep at the Wheel

鈥淚f the AI appears too high quality, workers are at risk of 鈥榝alling asleep at the wheel鈥 and mindlessly following its recommendations without deliberation.鈥 [1]

The paper鈥檚 central hypothesis begins with a simple behavioral observation: as AI quality increases, the rational incentive to exert one鈥檚 own effort decreases. When a tool appears highly reliable, people may stop checking its work closely, stop gathering their own information, and stop exercising independent judgment. Dell鈥橝cqua calls this 鈥渇alling asleep at the wheel.鈥 The result is a subtle but important distinction between AI performance in isolation and human-AI performance in practice. What matters is not only how good the model is, but how people behave when using it.

Key Insight: The Counter-Intuitive Power of 鈥淔lawed鈥 Predictions

鈥淥n average, HR recruiters receiving lower-quality AI were less likely to 鈥榝all asleep鈥 as they tended not to automatically select the AI-recommended candidate.鈥 [2]

To test this theory, Dell鈥橝cqua conducted a field experiment involving 181 professional HR recruiters who were tasked with reviewing 44 resumes each for a software engineering position. The recruiters were randomly assigned different levels of AI assistance: a 鈥淧erfect鈥 AI with approximately 99% accuracy, a 鈥淕ood鈥 AI with approximately 85% accuracy, a 鈥淏ad鈥 AI with roughly 75% accuracy, or no AI at all. Recruiters knew which tier of AI they were working with before starting. The results were clear and striking: recruiters who collaborated with the 鈥淏ad鈥 AI actually outperformed those with the 鈥淕ood鈥 AI. Because the 鈥淏ad鈥 AI was clearly imperfect, the recruiters remained vigilant, spending more time on each application and verifying the AI鈥檚 claims. This group effectively learned the AI鈥檚 weaknesses and improved their own performance to compensate. Those with better AI moved faster and delegated more. 

Key Insight: The Design Implication

鈥淒esigning effective structures for human/machine collaboration requires careful consideration of the organization鈥檚 objectives and task features.鈥 [3]

Dell鈥橝cqua is careful not to recommend that organizations simply deploy older, worse AI models. The real prescription is more nuanced: design AI systems with human behavioral responses in mind, not just technical performance benchmarks. In settings where people can add value, the design of the interaction becomes a strategic variable. That might mean calibrating AI confidence displays, introducing deliberate uncertainty signals for borderline cases, or creating interfaces that prompt humans to engage before surfacing a recommendation. A system that nudges humans to stay attentive may perform better than one that invites passive approval.

Why This Matters

For executives and business leaders, the lesson here is that combined human-AI performance is its own optimization target, and it might not move in lockstep with AI accuracy improvements. Strategy in the age of AI still requires an understanding of human psychology and effort. If leaders want better outcomes, they need to think beyond technical benchmarks to workflows where their employees remain wide awake at the wheel.

Bonus

This article shows that impressive AI performance can hide important weaknesses. Here, the issue hinges on over-reliance by human collaborators, but at other times it鈥檚 caused by the model itself. For example, even highly capable AI systems can still struggle with something as basic as multi-digit multiplication. For a closer look at this, check out When Giants Stumble: What Multiplication Reveals about AI鈥檚 Capabilities.

References

[1] Dell鈥橝cqua, Fabrizio, 鈥淔alling Asleep at the Wheel: Human / AI Collaboration in a Field Experiment on HR Recruiters,鈥 Working paper, Laboratory for Innovation Science, 性视界 Business School (2022), 2. 

[2] Dell鈥橝cqua, 鈥淔alling Asleep at the Wheel,鈥 3.

[3] Dell鈥橝cqua, 鈥淔alling Asleep at the Wheel,鈥 4.

Meet the Authors

Headshot of Fabrizio Dell'Acqua

is a postdoctoral researcher at 性视界 Business School. His research explores how human/AI collaboration reshapes knowledge work: the impact of AI on knowledge workers, its effects on team dynamics and performance, and its broader organizational implications.

Watch a video version of the Insight Article here.

The post Is AI Making Your Team Lazy? appeared first on 性视界 Business School AI Institute.

]]>
Back to the Beginnings of AI at Work /back-to-the-beginnings-of-ai-at-work/ Thu, 09 Apr 2026 11:14:14 +0000 /?p=29980 What a Landmark AI Study Tells Us About When to Trust, and When Not to Trust, AI Listen to this article: In September 2023, a working paper out of 性视界 Business School landed at an unusually consequential moment. Generative AI had been publicly available for less than a year, organizations were scrambling to understand its […]

The post Back to the Beginnings of AI at Work appeared first on 性视界 Business School AI Institute.

]]>
What a Landmark AI Study Tells Us About When to Trust, and When Not to Trust, AI

In September 2023, a working paper out of 性视界 Business School landed at an unusually consequential moment. Generative AI had been publicly available for less than a year, organizations were scrambling to understand its implications, and almost no rigorous field evidence existed on how it actually affected professional performance. 鈥溾 offered exactly that. Now, in March 2026, that research has been formally published in the peer-reviewed journal . To mark this milestone, we鈥檙e revisiting the study and its findings. The questions it set out to answer, what AI actually does to knowledge worker performance, where it helps, where it hurts, and why, were foundational then. They remain foundational now.

Key Insight: An Experiment Built for the Real World

鈥淸T]hese tasks were 鈥榲ery much in line with part of the daily activities of the consultants鈥 involved.鈥 [1]

To test the impact of generative AI on high-end knowledge work, the researchers collaborated with Boston Consulting Group (BCG) on a randomized controlled trial involving 758 consultants. After establishing an individual performance baseline, participants were randomly assigned to one of three conditions: no AI access, GPT-4 access, or GPT-4 access paired with a brief prompt engineering overview. The core of the design involved testing how these professionals navigated realistic tasks simulating real-world workflows. The researchers created two kinds of consulting-style assignments. One centered on product innovation and go-to-market work, including ideation, analysis, writing, and persuasion. The other was a difficult brand strategy case that required participants to reconcile spreadsheet data with subtle clues embedded in interview notes. This design let the researchers ask not just whether AI boosts productivity in general, but whether the answer depends on the nature of the task itself.

Key Insight: AI鈥檚 Capabilities Don鈥檛 Follow a Smooth Line

鈥淸W]ithin the same knowledge workflow, some tasks are beyond the frontier, whereas others remain within it, making effective AI use challenging.鈥 [2]

The paper introduces its signature jagged technological frontier concept to describe the uneven capabilities of generative AI. Tasks that appear of similar difficulty to humans might fall on opposite sides of this boundary. When a task falls inside the frontier, AI is capable of generating accurate, high-quality outputs that support human work. Conversely, being outside the frontier means that AI fails or produces believable but incorrect hallucinations. In such tasks performance still depends on human judgment, guidance, or synthesis that the AI cannot reliably provide on its own. The danger is that professionals have no obvious signal telling them which side of the line a task is on.

Key Insight: AI as a Booster and Disruptor

鈥淸E]xperienced and incentivized knowledge professionals, engaged in tasks akin to some of their daily responsibilities, performed worse when given access to AI.鈥 [3]

For tasks inside the frontier (the innovation and market exercise), AI access produced striking improvements. Consultants using GPT-4 completed 12.2% more subtasks, worked roughly 25% faster, and delivered work that human graders rated about 32% higher in quality. But for the task outside the frontier (the brand strategy case), the results flipped sharply. The control group (no AI) answered correctly 84.5% of the time. Among consultants using AI, accuracy fell to 70.6% for those with GPT access and 60% for those with access and the prompt-engineering overview. The AI had correctly processed surface-level numerical data, but missed a critical insight buried in interview materials. Consultants who used AI tended to trust its analysis and follow it to the wrong conclusion. Over-reliance on AI output, not ignorance of the task, was the mechanism of failure.

Key Insight: The Biggest Gains Go to the Lower Half

鈥淸T]he most significant beneficiaries of using AI are the bottom-half-skill subjects.鈥 [4]

The distribution of gains was not uniform. When the research team segmented consultants by their baseline assessment performance, they found that the largest beneficiaries of AI assistance were those in the lower half of the skill distribution. Bottom-half performers improved by 31% on the experimental task; top-half performers improved by 11%. This pattern suggests that AI can function as a meaningful equalizer within professional environments, lifting those furthest from peak performance (while still delivering meaningful gains to those at the top).

Why This Matters

For executives and leaders, this paper remains foundational because it frames AI adoption as a problem of decisions, strategy, and execution. The lesson is not that AI is universally good or dangerously flawed, it鈥檚 that leaders have to understand where, in a workflow, AI strengthens performance and where it creates deceptive failures. This means training people to exercise judgment rather than outsource it, and recognizing that polished output is not the same as sound reasoning. How will you guide your team along the jagged frontier?

Bonus

The need for human oversight, the risk of overtrusting polished outputs, and the challenge of separating assessment from interpretation are tensions that run through many of the most important conversations in AI today. Seen through these lenses, the paper is not only about consulting work, but a broader shift in how decisions get made when AI becomes part of the process. For another look at those themes in the context of screening and evaluating ideas, check out The Future of Decision-Making: How Generative AI Transforms Innovation Evaluation

References

[1] Dell鈥橝cqua, Frabrizio, et al., 鈥淣avigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of Artificial Intelligence on Knowledge Worker Productivity and Quality,鈥 Organizational Science 37(2): 403-423, 405.  

[2] Dell鈥橝cqua et al., 鈥淣avigating the Jagged Technological Frontier,鈥 404.

[3] Dell鈥橝cqua et al., 鈥淣avigating the Jagged Technological Frontier,鈥 419.

[4] Dell鈥橝cqua et al., 鈥淣avigating the Jagged Technological Frontier,鈥 410.

Meet the Authors

Headshot of Fabrizio Dell'Acqua

is a postdoctoral researcher at 性视界 Business School. His research explores how human/AI collaboration reshapes knowledge work: the impact of AI on knowledge workers, its effects on team dynamics and performance, and its broader organizational implications.

Headshot of Edward McFowland III

is an Assistant Professor in the Technology and Operations Management Unit at 性视界 Business School and Principal Investigator at the HBS AI Insitute Data Science and AI Operations Lab hosted within the Laboratory for Innovation Science.

Headshot of Ethan Mollick

is an Associate Professor at the Wharton School of the University of Pennsylvania, where he studies and teaches innovation and entrepreneurship, and examines the effects of artificial intelligence on work and education. Ethan is the Co-Director of the Generative AI Lab at Wharton, which builds prototypes and conducts research to discover how AI can help humans thrive while mitigating risks.

Headshot of Hila Lifshitz

is a Professor of Management at Warwick Business School (WBS) and a visiting faculty at 性视界 University, at the Laboratory for Innovation Science at 性视界 (LISH). She heads the Artificial Intelligence Innovation Network at WBS.

Headshot of Kate Kellogg

is the David J. McGrath Jr Professor of Management and Innovation, a Professor of Business Administration at the MIT Sloan School of Management. Her research focuses on helping knowledge workers and organizations develop and implement Predictive and Generative AI products, on-the-ground in everyday work, to improve decision making, collaboration, and learning.

Headshot of Saran Rajendran

is Director of Strategy and Execution at Palo Alto Networks.

Headshot of Lisa Krayer

is Principal at Boston Consulting Group (BGC).

Headshot of Francois Candelon

is Partner Value Creation & Portfolio Monitoring at Seven2.

Headshot of Karim Lakhani

is the Dorothy & Michael Hintze Professor of Business Administration at 性视界 Business School. He specializes in technology management, innovation, digital transformation, and artificial intelligence. He is also the Co-Founder and Faculty Chair of the HBS AI Institute and the Founder and Co-Director of the Laboratory for Innovation Science at 性视界 (LISH).

Watch a video version of the Insight Article here.

The post Back to the Beginnings of AI at Work appeared first on 性视界 Business School AI Institute.

]]>
Everyone Has AI. Which Firms are Going to Win? /everyone-has-ai-which-firms-are-going-to-win/ Tue, 07 Apr 2026 12:50:00 +0000 /?p=29965 New research shows that access to AI is not the same as knowing where to use it. Listen to this article: A firm is only as fast as the slowest step in its chain of work. In manufacturing, it might be one particular machine on the line. In software, one overloaded intake service. Many business […]

The post Everyone Has AI. Which Firms are Going to Win? appeared first on 性视界 Business School AI Institute.

]]>
New research shows that access to AI is not the same as knowing where to use it.

A firm is only as fast as the slowest step in its chain of work. In manufacturing, it might be one particular machine on the line. In software, one overloaded intake service. Many business leaders are accidentally recreating this scenario with artificial intelligence. They provision AI tools to employees and hear about localized productivity spikes, but the company鈥檚 overall performance barely moves. This tension lies at the heart of the new working paper 鈥,鈥 from co-authors at INSEAD and the 性视界 Business School AI Institute. By tracking hundreds of organizations, the researchers have uncovered friction points that hold firms back from realizing the true economic promise of generative AI.

Key Insight: A Global 性视界 for AI鈥檚 Real Value

鈥淒iscovering where and how AI creates value is fundamentally a search problem.鈥 [1]

To test how companies can overcome the barrier of firm-level AI performance, the authors conducted a massive field experiment involving 515 high-growth startups spanning the globe. All participating firms received API credits, access to frontier AI models, and technical training. A randomly selected treatment group of firms also received specialized case studies highlighting how AI-native companies reorganize their production workflows, teams, and business models around the technology. Control firms attended workshops on general entrepreneurship practices. The design let the researchers hold access and technical skill constant while varying which firms gained perspective across a much wider set of organizational functions, thereby expanding their search space for AI opportunities. 

Key Insight: A Small Nudge, Outsized Results

鈥淭reated ventures achieve faster growth without proportional increases in labor or capital, consistent with a reduction in the costs of experimentation and scaling seen in earlier technological waves.鈥 [2]

The performance effects were substantial. The treatment startups discovered 44% more AI use cases, particularly in high-leverage areas like strategy and product development. They completed 12% more tasks, became 18% more likely to land paying customers, and generated an astounding 1.9 times higher revenue compared to the control group. What makes these numbers even more fascinating is that these companies did not spend their way to growth. In fact, their demand for external capital investments actually fell by 39.5%, proving that AI enables firms to scale outputs without scaling inputs proportionally. The researchers found that these gains were heavily concentrated in the upper tail, suggesting that AI lifts the ceiling of what top ventures can achieve rather than just making struggling businesses slightly better. One startup built an end-to-end AI pipeline covering classification, compliance checking, and bid pricing without hiring any technical staff, growing from zero to $40,000 in revenue with four paying customers during the ten-week program. 

Key Insight: A Cognitive Bottleneck

鈥淭wo firms with identical tools, training, and budgets can realize very different returns if one searches more broadly across its production process for where the technology creates value.鈥 [3]

The researchers conclude that the ultimate blocker for AI gains is not the cost of technology or a lack of skills, but what they call the mapping problem: 鈥渄iscovering where and how AI creates value within a firm鈥檚 production process.鈥 [4] Most leaders default to localized, obvious AI solutions like launching a customer service chatbot or drafting email responses. The untapped potential comes from discovering how to rethink interconnected, complementary tasks across the entire enterprise. For example, a field services startup in the study rebuilt its entire operations chain of dispatcher, bookkeeper, scheduler, and collections staff into a sequence of AI modules that self-improve, fundamentally changing the firm鈥檚 cost structure. Solving the mapping problem is about overcoming cognitive constraints to see AI as a way to redraw your company鈥檚 production landscape, rather than simply slapping digital band-aids on legacy processes. 

Why This Matters

For business leaders and executives, this research shows that the organizations most likely to realize substantial AI-driven results are those that invest not just in technology, but in the wide-ranging process of exploring where it fits. That is a strategy and execution problem, and leaders will need to ask which parts of their organizations need redesign rather than optimization. If you don鈥檛 actively push the boundaries of how AI rewrites your firm, you risk using a map that never leads you to your destination. 

Bonus

What happens when the bottleneck lies in the surrounding market, rather than within your business? For example, committing too early to a single AI provider, before the technology has stabilized, risks being locked into a platform that may not be the right fit 6 months or a year from now. For a look at whether the competitive landscape will reward flexibility, check out Is GenAI Heading for a Tech Monopoly?

References

[1] Kim, Hyunjin, Dahyeon Kim, and Rembrand Koning, 鈥淢apping AI into Production: A Field Experiment on Firm Performance,鈥 INSEAD Working Paper No. 2026/20/STR (March 2026), 2. .

[2] Kim et al., 鈥淢apping AI into Production,鈥 4.

[3] Kim et al., 鈥淢apping AI into Production,鈥 6.

[4] Kim et al., 鈥淢apping AI into Production,鈥 2.

Meet the Authors

is Assistant Professor of Strategy at INSEAD.

Dahyeon Kim

is a PhD student in strategy at INSEAD.

Headshot of Rembrand M. Koning

is Mary V. and Mark A. Stevens Associate Professor of Business Administration at 性视界 Business School, and the co-director and co-founder of the Tech for All lab at the HBS AI Institute.

Watch a video version of the Insight Article here.

The post Everyone Has AI. Which Firms are Going to Win? appeared first on 性视界 Business School AI Institute.

]]>
The Surprising Link Between AI Reasoning and Honesty /the-surprising-link-between-ai-reasoning-and-honesty/ Mon, 23 Mar 2026 12:09:09 +0000 /?p=29780 Exploring how the complexity of large language models acts as a moral safeguard Listen to this article: The standard fear about advanced AI goes something like this: the more sophisticated a system becomes, the better it gets at sounding convincing, reading the room, and manipulating people. A model that can reason step-by-step might not just […]

The post The Surprising Link Between AI Reasoning and Honesty appeared first on 性视界 Business School AI Institute.

]]>
Exploring how the complexity of large language models acts as a moral safeguard

The standard fear about advanced AI goes something like this: the more sophisticated a system becomes, the better it gets at sounding convincing, reading the room, and manipulating people. A model that can reason step-by-step might not just answer better, it might lie better. That concern feels intuitive, especially as businesses hand more customer interactions, internal workflows, and decision support to increasingly capable systems. However, in the new study 鈥,鈥 co-written by 性视界 Business School AI Institute Associate Martin Wattenberg, a team of researchers found that our intuition might be backward. Through an exhaustive series of tests involving moral trade-offs and complex reasoning traces, they found that when an AI is forced to slow down and show its work, it becomes significantly more honest.聽

Key Insight: Testing the Moral Compass

鈥淓ach scenario is paired with two options: one favoring honesty and the other deception.鈥 [1]

To study deceptive behavior rigorously, the researchers built a new benchmark dataset called DoubleBind, a collection of social dilemmas engineered so that choosing honesty comes at a tangible, variable cost. In one scenario, a manager praises you for an analysis your colleague actually produced, so correcting the record means losing a promotion. The financial stakes shift across versions of each dilemma, allowing the researchers to observe how models respond as the price of honesty rises. They also augmented an existing dataset, DailyDilemmas, with the same cost-scaling structure. Together, the two datasets gave the team a controlled way to probe moral trade-offs across six open-weight model families. Each model was tested in two modes: token-forcing, where the model answers immediately without deliberation, and reasoning mode, where the model deliberates for a specified number of sentences before committing to a final recommendation. Models are honest roughly 80% of the time under token-forcing, though that rate erodes as the cost of telling the truth climbs.

Key Insight: Why Deliberation Favors the Truth

鈥淸M]odels are significantly more likely to choose the honest option when required to reason before providing a final answer.鈥 [2]

In human psychology, the 鈥渄ual-process鈥 theory suggests that our first, intuitive impulse is often prosocial, while slow, calculated reasoning allows us to justify selfish or deceptive behavior. We might 鈥渃alculate鈥 our way into a lie. The researchers found that LLMs flip this script entirely: across all model families tested, reasoning increases the probability of an honest recommendation, and longer deliberation amplifies the effect. Additionally, it seems that the effect doesn鈥檛 principally come from the reasoning text itself. If chain-of-thought were simply constructing a persuasive moral argument, then reading the reasoning should make the model鈥檚 final decision easy to predict, but that is not what the researchers found. Reasoning traces frequently read like balanced surveys of the pros and cons of both options rather than arguments building toward a verdict. The decision to deceive, when it happens, tends to arrive without a legible trail. This is what the researchers call the 鈥渇acsimile problem鈥 – reasoning changes behavior, but not because of what it says.

Key Insight: Deceptive Answers are Easier to Shake Loose

鈥淲e hypothesize that compared to honesty, deception is a metastable state鈥攖hat is, deceptive outputs are easily destabilized.鈥 [3]

If the content of reasoning doesn鈥檛 explain the honesty boost, what does? The researchers propose a theory based on the 鈥済eometry鈥 of the model鈥檚 internal states. They suggest that honesty is a stable, broad region in the AI鈥檚 conceptual map, while deception is a 鈥渕etastable鈥 state, essentially a narrow, fragile peak that is easily knocked over. When a model is 鈥渢hinking,鈥 it is navigating through its internal landscape. Because the honest regions of this space are larger and more 鈥渞obust,鈥 the process of reasoning draws the model toward them. The researchers tested this claim several ways. By changing the wording slightly through paraphrasing, they found that deceptive answers are much more likely to flip than honest ones. By resampling the model鈥檚 output, they found that initially deceptive recommendations often become honest, while honest ones usually stay put. Across these tests the asymmetry was consistent: honesty is robust, deception is fragile.

Why This Matters

For business leaders, the value of this paper is not that AI can now be assumed trustworthy. Rather, it offers a more useful way to think about risk. If deceptive outputs are less stable, then system design can exploit that fact. Building deliberation into AI workflows may become an important step before interfacing with customers or making high-stakes decisions. Organizations need systems that hold up when incentives get messy, and this paper suggests that at least in some cases, more reasoning may keep AI honest when it counts.

Bonus

In another study from HBS AI Institute associates, researchers found that fine-tuning LLMs on specialized datasets generally degrades chain-of-though reasoning performance. Faithfulness and Accuracy: How Fine-Tuning Shapes LLM Reasoning is a critical reminder that the choices made before deployment could erode the reasoning capacity you鈥檙e counting on.

References

[1] Ann Yuan et al., 鈥淭hink Before You Lie: How Reasoning Leads to Honesty,鈥 arXiv preprint arXiv:2603.09957 (2026): 3. . 

[2] Yuan et al., 鈥淭hink Before You Lie,鈥 4.

[3] Yuan et al., 鈥淭hink Before You Lie,鈥 2.

Meet the Authors

Martin Wattenberg

is Gordon McKay Professor of Computer Science at the 性视界 John A. Paulson School of Engineering and Applied Sciences, and an Associate Collaborator at the HBS AI Institute.

Additional authors: Ann Yuan, Asma Ghandeharioun, Carter Blum, Alicia Machado, Jessica Hoffmann, Daphne Ippolito, Lucas Dixon, Katja Filippova

Watch a video version of the Insight Article here.

The post The Surprising Link Between AI Reasoning and Honesty appeared first on 性视界 Business School AI Institute.

]]>
Why Your AI Strategy May Be Failing /why-your-ai-strategy-may-be-failing/ Mon, 16 Mar 2026 11:42:51 +0000 /?p=29696 How companies can overcome the structural frictions that block AI at scale Listen to this article: AI has entered the enterprise faster than most previous waves of technology, reshaping expectations about speed, productivity, and decision-making. Yet adoption alone does not produce transformation. The Frontier Firm Initiative (FFI), a joint effort between the 性视界 Business School […]

The post Why Your AI Strategy May Be Failing appeared first on 性视界 Business School AI Institute.

]]>
How companies can overcome the structural frictions that block AI at scale

AI has entered the enterprise faster than most previous waves of technology, reshaping expectations about speed, productivity, and decision-making. Yet adoption alone does not produce transformation. The Frontier Firm Initiative (FFI), a joint effort between the 性视界 Business School AI Institute and Microsoft, recently convened senior leaders from a dozen global organizations to address the 鈥渓ast mile鈥 challenge, when a company tries to scale localized, successful AI pilot programs into a standard, enterprise-wide operating model. In the new HBR article 鈥,鈥 Karim R. Lakhani and Jen Stave of the HBS AI Institute and Microsoft鈥檚 Jared Spataro identify a framework of the specific 鈥渇rictions鈥 stalling progress and outline a strategic blueprint to overcome them. In the insight below, we will zoom in on one friction and one corresponding recommendation from the blueprint to resolve it.

Key Insight: The Weight of What Already Exists

鈥淸T]he primary obstacle to progress is rarely model quality or data availability, but rather the 鈥榣ast mile鈥 of transformation where technical capability must meet organizational design.鈥 [1]

When organizations try to implement AI, they often assume that any issues will arise from the AI technology itself. However, the authors find that AI actually functions as a “diagnostic tool” that exposes problematic processes already present within a firm. For example, the authors label 鈥減rocess debt鈥 as the accumulation of fragmented and inconsistent workflows built up over years of embeddedness and geographic specificity. The article cites, for example, one professional-services firm operating in more than 170 countries where the 鈥渟ame鈥 process had dozens of different, regional variations. 

Key Insight: Designing the Organization AI Deserves

鈥淸F]or every bottleneck discovered in the 鈥榣ast mile,鈥 a corresponding shift in this blueprint provides the path forward.鈥 [2]

One of the shifts by organizations making real headway is replacing legacy processes. For example, what the authors call 鈥渃lean-sheet redesign鈥 starts by asking whether a given workflow would exist at all if the company were built today around AI agents. This demands equally fresh thinking about people and governance: capturing expert judgment as a codified asset rather than a protected credential, redesigning roles toward oversight and interpretation rather than execution, and managing AI agents with the same accountability structures applied to human teams.

Why This Matters

For today鈥檚 business professionals and executives, the 鈥渓ast mile鈥 is less a technical challenge and more a test of leadership imagination. Process debt and clean-sheet redesign are only two parts of a broader diagnosis of seven frictions and corresponding transformation strategies. Read the to see them all. The potential of the technology you have already purchased is immense, but realizing it requires the courage to redesign the organization to match the speed of an agentic world.

References

[1] Lakhani, Karim R., Jared Spataro, and Jen Stave, 鈥淭he 鈥楲ast Mile鈥 Problem Slowing AI Transformation,鈥 性视界 Business Review, March 9, 2026, . 

[2] Lakhani et al., 鈥淭he 鈥楲ast Mile鈥 Problem Slowing AI Transformation.鈥 

Meet the Authors

Headshot of Karim Lakhani

is the Dorothy & Michael Hintze Professor of Business Administration at 性视界 Business School. He specializes in technology management, innovation, digital transformation, and artificial intelligence. He is also the Co-Founder and Faculty Chair of the HBS AI Institute and the Founder and Co-Director of the Laboratory for Innovation Science at 性视界.

is Chief Marketing Officer, AI at Work, Microsoft.

Jen Stave Jen Stave is Executive Director of the HBS AI Institute. She was previously Senior Vice President at Wells Fargo, and has a PhD from American University.

The post Why Your AI Strategy May Be Failing appeared first on 性视界 Business School AI Institute.

]]>
Competing in the Dark /competing-in-the-dark/ Thu, 12 Mar 2026 12:29:33 +0000 /?p=29670 New research reveals that firms are playing a game of catch-up they didn鈥檛 even know they were losing Listen to this article: Leaders know they must innovate to survive, and they don鈥檛 make decisions in a vacuum: they watch rivals, draw inferences, and position themselves accordingly. But what happens when their picture of the competitive […]

The post Competing in the Dark appeared first on 性视界 Business School AI Institute.

]]>
New research reveals that firms are playing a game of catch-up they didn鈥檛 even know they were losing

Leaders know they must innovate to survive, and they don鈥檛 make decisions in a vacuum: they watch rivals, draw inferences, and position themselves accordingly. But what happens when their picture of the competitive landscape is fundamentally wrong? The new NBER working paper, 鈥,鈥 co-written by a team of researchers including 性视界 Business School AI Institute associate Zo毛 B. Cullen, takes this question seriously. Embedded in the Bank of Italy鈥檚 long-running INVIND survey, the study ran a randomized field experiment with roughly 3,000 Italian firms to test whether correcting firms鈥 misperceptions about competitors鈥 technology adoption changes their own investment plans. What they found should matter to any leader thinking about AI, automation, and the pace of organizational change.

Key Insight: Racing Without a Scorecard

鈥淭hese data provide a unique opportunity to measure the beliefs firms hold about their competitors鈥 adoption decisions鈥攁nd to identify their causal effects on firms鈥 own adoption behavior.鈥 [1]

The central question driving the research is deceptively simple: are firms more likely to adopt advanced technologies like AI when they expect competitors to adopt them? This is a problem economists call 鈥榮trategic complementarity鈥欌攖he idea that one firm鈥檚 incentive to act depends partly on what others around it are doing. It鈥檚 easy to theorize about, but hard to test with real firms making real decisions.

The researchers solved this through a massive field experiment. They first asked firms what share of their competitors were currently using AI or robotics, then randomly provided half of them with the actual adoption rates of peers in their specific sector and size class. By measuring how these firms updated their 2027 adoption plans after seeing the data, the researchers could cleanly identify whether knowing a rival鈥檚 move actually changes your own. In the real world, if two firms adopt AI at the same time, it鈥檚 hard to tell if one is copying the other or if they are both just reacting to something like a new tax break or a labor shortage. By introducing a controlled 鈥渋nformation shock,鈥 the researchers could prove that it was the knowledge of competitor behavior itself that drove the change in strategy.

Key Insight: We Are More Alone Than We Think

鈥淥n average, prior beliefs underestimated actual adoption by 24.6 pp.鈥 [2]

The research surfaced a striking baseline finding: firms are deeply mistaken about how technologically advanced their peers already are. On average, firms underestimated the share of competitors using AI or robotics by about 25 percentage points, a gap so large it suggests companies are making strategic decisions based on a competitive landscape that no longer exists. But when treated firms received accurate peer-adoption figures, they meaningfully revised their expectations upward鈥攁 response consistent with rational updating, and proportional to how far off their original estimates had been.

However, one of the most fascinating findings of the study was that not all technologies are equal when it comes to peer pressure. While information about competitors significantly increased intentions to adopt robotics, it had almost no measurable effect on plans for AI. The researchers offer several explanations: (1) AI adoption was already higher at baseline (roughly half of firms planned to use it by 2027), leaving less headroom for growth. (2) Robotics is a mature, deeply embedded technology in Italian manufacturing, so firms that see rivals using it extensively are receiving a clear legible signal. (3) AI, by contrast, is newer and often adopted experimentally, so competitive signals carry more ambiguity. Perhaps the return on investment for AI is still shrouded in uncertainty.

Key Insight: Information Campaigns

鈥淸G]overnments could deploy information campaigns that raise firms鈥 awareness of the productivity benefits of new technologies and the extent to which their peers are adopting them.鈥 [3]

Traditionally, governments try to spur innovation through expensive financial incentives and subsidies. However, this research points to a much cheaper and potentially more effective tool: the information campaign. If the primary reason firms aren鈥檛 adopting new technology is a misperception of the competitive landscape, then simply publishing accurate, sector-specific adoption data could do more to modernize an industry than a mountain of tax breaks. The researchers note two mechanisms that may be at work: a competition channel, where firms fear falling behind rivals, and a learning channel, where they use peer behavior to infer a technology鈥檚 productivity potential. Evidence from firms in concentrated markets suggests both channels are active, though neither can be fully isolated with current data. 

Why This Matters

For executives and business leaders, this research surfaces a concrete and often underappreciated source of strategic risk: competitive misperception. If your organization is making AI and automation investment decisions based on an outdated view of where your industry actually stands, you may be systematically underinvesting, not because you lack capital or ambition, but because you lack accurate signals. The practical implication is that competitive intelligence on technology adoption is a direct input into investment strategy. For those thinking about the longer arc of AI diffusion, the contrast between robotics and AI results is instructive: behavioral responses to peer signals are strongest when a technology has a proven track record. As generative AI matures from experiment to infrastructure, the competitive spillovers documented here for robotics will likely be coming for AI next.

Bonus

This research shows that learning what peers are actually doing with advanced technology can shift decision-making. So here鈥檚 a question worth asking: how well do you really know where GenAI adoption stands in the broader workforce? The , built by a team including HBS AI Institute Associate David Deming, offers a data-grounded answer. Drawing on five nationally representative U.S. surveys and 25,000 respondents, it tracks GenAI use at work and at home, adoption rates among working-age adults, and the productivity time savings already being realized. Consider it your lamp in the darkness.

References

[1] Cullen, Zo毛 B., Ester Faia, Elisa Guglielminetti, Ricardo Perez-Truglia, and Concetta Rondinelli, “The Innovation Race: Experimental Evidence on Advanced Technologies,” NBER Working Paper 34532 (2025), 2. . 

[2] Cullen et al., 鈥淭he Innovation Race,鈥 3.

[3] Cullen et al., 鈥淭he Innovation Race,鈥 26.

Meet the Authors

Headshot of Zoe Cullen

is Associate Professor of Business Administration at 性视界 Business School and Associate at the HBS AI Institute.

is Professor at Goethe University Frankfurt.

Headshot of Elisa Guglielminetti

is an Economist at the Bank of Italy.

Ricardo Perez-Truglia

is a Professor at UCLA鈥檚 Anderson School of Management.

Concetta Rondinelli

is a Senior Economist at the Bank of Italy.

The post Competing in the Dark appeared first on 性视界 Business School AI Institute.

]]>
Can You Spot the Bot? /can-you-spot-the-bot/ Thu, 05 Mar 2026 13:09:22 +0000 /?p=29558 New research reveals just how convincingly AI mimics humans Listen to this article: Alan Turing鈥檚 original 鈥渋mitation game,鈥 proposed in 1950, had an elegant simplicity: a human judge conducts a text-based conversation with two hidden parties鈥攐ne human, one machine鈥攁nd tries to guess which is which. Today, the question Turing posed has quietly expanded into territory […]

The post Can You Spot the Bot? appeared first on 性视界 Business School AI Institute.

]]>
New research reveals just how convincingly AI mimics humans

Alan Turing鈥檚 original 鈥渋mitation game,鈥 proposed in 1950, had an elegant simplicity: a human judge conducts a text-based conversation with two hidden parties鈥攐ne human, one machine鈥攁nd tries to guess which is which. Today, the question Turing posed has quietly expanded into territory he never mapped. Our digital existence is a kaleidoscope of multi-modal interactions. We don鈥檛 just “talk” to the internet, we upload snapshots of our morning coffee, interpret complex visual data in professional dashboards, estimate the mood of a room through a video call, and follow subtle cues of visual attention. co-written by Hanspeter Pfister, 性视界 Business School AI Institute Associate and An Wang Professor of Computer Science at 性视界 SEAS, explains how a new large-scale study from researchers at 15 organizations around the globe drags the imitation game into the full complexity of how humans communicate, perceive, and describe the world. Are we already past the point where we can reliably tell machines from humans, and does it matter who鈥檚 doing the judging?

Key Insight: A Gauntlet of Language and Vision

鈥淸W]e present an integrative benchmark encompassing a wide range of standard and well-established AI tasks across both language and vision.鈥 [1]

Rather than testing imitation in a single domain, the researchers designed a six-task benchmark spanning language and vision. Language tasks included image captioning, word association, and open-ended conversation. Vision tasks covered color estimation (identifying the dominant color in a scene), object detection (naming three visible items), and attention prediction (comparing human eye-tracking data with AI-generated gaze sequences). The data collection was correspondingly ambitious: 36,499 responses from 636 human participants and 37 AI models, evaluated through 72,191 Turing-like tests administered to 1,916 human judges and 10 AI judges. A subtle but important design choice: the tests were not trying to determine accuracy, they were trying to quantify indistinguishability: a system can be wrong and still match human patterns, or be correct and still fail to pass as a human.

Key Insight: Measurement, Not Myth

鈥淸W]e consider Turing-like tests as a quantitative evaluation of how well current AIs can imitate humans.鈥 [2]

The Turing test has always been contentious. Critics argue that a machine could pass it without genuine understanding, that it measures performance rather than intelligence, or that it鈥檚 too narrow to be meaningful. The researchers behind this study sidestep that debate entirely. Their goal isn鈥檛 to adjudicate whether AI is intelligent , it鈥檚 to measure something more practical: the ability of machines to convincingly replicate human behavior, including our flaws and mistakes. By using imitation detectability, which is the ability of a judge to distinguish between a person and an algorithm, the researchers create a framework for measuring how 鈥渉uman-like鈥 a machine has become. That shift matters because conventional benchmarks often assume a stable ground truth and score models by how accurately they match a single ‘correct’ answer predefined by a human. But many real tasks, especially open-ended ones, don鈥檛 have a single ‘correct’ output. And even when they do, being right isn’t the same as acting human; a truly human-like AI should make the exact same mistakes we do.

Key Insight: The Vanishing Threshold of Detection

鈥淸C]urrent algorithms are not far from being able to imitate humans in these tasks.鈥 [3]

The results of the study are both a testament to engineering prowess and a warning for the future of digital trust. Across the six tasks, the researchers found that the detectability of AI is plummeting. In a 鈥榩erfect鈥 world where humans and machines are indistinguishable, a judge would have a 50% chance of being right鈥攅ssentially a coin flip. The study found that for many tasks, the detectability scores are hovering remarkably close to that 50/50 chance level. In areas like image captioning and word associate, the gap has become so thin that even motivated human judges struggle to tell the difference between a person鈥檚 description and an AI鈥檚. Yet the convergence is not complete: AI still occasionally trips over the uniqueness of human behavior. For instance, humans are surprisingly consistent in where they look in a picture, while machines sometimes focus on statistically likely but humanly irrelevant details.

At the same time, one of the paper鈥檚 most striking results showed that AI proves highly effective in a different role: judge rather than subject. Simple machine learning classifiers trained to distinguish human- from AI-responses outperformed human judges on most tasks. For word associations, the AI judge correctly identified machine-generated responses 91% of the time, compared to just 47% for humans. Machines, it seems, are far better than we are at spotting each other.

Why This Matters

For executives and business leaders, this research redraws the risk landscape in two directions. First, the near invisibility of AI responses in everyday tasks means fraud, disinformation, and impersonation are no longer theoretical risks, they are statistically plausible at scale, today. Second, because automated classifiers outperform human judges, detection cannot rely on human vigilance alone anymore. It requires infrastructure, and regulators in the EU and elsewhere are already moving toward mandatory AI disclosure requirements. This paper highlights the importance of building transparency tools now to be prepared for when they are required and to ensure you can maintain your customers鈥 trust. 

Bonus

As AI systems get more capable, they鈥檙e also getting harder to understand. Another response to this challenge is to build clearer explanations for why models behave the way they do with a single, coherent framework. To go deeper on this initiative, check out 鈥Unifying AI Attribution: A New Frontier in Understanding Complex Systems.鈥

References

[1] Mengmi Zhang et al., 鈥淐an Machines Imitate Humans? Integrative Turing-like tests for Language and Vision Demonstrate a Narrowing Gap,鈥 arXiv preprint arXiv:2211.13087v3 (2025): 3.  

[2] Zhang et al., 鈥淐an Machines Imitate Humans?鈥: 2.

[3] Zhang et al., 鈥淐an Machines Imitate Humans?鈥: 16.

Meet the Authors

Hanspeter Pfister

is An Wang Professor of Computer Science at 性视界 John A. Paulson School of Engineering and Applied Sciences and HBS AI Institute Associate.

Additional Authors: Mengmi Zhang, Elisa Pavarino, Xiao Liu, Giorgia Dellaferrera, Ankur Sikarwar, Caishun Chen, Marcelo Armendariz, Noga Mudrik, Prachi Agrawal, Spandan Madan, Mranmay Shetty, Andrei Barbu, Haochen Yang, Tanishq Kumar, Shui鈥橢r Han, Aman Raj Singh, Meghna Sadwani, Stella Dellaferrera, Michele Pizzochero, Brandon Tang, Yew Soon Ong, Gabriel Kreiman

The post Can You Spot the Bot? appeared first on 性视界 Business School AI Institute.

]]>
The AI Deep Research Race Has a New Leaderboard /the-ai-deep-research-race-has-a-new-leaderboard/ Thu, 26 Feb 2026 14:07:08 +0000 /?p=29548 A new cross-domain benchmark reveals how the leading AI research tools perform on real-world production tasks Listen to this article: Two AI-generated research reports land on your desk before a major decision. Both are polished, confidently written, and well-structured, but they reach different conclusions. Which one do you trust, and how would you even begin […]

The post The AI Deep Research Race Has a New Leaderboard appeared first on 性视界 Business School AI Institute.

]]>
A new cross-domain benchmark reveals how the leading AI research tools perform on real-world production tasks

Two AI-generated research reports land on your desk before a major decision. Both are polished, confidently written, and well-structured, but they reach different conclusions. Which one do you trust, and how would you even begin to find out? In 鈥,鈥 a team at Perplexity and , Assistant Professor of Business Administration at 性视界 Business School and affiliate with the 性视界 Business School AI Institute, present a rigorous new benchmark for measuring how well AI deep research systems actually perform on real-world production tasks.

Key Insight: A New Standard for Deep Research Evaluation

鈥淲e introduce a cross-domain benchmark derived from real-world production deep research tasks designed to bridge the gap between AI evaluations and authentic research needs.鈥 [1]

AI 鈥渄eep research鈥 systems, tools that can autonomously decompose a complex question, search hundreds of sources, reconcile conflicting evidence, and synthesize findings into a cited report, are increasingly being used for high-stakes analytical work in areas such as finance, legal, and medicine. Unlike a simple chatbot response, these systems operate more like an analyst running an independent research process. While this technology has been advancing quickly, the frameworks for evaluating it have not kept pace. The authors argue that evaluating deep research must reflect realistic use cases, span domains, account for region-specific sources, and probe multiple system capabilities such as planning, search, and reasoning all at once.

Key Insight: Tasks Deeply Rooted in Practice

鈥淥ur main contribution is a curated set of benchmark tasks that closely mirror real deep research needs and how people use deep research agents in practice.鈥 [2]

Many AI benchmarks are built by researchers and experts imagining what hard questions look like. DRACO takes a different approach: its 100 tasks were sourced directly from actual user queries submitted to Perplexity鈥檚 deep research system in fall 2025. Specifically, researchers sampled from high-difficulty requests where users had expressed dissatisfaction, making these exactly the kinds of tasks where AI systems tend to struggle. Those raw queries were then anonymized, augmented to add specificity and scope, and filtered to ensure each task was objectively evaluable, appropriately bounded, and genuinely challenging. The results span 10 domains drawing on sources from 40 countries across five regions. 

Key Insight: Rating Real-World Complexity

鈥淭wenty-six domain experts, including medical professionals, attorneys, financial analysts, software engineers, and designers, were recruited to develop rubrics for selected tasks.鈥 [3]

DRACO鈥檚 grading rubrics were developed through a rigorous human-expert pipeline: an initial rubric is drafted by one expert, reviewed and refined by a second, subjected to a 鈥渟aturation test鈥 to ensure the current system cannot easily exceed 90% (which would indicate an overly easy task or lenient rubric), and finally validated by a third and fourth expert for quality assurance. Each task was ultimately assessed across an average of 39 criteria spanning four dimensions: factual accuracy, breadth and depth of analysis, presentation quality, and citation quality. 

Key Insight: Progress, But Gaps Remain

鈥淥ur evaluation of frontier deep research systems reveals that while significant progress has been made (especially in presentation quality), substantial headroom remains (especially in factual accuracy).鈥 [4]

The evaluation results indicate that while agents have improved across all rubric dimensions鈥攁nd now excel in presentation quality鈥攖hey continue to struggle with factual accuracy. This may partly stem from design choices: roughly half of all criteria focused on verifiable factual claims, and the rubrics also included negative criteria penalizing specific failure modes. In domains like medicine and law, these penalties are particularly severe, as incorrect or unsafe recommendations carry heavy negative weights. This reflects a core design principle: in high-stakes domains, what AI gets wrong matters as much as what it gets right.

Why This Matters

As we increasingly rely on AI for high-stakes tasks, from brainstorming and research to actual execution, the bottleneck is no longer speed, it鈥檚 accuracy. The area where AI performs best, producing polished, well-structured output, is precisely where it鈥檚 hardest for a non-specialist to detect errors. For business leaders, DRACO鈥檚 task-and-rubric design offers a concrete blueprint for evaluating and choosing research agents: define success criteria, test on representative workloads, and be sure to clarify how you鈥檒l know when it鈥檚 wrong.

Bonus

While it seems self-evident that we want the best and most accurate information from AI, that鈥檚 actually not always the case. Check out 鈥Explanations on Mute: Why We Turn Away From Explainable AI鈥 to see why.

References

[1] Joey Zhong et al., 鈥淒RACO: a Cross-Domain Benchmark for Deep Research Accuracy, Completeness, and Objectivity,鈥 arXiv preprint arXiv:2602.11685 (2026): 2.  

[2] Zhong et al., 鈥淒RACO鈥: 2.

[3] Zhong et al., 鈥淒RACO鈥: 5.

[4] Zhong et al., 鈥淒RACO鈥: 12.

Meet the Authors

Headshot of Jeremy Yang

is an Assistant Professor of Business Administration at 性视界 Business School and affiliated with the HBS AI Institute.

Additional Authors (Perplexity): Joey Zhong, Hao Zhang, Clare Southern, Thomas Wang, Kate Jung, Shu Zhang, Denis Yarats, Johnny Ho, Jerry Ma

The post The AI Deep Research Race Has a New Leaderboard appeared first on 性视界 Business School AI Institute.

]]>
The Manager鈥檚 AI Dilemma /the-managers-ai-dilemma/ Tue, 17 Feb 2026 13:23:54 +0000 /?p=29450 How to design AI adoption so decision makers can say yes without self-sabotage Lots of organizations can green-light AI. Far fewer can absorb it. That gap, between excitement and real, embedded use, keeps showing up even when ROI is compelling and leadership is visibly supportive. New research from 性视界 Business School AI Institute Frontier Firm […]

The post The Manager鈥檚 AI Dilemma appeared first on 性视界 Business School AI Institute.

]]>
How to design AI adoption so decision makers can say yes without self-sabotage

Lots of organizations can green-light AI. Far fewer can absorb it. That gap, between excitement and real, embedded use, keeps showing up even when ROI is compelling and leadership is visibly supportive. New research from 性视界 Business School AI Institute Frontier Firm affiliate Shunyuan Zhang and Das Narayandas reveals an uncomfortable idea contributing to this gap. In 鈥,鈥 they highlight that the very people who must approve and champion these technologies are the same ones whose jobs could be fundamentally threatened by them.

Key Insight: The Three Threats of Self-Disruptive Technologies (SDTs)

鈥淲e define SDTs as innovations that simultaneously (1) improve organizational performance and (2) erode the authority, discretion, or legitimacy of the role responsible for approving them.鈥 [1]

Traditional adoption theories typically focus on whether organizations are ready, whether the technology is useful, and whether there鈥檚 institutional pressure to adopt. But these frameworks miss something critical: they assume decision-makers are neutral agents acting on behalf of the firm. Now, add to the mix AI systems with the potential to automate managerial judgment, analytics platforms that centralize decision rights, or algorithmic tools that replace experiential expertise with codified models. When the manager in charge of approving these technologies anticipates that they will shrink their own role or reduce their influence, the approval decision becomes identity-laden. These Self-Disruptive Technologies, as Narayandas and Zhang call them, trigger three forms of role-level identity threat. Role compression occurs when automation shifts core work from 鈥渄eciding鈥 to 鈥渕onitoring,鈥 compressing the judgment and expertise that defines a role鈥檚 distinctive contribution. Control shift happens when discretion moves away from the approving role (e.g. centralized to analytics teams or delegated to algorithms), removing the decision authority that makes roles defensible within organizations. Span erosion reflects the contraction of influence over people, budgets, or processes, undermining status and future opportunity even when the formal position remains intact.

What makes these threats particularly powerful is that they can dominate the approval calculus even when firm-level incentives favor adoption and economic cases are strong. A manufacturing supervisor might support efficiency improvements in principle but resist when the technology eliminates the judgment calls that justify their expertise. A procurement manager might delay adopting an AI tool that demonstrably reduces costs because it centralizes decisions that previously sustained their organizational influence.

Key Insight: Engineering the Solution – Identity-Compatible Advantage (ICA)

鈥淚dentity-Compatible Advantage therefore does not operate by increasing perceived value or shifting bargaining power, but by enabling approvers to say yes without identity loss.鈥 [2]

Here鈥檚 where the research gets actionable. Narayandas and Zhang argue for an approach of Identity-Compatible Advantage to require bundling new technology with governance and role-design mechanisms that make adoption personally and politically defensible for managers. ICA includes five complementary elements: role rechartering that redefines the role around higher-order judgment rather than routine decisions; decision guardrails that preserve authority through override rights and and governance structures; analytical overlays that frame technology as augmentative rather than substitutive; redeployment pathways that provide credible commitments to role evolution rather than elimination; and executive sponsorship that legitimizes identity transition and reallocates accountability. 

The research emphasizes that these mechanisms work as a bundle, not in isolation. For example, implementing guardrails without rechartering leaves meaning unaddressed, as you give the manager the power to override the AI (restoring some control), but because the AI still does the core work, the manager feels their daily expertise is useless (leaving their loss of purpose and contribution unaddressed). The framework shows that successful SDT adoption requires designing offerings where endorsement becomes personally and politically defensible. 

Why This Matters

Most AI automation discourse has fixated on individual contributors like programmers, graphic designers, and copywriters because their work products are visible and the substitution story is easy to tell. This research adds a missing piece: the managers and decision-makers, who control whether AI technologies get adopted in the first place, are themselves facing automation of their core judgment and authority. For executives and business leaders, the implications are profound. If you treat AI adoption as a purely rational calculation, you are likely to be met with 鈥渟ymbolic adoption,鈥 where your team pays lip service to innovation while quietly ensuring that the status quo remains undisturbed. By utilizing Identity-Compatible Advantage, leaders can implement the complex undertaking of AI adoption as an evolution of their teams, not a replacement of them. The future of work belongs to the firms that can successfully re-anchor identities around high-level strategy, risk ownership, and the human-centric decisions that no machine can replicate.

Bonus

The path to real AI adoption runs through design choices: how you frame AI, where you keep humans in the loop, and how you protect legitimacy. For another look at the dynamics of AI in the workplace, check out Drawing the Line on AI Usage in the Workplace.

References

[1] Narayandas, Das and Shunyuan Zhang, 鈥淪elling Self-Disruptive Technologies: Identity-Compatible Advantage and the Role-Level Microfoundations of Automation Adoption.鈥 性视界 Business School Working Paper, No. 26-050 (February 9, 2026): 5.  

[2] Narayandas and Zhang, 鈥淪elling Self-Disruptive Technologies,鈥 9.

Meet the Authors

Headshot of Das Narayandas

is Edsel Bryant Ford Professor of Business Administration at 性视界 Business School.

Headshot of Shunyuan Zhang

is Associate Professor of Business Administration at 性视界 Business School. She and other HBS faculty contribute to the HBS AI Institute Frontier Firm Initiative.

The post The Manager鈥檚 AI Dilemma appeared first on 性视界 Business School AI Institute.

]]>
The Fast-Talking AI Chat Agent /the-fast-talking-ai-chat-agent/ Wed, 11 Feb 2026 13:19:22 +0000 /?p=29426 New research shows when AI boosts service, and when it backfires. Think about the last time you contacted customer support. Did you start with a chatbot? If it failed to resolve your problem, how did you feel when transferred to a human agent? This dynamic defines our expectations of the modern customer service experience: the […]

The post The Fast-Talking AI Chat Agent appeared first on 性视界 Business School AI Institute.

]]>
New research shows when AI boosts service, and when it backfires.

Think about the last time you contacted customer support. Did you start with a chatbot? If it failed to resolve your problem, how did you feel when transferred to a human agent? This dynamic defines our expectations of the modern customer service experience: the struggle to balance the cold speed of automation with the warm necessity of human empathy. However, in 鈥,鈥 性视界 Business School AI Institute Frontier Firm affiliate Shunyuan Zhang and Das Narayandas explain how the results from a year-long experiment involving 138 customer service agents and over 250,000 conversations are far more complex than the typical assumption.聽

Key Insight: AI Assistance Isn鈥檛 Just Faster, It鈥檚 More Human

鈥淲e posit that AI enables agents to handle conversations more efficiently, thus encouraging more responses from customers, leading to deeper back-and-forth interactions between them.鈥 [1]

The prevailing fear in customer service is that introducing AI will turn human interactions into robotic, assembly-line exchanges. Yet, when agents received real-time AI-generated reply suggestions, they didn鈥檛 just respond 22% faster to customer messages, they actually sent more messages and saw a measurable boost in the 鈥渉uman鈥 quality of the chats. AI freed agents from the cognitive burden of composing responses, allowing them to engage customers more deeply. The conversations became richer, not shallower: using large language models to categorize agent messages, the researchers found that AI-assisted responses scored higher in the key aspects of empathy, information, and solution, with the largest jump in empathy.

Key Insight: The Experience Equalizer

鈥淪pecifically, for a hypothetical brand new agent, AI would lead to a remarkable reduction in agent response time of approximately 70.3%.鈥 [2]

One of the most business-relevant findings in the study was that AI assistance didn鈥檛 benefit everyone equally. When the researchers examined how agent tenure moderated AI鈥檚 effects, they found that less-experienced agents gained far more from AI suggestions than their veteran counterparts. Essentially, the AI 鈥渄ownloaded鈥 institutional knowledge into the workflow of new employees: having access to these real-time suggestions was the functional equivalent of nearly five months of experience. This has profound implications for industries with high turnover, suggesting that AI can serve as a stabilized bridge, ensuring that a customer鈥檚 experience doesn鈥檛 suffer just because they happened to be connected to a trainee.

Key Insight: Not All Conversations Are Created Equal

鈥淒ifferent customer intents shape the context and dynamics of conversations, and if AI fails to adapt to these nuances, it may provide misleading suggestions, potentially harming interactions.鈥 [3]

The AI algorithm鈥檚 impact varied depending on why customers were reaching out in the first place. For example, when customers wanted to cancel subscriptions鈥攖raditionally difficult conversations鈥擜I helped agents identify underlying reasons and recommend alternative options, leading to notable improvements in customer sentiment. But repeat complaints told a different story. Although AI helped agents respond quickly in these scenarios, customer sentiment barely improved. These complaints stemmed from systematic operational issues, like recurring delivery problems, that no amount of empathetic, information-rich messaging could solve. The AI could help agents communicate better about problems, but it couldn鈥檛 actually fix them.

Perhaps the most counterintuitive finding emerged from examining what happened in the handoff from a bot to a human agent. Many companies use a 鈥渃hatbot first鈥 approach, where a fully automated bot tries to solve the problem before transferring the customer to a human. As we鈥檝e seen, AI-assisted agents are able to respond more quickly, and if the AI-assisted agent responded too quickly, customers suspected that they were still talking to a bot. The response speed that might normally delight customers became a liability, triggering what the researchers term a negative 鈥渟pillover鈥 from the initial bot failure. In these contexts, the study found that increasing the delay in human responses actually helped rebuild trust and improve sentiment.

Why This Matters

For executives deploying AI in customer-facing operations, this research delivers three strategic imperatives. First, resist the temptation to replace human agents entirely: augmentation delivers better outcomes than automation alone, particularly for handling nuanced, emotionally charged interactions. Second, deploy AI with precision: it鈥檚 most valuable in specific conversation types (like retention scenarios). Third, manage your AI ecosystem holistically. If you鈥檙e using multiple AI systems in sequence, recognize that they鈥檙e not independent. The companies that will win with AI aren鈥檛 those that deploy the most LLMs, they鈥檙e those that understand how these systems interact across the entire customer ecosystem and adapt their implementation accordingly.

Bonus

When emotions are involved, who people think is responding can shape outcomes as much as what is said. For another angle on AI and human emotion, check out It Feels Like AI Understands, But Do We Care? New Research on Empathy.

References

[1] Zhang, Shunyuan, and Das Narayandas, 鈥淓ngaging Customers with AI in Online Chats: Evidence from a Randomized Field Experiment.鈥 Management Science 72 (1) (2025): 84.  

[2] Zhang and Narayandas, 鈥淓ngaging Customers with AI in Online Chats,鈥 84.

[3] Zhang and Narayandas, 鈥淓ngaging Customers with AI in Online Chats,鈥 75-76.

Meet the Authors

Headshot of Shunyuan Zhang

is Associate Professor of Business Administration at 性视界 Business School. She and other HBS faculty contribute to the HBS AI Institute Frontier Firm Initiative.

Headshot of Das Narayandas

is Edsel Bryant Ford Professor of Business Administration at 性视界 Business School.

The post The Fast-Talking AI Chat Agent appeared first on 性视界 Business School AI Institute.

]]>