New Operating Models and Business Models Archives | 性视界 Business School AI Institute /communities-of-practice/new-operating-models-business-models/ The 性视界 Business School AI Institute catalyzes new knowledge to invent a better future by solving ambitious challenges. Tue, 26 Aug 2025 12:36:18 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 /wp-content/uploads/2026/04/cropped-Screenshot-2026-04-16-at-10.14.43-AM-32x32.png New Operating Models and Business Models Archives | 性视界 Business School AI Institute /communities-of-practice/new-operating-models-business-models/ 32 32 When Software Becomes Staff /when-software-becomes-staff/ Mon, 25 Aug 2025 12:31:28 +0000 /?p=28286 If AI can accept light supervision and then be off and running, what does it mean for how leaders and organizations design work, govern risk, and account for value? Drawing on perspectives from Jen Stave Jen Stave , Executive Director of the Digital Data Design (D^3) Institute at 性视界, Columbia Business School鈥檚 Stephan Meier, and […]

The post When Software Becomes Staff appeared first on 性视界 Business School AI Institute.

]]>
If AI can accept light supervision and then be off and running, what does it mean for how leaders and organizations design work, govern risk, and account for value? Drawing on perspectives from Jen Stave Jen Stave , Executive Director of the Digital Data Design (D^3) Institute at 性视界, Columbia Business School鈥檚 Stephan Meier, and Salesforce CEO Marc Benioff, the recent New York Times Shop Talk article 鈥溾 briefly explores the rise and implications of AI agents that can act like teammates or supervisees.

Key Insight: Agentic AI as Managed Teammates

鈥淟ike a human employee, these tools would work independently with a bit of management.鈥

Jen Stave

Agentic tools are moving beyond chatbots and image generation. Unlike traditional automation that follows rigid scripts, AI agents function more like human employees: capable of independent decision-making after being given high-level goals and objectives.

Key Insight: An Uncertain Future

鈥淗ow the fruits of digital labor will be treated in economic terms is still unsettled.鈥

Jen Stave

On one hand, the impact of AI is already here and being measured, as evidenced by how the use of AI agents at Salesforce led to a 17% customer service cost reduction over nine months. But the article also raises a range of undecided questions related to economic capture, quality and accountability, and the right balance between human and AI worker numbers.

Why This Matters

For forward-thinking executives, increasingly the question isn鈥檛 whether to adopt agentic AI, but how to operationalize it productively and responsibly. While the efficiency gains are compelling, success requires thoughtful integration by leaders who are ready to address challenges of workforce transition, quality control, and ROI measurement.

Bonus

To read more about Agentic AI and digital labor, read 鈥,鈥 co-authored by Jen Stave, for the 性视界 Business Review.

The post When Software Becomes Staff appeared first on 性视界 Business School AI Institute.

]]>
Mastering Change Resilience: The Key to AI-Driven Success /mastering-change-resilience-the-key-to-ai-driven-success/ Tue, 05 Aug 2025 13:40:50 +0000 /?p=28035 The disconnect between AI鈥檚 transformative potential and the actual scale of implementation represents one of today鈥檚 most significant organizational challenges. In their new article for the 性视界 Business Review, 鈥淎 Guide to Building Change Resilience in the Age of AI,鈥 Karim Lakhani, Dorothy and Michael Hintze Professor of Business Administration at 性视界 Business School and […]

The post Mastering Change Resilience: The Key to AI-Driven Success appeared first on 性视界 Business School AI Institute.

]]>
The disconnect between AI鈥檚 transformative potential and the actual scale of implementation represents one of today鈥檚 most significant organizational challenges. In their new article for the 性视界 Business Review, 鈥,鈥 , Dorothy and Michael Hintze Professor of Business Administration at 性视界 Business School and faculty chair and co-founder of the Digital Data Design (D^3) Institute at 性视界, Jen Stave Jen Stave , executive director of the Digital Data Design (D^3) Institute at 性视界, Douglas Ng Douglas Ng Headshot Douglas Ng , Director of Design at the Digital Data Design (D^3) Institute at 性视界, and , managing director at BCG X, argue that this mismatch arises from structural issues and propose change resilience as a systematic approach to building the organizational capabilities necessary for AI success.

Key Insight: The Missing Ingredient

“The primary obstacle is the ability of companies to adapt, reinvent, and scale new ways of working. We call this change resilience.” [1]

In the fast-paced business environment created by AI, leaders are no longer able to apply traditional operating models to episodic development cycles. Previously, as Lakhani and his co-authors suggest, 鈥淵ou modernized your systems, trained your people, and operated in a stable environment until the next wave of disruption hit.鈥 [2] However, if your old approach is falling short in today鈥檚 environment and you鈥檙e feeling left behind, you aren鈥檛 alone: the results of a BCG survey discussed in the article report that “just 26% of organizations have achieved value from AI.” [3] Responding to both the challenges and opportunities AI presents, the authors call for a fundamental shift: companies must move beyond simply managing AI-driven change and instead embed AI as a core organizational competency through the continuous and comprehensive strategy of 鈥渃hange resilience.鈥

Key Insight: The Mindset

Sensing – Rewiring – Lock-In

Change resilience, according to the authors, is made up of three 鈥榤uscles鈥 working in concert to create a sustainable AI ecosystem. Sensing enables organizations 鈥渢o pick up weak technological, competitive, or societal signals early.鈥 Rewiring is 鈥渢he capacity to redeploy talent, data, capital, and decision rights in days or weeks, not fiscal quarters.鈥 Lock-In is 鈥渢he discipline to codify what a team learns (in process, code, or policy) so the next initiative starts from a higher baseline instead of reinventing the wheel.鈥 [3] The authors describe Shopify as a company that exemplifies these characteristics, as it constantly evolves rather than adding AI to old systems. As one example, in 2023, Shopify spun off its logistics arm to concentrate on product innovation, enabling rapid development of AI-native tools like Sidekick for entrepreneurs.

Key Insight: The Playbook

Learn – Do – Imagine – Act – Care

Lakhani and his co-authors break down change resilience into five components: Learn, Do, Imagine, Act, and Care. Learning involves widespread AI experimentation to shift attitudes, empower employees, and discover opportunities to take advantage of AI. Doing targets deficiencies with fast-paced AI initiatives. Imagining puts your entire organization up for discussion, challenging you to invent new operating models instead of duck-taping existing ones. Acting makes these cycles continuous in order to establish change resilience as a foundational strategy rather than a one-off solution. Finally, Caring emphasizes wellbeing measures to ensure that employees feel supported and avoid burnout. The article discusses Accenture, Singapore-based DBS Bank, Moderna, P&G, and Cisco as already leading the pack by incorporating these elements into their strategy and operations.

Why This Matters

For executives and business professionals, developing change resilience represents a crucial strategic priority for competing effectively in the AI era. By focusing on the three muscles and five-steps, leaders can position their companies to leverage AI and adapt to future technological advances. The companies already achieving breakthrough AI results share a common strategy: they invest in their organization鈥檚 capacity to change as aggressively as they invest in AI technology itself.

If you鈥檙e wondering how change resilient your organization is, 鈥溾 also includes a set of questions that can act as a litmus test.

References

[1] Karim Lakhani et al., 鈥淎 Guide to Building Change Resilience in the Age of AI,鈥 性视界 Business Review, July 29, 2025, . 

[2] Lakhani et al., 鈥淎 Guide to Building Change Resilience in the Age of AI.鈥

[3] Lakhani et al., 鈥淎 Guide to Building Change Resilience in the Age of AI.鈥

Meet the Authors

Headshot of Karim Lakhani

is the Dorothy & Michael Hintze Professor of Business Administration at 性视界 Business School. He specializes in technology management, innovation, digital transformation, and artificial intelligence. He is also the Co-Founder and Faculty Chair of the Digital Data Design (D^3) Institute at 性视界 and the Founder and Co-Director of the Laboratory for Innovation Science at 性视界.

Jen Stave Jen Stave is Executive Director of the Digital Data Design (D^3) Institute at 性视界. She was previously Senior Vice President at Wells Fargo, and has a PhD from American University.

Douglas Ng Douglas Ng Headshot Douglas Ng is Director of Design of the Digital Data Design (D^3) Institute at 性视界. As a digital strategist, technology educator, and innovation researcher, he specializes in AI transformation and translates the institute鈥檚 research for industry leaders.

is Managing Director with BCG X, where he specializes in Generative AI, AI platform engineering, and data management.

The post Mastering Change Resilience: The Key to AI-Driven Success appeared first on 性视界 Business School AI Institute.

]]>
Teaching Trust: How Small AI Models Can Make Larger Systems More Reliable /teaching-trust-how-small-ai-models-can-make-larger-systems-more-reliable/ Thu, 03 Jul 2025 16:56:06 +0000 /?p=27648 As Gen AI technology continues to rapidly evolve and LLMs are integrated into more and more applications, questions of trustworthiness and ethical alignment become increasingly crucial. In the recent study 鈥淕eneralizing Trust: Weak-to-Strong Trustworthiness in Language Models,鈥 authors Martin Pawelczyk, postdoctoral researcher at 性视界 working on trustworthy AI; Lillian Sun, undergraduate student at 性视界 studying […]

The post Teaching Trust: How Small AI Models Can Make Larger Systems More Reliable appeared first on 性视界 Business School AI Institute.

]]>
As Gen AI technology continues to rapidly evolve and LLMs are integrated into more and more applications, questions of trustworthiness and ethical alignment become increasingly crucial. In the recent study 鈥,鈥 authors , postdoctoral researcher at 性视界 working on trustworthy AI; , undergraduate student at 性视界 studying computer science; , PhD student in computer science at 性视界; , postdoctoral research associate at 性视界 working on trustworthy AI; and , Assistant Professor of Business Administration at 性视界 Business School and PI in D^3鈥檚 Trustworthy AI Lab, explore a novel concept: the ability to transfer and enhance trustworthiness properties from smaller, weaker AI models to larger, more powerful ones.

Key Insight: The Three Pillars of AI Trustworthiness

“Trustworthiness encompasses properties such as fairness (avoiding biases against certain groups), privacy (protecting sensitive information), and robustness (maintaining performance under adversarial conditions or distribution shifts).” [1]

The holistic conceptualization taken by the authors in this paper recognizes that, for LLMs to be truly trustworthy, they must excel across multiple domains simultaneously. The researchers tested and demonstrated these principles using real-world datasets, including the Adult dataset, based on 1994 U.S. Census data, where they evaluated fairness by examining whether AI predictions of income varied based on gender attributes. Their privacy assessments used the Enron email dataset, containing over 600,000 emails with sensitive personal information including credit card numbers and Social Security Numbers. For robustness, they used the OOD Style Transfer, which incorporates text transformations, and AdvGLUE++ datasets, which includes examples for widely used Natural Language Processing (NLP) tasks.

Key Insight: Utilizing Novel Fine-Tuning Strategies

“This is the first work to investigate if trustworthiness properties can transfer from a weak to a strong model using weak-to-strong supervision, a process we term weak-to-strong trustworthiness generalization.” [2]

The 性视界 team developed two distinct strategies for embedding trustworthiness into AI systems. Their first approach, termed “Weak Trustworthiness Fine-tuning” (Weak TFT), focuses on training smaller models with explicit trustworthiness constraints, then using these models to teach larger systems. The second strategy, “Weak and Weak-to-Strong Trustworthiness Fine-tuning” (Weak+WTS TFT), applies trustworthiness constraints to both the small teacher model and the large student model during training.

Their experiments demonstrate that the Weak+WTS TFT approach produces significantly superior results, with improvements in fairness of up to 3 percentage points (equivalent to a 60% decrease in unfairness), as well as in robustness, or how resilient the AI was to attacks and unexpected situations. Remarkably, these ethical improvements required only minimal sacrifices in task performance鈥攄ecreases in accuracy did not exceed 1.5% across tested properties.

Key Insight: Challenges in Privacy Transfer

“Privacy presents a unique situation. Note that the strong ceiling (1) does not achieve better privacy than the weak model.” [3]

A key finding of the study is that not all trustworthiness properties transfer equally from weak to strong models. While the transfer of fairness and robustness properties showed promising results, privacy proved to be a more challenging attribute to transfer. The researchers found that larger models have a greater capacity to retain and recall details from their training data, which creates heightened vulnerabilities for exposing sensitive or confidential information. This finding highlights the complex nature of privacy in AI systems and suggests that different strategies may be needed to address privacy concerns in larger models.

Why This Matters:

For C-suite executives and business leaders, this research offers a potential pathway to developing more powerful LLM systems without compromising on certain ethical considerations. It suggests that companies could potentially start with smaller, more manageable models that are fine-tuned for trustworthiness in fairness and robustness, and then scale up to more capable systems while maintaining or even improving these critical properties. This approach could help mitigate risks associated with LLM deployment, enhance public trust in AI-driven decisions, and potentially reduce the resources required for ethical LLM development. However, the challenges identified in transferring privacy properties serve as a reminder of the complex nature of AI ethics. Business leaders should remain vigilant and consider multi-faceted approaches to ensuring the trustworthiness of their LLM systems, particularly when dealing with sensitive data.

Footnote

(1) The strong ceiling represents the benchmark performance of a large model that has been directly trained with trustworthiness constraints, serving as the upper bound for what the weak-to-strong approach should ideally achieve.

References

[1] Martin Pawelczyk et al., 鈥淕eneralizing Trust: Weak-to-Strong Trustworthiness in Language Models,鈥 arXiv preprint arXiv:2501.00418v1 (December 31, 2024): 1.

[2] Pawelczyk et al., 鈥淕eneralizing Trust,鈥 2.

[3] Pawelczyk et al., 鈥淕eneralizing Trust,鈥 8.

Meet the Authors

is a postdoctoral researcher at 性视界 working on trustworthy AI.

is an undergraduate student at 性视界 studying computer science.

is a PhD student in computer science at 性视界.

is a postdoctoral research associate at 性视界 working on trustworthy AI.

is an Assistant Professor of Business Administration at 性视界 Business School and PI in D^3鈥檚 Trustworthy AI Lab. She is also a faculty affiliate in the Department of Computer Science at 性视界 University, the 性视界 Data Science Initiative, Center for Research on Computation and Society, and the Laboratory of Innovation Science at 性视界. Professor Lakkaraju’s research focuses on the algorithmic, practical, and ethical implications of deploying AI models in domains involving high-stakes decisions such as healthcare, business, and policy.

The post Teaching Trust: How Small AI Models Can Make Larger Systems More Reliable appeared first on 性视界 Business School AI Institute.

]]>
The Gender Divide in Generative AI: A Global Challenge /the-gender-divide-in-generative-ai-a-global-challenge/ Thu, 17 Apr 2025 15:31:48 +0000 /?p=26380 As generative AI transforms the business landscape, a concerning trend demands immediate attention from executives and policymakers alike. In the recent 性视界 Business School (HBS) working paper, 鈥淕lobal Evidence on Gender Gaps and Generative AI,鈥 authors Nicholas G. Otis, PhD candidate at the Berkeley Haas School of Business; Sol猫ne Delecourt, Assistant Professor at the Berkeley […]

The post The Gender Divide in Generative AI: A Global Challenge appeared first on 性视界 Business School AI Institute.

]]>
As generative AI transforms the business landscape, a concerning trend demands immediate attention from executives and policymakers alike. In the recent 性视界 Business School (HBS) working paper, 鈥,鈥 authors , PhD candidate at the Berkeley Haas School of Business; , Assistant Professor at the Berkeley Haas School of Business and Affiliated Researcher at the Laboratory for Innovation Science (LISH) at 性视界; , PhD student at Stanford University; and , Associate Professor of Business Administration at HBS and Principal Investigator at the Digital Data Design (D^3) Institute at 性视界 Tech for All Lab, describe a significant gender gap in the adoption and use of generative AI tools worldwide. This disparity threatens to exacerbate existing inequalities and risks limiting the potential benefits of this revolutionary technology across various sectors and industries.

Key Insight: A Universal Gender Gap in AI Adoption

“To estimate the extent of the gender gap in generative AI use, we first identified every publicly available study that has surveyed people about generative AI use along with their gender […] [Surveys show] a remarkably consistent pattern in generative AI use: men are more likely to adopt generative AI tools than women in all but one survey.” [1]

Otis and his colleagues uncovered a pervasive gender gap in generative AI adoption. Their comprehensive analysis, drawing from 18 diverse studies among more than 140,000 individuals worldwide, showed that women are approximately 20% less likely than men to directly engage with generative AI technology. This gap was not confined to specific industries, geographic locations, or occupations, but appeared to be a universal phenomenon.

Key Insight: Persistence of the Gap Despite Equal Access

“[F]indings show, that even when efforts to increase participation by equalizing access are in place, women are still less likely to use generative AI than men.” [2]

The researchers demonstrated that simply providing equal access to generative AI tools is not sufficient to bridge the gender gap. Their findings suggest that deeper, more complex factors are at play, potentially rooted in cultural, social, or institutional barriers. For example, in a study conducted in Kenya where access to ChatGPT was equalized, women were still about 13.1% less likely to adopt the technology compared to men.

Key Insight: Implications for AI Development and Effectiveness

“As generative AI systems are still in their formative stages, the under-representation of women may result in early biases in the user data these tools learn from, resulting in self-reinforcing gender disparities.鈥 [3]

Otis and his team warned of a potential feedback loop where the current gender gap in AI usage could lead to biased AI systems that further discourage women’s participation. This cycle threatens to perpetuate and even amplify existing gender inequalities. The researchers discovered that women accounted for just 42% of the approximately 200 million average monthly users who visited the ChatGPT website worldwide between November 2022 and May 2024. In smartphone app usage, the gap widens further, with women estimated to make up only around 27.2% of total ChatGPT application downloads.

Key Insight: Multifaceted Roots of the Gender Gap

“[B]ecause women tend to work in different types of firms, jobs, and occupations than men, they may be less exposed to this new technology. Such differences are often further reinforced by the gendered differences in women’s personal and professional networks, further limiting diffusion and learning.” [4]

The working paper identified several potential factors contributing to the gender gap in AI adoption, including differences in workplace exposure, variations in personal and professional networks, and potential disparities in confidence and persistence when using new technologies. Research shows that women consistently say they are less familiar with and knowledgeable about generative AI tools than men. The team found that in the tech industry, junior women significantly lag behind men in generative AI use in both technical and non-technical functions, indicating that even in technology-focused environments, the gap persists.

Why This Matters

For business leaders and policymakers, understanding and addressing the gender gap in generative AI adoption is crucial. It represents a significant untapped potential in workforce productivity and innovation. As generative AI becomes increasingly integral to various business processes, ensuring equal participation across genders will be vital for maintaining competitiveness and fostering diverse perspectives in problem-solving and decision-making.

Moreover, the self-reinforcing nature of this gap poses a serious threat to gender equality in the workplace and beyond. If left unaddressed, it could lead to a widening skills gap, further entrenching gender disparities in high-growth, high-paying sectors of the economy. For executives, this translates to a pressing need to implement targeted strategies that provide equal access to AI tools and address the underlying factors that discourage women from engaging with these technologies.

References

[1] Nicholas G. Otis, Sol猫ne Delecourt, Katelyn Cranney, and Rembrand Koning, “Global Evidence on Gender Gaps and Generative AI”, 性视界 Business School Working Paper No. 25-023, (2024): 30, 3.

[2] Otis et al.,  “Global Evidence on Gender Gaps and Generative AI”, 5.

[3] Otis et al.,  “Global Evidence on Gender Gaps and Generative AI”, 5.

[4] Otis et al.,  “Global Evidence on Gender Gaps and Generative AI”, 2.

Meet the Authors

is a PhD candidate at the Berkeley Haas School of Business, researching the societal and economic effects of generative AI and how it can help underserved people, places, and organizations. He earned his BA in Sociology and MA in Social Statistics from McGill University in Montreal.

is an Assistant Professor at the Berkeley Haas School of Business and Affiliated Researcher at the Laboratory for Innovation Science (LISH) at 性视界. Her studies focus on inequality in business performance and factors that create variation in company profits. She holds a master鈥檚 degree in Economics and Public Policy from Sciences Po Paris and 脡cole Polytechnique. She earned her PhD at the Stanford Graduate School of Business. 

is a PhD student in economics at Stanford University. Her interests include labor, behavioral, and experimental economics and technology adoption, innovation, gender, entrepreneurship, and productivity. Formerly a research assistant at 性视界 Business School working with Rembrand Koning and Sol猫ne Delecourt, she earned her BS in Economics from Brigham Young University.

is an Associate Professor of Business Administration at 性视界 Business School. He is the co-director, co-founder, and a Principal Investigator in the Tech for All Lab at D^3 at 性视界, studying how entrepreneurs can accelerate and shift the rate and direction of science, technology, and AI to benefit humanity. He earned his PhD in Business from the Stanford Graduate School of Business and his BS in Mathematics and BA in Statistics from the University of Chicago.

The post The Gender Divide in Generative AI: A Global Challenge appeared first on 性视界 Business School AI Institute.

]]>
AI Alignment: The Hidden Costs of Trustworthiness /ai-alignment-the-hidden-costs-of-trustworthiness/ Mon, 03 Mar 2025 18:05:14 +0000 /?p=25525 As AI continues to evolve at a breakneck pace, the quest for aligning these systems with human values has become paramount. However, a recent study, 鈥淢ore RLHF, More Trust? On The Impact of Preference Alignment on Trustworthiness鈥, by Aaron J. Li, a master鈥檚 student at the 性视界 John A. Paulson School of Engineering and Applied […]

The post AI Alignment: The Hidden Costs of Trustworthiness appeared first on 性视界 Business School AI Institute.

]]>
As AI continues to evolve at a breakneck pace, the quest for aligning these systems with human values has become paramount. However, a recent study, , by , a master鈥檚 student at the 性视界 John A. Paulson School of Engineering and Applied Sciences (SEAS); , Assistant Professor of Business Administration at 性视界 Business School and Principal Investigator at the Digital Data Design (D^3) Institute at 性视界 Trustworthy AI Lab; and , PhD graduate from 性视界 SEAS and the Trustworthy AI Lab, revealed that the current methods used to achieve this alignment may have unexpected consequences on AI trustworthiness. The study explored the complex relationship between AI alignment techniques and various aspects of trustworthiness, and offered crucial insights for business leaders navigating this new technology landscape.

Key Insight: The Misalignment Paradox

“We identify a significant misalignment between generic human preferences and specific trustworthiness criteria, uncovering conflicts between alignment goals and exposing limitations in conventional RLHF datasets and workflows.鈥 [1]

The team’s research uncovered a surprising paradox in AI development: the techniques designed to align AI with human preferences may inadvertently compromise its trustworthiness. In the study, Reinforcement Learning from Human Feedback (RLHF)鈥攁 common method for fine-tuning machine learning models to improve self-learning鈥攕howed mixed results across different trustworthiness metrics. While it improved performance in machine ethics (observing ethical principles) by an average of 31%, it led to concerning increases in stereotypical bias (150% increase) and privacy leakage (12% increase), and a 25% decrease in truthfulness.

Key Insight: The Ethics Exception

鈥淓mpirically, RLHF does not improve performance on key trustworthiness benchmarks such as toxicity, bias, truthfulness, and privacy, with machine ethics being the only exception.鈥 [2] 

The study showed that machine ethics stood out as the only aspect of large language model (LLM) trustworthiness that consistently improved through RLHF. The researchers found that the false negative rate (FNR) for ethical decision-making decreased significantly across all tested models. This suggests that current AI alignment techniques are particularly effective at instilling ethical behavior, but struggle with other trustworthiness metrics. These metrics include truthfulness (accurate information), toxicity (harmful or inappropriate content), fairness (assessing and addressing biases), robustness (performance under different conditions), and privacy (protecting user data and preventing data leaks).

Key Insight: The Data Attribution Dilemma

“To address this, we propose a novel data attribution analysis to identify fine-tuning samples detrimental to trustworthiness, which could potentially mitigate the misalignment issue.” [3]

Li, Krishna, and Lakkaraju introduced an innovative approach to understanding the root causes of trustworthiness issues in AI alignment. By analyzing the contribution of individual data samples to changes in trustworthiness, they developed a tool to identify and quantify the effects of problematic training data.

Key Insight: The Scale of the Challenge

“Although our experiments focus on models up to 7 [billion] parameters, we expect similar trends in larger models because prior research […] suggests that larger models are not inherently more trustworthy in the aspects where we have observed negative RLHF effects.” [4] 

The research indicated that the trustworthiness issues identified are not limited to smaller AI models. Even as AI systems grow in size and complexity, they remain susceptible to these alignment-induced trustworthiness problems. In fact, the study referred to findings of large-size models using RLHF that demonstrated stronger political views and racial biases.

Why This Matters

For business leaders and executives, the insights from the team鈥檚 research are crucial for understanding the complexities of deploying AI systems, and highlights that simply focusing on aligning AI with human preferences is not enough to ensure trustworthy and reliable AI systems. 

Companies investing in AI technologies must be aware of the potential trade-offs between different aspects of trustworthiness. While improvements in ethical decision-making are encouraging, the increased risks of bias, privacy breaches, and misinformation cannot be ignored. This research calls for a more nuanced approach to AI alignment that balances multiple dimensions of trustworthiness. Using the data attribution analysis method the team proposed to identify problematic training data, companies can potentially improve the trustworthiness of their AI systems without compromising on performance or alignment with human preferences.

References

[1] Aaron J. Li, Satyapriya Krishna, and Himabindu Lakkaraju, “More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness”, arXiv:2404.18870v2 [cs.CL] (December 21, 2024): 2.

[2] Li, Krishna, and Lakkaraju, “More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness”, 11.

[3] Li, Krishna, and Lakkaraju, “More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness”, 11.

[4] Li, Krishna, and Lakkaraju, “More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness”, 2.

Meet the Authors

is a master鈥檚 student in Computational Science & Engineering at the 性视界 University John A. Paulson School of Engineering and Applied Sciences (SEAS). He obtained his BA in Mathematics from 性视界. His interests include mathematics, theoretical CS, and physics.

recently completed his PhD at John A. Paulson School of Engineering and Applied Sciences (SEAS) and worked with the D^3 Trustworthy AI Lab, where his research focused on the trustworthy aspects of generative models. He earned his MS in Computer Science from Carnegie Mellon University and his BS in Computer Science and Engineering from the LNM Institute of Information Technology in Jaipur, India.聽

is an Assistant Professor of Business Administration at 性视界 Business School and PI in D^3鈥檚 Trustworthy AI Lab. She is also a faculty affiliate in the Department of Computer Science at 性视界 University, the 性视界 Data Science Initiative, Center for Research on Computation and Society, and the Laboratory of Innovation Science at 性视界. She teaches the first year course on Technology and Operations Management, and has previously offered multiple courses and guest lectures on a diverse set of topics pertaining to artificial intelligence (AI) and machine learning (ML), and their real-world implications.

The post AI Alignment: The Hidden Costs of Trustworthiness appeared first on 性视界 Business School AI Institute.

]]>
Climate Solution Firms: Investment Strategy and Risk Management /climate-solution-firms-investment-strategy-and-risk-management/ Thu, 20 Feb 2025 14:08:33 +0000 /?p=25411 As the global economy grapples with the pressing challenges of climate change, a new paradigm is emerging in the world of finance and investment. In their working paper, 鈥淐limate Solutions, Transition Risk, and Stock Returns,鈥 researchers Shirley Lu, Assistant Professor of Business Administration at 性视界 Business School (HBS) and an affiliate of the HBS Digital […]

The post Climate Solution Firms: Investment Strategy and Risk Management appeared first on 性视界 Business School AI Institute.

]]>
As the global economy grapples with the pressing challenges of climate change, a new paradigm is emerging in the world of finance and investment. In their working paper, 鈥,鈥 researchers , Assistant Professor of Business Administration at 性视界 Business School (HBS) and an affiliate of the HBS Digital Data Design (D^3) Institute Climate and Sustainability Impact Lab; , Professor of Management and Accounting at the Questrom School of Business at Boston University; , Post-Doctoral Fellow in the Climate and Sustainability Impact Lab; and , Professor of Business Administration at HBS and Co-Leader of the Climate and Sustainability Impact Lab, explore the intricate relationship between climate solutions, transition risk, and stock returns. Their findings offer valuable insights for investors, executives, and policymakers navigating the complex landscape of climate-related financial opportunities and risks.

Key Insight: The Rise of Climate Solution Firms

“We measure firms’ climate solutions with data that utilizes large language models (LLMs) to analyze the “Business Description” section of Item 1 in U.S. public firm 10-K filings.” [1]

The researchers developed an innovative approach to identifying companies focused on climate solutions. Using advanced AI techniques, they analyzed SEC regulatory filings from 2006 to 2023 to quantify firms’ involvement in climate-related products and services. This method provides a more nuanced and accurate picture of a company’s climate strategy than traditional metrics alone. 

The team uses the phrase 鈥渉igh-climate solution firms鈥 to describe companies with large portions of their products and services dedicated to climate solutions. During the study, they developed the variable 鈥渃limate solution measure鈥 (CS measure) to represent firms鈥 levels of involvement in client solutions. For example, the paper notes that Tesla, a leader in electric vehicles, has an average CS measure of 57%, compared to 11% for General Motors.

Key Insight: The Hedging Potential of Climate Solutions

“[H]igh-climate solution firms are better positioned to hedge against transition risks, as their products and services are in greater demand during periods of heightened transition risk, allowing them to capitalize on new market opportunities.” [2]

The paper reveals that companies with a higher focus on climate solutions may offer a unique hedging opportunity for investors. As the world transitions to a low-carbon economy, these firms are likely to see increased demand for their products and services, potentially offsetting risks associated with climate change. The researchers found that high-climate solution firms experience improved future profitability as unexpected climate change concerns increase.

Key Insight: The Mispricing Paradox

“[M]arket participants may underreact to negative news about climate solutions, such as not immediately recognizing the technological or production risks associated with investing in them.” [3]

Despite the potential benefits, the paper suggests that the market may not always accurately price the risks associated with climate solution firms. This mispricing could lead to overvaluation in the short term but may also present opportunities for informed investors. The study found that high-climate solution firms tend to have lower stock returns, possibly due to overvaluation resulting from investor preferences or underestimation of risks.

Key Insight: The Impact of Environmental Regulatory Uncertainty

“We measure environmental regulatory uncertainty using the environmental and climate policy uncertainty (EnvPU) index developed by Noailly et al. (2022).” [4]

The researchers highlight the significant role that policy uncertainty plays in the performance of climate solution firms. They used the EnvPU index, available from 2005 to 2019, to measure the share of environmental policy uncertainty articles among all environmental and climate policy articles in leading U.S. newspapers. By using the EnvPU index, the team demonstrated how regulatory changes can affect these companies’ profitability and market perception. For example, the paper notes that periods of high regulatory uncertainty can boost cash flow for climate solution firms, resulting in higher future profitability.

Why This Matters

For business leaders, investors, and policymakers, understanding the dynamics of climate solutions in the financial markets is crucial for navigating the transition to a low-carbon economy. This research provides valuable insights into how companies focused on addressing climate change may perform under various market conditions and regulatory environments. It highlights the potential for these firms to act as a hedge against transition risks, while cautioning about possible mispricing due to market inefficiencies or investor preferences for environmentally friendly products and services. 

The study offers a new tool for assessing a firm’s climate strategy and corporate sustainability efforts. By understanding the complex interplay between climate solutions, market dynamics, and regulatory uncertainty, executives, investors, and policymakers can anticipate the future while managing associated risks and capitalizing on emerging opportunities. 

References

[1] Shirley Lu, Edward J. Riedl, Simon Xu, and George Serafeim, “Climate Solutions, Transition Risk, and Stock Returns”, 性视界 Business School Working Paper, No. 25-024 (November 11, 2024): 1.

[2] Lu, Riedl, Xu, and Serafeim, “Climate Solutions, Transition Risk, and Stock Returns”, 1.

[3]聽 Lu, Riedl, Xu, and Serafeim, “Climate Solutions, Transition Risk, and Stock Returns”, 2.

[4]聽 Lu, Riedl, Xu, and Serafeim, “Climate Solutions, Transition Risk, and Stock Returns”, 20.

Meet the Authors

is an Assistant Professor of Business Administration in the Accounting and Management Unit and a member of D^3鈥檚 Climate and Sustainability Impact Lab. She teaches the Financial Reporting and Control course in the MBA required curriculum.

is a Professor of Accounting and Professor of Management at the Questrom School of Business at Boston University. His research interests include financial reporting mega-trends鈥攆air value accounting, international reporting, and issues relating to environmental, social, and governance (ESG) reporting. Prior to entering academia, he worked at a Big 6 auditor, in internal audit at a Fortune 250 oil company, and in corporate reporting at a real estate brokerage house.

is a Post-Doctoral Fellow in the HBS D^3 Climate and Sustainability Impact Lab. He received his PhD in Finance at the , University of California, Berkeley and is interested in financial intermediation, corporate finance, and banking, with links to climate finance, using LLMs to develop new metrics for assessing firms’ climate solution products and services, and their implications for business strategy and market valuation.

is the Charles M. Williams Professor of Business Administration at 性视界 Business School, where he co-leads the Climate and Sustainability Impact Lab within the D^3. He teaches the MBA course 鈥淩isks, Opportunities, and Investments in an Era of Climate Change鈥 (ROICC), which he developed to guide students in mastering the skills needed for entrepreneurial, managerial, or investment roles in a rapidly evolving climate landscape.

The post Climate Solution Firms: Investment Strategy and Risk Management appeared first on 性视界 Business School AI Institute.

]]>
Understanding and Addressing Managerial Sabotage in Organizations /understanding-and-addressing-managerial-sabotage-in-organizations/ Thu, 16 Jan 2025 15:09:05 +0000 /?p=24972 In today鈥檚 competitive corporate landscape, the workplace can be a battleground of ambition and performance. While healthy competition can fuel innovation and productivity, research (鈥淒eterminants of Top-Down Sabotage鈥) by Hashim Zaman, Post-Doctoral Fellow at the Laboratory for Innovation Science at 性视界 (LISH) and Karim R. Lakhani, Professor of Business Administration at 性视界 Business School, founder […]

The post Understanding and Addressing Managerial Sabotage in Organizations appeared first on 性视界 Business School AI Institute.

]]>
In today鈥檚 competitive corporate landscape, the workplace can be a battleground of ambition and performance. While healthy competition can fuel innovation and productivity, research (鈥溾) by , Post-Doctoral Fellow at the and , Professor of Business Administration at 性视界 Business School, founder and co-director of the , and co-founder and chair of the Digital Data and Design (D^3) Institute, revealed a potential dark side to this dynamic: top-down sabotage (TDS). This phenomenon, where managers intentionally undermine their talented subordinates, poses significant risks to individual careers, organizational culture, and long-term performance. In their study, the authors analyze survey data from 335 corporate executives across various industries and firm sizes.

Key Insight: The Prevalence of Managerial Sabotage

鈥淎pproximately 30% of the survey participants report observing sabotage in their organizations, and over 70% throughout their careers.鈥 [1]

Research highlights the reality that managerial sabotage is widespread in corporate environments. Zaman and Lakhani鈥檚 study reveals that over 70% of executives have witnessed such behaviors during their careers, with nearly one-third observing sabotage directly within their organizations. In addition, approximately 28% of survey respondents said they were victims of TDS within their current organizations, and 60% were affected by it during their careers.

Key Insight: The Root Cause鈥擣ear

鈥淸A]bout 21% [of survey respondents] cited status concerns as a major determinant of TDS, which is almost equal to the number citing both status and monetary concerns simultaneously, and substantially higher than the 3.3% who observed TDS for monetary reasons
补濒辞苍别.鈥 [2]

The research identifies the root cause of managerial sabotage: fear. Managers, particularly in hierarchical organizations, may perceive talented subordinates as threats to their status and pride. This insecurity drives them to pre-emptively undermine their team members, which can hurt employees鈥 careers and the organization鈥檚 culture and performance.

Key Insight: The Role of Relative Performance Evaluations (RPEs)

鈥淸W]hen a firm operates on RPE but the final decision on compensation or promotion relies on subjective managerial discretion, the incidence of TDS increases to 46.8%. Conversely, the magnitude of TDS under RPE without managerial discretion drops to 26.9%.鈥 [3]

The study delves into the impact of relative performance evaluations (RPEs), a common method used to assess employees by comparing their performance. While RPEs can drive productivity, they may also inadvertently encourage sabotage, particularly when managers have significant discretion in determining promotions. Zaman and Lakhani found that firms relying heavily on subjective RPE systems saw a marked increase in sabotage incidents. By contrast, organizations with more objective and transparent evaluation processes experienced significantly lower levels of sabotage.

Key Insight: Building a Culture That Prevents Sabotage

“Our survey results show that organizational culture is the single biggest factor that mitigates TDS.鈥 [4]

The research underscores the critical role of organizational culture in combating sabotage. Companies that emphasize open communication, collaboration, and transparency are less likely to experience managerial undermining. Strategies such as implementing and enforcing 360-degree feedback systems (in which feedback is gathered from multiple sources about an employee鈥檚 performance); ensuring performance evaluations are transparent, standard, and objective; and shifting incentives away from individual to team-based performance measures can significantly reduce the fear and competitiveness that drive sabotage.

Why This Matters

TDS is more than a human resources challenge鈥攊t is a strategic business issue with far-reaching consequences. It weakens organizational performance, makes it difficult to attract and retain employees, and can jeopardize succession plans. C-suite and business leaders can address this problem by taking a few key actions: 

  • Increasing transparency and objectivity in performance evaluation
  • Enforcing the use of 360-degree feedback systems
  • Creating a culture of collaboration, openness, and communication
  • Aligning incentives with team-based performance metrics

References

[1] Hashim Zaman and Karim R. Lakhani, 鈥淒eterminants of Top-Down Sabotage鈥, HBS Working Paper 25-007 (August 22, 2024): 1-81, 2.

[2] Zaman and Lakhani, 鈥淒eterminants of Top-Down Sabotage鈥, HBS Working Paper 25-007 (August 22, 2024): 9-10.

[3] Zaman and Lakhani, 鈥淒eterminants of Top-Down Sabotage鈥, HBS Working Paper 25-007 (August 22, 2024): 10.

[4] Zaman and Lakhani, 鈥淒eterminants of Top-Down Sabotage鈥, HBS Working Paper 25-007 (August 22, 2024): 26.

Meet the Authors

isa Post-Doctoral Fellow at the Laboratory for Innovation Sciences at 性视界. His research lies at the intersection of information economics, strategy and finance. He uses observational data and field experiments to study the role of economic incentives in mitigating agency issues in organizations. In addition, he uses machine learning methods to study the impact of social media sentiment on firm performance.

Headshot of Karim Lakhani

is the Dorothy & Michael Hintze Professor of Business Administration at the 性视界 Business School. His innovation-related research is centered around his role as the founder and co-director of the and as the principal investigator of the NASA Tournament Laboratory. He is also the co-founder and chair of the The Digital Data Design (D^3) Institute at 性视界 and the co-founder and co-chair of the , a university-wide online program transforming mid-career executives into data-savvy leaders.


The post Understanding and Addressing Managerial Sabotage in Organizations appeared first on 性视界 Business School AI Institute.

]]>
The Future of Decision-Making: How Generative AI Transforms Innovation Evaluation /the-future-of-decision-making-how-generative-ai-transforms-innovation-evaluation/ Wed, 15 Jan 2025 14:37:51 +0000 /?p=24909 As businesses grapple with an ever-growing volume of ideas, products, and solutions to evaluate, decision-making processes are being reshaped by artificial intelligence (AI). Generative AI, in particular, has emerged as a game-changer in creative problem-solving and evaluation, as demonstrated by a recent field experiment described in the working paper 鈥淭he Narrative AI Advantage? A Field […]

The post The Future of Decision-Making: How Generative AI Transforms Innovation Evaluation appeared first on 性视界 Business School AI Institute.

]]>
As businesses grapple with an ever-growing volume of ideas, products, and solutions to evaluate, decision-making processes are being reshaped by artificial intelligence (AI). Generative AI, in particular, has emerged as a game-changer in creative problem-solving and evaluation, as demonstrated by a recent field experiment described in the working paper 鈥.鈥&苍产蝉辫;

The paper鈥攂y , Assistant Professor at 性视界 Business School and a co-Principal Investigator of the (LISH) at 性视界鈥檚 Digital Data Design Institute (D^3) and a team of researchers (see Meet the Authors section below for details)鈥攄escribes how AI can augment decision-making for early-stage innovation screening.

The experiment, conducted with MIT Solve, included 72 experts and 156 non-expert community screeners who evaluated 48 solutions submitted to the 2024 Global Health Equity Challenge. The team used the GPT-4 large language model (LLM) to recommend whether to pass or fail each idea and provide criteria for failure. The evaluation phase was designed with three conditions:

  • A human-only control condition, with no AI assistance
  • Treatment 1: black box AI (BBAI), AI recommendations without rationale
  • Treatment 2: Narrative AI (NAI), AI recommendations with rationale

Key Insight: AI-Augmented Decisions Are More Stringent

鈥淪creeners were 9 percentage points more likely to fail a solution under the treatment conditions than the control condition.鈥 [1]

Generative AI can be a source of rigor in evaluation. According to the authors, evaluators using AI recommendations were more discerning in their decision-making compared to human-only groups. The study highlights that AI-assisted screeners tended to fail solutions more often than their human-only counterparts, particularly when using treatment 2, which provided detailed narratives justifying its recommendations.

The NAI approach stood out as particularly effective, especially for subjective criteria like quality or alignment with goals. The researchers observed that human screeners were significantly more likely to follow narrative AI’s recommendations because the rationale added credibility and context to its suggestions.

Key Insight: Balancing Objectivity and Subjectivity in AI Collaboration

鈥淸E]ffective decision-making for subjective criteria requires human oversight and close collaboration with AI.鈥 [2]

While AI excels at tasks requiring objective analysis, its role in subjective evaluations remains nuanced. The study revealed a marked difference in human alignment with AI recommendations based on whether the criteria were objective or subjective. For objective tasks, such as assessing technical feasibility, AI provided valuable consistency. However, for subjective tasks, such as evaluating novelty or aesthetics, human oversight was indispensable. The researchers noted that over-reliance on AI narratives for subjective decisions could sometimes lead to uncritical acceptance of its conclusions.

Key Insight: The Rise of AI Interaction Expertise

鈥淸Our findings suggest] the emergence of a new form of expertise鈥擜I interaction expertise鈥攚hich involves effectively interpreting, questioning, and integrating AI-generated insights into decision-making processes.鈥 [3]

The authors suggested that integrating AI into decision-making demands more than technical know-how; it requires “AI interaction expertise.” The paper emphasized that screeners who deeply engaged with AI recommendations鈥攅xamining and, when necessary, challenging them鈥攚ere better able to integrate AI insights into their decisions. This highlights a new skill set for the modern workforce: the ability to collaborate effectively with AI systems.

Why This Matters

The authors鈥 experiment and conclusions can help C-suite and business executives assess the value of using LLMs in decision-making, specifically by:

  • Recognizing AI鈥檚 strengths and weaknesses related to objective and subjective decision-making criteria. LLMs can potentially be used to pre-screen decisions based on objective criteria, and send those results to human screeners. Decisions involving subjective criteria require close human-AI collaboration, where AI tools act as 鈥渟ounding boards鈥 that complement the decision-making process.
  • Understanding the importance of AI interaction expertise in the workforce to interpret AI results and implementing AI training that highlights the value of human perspectives and the uses and risks of AI tools.

As is often the case in studies of the current state of generative AI tools, the authors concluded that 鈥淭he key lies in leveraging LLMs as tools to augment human decision-making rather than replace it entirely.鈥 [4]

References

[1] Jacqueline N. Lane, L茅onard Boussioux, Charles Ayoubi, Ying Hao Chen, Camila Lin, Rebecca Spens, Pooja Wagh, and Pei-Hsin Wang, 鈥淭he Narrative AI Advantage? A Field Experiment on Generative AI-Augmented Evaluations of Early-Stage Innovations鈥, 性视界 Business School Working Paper 25-001 (2024): 1-60, 5.

[2] Lane, et al., 鈥淭he Narrative AI Advantage? A Field Experiment on Generative AI-Augmented Evaluations of Early-Stage Innovations鈥, 33.

[3] Lane, et al., 鈥淭he Narrative AI Advantage? A Field Experiment on Generative AI-Augmented Evaluations of Early-Stage Innovations鈥, 31.

[4] Lane, et al., 鈥淭he Narrative AI Advantage? A Field Experiment on Generative AI-Augmented Evaluations of Early-Stage Innovations鈥, 36.

Meet the Authors

Headshot of Jacqueline Ng Lane

is an Assistant Professor at 性视界 Business School and a co-Principal Investigator of the (LISH) at 性视界鈥檚 Digital Data Design Institute (D^3). She earned her Ph.D. from Northwestern University.

, is an Assistant Professor in the Department of Information Systems and Operations Management at the University of Washington, Foster School of Business, with an adjunct position at the Allen School of Computer Science and Engineering. He earned his Ph.D. in at the .

is a Postdoctoral Research Fellow at the Laboratory for Innovation Science at 性视界 (LISH) supported by a research grant from the Swiss National Science Foundation (SNSF). His research examines the processes of knowledge creation and diffusion in the context of science and innovation. He studies how scientists use their resources and informational advantages to achieve scientific breakthroughs, greater dissemination of knowledge and accessibility of innovation.

is a Lecturer at the University of Washington Global Innovation Exchange.

is an AIOps Product Manager at Microsoft. Prior to her work at Microsoft, Lin earned her Master鈥檚 in Information Systems from the University of Washington where she worked as a Research Assistant.

is  Results Measurement Manager and focuses on using research methods to understand Solve鈥檚 effectiveness and impact. Before joining Solve, Rebecca worked on evaluation and research in UK government, most recently at the Ministry of Justice. Rebecca holds a Master鈥檚 in Development Practice from Emory University and a BA in Modern History and French from the University of St. Andrews.

is Director, Operations & Impact at . Pooja came to Solve in 2017 with over a decade of experience in international development, program evaluation, and data analysis in the private and nonprofit sectors. Pooja holds a Masters in Public Policy from the 性视界 Kennedy School and a Bachelors in electrical engineering from MIT.

is a Cloud First Product Manager at Accenture. At the time of the research article鈥檚 publication, Wang was a Research Assistant and Data Scientist at the University of Washington.


The post The Future of Decision-Making: How Generative AI Transforms Innovation Evaluation appeared first on 性视界 Business School AI Institute.

]]>
Bridging the Gap Between Understanding and Control: Insights into AI Interpretability /bridging-the-gap-between-understanding-and-control-insights-into-ai-interpretability/ Fri, 10 Jan 2025 15:38:21 +0000 /?p=24780 As large language model (LLM) systems grow in complexity, the challenge of ensuring their outputs align with human intentions has become critical. Interpretability鈥攖he ability to explain how models reach their decisions鈥攁nd control鈥攖he ability to steer them toward desired outcomes鈥攁re two sides of the same coin. 鈥淭owards Unifying Interpretability and Control: Evaluation via Intervention鈥濃攔esearch by Usha […]

The post Bridging the Gap Between Understanding and Control: Insights into AI Interpretability appeared first on 性视界 Business School AI Institute.

]]>
As large language model (LLM) systems grow in complexity, the challenge of ensuring their outputs align with human intentions has become critical. Interpretability鈥攖he ability to explain how models reach their decisions鈥攁nd control鈥攖he ability to steer them toward desired outcomes鈥攁re two sides of the same coin.

鈥溾濃攔esearch by , Graduate Fellow PhD student at 性视界 University Kempner Institute and the Digital Data Design Institute (D^3) Trustworthy AI Lab; , Research Scientist at Bosch AI; , Assistant Professor of Business Administration at 性视界 Business School and PI in D^3鈥檚 Trustworthy AI Lab; and , Senior Research Scientist at Google DeepMind鈥攆ound that many methods developed to address these issues focus on one aspect, neglecting the other. The study introduces a new approach that unifies interpretability and control and proposes intervention as the primary goal, and evaluates how well different methods enable control through intervention.

Key Insight: Intervention as a Fundamental Goal of Interpretability

鈥淸W]e view intervention as a fundamental goal of interpretability, and propose to measure the correctness of interpretability methods by their ability to successfully edit model behaviour.鈥 [1]

The authors define intervention as the deliberate modification of specific human-interpretable features within a model鈥檚 latent representations1 to achieve desired changes in its outputs, or its responses to prompts. They argue that the ability to intervene in a model’s behavior this way should be a core objective of interpretability methods. By focusing on intervention, they provide a practical way to assess the effectiveness of various interpretability techniques. This approach shifts the focus from understanding a model’s inner workings to actively influencing its outputs, bridging the gap between theory and application.

Key Insight: A Unified Framework for Interpretability and Control

鈥淸W]e present an encoder-decoder framework that unifies four popular mechanistic interpretability methods: sparse autoencoders, logit lens, tuned lens, and probing.鈥 [2]

The study uncovered a critical limitation in current interpretability methods: their performance varies significantly across different models and features. To address these performance issues, Bhalla et al. present a new approach to unifying diverse interpretability methods under a single framework鈥攖he encoder-decoder model. Their framework maps intermediate latent representations to feature spaces that are understandable by humans, allowing interventions to these features. These changes can then be translated back into latent representations to influence the model’s outputs.The study evaluates four methods within its unified framework to determine their relative strengths and weaknesses for both interpretability and control: 

  • Logit Lens: Easy to use, requires no training, maps features directly to individual tokens in the model鈥檚 vocabulary, and generally has high causal fidelity2, but is limited by predefined, static features
  • Tuned Lens: Extends Logit Lens with additional learned linear transformation3, which improves its flexibility and effectiveness, but requires additional training and tuning
  • Sparse autoencoders (SAEs): Can learn a large dictionary of low-level and high-level or abstract features, but are difficult to train and label and shows lower causal fidelity
  • Probing: Trains simple classifiers (often linear) on top of model representations to predict specific features or concepts, but is prone to spurious correlations, leading to low causal fidelity

Key Insight: Measuring Success Through Interventions

鈥淸W]e propose two evaluation metrics for encoder-decoder interpretability methods, namely (1) intervention success rate; and (2) the coherence-intervention tradeoff to evaluate the ability of interpretability methods to control model behavior.鈥 [3]

The authors introduce two metrics to determine if interventions are accurate and maintain the integrity and functionality of AI systems in real-world applications: 

  • Intervention success rate: Measures the effectiveness, or whether the intervention achieves its goal
  • Coherence-intervention tradeoff: Measures practical utility, ensuring the intervention does not make the model鈥檚 outputs unusable by affecting its coherence and quality

Among the methods evaluated, the two lens-based approaches had the highest intervention success rates. However, due to current shortcomings, such as inconsistency across models and features, and the potential compromising of performance and coherence, the authors found that, when it comes to directing model behavior, simpler options, such as prompting, prevail over intervention methods.

Why This Matters

For business professionals and C-suite executives, the insights presented by Bhalla and her team represent a pivotal development in the practical application of AI technologies. As organizations increasingly rely on AI for tasks ranging from low-level to critical, understanding how to align these systems with human and organizational values is paramount. The proposed framework and metrics provide actionable tools to ensure AI systems are both correct and usable. The study also underscores the need to select and evaluate interpretability methods carefully based on the specific models used and tasks involved.

Footnotes

(1) Latent representation refers to the internal, abstract representation of data within a machine learning model. These representations are not directly interpretable by humans but encode meaningful patterns or features of the input data.

(2) Causal fidelity is the extent to which intervening on a specific feature of an explanation results in the corresponding change in the model’s output.

(3) A linear transformation is a mathematical function that converts one vector into another while maintaining the properties of vector addition and scalar multiplication. Put simply, it changes the direction and size of vectors without warping or distorting the structure of the space they occupy.

References

[1] Usha Bhalla, Suraj Srinivas, Asma Ghandeharioun, and Himabindu Lakkaraju, “Towards Unifying Interpretability and Control: Evaluation via Intervention”, arXiv preprint arXiv:2411.04430v1 (November 7, 2024): 2.

[2] Bhalla, et al. “Towards Unifying Interpretability and Control: Evaluation via Intervention”, 3.

[3] Bhalla, et al. “Towards Unifying Interpretability and Control: Evaluation via Intervention”, 3.

Meet the Authors

, is a PhD student in the 性视界 Computer Science program at 性视界 University Kempner Institute, and a fellow at the Digital Data Design Institute (D^3) Trustworthy AI Lab. Advised by Hima Lakkaraju, her research focuses on machine learning interpretability. Bhalla is also a dedicated advocate for diversity in computer science, mentoring early-career minority students to support their growth in the field.

is a Research Scientist at Bosch AI with a focus on model interpretability, data-centric machine learning, and the “science” of deep learning. They completed their Ph.D. with Fran莽ois Fleuret at Idiap Research Institute & EPFL, Switzerland, and were a postdoctoral research fellow with Hima Lakkaraju at 性视界 University. They have organized workshops and seminars on interpretable AI, including sessions at NeurIPS 2023 and 2024, and contributed to teaching an explainable AI course at 性视界. Their work bridges theoretical advancements and practical applications of explainable AI.

is an Assistant Professor of Business Administration at 性视界 Business School and PI in D^3鈥檚 Trustworthy AI Lab. She is also a faculty affiliate in the Department of Computer Science at 性视界 University, the 性视界 Data Science Initiative, Center for Research on Computation and Society, and the Laboratory of Innovation Science at 性视界. She teaches the first year course on Technology and Operations Management, and has previously offered multiple courses and guest lectures on a diverse set of topics pertaining to Artificial Intelligence (AI) and Machine Learning (ML), and their real world implications.

is a Senior Research Scientist at Google DeepMind, where she focuses on aligning AI with human values by understanding, controlling, and demystifying language models. She earned her Ph.D. from the MIT Media Lab鈥檚 Affective Computing Group and has conducted research at Google Research, Microsoft Research, and EPFL. Previously, she worked in digital mental health, collaborating with 性视界 medical professionals and publishing in leading journals.


The post Bridging the Gap Between Understanding and Control: Insights into AI Interpretability appeared first on 性视界 Business School AI Institute.

]]>
Revolutionizing Data Privacy: Machine Unlearning in Action /revolutionizing-data-privacy-machine-unlearning-in-action/ Wed, 08 Jan 2025 15:27:46 +0000 /?p=24744 In today鈥檚 data-driven world, businesses face the dual challenge of leveraging vast datasets to gain insights while ensuring compliance with stringent data privacy regulations. The concept of machine unlearning, a method for efficiently removing the influence of specific data points from machine learning models, represents a paradigm shift in managing data responsibly. Recent research explores […]

The post Revolutionizing Data Privacy: Machine Unlearning in Action appeared first on 性视界 Business School AI Institute.

]]>
In today鈥檚 data-driven world, businesses face the dual challenge of leveraging vast datasets to gain insights while ensuring compliance with stringent data privacy regulations. The concept of machine unlearning, a method for efficiently removing the influence of specific data points from machine learning models, represents a paradigm shift in managing data responsibly.

Recent research explores a new framework for machine unlearning in the article, “,” by , 性视界 Business School Assistant Professor, Faculty Affiliate, and Principal Investigator at the Digital Data Design Institute (D^3) Trustworthy AI Lab; , PhD student in Computer Science at the 性视界 John A. Paulson School of Engineering and Applied Sciences (co-advised with Seth Neel); , PhD candidate at MIT鈥檚 Electrical Engineering & Computer Science (EECS) Department; , Postdoctoral Scholar in Computer Science at Stanford University; , PhD student at Stanford University; , Stein Fellow at Stanford University; and , Cadence Design Systems Professor at MIT鈥檚 EECS Department.

Key Insight: The Growing Need for Machine Unlearning

“The goal of machine unlearning is to remove (or ‘unlearn’) the impact of a specific collection of training examples from a trained machine learning model.” [1]

The research emphasizes how regulatory pressures, like the EU’s Right to Be Forgotten, and practical needs鈥攕uch as mitigating the effects of poisoned, toxic, or outdated data and resolving copyright infringement issues in generative AI models鈥攁re driving the demand for machine unlearning. The authors demonstrate how machine unlearning can address these challenges by enabling models to function as though specific data points (the 鈥渇orget set鈥) were never part of the training process.

Key Insight: A Breakthrough Framework鈥擠atamodel Matching

Datamodel Matching (DMM) […] introduces a reduction from unlearning to data attribution, allowing us to translate future improvements in the latter field to better algorithms for the former.鈥 [2]

The authors introduce DMM, a novel approach that links machine unlearning to data attribution. Unlike traditional retraining methods that can be computationally expensive, DMM employs data attribution to predict a model鈥檚 output as if it were retrained without the forget-set data and fine-tunes the data to match these predicted outputs. 

Key concepts:

  • Data attribution: A framework within machine learning that connects specific training data samples to the predictions made by a trained model. This concept focuses on understanding and quantifying the influence of individual training data points on a model’s behavior and predicting how changes to the training dataset, such as adding or removing data points, would affect a model’s outputs.
  • Oracle Matching (OM): A hypothetical and idealized approach to machine unlearning where a model is fine-tuned to match the outputs of an oracle model. The oracle model represents a machine learning model that has been retrained from scratch on the dataset excluding the data points to be unlearned (the forget set). 
  • Fine-tuning: A process in which an already-trained machine learning model is updated to achieve a specific objective by making small adjustments to its parameters. In the context of machine unlearning, fine-tuning is used to modify a model so it behaves as though the forget-set data were never part of the original training process. The fine-tuned model鈥檚 behavior should be statistically indistinguishable from the oracle model on both the forget set and the retained data.

Key Insight: Addressing the Missing Targets Problem

“A pervasive challenge […] for fine-tuning-based approaches is what we refer to as the missing targets problem.” [3]

Existing fine-tuning-based unlearning methods suffer from the “missing targets” problem, which describes the challenge of determining the precise output a model should produce after forgetting a particular data point or group of points. DMM circumvents this issue by using data attribution to estimate the target outputs of an oracle model, and then fine-tuning to match, ensuring stability and preventing overshooting or undershooting the target loss.

Key Insight: Practical Efficiency with Broad Applications

“[DMM] achieves state-of-the-art performance across a suite of empirical evaluations.” [4]

To better assess unlearning performance, the researchers propose a new evaluation metric called KL Divergence of Margins (KLoM). This metric directly measures the distributional difference between unlearned model outputs and those of models retrained without the forget set. The authors鈥 research demonstrates that DMM delivers results comparable to full retraining at a fraction of the computational cost.

Why This Matters

DMM represents a significant step forward in the machine unlearning field, offering a more reliable and efficient approach to unlearning in complex neural networks. For C-suite executives and business professionals, this research highlights the potential for improved data management practices and reduced computational costs associated with model maintenance. This approach opens new avenues for future research and offers practical solutions for addressing privacy concerns and data removal requests in real-world applications.

References

[1] Kristian Georgiev, Roy Rinberg, Sung Min Park, Shivam Garg, Andrew Ilyas, Aleksander Madry, and Seth Neel, “Attribute-to-Delete: Machine Unlearning via Datamodel Matching”, arXiv preprint arXiv:2410.23232 (October 2024): 1-47, 1.

[2] Georgiev et al., “Attribute-to-Delete: Machine Unlearning via Datamodel Matching,” 3.

[3] Georgiev et al., “Attribute-to-Delete: Machine Unlearning via Datamodel Matching,” 2.

[4] Georgiev et al., “Attribute-to-Delete: Machine Unlearning via Datamodel Matching,” 3.

Meet the Authors

is an Assistant Professor housed in the Department of Technology and Operations Management (TOM) at 性视界 Business School, and a Faculty Affiliate in Computer Science at SEAS. He is the Principal Investigator at the Digital Data Design Institute (D^3) Trustworthy AI Lab.

is PhD student in Computer Science at the 性视界 John A. Paulson School of Engineering and Applied Sciences, and is co-advised by Seth Neel. His research interests focus on public-interest technology, with a recent focus on privacy technology.

is a PhD candidate at MIT鈥檚 Electrical Engineering & Computer Science (EECS) Department advised by Aleksander Madry. They are interested in the science of deep learning and deep learning for science.

is a Postdoctoral Scholar at Stanford working with Prof. Tatsu Hashimoto, Prof. Percy Liang, and Prof. James Zou. He received his PhD from MIT, where he was advised by Prof. Aleksander M膮dry. He is interested in understanding and improving machine learning (ML) methodology through the lens of data.

is a PhD student at Stanford, advised by Greg Valiant . His is part of the Machine Learning Group and the Theory Group at Stanford. Prior to Stanford, he worked at Microsoft Research India.

is a Stein Fellow at Stanford University. His research pursues a precise empirical understanding of the entire machine learning pipeline, with an emphasis on data. His interests span tracing predictions back to training data, identifying and alleviating data bias, and studying machine learning robustness.

is the Cadence Design Systems Professor of in the and a member of . He received his Ph.D. from in 2011. He is the Director of the MIT Center for Deployable Machine Learning and a Faculty Co-Lead of the . Prior to joining the MIT’s faculty, he spent a year as a postdoctoral researcher at .


The post Revolutionizing Data Privacy: Machine Unlearning in Action appeared first on 性视界 Business School AI Institute.

]]>