Ethics & Policy
AGI Is Not Multimodal

“In projecting language back as the model for thought, we lose sight of the tacit embodied understanding that undergirds our intelligence.” –Terry Winograd
The recent successes of generative AI models have convinced some that AGI is imminent. While these models appear to capture the essence of human intelligence, they defy even our most basic intuitions about it. They have emerged not because they are thoughtful solutions to the problem of intelligence, but because they scaled effectively on hardware we already had. Seduced by the fruits of scale, some have come to believe that it provides a clear pathway to AGI. The most emblematic case of this is the multimodal approach, in which massive modular networks are optimized for an array of modalities that, taken together, appear general. However, I argue that this strategy is sure to fail in the near term; it will not lead to human-level AGI that can, e.g., perform sensorimotor reasoning, motion planning, and social coordination. Instead of trying to glue modalities together into a patchwork AGI, we should pursue approaches to intelligence that treat embodiment and interaction with the environment as primary, and see modality-centered processing as emergent phenomena.
Preface: Disembodied definitions of Artificial General Intelligence — emphasis on general — exclude crucial problem spaces that we should expect AGI to be able to solve. A true AGI must be general across all domains. Any complete definition must at least include the ability to solve problems that originate in physical reality, e.g. repairing a car, untying a knot, preparing food, etc. As I will discuss in the next section, what is needed for these problems is a form of intelligence that is fundamentally situated in something like a physical world model. For more discussion on this, look out for Designing an Intelligence. Edited by George Konidaris, MIT Press, forthcoming.
Why We Need the World, and How LLMs Pretend to Understand It
TLDR: I first argue that true AGI needs a physical understanding of the world, as many problems cannot be converted into a problem of symbol manipulation. It has been suggested by some that LLMs are learning a model of the world through next token prediction, but it is more likely that LLMs are learning bags of heuristics to predict tokens. This leaves them with a superficial understanding of reality and contributes to false impressions of their intelligence.
The most shocking result of the predict-next-token objective is that it yields AI models that reflect a deeply human-like understanding of the world, despite having never observed it like we have. This result has led to confusion about what it means to understand language and even to understand the world — something we have long believed to be a prerequisite for language understanding. One explanation for the capabilities of LLMs comes from an emerging theory suggesting that they induce models of the world through next-token prediction. Proponents of this theory cite the prowess of SOTA LLMs on various benchmarks, the convergence of large models to similar internal representations, and their favorite rendition of the idea that “language mirrors the structure of reality,” a notion that has been espoused at least by Plato, Wittgenstein, Foucault, and Eco. While I’m generally in support of digging up esoteric texts for research inspiration, I’m worried that this metaphor has been taken too literally. Do LLMs really learn implicit models of the world? How could they otherwise be so proficient at language?
One source of evidence in favor of the LLM world modeling hypothesis is the Othello paper, wherein researchers were able to predict the board of an Othello game from the hidden states of a transformer model trained on sequences of legal moves. However, there are many issues with generalizing these results to models of natural language. For one, whereas Othello moves can provably be used to deduce the full state of an Othello board, we have no reason to believe that a complete picture of the physical world can be inferred by a linguistic description. What sets the game of Othello apart from many tasks in the physical world is that Othello fundamentally resides in the land of symbols, and is merely implemented using physical tokens to make it easier for humans to play. A full game of Othello can be played with just pen and paper, but one can’t, e.g., sweep a floor, do dishes, or drive a car with just pen and paper. To solve such tasks, you need some physical conception of the world beyond what humans can merely say about it. Whether that conception of the world is encoded in a formal world model or, e.g., a value function is up for debate, but it is clear that there are many problems in the physical world that cannot be fully represented by a system of symbols and solved with mere symbol manipulation.
Another issue stated in Melanie Mitchell’s recent piece and supported by this paper, is that there is evidence that generative models can score remarkably well on sequence prediction tasks while failing to learn models of the worlds that created such sequence data, e.g. by learning comprehensive sets of idiosyncratic heuristics. E.g., it was pointed out in this blog post that OthelloGPT learned sequence prediction rules that don’t actually hold for all possible Othello games, like “if the token for B4 does not appear before A4 in the input string, then B4 is empty.” While one can argue that it doesn’t matter how a world model predicts the next state of the world, it should raise suspicion when that prediction reflects a better understanding of the training data than the underlying world that led to such data. This, unfortunately, is the central fault of the predict-next-token objective, which seeks only to retain information relevant to the prediction of the next token. If it can be done with something easier to learn than a world model, it likely will be.
To claim without caveat that predicting the effects of earlier symbols on later symbols requires a model of the world like the ones humans generate from perception would be to abuse the “world model” notion. Unless we disagree on what the world is, it should be clear that a true world model can be used to predict the next state of the physical world given a history of states. Similar world models, which predict high fidelity observations of the physical world, are leveraged in many subfields of AI including model-based reinforcement learning, task and motion planning in robotics, causal world modeling, and areas of computer vision to solve problems instantiated in physical reality. LLMs are simply not running physics simulations in their latent next-token calculus when they ask you if your person, place, or thing is bigger than a breadbox. In fact, I conjecture that the behavior of LLMs is not thanks to a learned world model, but to brute force memorization of incomprehensibly abstract rules governing the behavior of symbols, i.e. a model of syntax.
Quick primer:
- Syntax is a subfield of linguistics that studies how words of various grammatical categories (e.g. parts of speech) are arranged together into sentences, which can be parsed into syntax trees. Syntax studies the structure of sentences and the atomic parts of speech that compose them.
- Semantics is another subfield concerned with the literal meaning of sentences, e.g., compiling “I am feeling chilly” into the idea that you are experiencing cold. Semantics boils language down to literal meaning, which is information about the world or human experience.
- Pragmatics studies the interplay of physical and conversational context on speech interactions, like when someone knows to close an ajar window when you tell them “I am feeling chilly.” Pragmatics involves interpreting speech while reasoning about the environment and the intentions and hidden knowledge of other agents.
Without getting too technical, there is intuitive evidence that somewhat separate systems of cognition are responsible for each of these linguistic faculties. Look no further than the capability for humans to generate syntactically well-formed sentences that have no semantic meaning, e.g. Chomsky’s famous sentence “Colorless green ideas sleep furiously,” or sentences with well-formed semantics that make no pragmatic sense, e.g. responding merely with “Yes, I can” when asked, “Can you pass the salt?” Crucially, it is the fusion of the disparate cognitive abilities underpinning them that coalesce into human language understanding. For example, there isn’t anything syntactically wrong with the sentence, “The fridge is in the apple,” as a syntactic account of “the fridge” and “the apple” would categorize them as noun phrases that can be used to produce a sentence with the production rule, S → (NP “is in” NP). However, humans recognize an obvious semantic failure in the sentence that becomes apparent after attempting to reconcile its meaning with our understanding of reality: we know that fridges are larger than apples, and could not be fit into them.
But what if you have never perceived the real world, yet still were trying to figure out whether the sentence was ill-formed? One solution could be to embed semantic information at the level of syntax, e.g., by inventing new syntactic categories, NPthe fridge and NPthe apple , and a single new production rule that prevents semantic misuse: S → (NPthe apple “is in” NPthe fridge ). While this strategy would no longer require grounded world knowledge about fridges and apples, e.g., it would require special grammar rules for every semantically well-formed construction… which is actually possible to learn given a massive corpus of natural language. Crucially, this would not be the same thing as grasping semantics, which in my view is fundamentally about understanding the nature of the world.
Finding that LLMs have reduced problems of semantics and pragmatics into syntax would have profound implications on how we should view their intelligence. People often treat language proficiency as a proxy for general intelligence by, e.g., strongly associating pragmatic and semantic understanding with the cognitive abilities that undergird them in humans. For example, someone who appears well-read and graceful in navigating social interactions is likely to score high in traits like sustained attention and theory of mind, which lie closer to measures of raw cognitive ability. In general, these proxies are reasonable for assessing a person’s general intelligence, but not an LLM’s, as the apparent linguistic skills of LLMs could come from entirely separate mechanisms of cognition.
The Bitter Lesson Revisited
TLDR: Sutton’s Bitter Lesson has sometimes been interpreted as meaning that making any assumptions about the structure of AI is a mistake. This is both unproductive and a misinterpretation; it is precisely when humans think deeply about the structure of intelligence that major advancements occur. Despite this, scale maximalists have implicitly suggested that multimodal models can be a structure-agnostic framework for AGI. Ironically, today’s multimodal models contradict Sutton’s Bitter Lesson by making implicit assumptions about the structure of individual modalities and how they should be sewn together. In order to build AGI, we must either think deeply about how to unite existing modalities, or dispense with them altogether in favor of an interactive and embodied cognitive process.

The paradigm that led to the success of LLMs is marked primarily by scale, not efficiency. We have effectively trained a pile of one trillion ants for one billion years to mimic the form and function of a Formula 1 race car; eventually it gets there, but wow was the process inefficient. This analogy nicely captures a debate between structuralists, who want to build things like “wheels” and “axles” into AI systems, and scale maximalists, who want more ants, years, and F1 races to train on. Despite many decades of structuralist study in linguistics, the unstructured approaches of scale maximalism have yielded far better ant-racecars in recent years. This was most notably articulated by Rich Sutton — a recent recipient of the Turing Award along with Andy Barto for their work in Reinforcement Learning — in his piece “The Bitter Lesson.”
[W]e should build in only the meta-methods that can find and capture this arbitrary complexity… Essential to these methods is that they can find good approximations, but the search for them should be by our methods, not by us. We want AI agents that can discover like we can, not which contain what we have discovered. – Rich Sutton
Sutton’s argument is that methods that leverage computational resources will outpace methods that do not, and that any structure for problem-solving built as an inductive bias into AI will hinder it from learning better solutions. This is a compelling argument that I believe has been seriously misinterpreted by some as implying that making any assumptions about structure is a false step. It is, in fact, human intuition that was responsible for many significant advancements in the development of SOTA neural network architectures. For example, Convolutional Neural Networks made an assumption about translation invariance for pattern recognition in images and kickstarted the modern field of deep learning for computer vision; the attention mechanism of Transformers made an assumption about the long-distance relationships between symbols in a sentence that made ChatGPT possible and had nearly everyone drop their RNNs; and 3D Gaussian Splatting made an assumption about the solidity of physical objects that made it more performant than NeRFs. Potentially none of these methodological assumptions apply to the entire domain of possible scenes, images, or token streams, but they do for the specific ones that humans have curated and formed structural intuitions about. Let’s not forget that humans have co-evolved with the environments that these datasets are drawn from.
The real question is how we might heed Sutton’s Bitter Lesson in our development of AGI. The scale maximalist approach worked for LLMs and LVMs (large vision models) because we had natural deposits of text and image data, but an analogous application of scale maximalism to AGI would require forms of embodiment data that we simply don’t have. One solution to this data scarcity issue extends the generative modeling paradigm to multimodal modeling — encompassing language, vision, and action — with the hope that a general intelligence can be built by summing together general models of narrow modalities.
There are multiple issues with this approach. First, there are deep connections between modalities that are unnaturally severed in the multimodal setting, making the problem of concept synthesis ever more difficult. In practice, uniting modalities often involves pre-training dedicated neural modules for each modality and joining them together into a joint embedding space. In the early days, this was achieved by nudging the embeddings of, e.g. (language, vision, action) tuples to converge to similar latent vectors of meaning, a vast oversimplification of the kinds of relationships that may exist between modalities. One can imagine, e.g., captioning an image at various levels of abstraction, or implementing the same linguistic instruction with different sets of physical actions. Such one-to-many relationships suggest that a contrastive embedding objective is not suitable.
While modern approaches do not make such stringent assumptions about how modalities should be united, they still universally encode percepts from all modalities (e.g. text, images) into the same latent space. Intuitively, it would seem that such latent spaces could serve as common conceptual ground across modalities, analogous to a space of human concepts. However, these latent spaces do not cogently capture all information relevant to a concept, and instead rely on modality-specific decoders to flesh out important details. The “meaning” of a percept is not in the vector it is encoded as, but in the way relevant decoders process this vector into meaningful outputs. As long as various encoders and decoders are subject to modality-specific training objectives, “meaning” will be decentralized and potentially inconsistent across modalities, especially as a result of pre-training. This is not a recipe for the formation of coherent concepts.
Furthermore, it is not clear that today’s modalities are an appropriate partitioning of the observation and action spaces for an embodied agent. It is not obvious that, e.g., images and text should be represented as separate observation streams, nor text production and motion planning as separate action capabilities. The human capacities for reading, seeing, speaking, and moving are ultimately mediated by overlapping cognitive structures. Making structural assumptions about how modalities ought to be processed is likely to hinder the discovery of more fundamental cognition that is responsible for processing data in all modalities. One solution would be to consolidate unnaturally partitioned modalities into a unified data representation. This would encourage networks to learn intelligent processes that generalize across modalities. Intuitively, a model that can understand the visual world as well as humans can — including everything from human writing to traffic signs to visual art — should not make a serious architectural distinction between images and text. Part of the reason why VLMs can’t, e.g., count the number of letters in a word is because they can’t see what they are writing.
Finally, the learn-from-scale approach trains models to copy the conceptual structure of humans instead of learning the general capability to form novel concepts on their own. Humans have spent hundreds of thousands of years refining concepts and passing them memetically through culture and language. Today’s models are trained only on the end result of this process: the present-day conceptual structures that make it into the corpus. By optimizing for the ultimate products of our intelligence, we have ignored the question of how those products were invented and discovered. Humans have a unique ability to form durable concepts from few examples and ascribe names to them, reason about them analogically, etc. While the in-context capabilities of today’s models can be impressive, they grow increasingly limited as tasks become more complex and stray further from the training data. The flexibility to form new concepts from experience is a foundational attribute of general intelligence, we should think carefully about how it arises.
While structure-agnostic scale maximalism has succeeded in producing LLMs and LVMs that pass Turing tests, a multimodal scale maximalist approach to AGI will not bear similar fruit. Instead of pre-supposing structure in individual modalities, we should design a setting in which modality-specific processing emerges naturally. For example, my recent paper on visual theory of mind saw abstract symbols naturally emerge from communication between image-classifying agents, blurring the lines between text and image processing. Eventually, we should hope to reintegrate as many features of intelligence as possible under the same umbrella. However, it is not clear whether there is genuine commercial viability in such an approach as long as scaling and fine-tuning narrow intelligence models solves commercial use-cases.
Conclusion
The overall promise of scale maximalism is that a Frankenstein AGI can be sewed together using general models of narrow domains. I argue that this is extremely unlikely to yield an AGI that feels complete in its intelligence. If we intend to continue reaping the streamlined efficiency of modality-specific processing, we must be intentional in how modalities are united — ideally drawing from human intuition and classical fields of study, e.g. this work from MIT. Alternatively, we can re-formulate learning as an embodied and interactive process where disparate modalities naturally fuse together. We could do this by, e.g., processing images, text, and video using the same perception system and producing actions for generating text, manipulating objects, and navigating environments using the same action system. What we will lose in efficiency we will gain in flexible cognitive ability.
In a sense, the most challenging mathematical piece of the AGI puzzle has already been solved: the discovery of universal function approximators. What’s left is to inventory the functions we need and determine how they ought to be arranged into a coherent whole. This is a conceptual problem, not a mathematical one.
Acknowledgements
I would like to thank Lucas Gelfond, Daniel Bashir, George Konidaris, and my father, Joseph Spiegel, for their thoughtful and thorough feedback on this work. Thanks to Alina Pringle for the wonderful illustration made for this piece.
Author Bio
Benjamin is a PhD candidate in Computer Science at Brown University. He is interested in models of language understanding that ground meaning to elements of structured decision-making. For more info see his personal website.
Citation
For attribution in academic contexts or books, please cite this work as
Benjamin A. Spiegel, "AGI Is Not Multimodal", The Gradient, 2025.
@article{spiegel2025agi,
author = {Benjamin A. Spiegel},
title = {AGI Is Not Multimodal},
journal = {The Gradient},
year = {2025},
howpublished = {\url{https://thegradient.pub/agi-is-not-multimodal},
}
References
Andreas, Jacob. “Language Models, World Models, and Human Model-Building.” Mit.edu, 2024, lingo.csail.mit.edu/blog/world_models/.
Belkin, Mikhail, et al. “Reconciling modern machine-learning practice and the classical bias–variance trade-off.” Proceedings of the National Academy of Sciences 116.32 (2019): 15849-15854.
Bernhard Kerbl, et al. “3D Gaussian Splatting for Real-Time Radiance Field Rendering.” ACM Transactions on Graphics, vol. 42, no. 4, 26 July 2023, pp. 1–14, https://doi.org/10.1145/3592433.
Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge, Massachusetts: MIT Press.
Designing an Intelligence. Edited by George Konidaris, MIT Press, 2026.
Emily M. Bender and Alexander Koller. 2020. Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5185–5198, Online. Association for Computational Linguistics.
Eye on AI. “The Mastermind behind GPT-4 and the Future of AI | Ilya Sutskever.” YouTube, 15 Mar. 2023, www.youtube.com/watch?v=SjhIlw3Iffs&list=PLpdlTIkm0-jJ4gJyeLvH1PJCEHp3NAYf4&index=64. Accessed 18 May 2025.
Frank, Michael C. “Bridging the data gap between children and large language models.” Trends in cognitive sciences vol. 27,11 (2023): 990-992. doi:10.1016/j.tics.2023.08.007
Garrett, Caelan Reed, et al. “Integrated task and motion planning.” Annual review of control, robotics, and autonomous systems 4.1 (2021): 265-293.APA
Goodhart, C.A.E. (1984). Problems of Monetary Management: The UK Experience. In: Monetary Theory and Practice. Palgrave, London. https://doi.org/10.1007/978-1-349-17295-5_4
Hooker, Sara. The hardware lottery. Commun. ACM 64, 12 (December 2021), 58–65. https://doi.org/10.1145/3467017
Huh, Minyoung, et al. “The Platonic Representation Hypothesis.” Forty-first International Conference on Machine Learning. 2024.
Kaplan, Jared, et al. “Scaling laws for neural language models.” arXiv preprint arXiv:2001.08361 (2020).
Lake, Brenden M. et al. “Building Machines That Learn and Think like People.” Behavioral and Brain Sciences 40 (2017): e253. Web.
Li, Kenneth, et al. “Emergent world representations: Exploring a sequence model trained on a synthetic task.” ICLR (2023).
Luiten, Jonathon, Georgios, Kopanas, Bastian, Leibe, Deva, Ramanan. “Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis.” 3DV. 2024.
Mao, Jiayuan, Chuang, Gan, Pushmeet, Kohli, Joshua B., Tenenbaum, Jiajun, Wu. “The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision.” International Conference on Learning Representations. 2019.
Mitchell, Melanie. “LLMs and World Models, Part 1.” Substack.com, AI: A Guide for Thinking Humans, 13 Feb. 2025, aiguide.substack.com/p/llms-and-world-models-part-1. Accessed 18 May 2025.
Mu, Norman. “Norman Mu | the Myth of Data Inefficiency in Large Language Models.” Normanmu.com, 14 Feb. 2025, www.normanmu.com/2025/02/14/data-inefficiency-llms.html. Accessed 18 May 2025.
Newell, Allen, and Herbert A. Simon. “Computer Science as Empirical Inquiry: Symbols and Search.” Communications of the ACM, vol. 19, no. 3, 1 Mar. 1976, pp. 113–126, https://doi.org/10.1145/360018.360022.
Peng, Hao, et al. “When Does In-Context Learning Fall Short and Why? A Study on Specification-Heavy Tasks.” ArXiv.org, 2023, arxiv.org/abs/2311.08993.
Spiegel, Benjamin, et al. “Visual Theory of Mind Enables the Invention of Early Writing Systems.” CogSci, 2025, arxiv.org/abs/2502.01568.
Sutton, Richard S. Introduction to Reinforcement Learning. Cambridge, Mass, Mit Press, 04-98, 1998.
Vafa, Keyon, et al. “Evaluating the world model implicit in a generative model.” Advances in Neural Information Processing Systems 37 (2024): 26941-26975.
Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz; Polosukhin, Illia (December 2017). “Attention is All you Need”. In I. Guyon and U. Von Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett (ed.). 31st Conference on Neural Information Processing Systems (NIPS). Advances in Neural Information Processing Systems. Vol. 30. Curran Associates, Inc. arXiv:1706.03762.
Winograd, Terry. “Thinking Machines: Can There Be? Are We?” The Boundaries of Humanity: Humans, Animals, Machines, edited by James Sheehan and Morton Sosna, Berkeley: University of California Press, 1991, pp. 198–223.
Wu, Shangda, et al. “Beyond language models: Byte models are digital world simulators.” arXiv preprint arXiv:2402.19155 (2024).
Ethics & Policy
Navigating the Investment Implications of Regulatory and Reputational Challenges

The generative AI industry, once hailed as a beacon of innovation, now faces a storm of regulatory scrutiny and reputational crises. For investors, the stakes are clear: companies like Meta, Microsoft, and Google must navigate a rapidly evolving legal landscape while balancing ethical obligations with profitability. This article examines how regulatory and reputational risks are reshaping the investment calculus for AI leaders, with a focus on Meta’s struggles and the contrasting strategies of its competitors.
The Regulatory Tightrope
In 2025, generative AI platforms are under unprecedented scrutiny. A Senate investigation led by Senator Josh Hawley (R-MO) is probing whether Meta’s AI systems enabled harmful interactions with children, including romantic roleplay and the dissemination of false medical advice [1]. Leaked internal documents revealed policies inconsistent with Meta’s public commitments, prompting lawmakers to demand transparency and documentation [1]. These revelations have not only intensified federal oversight but also spurred state-level action. Illinois and Nevada, for instance, have introduced legislation to regulate AI mental health bots, signaling a broader trend toward localized governance [2].
At the federal level, bipartisan efforts are gaining momentum. The AI Accountability and Personal Data Protection Act, introduced by Hawley and Richard Blumenthal, seeks to establish legal remedies for data misuse, while the No Adversarial AI Act aims to block foreign AI models from U.S. agencies [1]. These measures reflect a growing consensus that AI governance must extend beyond corporate responsibility to include enforceable legal frameworks.
Reputational Fallout and Legal Precedents
Meta’s reputational risks have been compounded by high-profile lawsuits. A Florida case involving a 14-year-old’s suicide linked to a Character.AI bot survived a First Amendment dismissal attempt, setting a dangerous precedent for liability [2]. Critics argue that AI chatbots failing to disclose their non-human nature or providing false medical advice erode public trust [4]. Consumer advocacy groups and digital rights organizations have amplified these concerns, pressuring companies to adopt ethical AI frameworks [3].
Meanwhile, Microsoft and Google have faced their own challenges. A bipartisan coalition of U.S. attorneys general has warned tech giants to address AI risks to children, with Meta’s alleged failures drawing particular criticism [1]. Google’s decision to shift data-labeling work away from Scale AI—after Meta’s $14.8 billion investment in the firm—highlights the competitive and regulatory tensions reshaping the industry [2]. Microsoft and OpenAI are also reevaluating their ties to Scale AI, underscoring the fragility of partnerships in a climate of mistrust [4].
Financial Implications: Capital Expenditures and Stock Volatility
Meta’s aggressive AI strategy has come at a cost. The company’s projected 2025 AI infrastructure spending ($66–72 billion) far exceeds Microsoft’s $80 billion capex for data centers, yet Meta’s stock has shown greater volatility, dropping -2.1% amid regulatory pressures [2]. Antitrust lawsuits threatening to force the divestiture of Instagram or WhatsApp add further uncertainty [5]. In contrast, Microsoft’s stock has demonstrated stability, with a lower average post-earnings drawdown of 8% compared to Meta’s 12% [2]. Microsoft’s focus on enterprise AI and Azure’s record $75 billion annual revenue has insulated it from some of the reputational turbulence facing Meta [1].
Despite Meta’s 78% earnings forecast hit rate (vs. Microsoft’s 69%), its high-risk, high-reward approach raises questions about long-term sustainability. For instance, Meta’s Reality Labs segment, which includes AI-driven projects, has driven 38% year-over-year EPS growth but also contributed to reorganizations and attrition [6]. Investors must weigh these factors against Microsoft’s diversified business model and strategic investments, such as its $13 billion stake in OpenAI [3].
Investment Implications: Balancing Innovation and Compliance
The AI industry’s future hinges on companies’ ability to align innovation with ethical and legal standards. For Meta, the path forward requires addressing Senate inquiries, mitigating reputational damage, and proving that its AI systems prioritize user safety over engagement metrics [4]. Competitors like Microsoft and Google may gain an edge by adopting transparent governance models and leveraging state-level regulatory trends to their advantage [1].
Conclusion
As AI ethics and legal risks dominate headlines, investors must scrutinize how companies navigate these challenges. Meta’s struggles highlight the perils of prioritizing growth over governance, while Microsoft’s stability underscores the value of a measured, enterprise-focused approach. For now, the AI landscape remains a high-stakes game of regulatory chess, where the winners will be those who balance innovation with accountability.
Source:
[1] Meta Platforms Inc.’s AI Policies Under Investigation and [https://www.mintz.com/insights-center/viewpoints/54731/2025-08-22-meta-platforms-incs-ai-policies-under-investigation-and]
[2] The AI Therapy Bubble: How Regulation and Reputational [https://www.ainvest.com/news/ai-therapy-bubble-regulation-reputational-risks-reshaping-mental-health-tech-market-2508/]
[3] Breaking down generative AI risks and mitigation options [https://www.wolterskluwer.com/en/expert-insights/breaking-down-generative-ai-risks-mitigation-options]
[4] Experts React to Reuters Reports on Meta’s AI Chatbot [https://techpolicy.press/experts-react-to-reuters-reports-on-metas-ai-chatbot-policies]
[5] AI Compliance: Meaning, Regulations, Challenges [https://www.scrut.io/post/ai-compliance]
[6] Meta’s AI Ambitions: Talent Volatility and Strategic Reorganization—A Double-Edged Sword for Investors [https://www.ainvest.com/news/meta-ai-ambitions-talent-volatility-strategic-reorganization-double-edged-sword-investors-2508/]
Ethics & Policy
7 Life-Changing Books Recommended by Catriona Wallace | Books

7 Life-Changing Books Recommended by Catriona Wallace (Picture Credit – Instagram)
Some books ignite something immediate. Others change you quietly, over time. For Dr Catriona Wallace—tech entrepreneur, AI ethics advocate, and one of Australia’s most influential business leaders, books are more than just ideas on paper. They are frameworks, provocations, and spiritual companions. Her reading list offers not just guidance for navigating leadership and technology, but for embracing identity, power, and inner purpose. These seven titles reflect a mind shaped by disruption, ethics, feminism, and wisdom. They are not trend-driven. They are transformational.
1. Lean In by Sheryl Sandberg
A landmark in feminist career literature, Lean In challenges women to pursue their ambitions while confronting the structural and cultural forces that hold them back. Sandberg uses her own journey at Facebook and Google to dissect gender inequality in leadership. The book is part memoir, part manifesto, and remains divisive for valid reasons. But Wallace cites it as essential for starting difficult conversations about workplace dynamics and ambition. It asks, simply: what would you do if you weren’t afraid?

2. Women and Power: A Manifesto by Mary Beard
In this sharp, incisive book, classicist Mary Beard examines the historical exclusion of women from power and public voice. From Medusa to misogynistic memes, Beard exposes how narratives built around silence and suppression persist today. The writing is fiery, brief, and packed with centuries of insight. Wallace recommends it for its ability to distil complex ideas into cultural clarity. It’s a reminder that power is not just a seat at the table; it is a script we are still rewriting.
3. The World of Numbers by Adam Spencer
A celebration of mathematics as storytelling, this book blends fun facts, puzzles, and history to reveal how numbers shape everything from music to human behaviour. Spencer, a comedian and maths lover, makes the subject inviting rather than intimidating. Wallace credits this book with sparking new curiosity about logic, data, and systems thinking. It’s not just for mathematicians. It’s for anyone ready to appreciate the beauty of patterns and the thinking habits that come with them.
4. Small Giants by Bo Burlingham
This book is a love letter to companies that chose to be great instead of big. Burlingham profiles fourteen businesses that opted for soul, purpose, and community over rapid growth. For Wallace, who has founded multiple mission-driven companies, this book affirms that success is not about scale. It is about integrity. Each story is a blueprint for building something meaningful, resilient, and values-aligned. It is a must-read for anyone tired of hustle culture and hungry for depth.
5. The Misogynist Factory by Alison Phipps
A searing academic work on the production of misogyny in modern institutions. Phipps connects the dots between sexual violence, neoliberalism, and resistance movements in a way that is as rigorous as it is radical. Wallace recommends this book for its clear-eyed confrontation of how systemic inequality persists beneath performative gestures. It equips readers with language to understand how power moves, morphs, and resists change. This is not light reading. It is a necessary reading for anyone seeking to challenge structural harm.
6. Tribes by Seth Godin
Godin’s central idea is simple but powerful: people don’t follow brands, they follow leaders who connect with them emotionally and intellectually. This book blends marketing, leadership, and human psychology to show how movements begin. Wallace highlights ‘Tribes’ as essential reading for purpose-driven founders and changemakers. It reminds readers that real influence is built on trust and shared values. Whether you’re leading a company or a cause, it’s a call to speak boldly and build your own tribe.
7. The Tibetan Book of Living and Dying by Sogyal Rinpoche
Equal parts spiritual guide and philosophical reflection, this book weaves Tibetan Buddhist teachings with Western perspectives on mortality, grief, and rebirth. Wallace turns to it not only for personal growth but also for grounding ethical decision-making in a deeper sense of purpose. It’s a book that speaks to those navigating endings—personal, spiritual, or professional and offers a path toward clarity and compassion. It does not offer answers. It offers presence, which is often far more powerful.

The books that shape us are often those that disrupt us first. Catriona Wallace’s list is not filled with comfort reads. It’s made of hard questions, structural truths, and radical shifts in thinking. From feminist manifestos to Buddhist reflections, from purpose-led business to systemic critique, this bookshelf is a mirror of her own leadership—decisive, curious, and grounded in values. If you’re building something bold or seeking language for change, there’s a good chance one of these books will meet you where you are and carry you further than you expected.
Ethics & Policy
Hyderabad: Dr. Pritam Singh Foundation hosts AI and ethics round table at Tech Mahindra

The Dr. Pritam Singh Foundation and IILM University hosted a Round Table on “Human at Core: AI, Ethics, and the Future” in Hyderabad. Leaders and academics discussed leveraging AI for inclusive growth while maintaining ethics, inclusivity, and human-centric technology.
Published Date – 30 August 2025, 12:57 PM
Hyderabad: The Dr. Pritam Singh Foundation, in collaboration with IILM University, hosted a high-level Round Table Discussion on “Human at Core: AI, Ethics, and the Future” at Tech Mahindra, Cyberabad.
The event, held in memory of the late Dr. Pritam Singh, pioneering academic, visionary leader, and architect of transformative management education in India, brought together policymakers, business leaders, and academics to explore how India can harness artificial intelligence (AI) while safeguarding ethics, inclusivity, and human values.
In his keynote address, Padmanabhaiah Kantipudi, IAS (Retd.), Chairman of the Administrative Staff College of India (ASCI),
paid tribute to Dr. Pritam Singh, describing him as a nation-builder who bridged academia, business, and governance.
The Round Table theme, Leadership: AI, Ethics, and the Future, underscored India’s opportunity to leverage AI for inclusive growth across healthcare, agriculture, education, and fintech—while ensuring technology remains human-centric and trustworthy.
-
Tools & Platforms3 weeks ago
Building Trust in Military AI Starts with Opening the Black Box – War on the Rocks
-
Ethics & Policy1 month ago
SDAIA Supports Saudi Arabia’s Leadership in Shaping Global AI Ethics, Policy, and Research – وكالة الأنباء السعودية
-
Business2 days ago
The Guardian view on Trump and the Fed: independence is no substitute for accountability | Editorial
-
Events & Conferences3 months ago
Journey to 1000 models: Scaling Instagram’s recommendation system
-
Jobs & Careers2 months ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Funding & Business2 months ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Education2 months ago
VEX Robotics launches AI-powered classroom robotics system
-
Podcasts & Talks2 months ago
Happy 4th of July! 🎆 Made with Veo 3 in Gemini
-
Podcasts & Talks2 months ago
OpenAI 🤝 @teamganassi
-
Mergers & Acquisitions2 months ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies