{"id":324029,"date":"2026-04-24T21:32:15","date_gmt":"2026-04-24T16:02:15","guid":{"rendered":"https:\/\/ebiztoday.news\/?p=324029"},"modified":"2026-04-24T21:32:15","modified_gmt":"2026-04-24T16:02:15","slug":"teaching-ai-models-to-say-im-unsure-mit-news","status":"publish","type":"post","link":"https:\/\/ebiztoday.news\/index.php\/2026\/04\/24\/teaching-ai-models-to-say-im-unsure-mit-news\/","title":{"rendered":"Teaching AI models to say \u201cI\u2019m unsure\u201d | MIT News"},"content":{"rendered":"<div>\n<p dir=\"ltr\" id=\"docs-internal-guid-57729c6d-7fff-dea4-bd4a-1d5b0ebbff74\">Confidence is persuasive. In artificial intelligence systems, it is usually misleading.<\/p>\n<p dir=\"ltr\">Today&#8217;s most capable reasoning models share a trait with the loudest voice within the room: They deliver every answer with the identical unshakable certainty, whether or not they&#8217;re right or guessing. Researchers at MIT&#8217;s Computer Science and Artificial Intelligence Laboratory (CSAIL) have now traced that overconfidence to a particular flaw in how these models are trained, and developed a way that fixes it without giving up any accuracy.<\/p>\n<p dir=\"ltr\">The technique, called RLCR (Reinforcement Learning with Calibration Rewards), trains language models to provide calibrated confidence estimates alongside their answers. Along with coming up with a solution, the model thinks about its uncertainty in that answer, and outputs a confidence rating. In experiments across multiple benchmarks, RLCR reduced calibration error by as much as 90 percent while maintaining or improving accuracy, each on the tasks the model was trained on and on entirely recent ones it had never seen. The work will probably be presented on the International Conference on Learning Representations later this month.<\/p>\n<p dir=\"ltr\">The issue traces to a surprisingly easy source. The reinforcement learning (RL) methods behind recent breakthroughs in AI reasoning, including the training approach utilized in systems like OpenAI&#8217;s o1, reward models for getting the proper answer, and penalize them for getting it fallacious. Nothing in between. A model that arrives at the right answer through careful reasoning receives the identical reward as one which guesses accurately by probability. Over time, this trains models to confidently answer every query they&#8217;re asked, whether or not they have strong evidence or are effectively flipping a coin.<\/p>\n<p dir=\"ltr\">That overconfidence has consequences. When models are deployed in medicine, law, finance, or any setting where users make decisions based on AI outputs, a system that expresses high confidence no matter its actual certainty becomes unreliable in ways which might be difficult to detect from the skin. A model that claims &#8220;I&#8217;m 95 percent sure&#8221; when it is correct only half the time is more dangerous than one which simply gets the reply fallacious, because users haven&#8217;t any signal to hunt a second opinion.<\/p>\n<p dir=\"ltr\">&#8220;The usual training approach is straightforward and powerful, however it gives the model no incentive to specific uncertainty or say\u00a0I don\u2019t know,&#8221; says Mehul Damani, an MIT PhD student and co-lead writer on the\u00a0<a href=\"https:\/\/arxiv.org\/abs\/2507.16806\">paper.<\/a> &#8220;So the model naturally learns to guess when it&#8217;s unsure.&#8221;\u00a0<\/p>\n<p dir=\"ltr\">RLCR addresses this by adding a single term to the reward function: a Brier rating, a well-established measure that penalizes the gap between a model&#8217;s stated confidence and its actual accuracy. During training, models learn to reason about each the issue and their very own uncertainty, producing a solution and a confidence estimate together. Confidently fallacious answers are penalized. So are unnecessarily uncertain correct ones.<\/p>\n<p dir=\"ltr\">The mathematics backs it up: the team proved formally that any such reward structure guarantees models which might be each accurate and well-calibrated. They then tested the approach on a 7-billion-parameter model across a variety of question-answering and math benchmarks, including six datasets the model had never been trained on.<\/p>\n<p dir=\"ltr\">The outcomes showed a consistent pattern. Standard RL training actively degraded calibration in comparison with the bottom model, making models worse at estimating their very own uncertainty. RLCR reversed that effect, substantially improving calibration with no loss in accuracy. The tactic also outperformed post-hoc approaches, by which a separate classifier is trained to assign confidence scores after the actual fact. &#8220;What\u2019s striking is that abnormal RL training doesn&#8217;t just fail to assist calibration. It actively hurts it,&#8221; says Isha Puri, an MIT PhD student and co-lead writer. &#8220;The models turn into more capable and more overconfident at the identical time.&#8221;<\/p>\n<p dir=\"ltr\">The team also demonstrated that the arrogance estimates produced by RLCR are practically useful at inference time. When models generate multiple candidate answers, choosing the one with the best self-reported confidence, or weighting votes by confidence in a majority-voting scheme, improves each accuracy and calibration as compute scales.<\/p>\n<p dir=\"ltr\">A further finding suggests that the act of reasoning about uncertainty itself has value. The researchers trained classifiers on model outputs and located that including the model&#8217;s explicit uncertainty reasoning within the input improved the classifier&#8217;s performance, particularly for smaller models. The model&#8217;s self-reflective reasoning about what it does and doesn\u2019t know incorporates real information, not only decoration.<\/p>\n<p dir=\"ltr\">Along with Damani and Puri, other authors on the paper are Stewart Slocum, Idan Shenfeld, Leshem Choshen, and senior authors Jacob Andreas and Yoon Kim.<\/p>\n<\/p><\/div>\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Confidence is persuasive. In artificial intelligence systems, it is usually misleading. Today&#8217;s most capable reasoning models share a trait with the loudest voice within the room: They deliver every answer with the identical unshakable certainty, whether or not they&#8217;re right or guessing. Researchers at MIT&#8217;s Computer Science and Artificial Intelligence Laboratory (CSAIL) have now traced [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":324030,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[182,356,395,4554],"class_list":["post-324029","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology","tag-mit","tag-models","tag-news","tag-teaching"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/posts\/324029","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/comments?post=324029"}],"version-history":[{"count":2,"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/posts\/324029\/revisions"}],"predecessor-version":[{"id":324032,"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/posts\/324029\/revisions\/324032"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/media\/324030"}],"wp:attachment":[{"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/media?parent=324029"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/categories?post=324029"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ebiztoday.news\/index.php\/wp-json\/wp\/v2\/tags?post=324029"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}