{"id":2473,"date":"2019-01-08T17:52:38","date_gmt":"2019-01-08T08:52:38","guid":{"rendered":"http:\/\/163.180.4.222\/lab\/?p=2473"},"modified":"2019-01-08T17:52:38","modified_gmt":"2019-01-08T08:52:38","slug":"unprovability-comes-to-machine-learning","status":"publish","type":"post","link":"https:\/\/biochemistry.khu.ac.kr\/lab\/?p=2473","title":{"rendered":"Unprovability comes to machine learning"},"content":{"rendered":"<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h4>Scenarios have been discovered in which it is impossible to prove whether or not a machine-learning algorithm could solve a particular problem. This finding might have implications for both established and future learning algorithms.<\/h4>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<div class=\"article__body serif cleared\">\n<p>During the twentieth century, discoveries in mathematical logic revolutionized our understanding of the very foundations of mathematics. In 1931, the logician Kurt G\u00f6del showed that, in any system of axioms that is expressive enough to model arithmetic, some true statements will be unprovable<sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR1\">1<\/a><\/sup>. And in the following decades, it was demonstrated that the continuum hypothesis \u2014 which states that no set of distinct objects has a size larger than that of the integers but smaller than that of the real numbers \u2014 can be neither proved nor refuted using the standard axioms of mathematics<sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR2\">2<\/a><\/sup><sup>\u2013<\/sup><sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR4\">4<\/a><\/sup>. Writing in\u00a0<i>Nature Machine Intelligence<\/i>,\u00a0<a href=\"https:\/\/doi.org\/10.1038\/s42256-018-0004-1\" data-track=\"click\" data-label=\"https:\/\/doi.org\/10.1038\/s42256-018-0004-1\" data-track-category=\"body text link\">Ben-David\u00a0<i>et al.<\/i><\/a><sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR5\">5<\/a><\/sup>\u00a0show that the field of machine learning, although seemingly distant from mathematical logic, shares this limitation. They identify a machine-learning problem whose fate depends on the continuum hypothesis, leaving its resolution forever beyond reach.<\/p>\n<p>Machine learning is concerned with the design and analysis of algorithms that can learn and improve their performance as they are exposed to data. The power of this idea is illustrated by the following example: although it seems hopelessly difficult to explicitly program a computer to determine what objects are in a picture, the Viola\u2013Jones machine-learning system can detect human faces in real time after being trained on a labelled sample of photographs<sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR6\">6<\/a><\/sup>. Today, we regularly interact with machine-learning algorithms, from virtual assistants on our phones to spam filters for our e-mail. But these modern real-world applications trace their origins to a subfield of machine learning that is concerned with the careful formalization and mathematical analysis of various machine-learning settings.<\/p>\n<p>The goal of learning a predictor (a mathematical function that can be used to make predictions) from a database of random examples was formalized in the aptly named probably approximately correct (PAC) learning model<sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR7\">7<\/a><\/sup>. In this model, the aim is to train the predictor to match some true function that labels the data. A different model, called online learning, has the learner making immediate predictions as data arrive \u2014 for example, capturing a trading system\u2019s task of executing transactions in an ever-changing market. And another model known as multi-armed bandits can simulate clinical trials, in which the medical outcomes that an experimenter observes depend on his or her own choices.<\/p>\n<p>These are only a few examples of the many models used in machine learning. In each case, the basic goal is to perform as well, or nearly as well, as the best predictor in a family of functions, such as neural networks or decision trees. For a given model and function family, if this goal can be achieved under some reasonable constraints, the family is said to be learnable in the model.<\/p>\n<p>Machine-learning theorists are typically able to transform questions about the learnability of a particular function family into problems that involve analysing various notions of dimension that measure some aspect of the family\u2019s complexity. For example, the appropriate notion for analysing PAC learning is known as the Vapnik\u2013Chervonenkis (VC) dimension<sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR8\">8<\/a><\/sup>, and, in general, results relating learnability to complexity are sometimes referred to as Occam\u2019s-razor theorems<sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR9\">9<\/a><\/sup>. These notions of dimension happen to be simple enough to leave no room for the spectre of unprovability to manifest itself. But Ben-David and colleagues show that machine learning cannot always escape this fate. They introduce a learning model called estimating the maximum (EMX), and go on to discover a family of functions whose learnability in EMX is unprovable in standard mathematics.<\/p>\n<p>Ben-David\u00a0<i>et al.<\/i>\u00a0describe an example EMX problem: targeting advertisements at the most frequent visitors to a website when it is not known in advance which visitors will visit the site. The authors formalize EMX as a question about a learner\u2019s ability to find a function, from a given family, whose expected value over a target distribution is as large as possible. EMX is actually quite similar to the PAC model, but the slightly different learning criterion surprisingly connects it to the continuum hypothesis and brings unprovability into the picture.<\/p>\n<p>The authors\u2019 proof involves a beautiful connection between machine learning and data compression that was first observed<sup><a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29#ref-CR10\">10<\/a><\/sup>\u00a0in the 1980s. The intuition is that, if a training sample labelled by a function from some family can always be compressed, the family must in some sense have low complexity, and therefore be learnable. Moreover, certain learning algorithms can be used to compress data. The authors introduce monotone compression \u2014 a variant of compression that they show to be appropriate for characterizing the learnability of particular function families in EMX.<\/p>\n<p>Ben-David and colleagues then prove that the ability to carry out a weak form of monotone compression is related to the size of certain infinite sets. The set that the authors ultimately use in their work is the unit interval, which is the set of real numbers between 0 and 1. Their results imply that the finite subsets of the unit interval have monotone-compression schemes, and therefore are learnable in EMX, if and only if the continuum hypothesis is true, which is known to be unprovable.<\/p>\n<p>Because EMX is a new model in machine learning, we do not yet know its usefulness for developing real-world algorithms. So these results might not turn out to have practical importance. But we do now know that we should be careful when introducing new models of learning. Moreover, we might need to look again at the many subtleties that can come up, even in established learning models.<\/p>\n<p>Machine learning has matured as a mathematical discipline and now joins the many subfields of mathematics that deal with the burden of unprovability and the unease that comes with it. Perhaps results such as this one will bring to the field of machine learning a healthy dose of humility, even as machine-learning algorithms continue to revolutionize the world around us.<\/p>\n<\/div>\n<p><span class=\"emphasis\">Nature<\/span>\u00a0<strong>565<\/strong>, 166-167 (2019)<\/p>\n<aside class=\"c-latest-content mt40 hide-print\" data-simple-tab=\"\" data-tab-group=\"\" data-component-id=\"latest-news\" data-track=\"in-view\" data-track-action=\"in-view\" data-track-category=\"latest content\" data-track-label=\"visible\">\n<div id=\"latest-content\" role=\"tablist\"><\/div>\n<\/aside>\n<p>&nbsp;<\/p>\n<p>(\uc6d0\ubb38: <a href=\"https:\/\/www.nature.com\/articles\/d41586-019-00012-4?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+nature%2Frss%2Fcurrent+%28Nature+-+Issue%29\">\uc5ec\uae30<\/a>\ub97c \ud074\ub9ad\ud558\uc138\uc694~)<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; &nbsp; Scenarios have been discovered in which it is impossible to prove whether or not a machine-learning algorithm could solve a particular problem. This<a href=\"https:\/\/biochemistry.khu.ac.kr\/lab\/?p=2473\" class=\"more-link\">(more&#8230;)<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[32,35,29,30],"tags":[],"class_list":["post-2473","post","type-post","status-publish","format-standard","hentry","category-essays-on-science","category-lets-do-computer-science","category-lets-do-science","category-recent-science-news"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":2979,"url":"https:\/\/biochemistry.khu.ac.kr\/lab\/?p=2979","url_meta":{"origin":2473,"position":0},"title":"Robots on the run","author":"biochemistry","date":"March 29, 2019","format":false,"excerpt":"\u00a0 After decades of clumsiness, robots are finally learning to walk, run and grasp with grace. Such progress spells the beginning of an age of physically adept artificial intelligence. \u00a0 Young animals gallop across fields, climb trees and immediately find their feet with enviable grace after they fall1. And like\u2026","rel":"","context":"In &quot;Let's Do Computer Science!&quot;","block_context":{"text":"Let's Do Computer Science!","link":"https:\/\/biochemistry.khu.ac.kr\/lab\/?cat=35"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3994,"url":"https:\/\/biochemistry.khu.ac.kr\/lab\/?p=3994","url_meta":{"origin":2473,"position":1},"title":"Bringing machine learning to the masses","author":"biochemistry","date":"August 3, 2019","format":false,"excerpt":"\u00a0 \u00a0 A machine learning tool called Northstar lets users play with data visually. PHOTO: MELANIE GONICK \u00a0 \u00a0 Yang-Hui He, a mathematical physicist at the University of London, is an expert in string theory, one of the most abstruse areas of physics. But when it comes to artificial intelligence\u2026","rel":"","context":"In &quot;Let's Do Computer Science!&quot;","block_context":{"text":"Let's Do Computer Science!","link":"https:\/\/biochemistry.khu.ac.kr\/lab\/?cat=35"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1545,"url":"https:\/\/biochemistry.khu.ac.kr\/lab\/?p=1545","url_meta":{"origin":2473,"position":2},"title":"New machine-learning technologies for computer-aided diagnosis","author":"biochemistry","date":"September 4, 2018","format":false,"excerpt":"\u00a0 \u00a0 (\uc6d0\ubb38) \u00a0 \u00a0 Nature Medicine\u00a0(2018) \u00a0 \u00a0 Machine learning can be used for computer-aided diagnosis of acute neurological events and retinal disease and can be incorporated into conventional clinical workflows to improve health outcomes. \u00a0 \u00a0 Machine learning is a branch of data science that trains computers to\u2026","rel":"","context":"In &quot;Let's Do Biology!&quot;","block_context":{"text":"Let's Do Biology!","link":"https:\/\/biochemistry.khu.ac.kr\/lab\/?cat=33"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3926,"url":"https:\/\/biochemistry.khu.ac.kr\/lab\/?p=3926","url_meta":{"origin":2473,"position":3},"title":"AI protein-folding algorithms solve structures faster than ever","author":"biochemistry","date":"July 22, 2019","format":false,"excerpt":"\u00a0 \u00a0 Deep learning makes its mark on protein-structure prediction. \u00a0 \u00a0 Predicting protein structures from their sequences would aid drug design.Credit: Edward Kinsman\/Science Photo Library \u00a0 \u00a0 The race to crack one of biology\u2019s grandest challenges \u2014 predicting the 3D structures of proteins from their amino-acid sequences \u2014 is\u2026","rel":"","context":"In &quot;Let's Do Biology!&quot;","block_context":{"text":"Let's Do Biology!","link":"https:\/\/biochemistry.khu.ac.kr\/lab\/?cat=33"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2540,"url":"https:\/\/biochemistry.khu.ac.kr\/lab\/?p=2540","url_meta":{"origin":2473,"position":4},"title":"Precision CRISPR editing","author":"biochemistry","date":"January 18, 2019","format":false,"excerpt":"\u00a0 \u00a0 The most popular gene-editing tool, CRISPR-Cas9, generates breaks in the genome that are subsequently repaired by a mix of cellular pathways. Yet, the repair outcomes are not random. Using machine-learning algorithms to analyze large amounts of Cas9-mediated, genome-wide editing events in a range of cells, Shen\u00a0et al., Allen\u00a0et\u2026","rel":"","context":"In &quot;Let's Do Biology!&quot;","block_context":{"text":"Let's Do Biology!","link":"https:\/\/biochemistry.khu.ac.kr\/lab\/?cat=33"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1525,"url":"https:\/\/biochemistry.khu.ac.kr\/lab\/?p=1525","url_meta":{"origin":2473,"position":5},"title":"\ucc45 \uc18c\uac1c &#8211; Understanding the double slit","author":"biochemistry","date":"September 2, 2018","format":false,"excerpt":"\u00a0 \u00a0 (\uc6d0\ubb38: \uc5ec\uae30\ub97c \ud074\ub9ad\ud558\uc138\uc694~) \u00a0 \u00a0 Science\u00a0\u00a031 Aug 2018: Vol. 361, Issue 6405, pp. 855 DOI: 10.1126\/science.aav0128 \u00a0 \u00a0 In his famous\u00a0Lectures on Physics, Richard Feynman argued that nothing more is needed to get a solid grasp of the behavior of quantum objects than the simple double-slit experiment, in\u2026","rel":"","context":"In &quot;Essays on Science&quot;","block_context":{"text":"Essays on Science","link":"https:\/\/biochemistry.khu.ac.kr\/lab\/?cat=32"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":false,"jetpack_shortlink":"https:\/\/wp.me\/p9Xo1j-DT","_links":{"self":[{"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=\/wp\/v2\/posts\/2473","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2473"}],"version-history":[{"count":1,"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=\/wp\/v2\/posts\/2473\/revisions"}],"predecessor-version":[{"id":2474,"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=\/wp\/v2\/posts\/2473\/revisions\/2474"}],"wp:attachment":[{"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2473"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2473"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/biochemistry.khu.ac.kr\/lab\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2473"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}