Türkiye'deki Matematiksel Etkinlikler
29 Nisan 2022, 18:00 Bilkent Üniversitesi Analiz Seminerleri Ali Devran Kara The talk focuses on partially observed Markov Decision processes (POMDPs). In POMDPs, the existence of optimal policies has in general been established via converting the original partially observed stochastic control problem to a fully observed one on the belief space, leading to a belief-MDP. However, computing an optimal policy for this fully observed model using classical methods is challenging even if the original system has a finite state and action spaces, since the state space of the fully observed belief-MDP model is always uncountable. We provide an approximation technique for POMPDs that use a finite window history of past information variables. We establish near optimality of finite window control policies in POMDPs under filter stability conditions and the assumption that the measurement and action sets are finite (and the state space is real vector valued). We also establish a rate of convergence result which relates the finite window memory size and the approximation error bound, where the rate of convergence is exponential under the filter stability conditions, where filter stability refers to the correction of an incorrectly initialized filter for a partially observed stochastic dynamical system (controlled or control-free) with increasing measurements. Finally, we establish the convergence of the associated Q learning algorithm for control policies using such a finite history of past observations and control actions (by viewing the finite window as a 'state') and we show near optimality of such limit Q functions under the filter stability condition. While there exist many experimental results for POMDPs, (i) the near optimality with an explicit rate of convergence (in the memory size) and relations to filter stability, and (ii) the asymptotic convergence (to the approximate MDP value function) for such finite-memory Q-learning algorithms are results that are new to the literature, to our knowledge. -Joint work with Serdar Yuksel (Queen's University). NOT: To request the event link, please send a message to goncha@fen.bilkent.edu.tr Optimal Kontrol İngilizce Zoom botan 26.04.2022 |
Akademik biriminizin ya da çalışma grubunuzun ülkemizde gerçekleşen etkinliklerini, ilan etmek istediğiniz burs, ödül, akademik iş imkanlarını veya konuk ettiğiniz matematikçileri basit bir veri girişi ile kolayca turkmath.org sitesinde ücretsiz duyurabilirsiniz. Sisteme giriş yapmak için gerekli bilgileri almak ya da görüş ve önerilerinizi bildirmek için iletişime geçmekten çekinmeyiniz. Katkı verenler listesi için tıklayınız.
Özkan Değer ozkandeger@gmail.com
31. Journees Arithmetiques Konferansı Organizasyon Komitesi
Web sitesinin masraflarının karşılanması ve hizmetine devam edebilmesi için siz de bağış yapmak, sponsor olmak veya reklam vermek için lütfen iletişime geçiniz.