🤖 AI Summary
This study addresses the measurement, optimization, and feedback capacity computation of directed information (DI) in finite-state channels with feedback and memory. To overcome the limitations of conventional approaches in large alphabet settings, the work formulates DI maximization as a Markov decision process, integrating Q-graph representations with reinforcement learning techniques to optimize the causal conditional distribution between input and output, thereby enhancing the DI rate. By leveraging plug-in estimators, neural network-based estimators, and value iteration algorithms, the proposed framework enables either exact computation or high-precision estimation of the feedback capacity for strongly connected, unidirectionally evolving channels, significantly advancing the capabilities of complex channel modeling and causal inference.
📝 Abstract
Directed information (DI) is an information measure that attempts to capture directionality in the flow of information from one random process to another. It is closely related to other causal influence measures, such as transfer entropy, Granger causality, and Pearl's causal framework. This monograph provides an overview of DI and its main application in information theory, namely, characterizing the capacity of channels with feedback and memory. We begin by reviewing the definitions of DI, its basic properties, and its relation to Shannon's mutual information. Next, we provide a survey of DI estimation techniques, ranging from classic plug-in estimators to modern neural-network-based estimators. Considering the application of channel capacity estimation, we describe how such estimators numerically optimize DI rate over a class of joint distributions on input and output processes. A significant part of the monograph is devoted to techniques to compute the feedback capacity of finite-state channels (FSCs). The feedback capacity of a strongly connected FSC involves the maximization of the DI rate from the channel input process to the output process. This maximization is performed over the class of causal conditioned probability input distributions. When the FSC is also unifilar, i.e., the next state is given by a time-invariant function of the current state and the new input-output symbol pair, the feedback capacity is the optimal average reward of an appropriately formulated Markov decision process (MDP). This MDP formulation has been exploited to develop several methods to compute exactly, or at least estimate closely, the feedback capacity of a unifilar FSC. This monograph describes these methods, starting from the value iteration algorithm, to Q-graph methods, and reinforcement learning algorithms that can handle large input and output alphabets.