go back

On a differentiable partial information decomposition for continuous random variables and applications in (artificial) neural networks

Kyle Poland, Anja Sturm, Aaron Gutknecht, Patricia Wollstadt, Michael Wibral, Abdullah Makkeh, "On a differentiable partial information decomposition for continuous random variables and applications in (artificial) neural networks", Bernstein Conference 2021, 2021.

Abstract

Understanding information mechanisms inside complex systems often poses intricate questions. In neural systems, information is often represented by an ensemble of agents. Knowledge about how information is distributed amongst those agents can lead to insights about how to distribute relevant information about a problem over available agents. These agents can, for instance, be neurons that are recorded during stimulation, one may imagine spike trains connected to neurophysical behavior in a classical center-out-task. Determining the exact nature of the information that arises when varying the composition of agents is answered by partial information decomposition (PID), modeling the agents as so-called source and target random variables. The framework of PID decomposes the multivariate mutual information according to information contributions, such as shared, unique, and synergistic information. These categories represent distinct ways in which a collection of source variables might contribute information about a specific target variable, and can vastly enhance the understanding of neural systems. However, in its conceptual generality, particular propositions for PID quantities have so far mostly been defined for systems of purely discrete variables. While recently a quantification for a PID in continuous settings for two or three source variables was introduced, no ansatz has managed to both cover more than three variables and at the same time assume measure-theoretic variables, such as mixed discrete-continuous, or continuous variables yet. In this work, we will propose such an information quantity that is well-defined for any finite number or type of source and target variable. This proposed quantity is tightly related to a recently developed local shared information quantity for discrete variables based on the idea of shared exclusions. Further, we prove that this new measure fulfills various desirable properties, crucial for applicability in particular settings of interest for neuroscientists and physicists. We demonstrate that our measure satisfies (i) a set of PID axioms, (ii) invariance under invertible transformations, ensuring independence of experimental setups and measurement units of neural recordings, (iii) differentiability enabling gradient descent methods for learning in neural networks, and (iv) admitting a target chain rule quantifying simultaneous treatment of multiple neurons investigating i.e. their cross-dependence on the neurons past.



Download Bibtex file Per Mail Request

Search