The Mathematical Foundations of Artificial Intelligence

Published in

Turk Telekom Bulut Teknolojileri

5 min readMay 5, 2022

Different definitions of AI

In some CS courses, artificial intelligence (AI), sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans. Also some AI textbooks define the field as the study of “intelligent agents”: any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals. Colloquially, the term “artificial intelligence” is often used to describe machines (or computers) that mimic “cognitive” functions that humans associate with the human mind, such as “learning” and “problem solving”.

Russell, Stuart J.; Norvig, Peter (2009). Artificial Intelligence: A Modern Approach (3rd ed.). Upper Saddle River, New Jersey: Prentice Hall. ISBN 978–0–13–604259–4

Functional Analysis

Establishes to domain of the model or “hypothesis” 
Defines operations within the domain and transformations into adjacent domains 
Provides for measures of completeness: orthonormal function sets, vector projection 
Simplifies to more tractable implementations: linear algebra, matrix arithmetic, Fourier series.

Numerical Methods

Solutions to multivariate classification problems often require optimization routines:

Establishment of cost and gradient functions
Numerical search strategies
Linearization/determinism of stochastic process
Application of heuristics and ontologies
Numerical integration and differentiation required for ill defined data or “complicated” regions

Probability Theory

Establishes performance bounds upon stochastic classifiers: Bayesian networks, Particle Filters, Markov Chains, Maximum Likelihood, Parameter Estimation, Statistical Analysis of Physical Parameters
Accommodates stochastic processes and multivariate data — employing measures such as Mahalanobis Distance and MahalanobisBregman divergence

Convolutional neural networks are a specialized type of artificial neural networks that use a mathematical operation called convolution in place of general matrix multiplication in at least one of their layers.[13] They are specifically designed to process pixel data and are used in image recognition and processing.

Ian Goodfellow and Yoshua Bengio and Aaron Courville (2016). Deep Learning. MIT Press. p. 326.

Questions??

• Is the image sufficiently sampled to capture “high frequency” effects- Nyquist criteria

• Does the discretization of the convolution function compromise the output

• How much data is lost when using max pool compression

• Is fidelity of training data sufficient • Would alternate approaches (DCT, for example) provide sufficient compression and maintain fidelity

What would be the difference in compute resource requirements

NNs and Numerical Methods

A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes.[1] Thus, a neural network is either a biological neural network, made up of biological neurons, or an artificial neural network, used for solving artificial intelligence (AI) problems. The connections of the biological neuron are modeled in artificial neural networks as weights between nodes. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed. This activity is referred to as a linear combination. Finally, an activation function controls the amplitude of the output. For example, an acceptable range of output is usually between 0 and 1, or it could be −1 and 1.

Hopfield, J. J. (1982). “Neural networks and physical systems with emergent collective computational abilities”. Proc. Natl. Acad. Sci. U.S.A. 79 (8): 2554–2558. Bibcode:1982PNAS…79.2554H. doi:10.1073/pnas.79.8.2554. PMC 346238. PMID 6953413.

Forward Propagation

Back Propagation, minimize wrt θ, gradient derivatives

Searching for the minima

Classic optimization theory
Conjugate gradient
Simplex
Direct search
Stochastic Gradient

Challenges

Well behaved and global minima
Oscillatory behavior
Regularization
Convergence Rate

BBNs and Probability Theory

Bayesian belief networks (also known as belief networks, causal probabilistic networks, causal nets, graphical probability networks, probabilistic cause–effect models, and probabilistic influence diagrams) provide decision-support for a wide range of problems involving uncertainty and probabilistic reasoning. The underlying theory of BBNs is Bayesian probability theory and the notion of propagation.

There is an example of this theory as you see below

Naïve Bayes Probability Condition

Vapnik’s Learning Model

A generator of random vectors ∈ , drawn independently from a fixed but unknown probability distribution function F(x)
A supervisor who returns an output value to every input vector according to a conditional distribution function F(x|y) also fixed but unknown
A learning machine capable of implementing a set of functions f(x, α), α ∈ Λ, where Λ is a set of parameters.

Thank you for reading my article about mathematical background of AI. Hope you enjoyed. You can reach all sources as you see below.

IEEE Maine Section

Welcome to the IEEE Maine Section See our very interesting recent tour and presentation of ReVision Energy. ReVision…

r1.ieee.org

https://r1.ieee.org/maine/wp-content/uploads/sites/29/

Deep Learning

The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine…

www.deeplearningbook.org

Neural networks and physical systems with emergent collective computational abilities - PubMed

Computational properties of use of biological organisms or to the construction of computers can emerge as collective…

pubmed.ncbi.nlm.nih.gov

The Nature of Statistical Learning Theory

The aim of this book is to discuss the fundamental ideas which lie behind the statistical theory of learning and…

link.springer.com

https://www.researchgate.net/publication/229674541_Bayesian_Belief_Networks_BBNs