This is an automated archive made by the Lemmit Bot.
The original was posted on /r/machinelearning by /u/masonw32 on 2024-11-10 03:37:47+00:00.
In machine learning we work with log probabilities a lot, attempting to maximize log probability. This makes sense from a numerical perspective since adding is easier than multiplying but I am also wondering if there is a fundamental meaning behind “log probability.”
For instance, log probability is used a lot in information theory, and is the negative of ‘information’. Can we view minimizing the negative log likelihood in terms of information theory? Is it maximizing/minimizing some metric of information?
You must log in or register to comment.