Peter Wang on How to Democratise AI
See also Ursula Martin's talk in Emily Riehl - Formalizing invisible mathematics: case studies from higher category theory.
I'm too old for this shit! At 7:12 there's a sign of what's really going on in LLMs. See Florentin Guth and Brice Ménard's paper On the universality of neural encodings in CNNs and Rishi Jha, Collin Zhang, Vitaly Shmatikov and John X. Morris's Harnessing the Universal Geometry of Embeddings: We introduce the first method for translating text embeddings from one vector space to another without any paired data, encoders, or predefined sets of matches. Our unsupervised approach translates any embedding to and from a universal latent representation (i.e., a universal semantic structure conjectured by the Platonic Representation Hypothesis). Our translations achieve high cosine similarity across model pairs with different architectures, parameter counts, and training datasets.
The ability to translate unknown embeddings into a different space while preserving their geometry has serious implications for the security of vector databases. An adversary with access only to embedding vectors can extract sensitive information about the underlying documents, sufficient for classification and attribute inference.
It's producing an optimum encoding, the one Claude Shannon described in 1948 in sections I.2 and I.3 of A Mathematical Theory of Communication or it would of there was some proper control of the input. If you read E. T. Jaynes book Probability Theory you see that this is a very interesting question from the point of view of Bayesian probability. This quote on thinking machines from Jaynes' book sums it up:
Models have practical uses of a quite different type. Many people are fond of saying, “They will never make a machine to replace the human mind—it does many things which no machine could ever do.” A beautiful answer to this was given by J. von Neumann in a talk on computers given in Princeton in 1948, which the writer was privileged to attend. In reply to the canonical question from the audience [“But of course, a mere machine can’t really think, can it?”], he said: “You insist that there is something a machine cannot do. If you will tell me precisely what it is that a machine cannot do, then I can always make a machine which will do just that!”
In Jaynes' 1978 essay Where do we stand on Maximum Entropy? he wrote, about Shannon's notion of entropy as information:
We take a step in the direction of making sense out of this if we suppose that H measures, not the information of the sender, but the ignorance of the receiver, that is removed by receipt of the message. Indeed, many subsequent commentators appear to adopt this interpretation. Shannon, however, proceeds to use H to determine the channel capacity C required to transmit the message at the given rate. But whether a channel can or cannot transmit a message M in time T obviously depends only upon properties of the message and the channel --- and not at all on the prior ignorance of the receiver! So this interpretation will not work either.
From E.T. Jaynes "Where do we stand on Maximum Entropy?" (1978) [p 23]
Subscribe to FUTO.
Comments
Post a Comment