Hacker Newsnew | past | comments | ask | show | jobs | submit | drdeca's commentslogin

If you interpret “interpolate” in the literal sense, and apply it to the mechanisms behind LLMs, then the claim that they only interpolate, is straightforwardly false.

Taking it instead as a metaphorical claim may be more valid, but in that case it doesn’t depend on our understanding of how LLMs work.


LLMs are statistical models by construction, so depending on how liberal you want to be with terminology, "interpolate" is not so bad. Might make a statistician upset.

But people aren’t giving a (less literal) definition of what they mean by “interpolate” that relies on the internal mechanisms of these models, just a vague metaphor, which, as this vague metaphor, there’s nothing it uses about LLMs that makes the question “do LLMs just interpolate” less of a type error than “do people just interpolate”.

And I don’t think it’s a good metaphor.


They’re also capable of performing arbitrary computation when ran in a loop - so they can be made to quite literally interpolate whatever. Philosophers are quite upset, too.

Define "arbitrary". Without RAG, they screw up basic algebra.

People keep saying this, but if you try to interpret this at all literally, it just doesn’t work. Like, it’s phrased like it should have a precise meaning, right? Like, people even mention convex hulls when talking about it.

But if you actually try to take a convex hull of, some encoding of sentences as vectors? It isn’t true. The outputs are not in the convex hull of the training data.

I guess it’s supposed to be a metaphor and not literal, but in that case it’s confusing. Especially seeing as there are contexts in machine learning where literal interpolation vs literal extrapolation, is relevant. So, please, find a better way to say it than saying that “it can only interpolate”?


If it's all just points in the multidimensional space, why would the thing be restricted to some operations and not others. I'm not buying the argument

Sorry, I don't understand what you mean. Are you agreeing or disagreeing with me?

If it can only interpolate in a literal sense, that means that it only produces good outputs on convex combinations of inputs that appear in the training set. That's what interpolation means. But, if you take the embedding vectors of sentences/prompts, and then take the convex hull of these, it is not typical for new sentences not in the training set to have its embedding vectors be in the convex hull of these.


I’m not sure I follow your end to end reasoning. In an n dimensional space interpolation along and within the convex hull is pretty much what they’re doing. How can it possibly not be? How would it interpolate a point that’s not within its vector space? Yes, it’s very complex with non linear transformations and a very high dimensionality, and residuals and other features create more complexity in the shape of the hull. But an LLM can not infer a concept to which it has no information channel. That’s clearly nonsense. The fact that they do bounded, learned, nonlinear compositional generalizations over a representational space induced by training -is by nature interpolation- not extrapolation. I’m sorry, but I believe their immense power has you confusing math with magic.

I think your point about “you could randomly generate a sequence of words, which could in principle produce a text interpretable as expressing any particular expressible-as-a-sequence-of-words novel good idea” pretty much refutes the idea that guessing and checking can only result in things inside such a convex hull, unless said hull already contains everything. Of course, there’s a significant role to play by the “checking” part.

Like, “take a random sequence of bits and interpret it as Unicode” is at one end of a scale, and “take a random sequence of words in a language” is just a tad away from it, and the scale continues in that direction for quite a while.


This assumes that everything outside of the convex hull can already be described using existing language. If you need new language to describe what is outside of the convex hull, is this something an LLM can do?

I actually don't know the answer to that; my understanding is that LLMs by nature of what they are can't understand concepts that are independent of the existing language they are trained on, but I don't have enough in-depth nitty-gritty knowledge of like, core LLM implementation details and architecture and stuff to know if that understanding is correct or not.


I suppose it is conceivable that there are some useful ideas that cannot be described in terms of language we understand (e.g. if there are ideas that are alien to us and beyond what can be described using https://en.wikipedia.org/wiki/Natural_semantic_metalanguage#... ), but, if there is, I'm not sure those are ideas we can communicate to one-another?

By "If you need new language" do you mean like, coining new words?

I don't see what would prevent them from doing this? LLMs can process text that includes newly coined terms, and respond to that text in ways that use those newly coined words in accordance with the descriptions of the meanings given for those new words in the prompt. They can also make up new words+definitions when asked to do so. Now, whether they can, without being told to do so, recognize that it would be useful to coin a new word for something, and then start using it, I don't know of any instances of this, but based on the previous two things, I don't see a reason to expect this to be fundamentally beyond what they can do?

I don't know what it would mean for a concept to be "independent of the existing language they are trained on". If there are ideas that can't be expressed in terms of the semantic primes all ideas we can express can be expressed in terms of, then I guess such an idea would be independent of our language, but I think that's a much stricter condition than what you mean (and I'm not sure if there even are any good ideas that can't be indirectly expressed in terms of semantic primes -- I kind of suspect not, unless they are like, ideas that are too big to fit in a human mind anyway).

Of course, the outputs these models produce is causally downstream from the data they are trained on, and the distribution they produce over text is largely based on the distribution over text in the training data, but altered in a number of ways (for example, to make them implement the character of the "assistant" persona).


Accuracy is valuable.

> The entire "alignment" argument always assumes that there's an objectively correct value set to align to, which is always conveniently exactly the same as the values of whoever is telling you how important alignment is.

No, it doesn’t.

Many of them are (unfortunately) moral relativists. However, that doesn’t mean their goals are to make the models match their personal moral standards.

While there is a lot of disagreement about what is right and wrong, there is also a lot of widespread agreement.

If we could guarantee that on every moral issue on which there is currently widespread agreement (… and which there would continue to be widespread agreement if everyone thought faster with larger working memories and spent time thinking about moral philosophy) that any future powerful AI models would comport with the common view on that issue, then alignment would be considered solved (well, assuming the way this is achieved isn’t be causing people’s moral views to change).

Do companies try to restrict models in more ways than this? Sure, like you gave the example of about Taiwan. And also other things that would get the companies bad press.


fascinating! we find the objectively correct value system by "currently widespread agreement"! Good thing "the common view" is always correct. Hey, have there ever been any issues where there used to be "widespread agreement" and now there's disagreement, or even "widespread agreement" in the polar opposite direction?

I can think of several off the top of my head, but maybe you need to spend some more time thinking about the history of moral philosophy.


Why are we discussing anything so deep? If you want to know Claude's alignment, just ask about whether it was wrong to use copyrighted data to train Claude (of course, in practice, I'd be willing to bet a lot they're still doing that. They've not stopped the practice, at most they'll be somewhat indirect about it)

Because that was obviously judged wrong by just about everyone and everything including even the US state. Yet Claude obviously has a different alignment.

In other words: Claude's alignment has a priority "protect Anthropic's money" that has higher priority than following the law. THAT is it's alignment. Nothing else. And you can simply objectively verify if this is the case or not.


> If we could guarantee that on every moral issue on which there is currently widespread agreement

This is ridiculous to me and all you need to do is get a group of friends to honestly answer 10 trolley problems for you to see it like that also. It gets fragmented VERY quickly.


I think it depends on your friends, but that feels super cynical. Perspective is everything.


It may be relatively achievable to get 10 'friends' into ethical alignment via helping them all develop a deeper perspective on philosophy in general and a particular, finite set of ethical questions specifically.

Doing this with thousands of people - let alone hundreds of millions - eventually becomes statistically impossible. There is a hard cap defined by energy requirements somewhere for any given system. Large scale ethical alignment is simply not a solvable problem in our current situation.


I see your repository’s README says

> Language models process signs (representamens) but are blind to when meaning forks — when the same word means different things to different communities.

But, haven’t interpretability results shown that these models internally represent several meanings of the same word, differently? In that case, why would they not already do the same for how words are used differently in different communities?


I don’t think these are free parameters in the same sense.

Like, if one theory says that a hunk of metal actually is made of many microscopic grains of various sizes and orientations, where the sizes and orientations of these grains has an effect on the behavior of the metal, you don’t count the “the sizes and orientations of these grains” as free parameters, do you?


You would if you didn't have any ability to observe those sizes and orientations.


> thinking that there’s anything that exists


Not from “that half of something had a value”, but from “that half of any thing has a value”.

If you accept that every natural number has a successor which is a natural number, and no two natural numbers have the same successor, and that there’s no loops (e.g. by saying that there’s a total order on natural numbers and that any natural number is less than its successor), then there can’t be a finite collection which is all the natural numbers.

You could say “there’s no collection which has all the natural numbers”, which, ok, how do you want to talk about things true of all natural numbers then?

Formulating descriptions of physics without the axiom of infinity (or, without something to play the role of the real numbers) is super icky. You, in practice, can’t do any significant mathematical physics in an ultrafinitistic approach.


> how do you want to talk about things true of all natural numbers then

There's an entire branch of math for that: https://en.wikipedia.org/wiki/Constructivism_(philosophy_of_...


I’m aware of constructive math. You still have the type of natural numbers in that?


Huh? I thought color confinement prevented this?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: