Sep 13, 2024

A Review by an International Olympiad in Informatics Gold Medalist

7 Comments

Sep 13, 2024

I agree, it is becoming more clear how the LLMs function as “fuzzy lookup systems”. Hallucination is still a problem, too - it does not seem able to admit that it can’t figure out a problem, when it’s a little bit beyond its ability.

Expand full comment

Reply (1)

Brian Chau

Sep 13, 2024

Well o1 seems be a much clearer lookup system! ... Albeit still a lookup system.

Expand full comment

sprachzarathustra

Sep 17, 2024Edited

How does this relate to the thesis of the (I know, unpublished) Diminishing Returns in Machine Learning installment on algorithmic improvements? On one hand o1‘s accuracy scales logarithmically with compute, on the other test time compute is a new type of scaling district from training compute. Will the next installment of Diminishing Returns still be published? I enjoyed the first one.

Expand full comment

Reply (1)

Brian Chau

Sep 17, 2024

Well I ended up not finding anything because there wasn't much algorithmic change at all! Some introduction of mixture of experts, but that wasn't really a change with much long-term impact. Honestly, I expected more algorithmic improvement by now, so hopefully RL-type scaffolding is a path towards that.

Expand full comment

Darij Grinberg

Sep 14, 2024

Here is a very simple math problem that has so far stumped every AI I've tried (arguably I haven't tried many, as I don't have paid accounts). Can you try it out on o1?

===

I'm looking for an answer to an exercise left unproved in some notes on abstract algebra. It goes as follows:

(a) Let $R$ be a commutative ring, and $A$ and $B$ be two $n \times n$-matrices with entries in $R$. Prove that $\operatorname{Tr}(AB) = \operatorname{Tr}(BA)$. Here, $\operatorname{Tr}(C)$ denotes the trace of a matrix $C$, defined as the sum of its diagonal entries.

(b) Is the claim of (a) still true if $R$ is not commutative? Provide a proof or an explicit counterexample.

===

I always get mostly-but-not-entirely complete proofs on part (a), and nonsense on part (b) (GPT reasonably guesses to work over the quaternions, but then comes up with matrices that only use integers, rendering the quaternions useless). This is precisely what I would have predicted: Part (a) is in the training data or very close, while part (b) is much less popular, despite being very easy with the definitions and a basic understanding of noncommutative rings. And matrices with commutative entries are far more widespread than noncommutative ones, so the machine's muscle memory leans commutative. I'd be excited to see AI break this barrier.

Expand full comment

Reply (1)

Abe

Sep 14, 2024

It gets the right answer:

https://chatgpt.com/share/66e5d715-d584-800d-9b4a-64f451e8e876

Although it's still kind of an uncanny valley right answer

Expand full comment

A mensch

Sep 13, 2024

Great post. FYI all the conversation links appear to be broken for me (I'm logged in, have a premium account, and am on Firefox, also tried in a private window)

Expand full comment

From the New World

OpenAI's o1 “Strawberry” - Newscard