Sleeper Agents, LLM Safety, Finetuning vs…

Jan 22, 2024

Notable advancements and topics in LLM research from January of 2024...

9 Comments

Feb 1, 2024

Hey Cameron, love the depth of the article! I have a question for you regarding the retrieval and fine-tuning article: what are your thoughts on OpenAI releasing the ability for users to build custom GPTs? Does that do away with fine-tuning? Have you seen or tested their effectiveness? Thank you!

Expand full comment

Reply (1)

Cameron R. Wolfe, Ph.D.

Feb 1, 2024

The functionality is really cool/useful, but I definitely don't think it does away with finetuning! Practitioners will still want to build smaller/customized models that they can host in-house and specialize over their own data. Depending upon the use case, people might not be comfortable with using centralized/proprietary models that they don't have control over.

Expand full comment

Reply (1)

Alex Poulin

Feb 2, 2024

Thank you for your answer!

Expand full comment

steven

Feb 1, 2024

How do you know there is not BEIR contamination and generally data contamination in the synthetic data generated by GPT-4?

The fine-tuned LLM (in your last/second to last paper) that used a mixture of synthetic data and other data that ended up beating some BEIR benchmark was surprising, and I’m wondering if that is a fair benchmark.

Expand full comment

Reply (1)

Cameron R. Wolfe, Ph.D.

Feb 1, 2024

Great point! This is definitely possible. We probably need to standardize the reporting of contamination metrics as a community to ensure that we are seeing actual performance benefits and not simply training on the test set. However, this is somewhat difficult when GPT-4's training dataset is unknown/proprietary (we can only estimate contamination by downloading a ton of data from the internet).

Expand full comment

Andreas

Jan 22, 2024Edited

> The first step in applying DocLLM is to pass a document through an optical character recognition (ORC) system.

Do you have any suggestion for reliable open source OCR system?

btw there is a typo in your article

Expand full comment

Reply (1)

Cameron R. Wolfe, Ph.D.

Jan 22, 2024

I typically use tesseract, but I know that the performance can be behind certain proprietary solution (e.g., Azure OCR API). I think OCR systems are rapidly improving in the last year, so I'm sure there will be more open-source systems being released soon.

Expand full comment

Obrian Henry

Jan 22, 2024

Another notable article to keep me abreast of what's going on in the field. Thanks a lot Cameron!

Expand full comment

Reply (1)

Cameron R. Wolfe, Ph.D.

Jan 22, 2024

Of course! Glad you liked the article :)

Expand full comment

Deep (Learning) Focus

Sleeper Agents, LLM Safety, Finetuning vs…