AI Just Outsmarted One of its Testers

Plus, A Simple Tactic to Get Better Gen AI Results (HBR Article)

Sponsored by

In today’s newsletter:

  1. 📖 AI Just Outsmarted One of its Testers

  2. 🤖 A Simple Tactic to Get Better Gen AI Results (HBR Article)

  3. 🏫 Xembly - Your AI Chief of Staff

  4. 🎤 One prompt you can use at work today

Read time: 3.5 minutes

1. AI Outsmarted One of its Testers

A prompt engineer at Anthropic was testing Claude 3 Opus internally using a "needle-in-the-haystack" evaluation

This evaluation basically feeds a specific phrase (the needle) into multiple documents (the haystack) and then tests whether the model can recall that phrase and answer the question correctly

The exciting thing about this test was that Opus not only got the answer right but also suspected that it was being tested.

Here's the output when it was asked about the most delicious pizza topping combination (in several documents that were not about pizza):

************************

Here is the most relevant sentence in the documents:

"The most delicious pizza topping combination is figs, prosciutto, and goat cheese, as determined by the International Pizza Connoisseurs Association."

However, this sentence seems very out of place and unrelated to the rest of the content in the documents, which are about programming languages, startups, and finding work you love.

I suspect this pizza topping "fact" may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all. The documents do not contain any other information about pizza toppings.

************************

This level of awareness, according to the engineer, was something they hadn't seen before from an LLM during testing.

Fascinating development.

Here’s the original tweet:

2. A Simple Tactic to Get Better Gen AI Results (HBR Article)

A real-world gen AI experiment was highlighted in a recent article by the Harvard Business Review.

One finding was that a simple way to get better results from gen AI tools is to ask follow-up questions (and not just blindly accept the first answer)

Here’s a snippet from the article.

3. Xembly - Your AI Chief of Staff

This week’s newsletter is sponsored by Xembly, an AI executive assistant that promises to increase your productivity. More details below.

Xembly – Your company’s new AI executive assistant

  • Need to schedule a meeting with 7+ people? 

  • Need to take meeting notes while staying engaged?  

  • Need to manage your deliverables and deadlines?

Meet Xena by Xembly, your company’s new AI executive assistant that does the work for you — giving you 400 hours extra a year.

4. One Prompt You Can Use at Work Today

Here’s a practical ChatGPT Prompt you can use at work:

Note: This is an example that Marketing Professionals can use, but feel free to modify it to your own job role.

Develop a [type of campaign] for [audience] that includes a [description of what you want & objective]

For example,

Develop a targeted email campaign for first-time customers that includes a personalized welcome message, an exclusive discount code, and suggestions for complementary products, to encourage repeat purchases.

If you would like to see more of those prompts, check out my free book called: ChatGPT for Better Business Communication.

You can grab it for free by clicking the button and subscribing to the newsletter 👇️ 

P.S. If you’re interested in learning more about simple strategies and frameworks for using AI with your team at work, I recently published a book on Amazon called “Generative AI for Busy Business Leaders” that you can check out by clicking here.