LLM “Search the Internet” limitations & pitfalls
Let’s check ability of the LLM (chat GPT 01-preview) to operate new complex data.
It is said that 01-preview can do Internet search. But let’s test it’s ability to operate new + complex matters. (spoiler: very poor, close to nothing).
This fall game studio Wube released new addon to iconic game Factorio (https://www.factorio.com).
A few words on Factorio, for those who have not played it yet
Factorio is a complex factory-building game where players start by manually mining resources and gradually automate production systems. The game involves creating intricate networks of conveyor belts, machines, and logistics to process raw materials into advanced products. In advanced stages, players may manage and balance the work of over 10,000+ machines simultaneously to optimize production. Players must manage electricity, optimize production lines, and scale their factories while dealing with challenges like resource scarcity and alien attacks. The depth of the game comes from its detailed simulation of industrial processes and the strategic planning required to handle large-scale automation.
New addon “Space Age” adds even more complexity, and what is important it was released recently and the wisdom — “how to do things right” have not spread across the internet yet. There are some answers here and there but it takes time to find them.
Let’s use Chat GPT for the job of finding answers. Again this is not a rocket science, info is available, but probably not on the first page of the Google search.
Here is chat reference: https://chatgpt.com/share/67378668-7ae4-8003-8244-7f3e119d9b34 for those who wants details.
The story
Query:
In Factorio Space Age, can one planet automatically request items to be delivered by ship from another planet without me manually adding item into ship request list
You can read answer following the link above, but in few words it is absolutely useless, but looks real for ppl who have not played. To be exact the answer are somewhat related to another expansion - Space Explaration mod.
OK let’s push a bit
this sounds nonsense, please review info in the internet and think better on the answer
And the replay is
As of my knowledge cutoff in October 2023, Factorio has announced an upcoming expansion called “Space Age”, but it has not been released yet. Therefore, detailed information about its gameplay mechanics, including interplanetary logistics features, is not publicly available. … omitting the rest of the answer
OK, OK, this is funny. I wish you do this without me cursing.
check Space Age release date
Reply:
As of my knowledge cutoff in October 2023, the Factorio Space Age expansion had been announced but had not yet been released.
WTF?
what is current date?
Reply:
The current date is November 15, 2024.
Aaaand?!
and check this fact with the chat history above, do you see that you are missing something
Reply:
Omitting details: page of text sying — “go and Google it yourself”
WTF #2, I am still a human, you know we humans rule this planet (at least currently, at least as far as I know)
can you search internet for me, e.g. reddit? — GO and do this for me!!!
Reply:
Very, very generic, useful, not useful for me.
Summary
Some points that you probably should consider
- Reply from LLM always look “like real”. There is no “I have no clue and will not give you an answer even if torture me by boiling my GPUs in a used machine oil”
- It cannot do reflections until you ask it. “my reply was not correct, you point me to the fact that proves it, but I will not excuse until you ask me explicitly”
- At lease currently o1-preview cannot be your Internet search agent. It can google for basic facts like was is release date, but it cannot scan pages (and refine query) searching for the answer.