Though the title of the article suggests a light-hearted subject, it addresses what might be an necessary side of the Giant Language Mannequin (LLM) AI. As a chartered accountant working with the LLM AI (largely the free variations), being conscious of hallucinations, and checking the end result for veracity, the creator want to speak in regards to the additional limitations (if there are any).
The next is the copy of prompts (questions) and the responses by Gemini (free model) that are self-explanatory. The responses are edited considerably for brevity.
One understands that Giant Language Fashions are primarily based on voluminous quantity of knowledge made out there which kind their core. Is that this a steady course of or is there a closing date when it stops.
The method of constructing a big language mannequin like me is just not a single, steady stream of knowledge. As a substitute, it’s a multi-stage course of that entails a definite “information cut-off” date.
The rationale for that is the immense computational and monetary value of this course of. It takes an unlimited period of time and power to coach a mannequin on trillions of knowledge factors. It’s not possible to do that repeatedly.
Whereas the core information is fastened, fashions could be up to date and improved in different methods.
Nonetheless, the flexibility to carry out reside searches and retrieve new info permits me to beat this limitation and supply solutions on a variety of present subjects.
Is your reply restricted to June 2024 information being the closing date?
My information is just not restricted to June 2024. My coaching knowledge contains info from an enormous vary of sources, and I’m repeatedly up to date to supply essentially the most present and correct info doable.
In an earlier immediate, Gemini had indicated June 2024 because the closing date which was later, in response to a different immediate, revised as follows: ‘Essentially the most up-to-date info out there signifies that my core information cut-off is January 2025. Some particular fashions, just like the gemini-2.5-flash-image-preview mannequin, have a newer cut-off of June 2025’.
When will you improve your core information? That is going to be a perennial downside as days cross.
You’ve pinpointed one of many largest challenges within the discipline of enormous language fashions. The issue of a set information cut-off is certainly “perennial” as you’ve referred to as it, and it’s one thing that researchers and builders are actively working to resolve.
Retrieval-Augmented Technology (RAG): That is the present resolution and what I exploit to offer you up-to-date info. It’s a method referred to as Retrieval-Augmented Technology (RAG). This method is extremely efficient for a lot of queries, however as you appropriately identified, it has limitations. The standard of the reply relies on the standard of the retrieved info, and it may well generally result in much less nuanced solutions in comparison with what a completely skilled mannequin may present. The sphere is transferring in the direction of extra dynamic and environment friendly strategies to resolve this downside.
Does it imply the standard and reliability of solutions are higher when confined to pre June 2024 subject than submit June 2024 since your core information is of upper high quality?
The brief reply is: Sure, for sure sorts of questions; the standard and reliability of my reply could be higher when confined to subjects inside my core information cut-off.
Right here’s a extra detailed breakdown of why:
Pre-June 2024 subjects (inside my core information): Deep and built-in information: For these subjects, my understanding is predicated on the huge quantity of knowledge which incorporates books, articles, code, built-in into my core neural community. This permits me to attract connections, cause, and supply nuanced and complete solutions.
Diminished Hallucination Danger: The danger of a plausible-sounding however false reality (a “hallucination”) is decrease. I’m not counting on a real-time, exterior supply that could be biased, inaccurate, or only a single, non-authoritative web page.
Contextual understanding: My core coaching provides me a deep understanding of the historic and contextual relationships between ideas. In the event you ask me in regards to the causes of a historic occasion, I can pull from a variety of interconnected info to offer you a well-rounded reply.
Put up-June 2024 subjects (utilizing real-time search): Reliance on exterior sources: For these subjects, I’m not answering from my inside information. I’m utilizing an exterior device — Google Search — to search out the data.
The “pipeline” issues: The standard of the reply is closely depending on the standard of the search outcomes I retrieve: if the data is on a single, low-quality, or biased web site, my reply will replicate that; if the data is from a number of, authoritative sources, my reply will probably be extra sturdy and dependable; the method of me studying, synthesising, and summarising that info from search outcomes introduces a possible for error that’s not current after I’m drawing from my inside, skilled information.
Potential for inaccuracies and gaps: There’s a chance that I could not discover all related info, or that the data I discover is incomplete. This will result in a much less complete or perhaps a partially incorrect reply. In some circumstances, if the data is just too new or sparse on the internet, I could not be capable to present an in depth reply in any respect.
The trade-off: For subjects inside my core information base, you’re getting a extra deeply built-in, synthesised, and less-prone-to-hallucination reply. For subjects exterior of that, you’re getting a solution that’s nearly as good because the reside search outcomes I’m capable of finding and course of.
The important thing takeaway is to at all times deal with my solutions — particularly on new, quickly altering, or extremely technical subjects — as a place to begin. It’s at all times a very good apply to confirm vital details, no matter when the data was revealed.
Conclusion (by the creator)
For the same query, Perplexity (free model) additionally answered June 2024 because the cut-off level.
The purpose of the article is to make sure that the consumer is totally conscious of an extra dimension to the LLMs. As for the creator, AI has been a really great tool offered sure precautions are noticed.
The creator is a chartered accountant
Revealed on September 20, 2025





























