Disclaimer: The following texts are unorganized thoughts, which I tried to make general, accessible, and 'just' useful. It is deliberately incomplete. Experts hoping something substantial therefore might end up in disappointment. If not, I owe you a big "thank you" for your time in going through it :)
Since the last years, Large Language Models (LLMs) - and Artificial Intelligence (AI) tools - have become like 'household apparatus' in most of the workplaces around the world (see the McKinsey report)- I can surely tell the adoption trend is growing fast in Nepal. In a recent discussion with some of my colleagues from the natural language processing (NLP) research group, we were reiterating on the potential risks associated with 'hallucinations' in LLMs, where responses from these 'intelligent learning models' appear plausible, but are in fact compromised (potentially adulterated during the chain-of-thoughts) on factual correctness, - leading to (critical) unintended consequences.
As it was also related to strategic interactions and strategic behaviors, something I am also keenly interested to understand, we discussed around what we actually consider 'learning' is and can 'intelligence' be a commodity that any learning agent can subscribe to, - to adapt themselves in the operational environment and autonomously undertake planning, decision-making, and optimization tasks. I must remark here, by no means I am equating learning with intelligence, and so the emergent behavior (just here, the reader might be interested in our earlier work on large-scale distributed learning systems and intelligence; and a summary blogpost).
While addressing some of these aforementioned problems would demand interdisciplinary knowledge that lies at the intersection of Computer Science, Statistics, and Economics, what's always come missing (late) in the story, however, is a system-level perspective; for instance, regarding network architectures that are constrained with data availability and accessibility, compute, memory, and (wireless) connectivity, which fundamentally offers a substrate to work on all of these concepts. The future wireless communication systems, 6G and beyond - perhaps, architectures towards evolution of classical communications for quantum applications (see 1Q), is convergent; it is aimed to be AI-native -- envisioned to primarily operate over (by) interconnected 'intelligence' - network entities (and functionalities) working as 'learning agents', taking different roles allowing acquisition and fusion of different modalities of information, and through exploration (and improvements) of the underlying communication technologies, optimizing resources and operations for different complex tasks.
Here, I cherry-pick memory as one of the attributes to talk about, though I work more on understanding several aspects related to data, learning, and connectivity. As what the authors discuss as "now" in (see Time, Simultaneity, and Causality in Wireless Networks with Sensing and Communications), I write to unfold the notion of "past" - a parallel that can be drawn (with some liberty and loose definition) of what is perceived as "memory", - an algorithmic access to past experiences (and computations), and contrast to "now". Well, one can easily argue, at least in the modern times dominated by AI and Machine Learning (ML), - "storage is cheap, memory is expensive" - working memory is expensive!
Surely, - you know this when someone say - "Do keep me (us) in your memory."
Memory is a key cognitive function - gifted by nature, - yet, beautifully agile, dynamic, and not operates the same way for all of us. For instance, if you have read "Thinking, Fast and Slow" by Daniel Kahneman, and the next time you meet someone, - seemingly you have met before, you would have probably both System 1 (the fast, intuitive, and automatic) and System 2 (slow, deliberate, and analytical) engaged. As such, your mind would implicitly do several rounds of cognitive computations to match the person in the memory lane. In simple learning terms, you might be solving a classification problem through iterative interactions with the memory. But, beyond that, the thinking part here could be an abstraction to cognitive computation, what perhaps can be called reasoning: "intelligence with memory". As such, System 1 and System 2 also unfolds two contrastive but complementary time scales for general cognitive operations (thinking).
Such modes of thinking, which requires different cognitive effort and computational abilities, however, cannot be the only basis for how intelligence could be defined; a modern exposition on intelligence, simply called smart decisions, tells what leads to errors and how to attain specific goals with minimal errors, if not perfection (optimality). I strongly recommend the readers to check "A Collectivist, Economic Perspective on AI" by Michael I. Jordan. Primates of intelligence goes back to accumulating experiences, building knowledge, and (re)acting (reasoning) based on it. New form of intelligence juggling over the past, sneaking through the "memory lane", and making a prediction.
Intelligent systems exploit the aforementioned concepts of memory for reasoning to make an informed decision, and the related works (for e.g., see [a], [b], [c], and [d]) are precise, in many ways, to define and interpret the value of "information" in communications. Real-world signals are 'analog' and 'multimodal'. Humans take snapshots of the world information through their 'sensory memory', some that last for a long period of time. Some that fades about over time without finding a proper registry in the mental memory map. From system-level perspective, in fact, data delineates information signals, and how data value is perceived, - as much as it depends on the content, context, and the timing aspects (the context window, e.g., [e]), - it is related to its usage within the system and 'by' the algorithms (and applications). As positioned before, what's intelligence is therefore, an interplay between data, compute, and learning/inference - captured with a common fabric of timing, a context window in the memory, which is sparse, dynamic, and agile, - and algorithms (computations). This give rise to first-order, second-order, and even differential thinking (reasoning in the sense of intelligent systems) to interpret information. "Memory" therefore becomes, and is, a crucial part of agentic workflows, and its active engagement underpin intelligent decision-making.
Going back to the LLMs, if you have experienced already, most variants have a short-term memory. And exposure to a high-frequency adversarial contexts (examples) is sufficient enough to break their 'knowledge consistency'; by manipulating their working memory , their intelligence through subsequent reasoning can be easily challenged.
So what will you keep in the memory?
That's the question a learning agent will ask in the near future. :)