1. Library
  2. Podcasts
  3. Generationship
  4. Ep. #6, AI’s Cognitive Leap with Mark Wallace of Global Worldwide
Generationship
34 MIN

Ep. #6, AI’s Cognitive Leap with Mark Wallace of Global Worldwide

light mode
about the episode

In episode 6 of Generationship, Rachel Chalmers speaks with Mark Wallace of Global Worldwide. The conversation kicks off with a deep dive on Mark’s philosophical stance that LLMs are actually thinking. Additionally, they explore the risks of using commercial LLMs, the difficulty of fine-tuning LLMs, and the crossroads of game infrastructure and generative AI.

Mark Wallace is a Principal Architect at Global Worldwide. Mark is an expert in distributed computing architectures, databases, observability, and cloud deployments with Kubernetes, with a deep foundation in physics/applied math. He was previously technical director at Sysdig.

transcript

Rachel Chalmers: Today I'm thrilled to welcome Mark Wallace to the show. Mark is a chief software architect currently at games company Global Worldwide. A technology innovator and a team leader with a deep foundation in physics and applied math.

He's expert at architecting and operationally managing cloud application infrastructure at web scale, and he's a hands on architect, code mentor, review designs and review pull requests throughout the entire product system, including servers, infrastructure as code, mobile clients, data science and machine learning. Mark, thank you so much for being on the show.

Mark Wallace: Well, thank you. I'm really looking forward to our conversation. This is a great array of topics.

Rachel: We are also welcoming podcast dog Wiley, the guide dog in training. So if you hear some jingling, it's just his collar. Mark, you take a different philosophical stance from a lot of my guests. You believe these large language models are actually thinking, whatever thinking means. Can you say more about that?

Mark: Yeah. It was around the beginning of this year I realized it was not useful for us to continue to move the goalposts with respect to AI, and that by reserving the word thinking for certain cognitive tasks and then saying that something else that a computer got good at was no longer thinking. It makes it hard for us to realistically view them as tools, and you have to be realistic because you have to look in order to see their weaknesses, you have to acknowledge they're strengths.

So saying they're not thinking, I feel like it's running away from them because in fact they are doing really sophisticated cognitive tasks that any other era in history would of course have been thinking.

Rachel: Yeah, and they've blown away the Turing Test which was our milestone for so many years. As a woman in computer, I'm accustomed to moving goalposts but this is a little ridiculous.

Mark: Yeah. Philosophically, I'm also aligned with the answer I just gave, that I believe that there really is cognition. Now, it might be a frozen mental state, it's not something that's alive, but it is a frozen mental state reflective of a lot of knowledge and a lot of wisdom, and it's incredibly error prone as any early technology is.

Rachel: I guess the only place I don't fully go with you on that is they don't sense, they don't have a connection to the world.

Mark: It's true, they live entirely in a world of words so the way I think about them... It's almost like the Disney thing, the Sorcerer's Apprentice. It's like the magic book actually came alive and the books all started talking to you. Its personality is representative of a verbal corpus so they live entirely in the world of words and they don't live at all, but they don't exist even in time so it's just a snapshot of an idea that could occur from the corpus of words.

Rachel: But it is, and I always go back to Jaron Lanier's article in The New Yorker, it is a library that can actually interact with you and talk to you, and get a lot of things wrong. But it can hold up its end of a conversation, it can pass the Turing Test.

Mark: Right. And beyond holding the end of a conversation, you can confront it with a difficult cognitive task that you would confront a colleague with and a lot of cases it'll do quite well. Of course, the corpus of knowledge available to it is so much greater than a person could ever have in their immediate recall, that it's compellingly useful just for that alone.

The fact that it's ability at cognitive tasks is, in a lot of cases, comparable to what people can do with words is good. It lives in a way simpler world than we do.

When I say it, I'm just talking about LLMs in general. But in some ways an earthworm has a tougher job than an LLM. This is a world I'm going to throw bias at. The idea that we would throw in a task that involves negotiation and safety and things like that and somehow delegate those to an AI is a little silly, because everything that's alive has graduated from the school of hard knocks over many generations, by definition.

So I wouldn't trust my personal safety with an AI, but that doesn't mean I can't have a really nice conversation with it, like if I was reading a book.

Rachel: I think to me it rhymes a lot with a lot of the child dev I read when I was raising my kids. Babies are also in a very simple world and live in a world of constant input and are trying to make sense of it and grow and change very rapidly. But we don't expect a baby to grow in isolation and develop judgment and nuance without a community around it. I worry that we regard artificial intelligence, whatever it is, as if it could raise itself and we don't take on ourselves the responsibility of making sure that it observes our societal norms and values, what we value.

Mark: I agree. I've obviously worked really closely with AI for many years because it's always been important in the gaming industry and other industries that I've been in, and as an architect it's just a technology you always stay on top of. But particularly, lately, I've been an advisor to a company called Bamboo AI and they're working on an AI companion called Willow which is a really charming personality.

But in the process of doing that I've learned quite a bit about fine tuning and what it means and what it can mean. I think where I'm going with this, I'm trying to answer at least one question that you put out there, is that we're going through this engineering process of trying to train these things and form them. The base model is just the beginning, the fine tuning is just an enormous project that is still ahead of a lot of people.

But they don't grow by themselves, and the thing is it's really difficult for us to realize that any animal, anything with a little bit of cognitive skill, an insect that we encounter in real life, is highly adaptive and grows and changes. These things don't, they only change when we change them. I don't know if we have a technology to surmount that yet. There's a lot of tricks we can play to make them appear to grow and that does make them more useful to us.

Rachel: Yeah. Much to unpack there, we'll come back to that piece by piece. Mark, you're already way ahead of most people I know in incorporating gen AI in particular into your day to day work in game infrastructure. Can you talk about some of the things you're using ChatGPT for?

Mark: Yeah. I manage a group of SREs for keeping the game up and going. Global Worldwide is the games studio, and the name of the game that is currently live but also constantly in development like games are, is Kingdom Maker. I am the original technical architect of that game along with some other good friends that worked on it. But in addition to that, now I'm managing the SRE group.

I've managed SRE groups before, I wouldn't say I'm necessarily an SRE but it is striking how useful a tool it is in that particular context because SRE type of work is really information intensive and really cognitively intensive without usually a very clear path ahead of you. If you're doing development work there's a mental discipline and a path ahead of you, and often in SRE stuff there is a professional discipline of keep the system up and don't take risks.

But the cognitive solutions for that aren't really obvious. So ChatGPT has actually been really helpful in stuff like that, I can have a sophisticated architectural discussion with it about something that I both need to do research on, but also need to just check my analysis and make sure that the line of reasoning is going. Also, I can fish and see if it's got some line of reasoning that might be better.

Rachel: So what does a prompt look like? "Site's down, here are some of the symptoms. What might be causing this?"

Mark: Yeah. I usually start with prompts that are very rich and that I really try to make myself clear in words from the very first thing I say. Then I provide illustrations, usually in the form of code or specifications.

Rachel: So, technically, I guess what that's doing is it's really refining the attention window so that you're picking a piece of the vector space that's super pertinent.

Mark: Exactly. I figure my first interaction with it is my best chance to find-

Rachel: The right nodes?

Mark: Yeah. This little hyper-surface of gradient descent that we're going to do. That's where. For instance, a prompt today I was talking with it about that I was considering switching some of our REDIS codes over database code from using an older feature in REDIS called Pub-Sub to a newer one called Streams.

That would be a pretty big undertaking, it's a lot of time for the development team so I was just trying to explore the pros and cons of that, versus some other things that I could do to enhance the reliability of stuff with Pub-Sub. So I started by giving it a paragraph on that, and then I followed up with, "And by the way for some context, here's the YAML specification of some of my proxy codes that interacts with REDIS," so I threw in a HA proxy configuration.

That's really pretty challenging. That's the way that I would interact with a colleague who was really experienced and already knew me and was ready to just dive in.

Rachel: This is what fascinates me, when you talk about these interactions it's almost like you're talking about it as a peer programmer.

Mark: Yes. I do work with it as a peer programmer, but it's para-architecture. Now, one of the things is that I've tried to convince colleagues to do this and I'll say, "Hey, look, it can do this." It's surprising how few people want to actually do the cognitive work to raise the bar to get... I'm talking about ChatGPT 4.0, to get it to do the amount of work for you that it can.

Most people are pretty reluctant to go that direction. The other thing is that invariably the conversation is a search for something useful and there's many prompts and responses. Generally the first prompt is the most in depth, but occasionally there'll be another long prompt in there. But primarily, after that it's steering and trying to find... like you said, it's all a vector space, trying to find that spot in the vector space where it's actually able to access the knowledge and, to be honest, the type of judgment that I want it to.

The other thing is it's actually really useful when it makes a mistake. If you get disappointed when it makes a mistake, it's like, "No, that's within the context of the prompt that you're doing." The mistake is actually a teachable moment, and that will help you get there faster. The thing is, all this stuff that I'm talking about, I've been a software architect, a technical architect in various areas for decades, I suppose.

Before that I was a technical manager. So all I have to do is switch on my architect and technical manager hat and just act like it knows what it's doing, and then what I'm giving it comes back to me with all of the knowledge of anything the model was ever trained on added.

Rachel: What's fascinating about your description of that is how much it's like my practice of working with startup founders and trying to get as much context as we can to begin the conversation and then exploring for a place where we can mutually solve a problem. That's one part that's really interesting, and that's obviously coming out of your technical management background.

The other part is that you're using the language model as an augmentation to your emotional reactions to things. When you talk about that disappointment, that's your embodied intuition about how the system should behave and what the answers should be. That's what I see the AI is missing, it doesn't have that physical, visceral reaction to a wrong answer.

Mark: Yeah. It doesn't know the difference between right and wrong the way we do.

Rachel: Right. It doesn't know anything, it just knows which words connect to other words.

Mark: Well, I think it's more than that. I think it's fair to say it really does know what words mean, and possibly in a deeper sense than we do. The fact that it's this very deep, multilayer model, if you just think about the mathematics of it. That's about as close that I could get to a proof that words mean something. To me, that's one of the most encouraging things about LLMs.

It's that not long after most people learn to read, this feeling that words are magically and you think about the history of words and books and these things and they go back thousands of years. LLMs confirm all of our most fanciful and positive suspicions about words, that, wow, they actually are magic, they actually do mean something.

Even by themselves, even if you leave us out they actually mean something. So I love that aspect of working with the LLMs because it's very reinforcing that language is not just this simple mechanism for people to communicate that's all relative, based on the situation. You say, "Oh, there is an absolute meaning."It's helpful.

Rachel: It's a sophisticated model for understanding the world, just like mathematics is.

Mark: Exactly, exactly.

Rachel: Yes, and I think there comes that point for every avid reader where you've grown up in a world of books and then you read something like Samuel R. Delany and he points to a world beyond what can be described by words, and suddenly you're aware of the air on your skin and the scenery rushing past you in the bus.

You realize there's a huge amount of reality that can't be captured in words, and that's what I'm trying to get to with the physical intuition. We learn from our experience, from the times we survived and the times we were in peril, and that builds our embodied expertise and that's what we're bringing to this interaction.

Mark: Exactly. I feel like not just us as humans and the creators of LLMs, but any animal in the world has that connection to the world.

Rachel: Wiley the guide dog.

Mark: Right. Or things that are super far away from us in the evolutionary tree, like an octopus. The commonality between us and an octopus is amazing, separated by a half billion years of evolution.

Rachel: Convergent evolution with the eyes. Their eyes are completely different from ours, but they work the same way.

Mark: Right. And their brain, part of it is in their feet and part of it is near their eye.

Rachel: One of the most moving experiences of my life, not to derail, but was getting to the Monterey Bay Aquarium right when it opened, when the Pacific Giant Octopus is really interactive and trailing my fingers along the glass and having the octopus trail his tentacles along the glass. That moment of communication was astonishing.

Mark: Right, it is stunning. Honestly, I think most of us that are in the AI business now didn't really expect this to happen during our lifetimes, and now it's like we're meeting the octopus. This only happens once, so it's quite a moment.

Rachel: Once in a generation, hence the name of the podcast. Yeah. It's amazing. A segue that isn't really a change of direction, how do you see these kinds of generative AI, language models being harnessed in games? Are we going to see much more intelligent non player characters? And what will that do to games as empathy engines?

Mark: Yeah. Well, you already see them being used in games, but I haven't seen a game where they don't fall flat yet. I think really the issue of personality and a word I like in this category is rapport, the concept of how do you actually build a rapport. Well, you build a rapport as soon as you have a conversation with someone, but rapport built over a series of days and years, it gets incredibly powerful.

Now, in a game you don't need that long term rapport. But you do need the type of rapport that can come in moments, and rapport is a very difficult cognitive task and it's not something that's just coded in, it's not something you get for free with an LLM.

The architectural work that I'm doing with Bamboo AI is very focused on what can you do with the tools that we have available now to build rapport? Now, our goal there is not to build a game. It's an AI companion and potentially like anything that's useful about an AI companion, which everybody is exploring that space. With a game, I think before gets really good in an entertainment environment, there will have to be models that can build rapport. That's not something that's built into ChatGPT right now, not really.

Rachel: That's fascinating. So what are the elements of rapport?

Mark: I think you have to have an authentic interest potentially in the person that you're talking to. You have to learn things about them.

Rachel: Yeah. I would think maintaining state in a conversation would be a really key piece.

Mark: It's helpful. With AI, rapport is a constraint problem that is difficult. There's solutions to it. Imagine you build a rapport with a pen pal, it takes a while, it's a lot of different exchanges. But after a while, you engage your imagination and if the other person is engaged in your imagination and if you're exchanging messages in a reasonable way you really can build a rapport. But it's a lot more effort than it is when you're actually in a conversation with someone and you get all the experience of being in the world.

Rachel: The rich feedback of body language and expression and gesture.

Mark: So I think that we know how to do it, but we also know that in the world of words alone it's not very easy. Honestly, I think probably only a few people touching AI are really thinking very seriously about that problem yet because there's so many different problems to be solved. It's just that I think that ones important for companionability.

Rachel: I think that's actually going to be a super important application. Another literary precedent that comes up again and again is people talking about Neil Stevenson's The Diamond Age and The Young Lady's Illustrated Companion. I don't know if you've read it. But that idea of a very long term relationship with an entity that's trying to make you your best self, that's really compelling and attractive.

Mark: I love that book. Years ago I gave my copy of it to my friend, Andrew, who's the chief technologist at Bamboo. Yeah, we talked about it a lot and talking about what we wanted to accomplish with things like rapport and companion AI. Our dream is really the book in The Diamond Age. That's what we want to get to. What's funny is the cognitive abilities, they're almost practically there. But the rapport is not and it takes, I think, quite a bit of understanding to get there.

Rachel: Are you worried at all about the risks of using these very large, closed source, commercial LLMs where they're effectively black boxes and we don't know a lot about their inputs or their internal mechanism?

Mark: I gave that one some thought because you gave me the questions last night, and I have to say honestly, at least with respect to OpenAI and I use their offerings more than others, I'm not to worried about it. Other models that I've worked with, LLaMA 2, I'm not worried about that one really either. I can imagine commercial offerings where I would not want to talk to it. Anything you commit to writing on the world wide web, you have to be careful about it.

Rachel: Exactly. The horse has left the stable at this point. It's already ingested all of our blogs, it knows our secrets. When you are peer programming, peer architecting with ChatGPT, how do you handle hallucinations? Do you have an opinion on the fine tuning to retrieval augmented generation spectrum?

Mark: Okay. Well, I'll separate those out. The hallucinations, of course you can make it hallucinate. But when I'm peer programming with it, part of my job is to keep it from hallucinating because I know that's completely not useful.

Rachel: A therapist?

Mark: Yeah. I want it to work with me in a useful way during a session, and a way to get it to hallucinate is to use language that encourages it to engage in wishful thinking because what we call hallucinations, it's fair to say it's mostly wishful thinking. It's like jumping to conclusions without understanding.

Rachel: It's what I did for the first half of my career. "Oh yeah, I can do that."

Mark: Yeah. It's what people do all the time. So like I said, software architect for decades, I know about hallucinations because it's been part of my job to talk with programmers who were having them all day long.

Rachel: "Yeah, we can build that in three months. No worries. Just give me three mythical man-months."

Mark: So honestly, I'm aware of them but they're not in any way an impediment. I know that for somebody who maybe didn't have the same set of expectations as me who wanted to view these as some immediate source of truth, it would be a problem. But you have to think of these as a cognitive tool, not something to follow.

Rachel: It's not an oracle, it's a pencil.

Mark: Well, it's like a talking pencil that's really charming. So I was also thinking about the fine tuning versus prompt engineering. Prompt engineering is probably the low class name.

Rachel: It's just what we were saying three months ago. We call it RAG now.

Mark: We call it RAG, okay. Got it. I come down pretty heavily on the side of fine tuning. Fine tuning is not easy. It's very time consuming, it is the deep dive.

Rachel: It's very energy intensive.

Mark: Right. And so if you're looking to get a result in days or even a few weeks, it's not the direction to go. But if you get the knack for what it is really doing, and you're willing to invest the time into it, that's where you will really be able to engage the cognitive capabilities of a model.

You use prompt engineering when you want to hit it over the head and make sure that it's not going to go the wrong direction there. But if you want it to truly explore the solution space, the what's in the prompt, it's interesting. I'm just going to do an analogy with human cognition here.

What's in the prompt is like the last thing you gave somebody to read and they haven't slept on it, they don't truly understand it, they haven't compared it to other things yet. It might have made a big impression and they might even parrot things back, but they won't be able to analyze it and really engage that information.

Of course, our entire educational establishment and university is really based on the fact that we actually really do know this is true. You can't just ask somebody to read the book. There has to be a few days before that knowledge really gets in there.

Rachel: People always ask me why I studied English at university. This is why, it takes a while to understand what a book is actually saying.

Mark: Yeah. So the thing is fine tuning is not... fundamentally it's the same algorithm that's building the base model to begin with, so you're able to access every capability that's down there. I'm a huge fan, but you need to have patience. I feel like the companies that are operating well in this space, OpenAI, more of their staff is dedicated to fine tuning by far than to base models or just other forms of research now. It's probably a good set of decisions on their part to direct it that way.

Rachel: Yeah. It's interesting and a little challenging looking at it with investor eyes and realizing what a huge advantage the incumbents have in this space because they have access to GPUs and power. Do you think there are... I mean obviously you're involved with Bamboo so you do think there are opportunities for startups.

Mark: There are. I mean, at this point you do have to... the easy leveraged thing is to align yourself with a particular model and invest in that one. So a startup right now might say, "Okay. Do I want to work with OpenAI and start with a highly sophisticated set of fine tuning, which gives me a great starting point? Or do I want to do the heavy lift myself with, let's say, LLaMA 2?"

Which evidence says is that there's a lot of fundamental capability there, and with probably many months of engineering work going into the fine tuning it could probably get it there. I spent a lot of time analyzing that from different angles, and I think what I'd say now is that if you want to have something that you can get something done in a few weeks, starting with an already fine tuned model is good. It does give incumbents a big advantage.

But there's nothing about the math of this that makes incumbency a huge longterm thing. To some extent, the companies that have an advantage in here it's because they've done good work and there's some goodwill involved in trying to get it out there, because all this stuff could be really broken and really bad if you're not careful.

Rachel: Mark, you're one of the most well informed people about this space that I talk to. What are some of your sources of data? What do you watch to keep abreast?

Mark: Wow. I'll occasionally watch a YouTube video from somebody who's an important researcher, and in terms of background for this, I'll always go back to Hugging Phase and just got to get the foundational stuff. What do I need to do to get this stuff working? But we're such early days with this that it's so easy to do things on your own and hands on.

Some of the opinions I'm talking about or developed, are basically just me doing things directly or with colleagues that are doing it directly. It's been an exciting year, and we're just immersed in all this information. This idea of what do I watch or how do you do it, it also rhymes with the issue of fine tuning versus prompt engineering, that watching a YouTube video or reading an article is like, "Well, you just got that into your prompt." But actually doing this and living it for a while, you fine tune yourself and that's better.

Rachel: Right. That's a good insight. I'm going to make you god emperor of the world for the next five years. Everything is going to go as you hope it will around AI. What does the world look like?

Mark: It's interesting to think about that, because a lot of my thoughts about that are what type of impression does this make in the generation that's coming up now? The people that are in university now will be out in the world five years from now, and I've been working in the industry through a lot of different transitions, and one thing I realize is, well, one of the most important things you should do is you build a world, you build an intellectual structure for the people that are coming in.

I would like to see the generation of people that are entering their most productive phase in five years really deeply understand AI is a tool. To be honest, I would like them to all have relatively equitable or equal access to it. I'd like it to not be something that's reserved for a few people. I'd like to see it being used as a tool, not just for technical software developer types, but to grow into a cognitive assistant for people in all sorts of fields. Including bicycle repair and carpentry. It can help everywhere. I just mentioned those two because I like both of those things.

Rachel: Who doesn't like bike repair and carpentry?

Mark: Is that a good answer? Did I-

Rachel: It's a great answer, it's a great answer. The democratization piece is incredibly important to me. The industry has been talking so long about getting from 5 to 50 to 500 million programmers, this really feels like you could legitimately bring on half a billion people and have them create applications with this technology in a way that we've never really been able to do before.

Mark: It's true. In some ways, my job has gotten way, way easier in the last year since I started using ChatGPT 4 for a lot of cognitive tasks. It helps me to context switch really quickly. Now, being the type of person I am, I've just upped the ante and accelerated the rate of work.

Rachel: You've got, what? Two, three full time jobs now?

Mark: Well, I think I have one job but I'm advisor to a lot of people.

Rachel: We know what that means.

Mark: Yeah. I think my day to day productivity on the technical tasks that are sometimes frustrating and time consuming, it might have tripled or more just in the last year because getting stuck on something where you're like, "Oh god, I'm going to have to read a manual for four hours and then I'm going to have to worry about there were errors in the manual.

Then I'm going to have to read the code in order to reconcile what the errors were in the documentation. Then I'm going to trial these things, I'm going to have to test it." Anything that can short circuit that process is just a godsend. Hopefully everybody is there in five years and just doing that, but honestly I think there's probably more to it than that. I think there's going to be a transformation in intellectual work which is really huge in foundational ways that I probably can't anticipate five years before they happen.

Rachel: I think for a long time we've glorified a certain amount of toil in intellectual work and we've given accolades to people who can write a dissertation or a white paper or a long form piece of intellectual reasoning. I think that's suddenly got massively devalued in really interesting ways, and that may give an advantage to people who can think quickly and can think on their feet and adapt very quickly. I don't know.

Mark: That's actually really interesting. What's so interesting, this is not something that's fundamental to intellectual work but it's a human behavioral thing that we use cognitive tasks as gate keeping devices. Really forward thinking educators have started to realize that, even more over the last generation or so than before.

But there are still some true believers in gate keeping and you'll really see it in the interview cycles for a lot of big companies, and the thing is that now that gate keeping which was always really questionable in terms of its value is just demonstrably a waste of time.

Rachel: I think there are some things that do need to be gate-kept, but I think proving how smart you are is probably not one of them. If you had a colony ship, setting sail for the stars, heading to Proxima Centauri, what would you call it?

Mark: Well, naming things, one of the hard problems in computer science. Okay. I'm not going to fix on one name, but I think I'd name it after a tree most likely.

Rachel: I love trees.

Mark: If we were ever able to send something to another star system or to another planet, if you're able to make a tree grow some place it's like, "Wow." To me, trees are in charge of running the planet. We just haven't, as humans, recognized that. But we're starting to put it together that, hey, maybe it wouldn't rain at all without trees. So I'm thinking that trees is where I'd go with the naming.

Rachel: I'll never forget the first time I went to Muir Woods and met the old growth forest there and realized that we are mayflies to them. We appear and disappear in the blink of an eye, to a tree. We are so small and so young.

Mark: Yeah. They don't necessarily do what we think of as cognitive tasks, but they process a lot of information and fundamentally they're responsible for the planet functioning.

Rachel: I cannot think of a better name for a generation ship. Mark, thank you so much. This was a great conversation.

Mark: Thank you.