Video Generative AI in Energy: 101

In this first of a 3 part Solar Conversation series, Jon Bonanno and Kerim Baran interview MIT professor and Climate Change AI (CCAI) co-founder Priya Donti. During the conversation, they talk about:

      • Generative AI and Machine Learning: What they are, how they evolved, approaches to current Artificial Intelligence and Machine Learning applications as well as where they are currently used with high impact.
      • The potential of Generative AI and Machine Learning in addressing climate challenges. How these technologies have evolved and their current impactful applications.
      • The importance of understanding the collaboration between Generative AI and Machine Learning to tackle complex climate issues. Through examples, Priya showcases the transformative power of these technologies and inspires others to explore their potential for driving sustainable change in the face of the climate crisis.

You can find this same Solar Conversation broken into chapters and fully transcribed below.

Panel Discussion on AI Solutions for Climate Issues in the Business Sector (5:59)

The Synergy between Machine Learning and Artificial Intelligence (5:59)

The Three Major Paradigms of Machine Learning (5:19)

Unsupervised Learning: Discovering Structure in Unlabeled Data Sets (4:09)

Reinforcement Learning for Optimizing Complex Systems through Interactive Agent-Based Training (5:39)

Exploring Reinforcement Learning in Transmission and Distribution Companies (5:49)

The Landscape of Machine Learning Paradigms, Models, and Applications (06:55)

Exploring the Potential of Machine Learning - Use Cases, Strengths, and Data-Driven Insights (3:01)

The Challenges and Biases in Machine Learning (9:58)

Highlights from the Conversation with Priya Donti (1:07)

The transcription of the video is below. 

Panel Discussion on AI Solutions for Climate Issues in the Business Sector

Kerim: Hi everyone, this is Kerim, Kerim Baram with Solar Academy. I have with me today Jon Bonanno of Strategic Operating Partners, and we are hosting Priya Donti, a newly appointed computer science professor at MIT, focusing on AI solutions specifically for climate issues. And welcome, Priya. Good to have you here.

Priya: Thanks for having me. 

Jon: Well, so Kerim, I think we should start by just putting this question out there. As we always do with Solar Academy, we always put out a question as to what is the purpose of this? And I think the purpose of this is that we have this new artificial intelligence toolbox or process that is possible to apply to climate solutions that may or may not affect certain businesses in our beloved sector. And so I think that we would like to discover as part of the Solar Academy teaching others to be empowered. I think that this is really an opportunity where people can learn about these tools or potential tools, how do they apply it, how might it affect their businesses? And, you know, really get off on a good start with this artificial intelligence in investigation.

Kerim: That sounds like a good plan. And Priya, please remember that we launched so Academy because especially in the business world, we’ve observed that many professionals, especially customer-facing and partner-facing ones, find themselves repeating the same information again and again and again between, between meetings, between calls, between conferences. And we just wanted to like capture it on video and audio so that it can be repeatedly shared in a more efficient way. So whatever you find yourself explaining a lot these days especially in the last, you know, three to five months with chatGPT come and, you know, all sorts of solutions popping up that would be great to capture those snippets from you that you find yourself repeating in the real world.

Jon: All right, so let’s jump into the content and, we are very grateful to have Priya Donti here, who, as Kerim mentioned, is a newly appointed professor at, Massachusetts Institute of Technology, better known as MIT, and super excited to have her give us a little tutorial on some basic building blocks. And we plan to have more than one session here, so you’re not gonna run out of content. and Priya, I’ll kick it off to you.

Priya: Yeah, so I guess to introduce myself a bit more, so I wear two hats, I am the co-founder and executive director of Climate Change AI, which is a global nonprofit that really tries to bring together a community of, you know, innovators and practitioners and To accelerate climate action using AI. And you’re gonna hear today about, you know, various ways what, what that actually means, in addition, as Jon mentioned, I’m an incoming assistant professor at MIT and my research focuses on how do you dynamically optimize power grids to foster the integration of, you know, variable renewables and do this in a way that’s really cognizant of the data, the sensing that you’re seeing on the system, but at the same time is respecting the physics and constraints and all of that good stuff that you really need to make sure that you satisfy in order to not break your power grid. So my work comes in this realm of what’s called physics-informed machine learning. How do you build a bridge machine learning and data-driven techniques with the physical and engineering knowledge we already have to try to get the best of both worlds.

Jon: That is a incredibly good topic, and I encourage all of our viewers to also access the sessions we did with Ted Thomas around untangling the grid because not only is this a physics and an energy problem, but it’s also a politics and policy problem. So we encourage people to go take a look at those sessions we did with Ted and the upcoming sessions we’re going to do with some FERC, chair people coming up soon, but not ready yet. So Priya, this is a very, very time sensitive on point conversation. Please lead.

Priya: Yeah, absolutely. And I mean, I know that kind of AI has come into the public consciousness, you know, in a huge way over recent weeks because of, in particular chatGPT and other similar models. But in some sense, actually the story of AI, you know, is not new.

AI was invented back in the 1950s and actually started not with machine learning techniques. But with symbolic techniques, basically, can I write down a set of rules that describe the problem I’m trying to solve? For example, the game of chess. Can I just write down what the rules of chess are and find some algorithm that’s able to reason over those rules in order to do something.

So back in 97, deep blue, for example, was an AI agent that actually be, you know, the world’s, greatest chess players on this topic. But since then, especially in the last few years, machine learning has become a kind of predominant paradigm. And machine learning is the, you know, techniques that learn from data, large amounts of data, and machine learning has been around for a bit as well.

I mean, Siri, in your smartphone, kind of in the last, you know, 10 years is powered by machine learning. There were some advancements, you know, in the last five years about machine learning being used to achieve superhuman performance on games like Dota Two. Or, you know, help solve the 50-year-old protein folding problem, the alpha folds innovation.

And then, of course, all of the things we’re seeing with Gpt, Bard, and Dolly. So maybe given that history, what I wanted to do is just start in and dive in with some definitions. Really just talk about what does machine learning. where, and what are the different types of machine learning. And then maybe get into what that actually means for kind of energy systems in the clean, clean economy.

The Synergy between Machine Learning and Artificial Intelligence: Bridging the Gap and Exploring Applications

Priya: So, I think that’s my cue to start sharing slides

Jon: that is, and we will have these slides, associated below in the links so you’ll be able to access these slides as well. So do not worry. 

Kerim: And Priya, will you also speak to, maybe not today, but in a future session, the relationship between machine learning and AI and how, you know, what’s the break-off and, you know, what are the different, use cases for these different approaches?

Priya: Yeah, so I think that’s actually a great preemption for my first slide here. So, yeah. So what is machine learning and exactly when we talk about machine learning, we do wanna talk about what artificial intelligence is first. So Artificial intelligence really refers to any algorithm that allows a computer to perform a complex task.

Sometimes these are tasks that we associate with human intelligence, so speech or reasoning or vision, but sometimes they’re not right. Machine learning and AI are used to predict, solar output, solar power output. That’s not canonically something a human would do. So really it’s just any task, that task, that’s a complex task.

As I mentioned, kind of the, initially when AI started, it really was about just writing down a formal set of rules and figuring out some way to reason over those sets of rules. But today, we’ve seen a huge re surge in machine learning. And machine learning is a subset of artificial intelligence, and it specifically refers to techniques that automatically extract patterns from data and usually large amounts of data.

And so, because machine learning has become such a kind of dominant paradigm within AI, you’ll actually hear these terms used interchangeably quite often, and I’ll do that often as well, kind of interchangeably use AI and machine learning. But AI is actually a bit of a broader concept that encompasses symbolic techniques, you know, other kinds of techniques like physics, informed techniques, things like that.

Whereas machine learning is specifically a kind of data-driven analysis, automatic data driven.

Kerim: Can I ask you a layman’s question? So, for example, think of a big insurance company with all the data that they have access to, all sorts of things that the insurer and things that go wrong, and like the analysis of their set to figure out how to price their premiums and how to, you know, create their products.

I assume that would be using more of a machine learning subset of technology rather than, I guess we wouldn’t call that art artificial intelligence, but when I do ask a question to chatGPT and, and get a, you know, a page of, text as answer. I mean, it could be analyzing a key character from a play and you know, that is that called artificial intelligence.

Priya: Yeah, so I’d say both of those things are machine learning and as a result, both of them are, are also artificial intelligence in the sense that even GPT, and I’m sure we’ll get into this more later on. All it’s doing is analyzing some structure and its underlying data. And parroting that back out.

So there’s actually this famous paper, Stochastic Parrots, which describes these kinds of models as basically models that really try to understand or analyze or learn some kind of structure or pattern in the underlying data. And then with some random probability distribution, basically just spits something out that looks kind of like that.

And so that, even though, it seems more intelligent on the other side, it’s actually not inherently any more intelligent than just doing large-scale analysis of your insurance data. It’s actually something very similar under the hood.

Kerim: Got it. They are kinda making it look like it’s different. Like it could be different every time you ask the question.

Kerim: I guess there’s some sort of variable in there that causes it to come slightly different every time.

Priya: Yeah, and that’s just the stochastic in stochastic parrots, basically. As you put in certain kinds of data, there is some kind of randomness in terms of kind of what spits out. Yeah.

Kerim: That makes sense.

Jon: But Kerim, you’ve raised a good, very climate solutions-focused solution, which is insurance, because, with increases in extreme weather, you’re gonna have new I underwriting requirements. 

Kerim: But that takes me to the parroting word that Priya just mentioned because. Each extreme event is more extreme than the previous, at least, it seems so in our life, right?

Jon: So looking exclusively at historic data is not gonna work. So you have to take the next step, which is predictive.

Kerim: Right? And how… 

Priya: Yes, exactly is.

Kerim: that gonna be? So is that, are those coming out accurate? I guess that is the question.

Priya: Exactly. And that is the real question. And this is where this kind of, you know, area of kind of physics-informed or knowledge-informed machine learning comes into play. So, for example, if you are an insurance company that’s trying to predict what climate risk will look like in the future, One thing you could do is just try to learn from your current weather data using a machine learning model and predict that out. But as you’re observing, because you’re, the data you’re learning from is fundamentally different than sort of the situation going forward, you’re not gonna get that.

But what people have done is do things like, say I have a climate model. That has my physics written down, but there are some parameters in there where I don’t know what the exact values should be and so can I actually use data to inform what the parameters of my climate model are. And that climate model it’s the physics that will continue to persist.

That climate model can actually generalize and make predictions we can trust in the future. So I’m increasingly a big fan of these. You know, knowledge domain informed, you know, physics informed knowledge informed approaches, as opposed to just say, just scraping a bunch of random data from the internet and hoping something emerges from that.

Priya: Yeah.

Kerim: Sounds good. All right. Please continue.

The Three Major Paradigms of Machine Learning.

Priya: Yeah. So, yeah, and I did wanna get into, and I think some of our discussion has already started here, kind of when we say machine learning in some sense. The core concept is, again, very simple, automatically analyzing large amounts of data. But this actually plays out in a couple of different ways.

And so there are kind of three major paradigms for machine learning. So supervised learning, unsupervised learning, and reinforcement learning. And I wanna talk about each of those in turn. So supervised learning is a situation where you have a fixed data set that has labels or answers on it, and you’re trying to learn some relationship between your raw data and the answers that you have labeled on it.

So really maybe easy to grasp, hopefully. Example there is if you have a ton of images, each of which is labeled dog or cat. You can imagine a machine learning algorithm trying to figure out, okay, I’m ta getting this input, the raw pixels of the image. I’m analyzing something about it. I’m making some kind of guess.

And at the other side, I’m being told, am I right or am I wrong? Is it a dog or is it a cat? And then bite through kind of iterative feedback, continuing to try to make those guesses and updating its parameters based on the kind of signal on the other side as to whether it was right or wrong. The machine learning algorithm gradually learns how to basically learn this mapping from an image to your label of dog or cat.

One way to think about kind of supervised learning tasks is, these are situations where a human could do exactly what the machine learning model is doing and probably better. I trust myself.

Kerim: Mechanical Turk of Amazon instance.

Priya: Yeah. Or, or like me, right? Like, I mean, I trust myself in some sense to look at an image and tell you if it’s a dog or a cat, except in maybe edge cases where there’s some, species that look kind of similar.

I can, I can, pretty confidently tell you, I’ll be able to do that. Your machine learning algorithm may be wrong every now and then, but it, your machine learning algorithm can do it at much greater scale. I can’t label a hundred million images for you, a machine-learning algorithm can. And so supervised learning is really where you want an algorithm to do something at scale that a human could do, but where you’re okay with it be done being done slightly worse, because of the scale.

Jon: But in all cases, supervised learning always includes a human or does not include a human?

Priya: So it involves a human to actually create your data set. Okay. So you do have to make sure that yeah, you have the ability to have a clean data set where you have already established what the labels are. And once your machine learning algorithm learns from that, it can now take an image that is similar to what it’s already seen, but is not exactly what it’s already seen and, and mapped there. And so this really, you know, brings up really important questions about, well, how do you get a massively labeled data set. Who is able to label that? Where does the label come from? Under what conditions is the label coming and what are, what’s the pay around that and where does that…

Kerim: And where does that intellectual property and where does that intellectual property reside? Like all that labor is capital investment. So who owns that, too?

Priya: Exactly. And these are kind of, you know, important and open questions. I mean, this comes up with GPT, which is not an instance of supervised learning, and I’ll get to that. But even just in, in other settings where you have. Large amounts of data that are being analyzed for input to a machine-learning model.

Kind of how do you deal with the IP behind those images, especially depending on, you know. If the outputs that turn out on the other side are, are really deeply informed by the data that went in. So these are all really important and, and not open, open in the sense that, you know, there are people who have studied these questions, but I think it’s, it’s high time that we kind of bring any of those theories into practice at this point.

Jon: Would you say the internet at large being basically, servers worldwide with information at the edge, in most cases is most of that labeled data sets, would you say that’s mostly data LA labeled data sets, or not labeled data sets?

Priya: I would say that’s not labeled data sets and, okay. 

Jon: So you’re really talking about some, some a labeled data set example would be something like satellite images captured by X company that says, you know, this is a rooftop that has X amount of irradiation on it. It says Southwest, blah, blah, blah, this triangulation, et cetera. That would be a labeled data set, but it would also be considered a closed data set, right?

Priya: Yeah, so that would be a labeled data set and a labeled data set doesn’t, you know, inherently necessarily have to be closed or open. This is where sort of data-sharing policies and, and things come into play. So in the academic community, there are actually some common, what we call benchmark datasets, where they are open for the community to use. But then additionally, again, it does take a lot of resources and,  both, you know, equipment, resources, and human resources and expertise to really get these data sets together.

And it does mean that entities that collect them are often those with the most power and finances in society. And then they also often have an incentive to keep them proprietary because they wanna generate value from them in some way that others can’t. So a lot of stuff to do there with incentives and how we really open up the landscape.

Unsupervised Learning: Discovering Structure in Unlabeled Data Sets

Priya: Yeah. Yeah. But then kind of getting at this point of, you know, is the data online, you know, labeled or not? It, it’s not, and in some sense, the vast majority of data we have in the world is not labeled. And so there’s actually another branch of machine learning called, unsupervised learning, which really deals with, in a case where you don’t have answers about your data set, is there still some way that you can learn some underlying structure in your data set?

And use your knowledge of that structure to do something. So one example is clustering. So if you have a bunch of images and you don’t know what they are, You could still try to figure out some machine learning algorithm that sorts images into buckets based on which images are the most similar to each other.

And then maybe a human can go later on and say, oh, that’s the cat cluster and that’s the dog cluster, or something like that. But the algorithm doesn’t know that. It’s not able to assign semantic meaning or anything like that to that cluster. It just knows that those things are similar. And even though it’s not shown on this kind of base image, which I don’t have an attribution for it because it actually pops up everywhere on the internet without attribution.

So apologies to whoever the original creator is. But, generative AI is a type of unsupervised learning where basically you’re asking your model to take in a data set in, in the case of GPT, the internet and learn something about the structure. So what does a coherent sentence look like? And then when prompted, spit out something that fits what the model thinks a coherent sentence looks like, and I’m using the word coherent, you know, specifically to mean, it’s not a truthful sentence. It’s not necessarily a quality sentence. It’s a sentence that matches the linguistic patterns that we see on the internet. So GPT is basically doing next-word prediction. So it’s saying, I have this data set of the internet. if I see a certain set of words, I can, I know from my data what word is usually next, so let me just figure out when I see a particular sentence, what word is next, and just try to generate something like that for you. But again, there’s nothing inherently encoded in there about kind of truth or knowledge. It is really parroting what it thinks it sees in the underlying data.

Jon: You’ve mentioned a couple times, images. so I assume video and still images, but also words and I assume that between supervised and unsupervised learning, they’re both equally similar in terms of an image or a video or words. Is that right? 

Priya: Yeah, so the general kind of paradigm, you can kind of, you know, it, it applies to kind of any kind of input data, be that exactly like images or words or physics data or anything like that. 

Jon: Numbers or something, or

Kerim: Math, mathematical statements and equations.

Priya: Yeah so in, in principle you can, any of those things in practice, some of those problems are harder than others. Getting a, machine learning model to output the correct answers to mathematical questions is harder to do from raw data than other things. And in part because we’re less forgiving if an output is wrong, if an image is a little bit blurry. In the machine learning output, we tend to be like, okay, but it still looks pretty real. If a math problem is a little bit wrong it’s wrong. So, yeah, there’s an extent to which our evaluation of these things also affects how we feel about the inaccuracies of the output.

Jon: Yeah. You’ve underwritten the home in Florida just a slight bit off your insurance policy is underwater, literally.

Priya: Exactly. So, yeah, this general question of thinking about what is a sort of, critical setting where we really need to trust the output versus one where we’re okay with it being a little bit wrong. These are very different sets of settings and the way we should think about using machine learning in those settings is truly different.

Reinforcement Learning for Optimizing Complex Systems through Interactive Agent-Based Training

Priya: Yeah. So maybe jumping into, the third branch of machine learning or a third branch, this is a, you know, there, there are things that fall out of this taxonomy, but both supervised learning and unsupervised learning deal with in some sense or the other. Look at a fixed data set. And learn something about it.

There’s another category called reinforcement learning, which basically involves training a machine learning model that’s interacting with some kind of quote-unquote environment. So this could be like a physics engine, this could be a power grid. This could be like a game engine for a multiplayer online game.

But what your model is doing is it’s trying to take some actions on that environment. So in the case of a game, what should the player do? The model might be trying to learn some kind of policy for how to take actions for that player. Or if you are, trying to learn how a battery should charge or discharge on your power grid, then this might, the action might be, you know, how much should I charge or discharge?

And what the machine learning model does is it tries an action. It gets feedback from the environment about how good that action was. So, you know, in the case of the battery, how much money did I make? Did I destabilize the grid? And then based on that, it updates, its underlying kind of parameters that what’s, what’s enabling it to generate, its, recommendation for what action you should take.

And it just does that over and over again. Tries to take some kind of action, gets some feedback, and does that over and over again. And so that’s reinforcement learning and iin a lot of cases where we’re talking about optimizing a heating or cooling system or optimizing a power grid or something like that, something where you’re actually trying to think about what are the actions that, that we should be taking in an automated way. You’re often talking about reinforcement learning or something similar.

Jon: That’s incredibly helpful. When you look at the triangulation of just energy storage assets, for example, you’ve given the example, but also the state of charge of the battery itself, the operation of that battery in the most optimal manner so that you’re not degrading it over the period of time that you care about financially. But how does it respond to the grid? What was your outcome financially, all that. I mean, it’s just like an enormously complex set of information and to have a reinforcement learning, what would you call it? Tool? Agent? How, how do people characterize these in actual practice?

Priya: Yeah. So you’d call it a reinforcement learning agent? Exactly. 

Jon: Okay. 

Priya:the tool also works depending on sort of whether it’s the whole solution or a part of it. 

Jon Okay. 

Priya: And, Yeah. And, so I think these can be incredibly helpful. There are some caveats. One is that you often need either a real system you can test on, or a simulated environment that is, you know, realistic enough that you have some confidence, that if you’re, if your agent is learning to do something on your simulated system, that it actually tells you something about how the agent would behave on your real system.

Jon: And how long do you think that takes? Let’s say it’s a qualified team with real assets and they’re gonna do a parallel practice. You know, they’ll, they’ll have a practice system that’s really not connected to anything that could, it could hurt. And then you have a real system that you’re just running and some person is sitting there going, okay, on, off on, you know, doing it manually. What would be the timing? Is this like two weeks? Is it 15 years? When do you think people can get enough confidence with this reinforcement learning element agent to let it go?

Priya: Yeah, I mean, I think it can be a matter of years and in some sense it has to be in the sense that we do just really need to be dynamically optimizing power grids to be responsive to, you know, higher renewables integration.

But there’s work to be done if we really wanna realize that timescale. So these realistic simulators, for example, don’t exist. And there are efforts to build, you know, quote unquote digital twins of the overall power grid, which is, you know, really real-time snapshot of what your power grid is doing.

But I would say that’s a solution for reinforcement learning algorithms that have already advanced along the readiness pipeline enough, that you’re in that last state of testing. We don’t even have kind of simpler simulation environments that don’t, are not pulling in all your real-time data, but where a power system operator, for example, has sat down and agreed, this is reflecting how my grid would, would actually work.

You know, what kinds of anomalies or intricacies, I would actually see during actual practice, we need even that just basic simulation infrastructure along with agreed-upon metrics so that if your method succeeds on the simulation infrastructure, you know what success means and you know what’s next in order to, to test it out.

And the other thing is, Reinforcement learning methods, they, you know, they’ve achieved really powerful performance in the last years but they can fail and potentially catastrophically in unexpected ways. And so bringing back this theme of, you know, physics informed or domain and informed learning, there’s some work including my own, that really tries to bring together concepts from robust control and control theory where you can actually prove something about what your controller is doing, with reinforcement learning, so that you’re not, just depending on like seeing in real life how your algorithm performs and kind of crossing your fingers that it won’t break anything or hoping, making your best guess, but really can you, can you actually provably say something based on the mathematics of the system about what kinds of failures you, you know, won’t happen?

Exploring the Application and Challenges of Reinforcement Learning in Transmission and Distribution Companies.

Jon: I’ve got two questions here, which is, you have direct experience at a major transmission and distribution company. I’m not gonna name their name, but, how insightful are they on this topic of reinforcement learning? Because this would seem like a very amazing tool for someone that owns transmission and distribution assets.

Kerim: Can I add one more thing to that before they get to the reinforcement learning. Or deploying the reinforcement learning agent, how long do they have to do supervised learning or unsupervised? Do they even do unsupervised learning in the context of a power utility company?

Priya: It’s a great question. So I would say that often, you know, you would use supervised to unsupervised to reinforcement learning. It’s not that it’s exactly a progression. There, there are paradigms that solve different problems. So if you want to do something like predict how much solar power output you’re gonna get from your solar panel based on the weather and you know, time of day and all of that.

Then you, that’s pretty, you know, nicely framed as a supervised learning problem where you have some information about your historical weather, historical inputs, and then your solar power output, and you can just map between them. Yeah. But then the question is, if I have a prediction, what do I do with it?

What action do I take? And that is not something that, you know, supervised learning or unsupervised learning canonically would solve. That is just canonically a reinforcement learning problem where you’re trying to learn some kind of action. So this algorithm, do you bring on the.

Kerim: Do you bring on the peaker plant or not?

Priya: Exactly, yeah. And these, these algorithms do kind of, you know, interplay with each other. But they don’t have to, in some sense, I could produce a manual prediction of solar power output and it would still be a reasonable input to my reinforcement learning algorithm in order to do something automated.

So yeah, they solve slightly different problems. And Jon, to your question about, you know, the, the readiness, I guess, of different system operators. I would say that because these things solve different problems and because of the input technologies, just the situation around the availability of data and simulators is very different for these, I would say that many power system operators actually very widely used machine learning for what I would call situational awareness problems. So I want to know what my solar power output will look like. I want to know what the state of my system looks like. Because “I want to know…” leads somebody to be able to take action in a way that the system operator views as being trustworthy based on that information.

Whereas reinforcement learning is very much about. Having the machine learning algorithm itself takes some kind of action. And there’s, you know, understandably more hesitance around that because you really can break your power grid if you do something wrong. And so one kind of publicly available challenge is actually by, RTE France. They run a public challenge called Learning to Run a Power Network or L2RPN. They’ve actually created some basic simulation environments that try to reflect some of the realities that the French system operators seeing on their power grid and basically kind of present that as a challenge to the machine learning community to say, can you come up with a reinforcement learning agent that either, you know, automatically, or in some of their challenges in a human in the loop way, is able to come up with, with good actions that, that we can actually take on our grid.

Jon: At 80% nuclear, it’s an easier problem to solve.

Priya: Gotta start somewhere.

Jon: Totally agree. But this reminds me also of this simulation requirement, like these environments. It reminds me a lot of the late nineties where, you know, mobile phones were just kind of becoming very, very, very popular at the time, and everyone was trying to do apps on these mobile phones. But of course, the mobile phone companies and the carriers didn’t want you to play with live ammunition. They didn’t want you on the real. Devices is right outta the gate. Now, a very different problem if your phone crashes than if the power grid crashes. So I can understand the level of safety and, and reliability is very, very different, but it seems like there’s a place in this new world for power simulation environments where people can test their tools safely, and it can raise awareness and trust of these transmission and distribution owners. So there’s a, there’s a whole business probably in just making simulation elements or environments, is that right?

Priya: Yeah, absolutely. And I think that the challenge is, I think there’s a lot of utility and value, utility in the sense of usefulness as opposed to power utilities. There’s a lot of, value in making these kinds of simulation environments available. But it’s the classic problem that comes when, when you look at the startup landscape, everybody wants to build the things that create profit and value on top of the infrastructure that already exists. Often you have some kind of, you know, public goods issue when it comes to actually building the infrastructure that everybody can build off of. And so I think clever business models can help with this. But I also think that this is a place where public funding and public-private partnerships can really play a huge role in kickstarting this. 

Jon: So Department of Energy, universities, these kinds of places are the place to incubate these kinds of environments to test this stuff. You think that’s, the best place?

Priya: Yeah, and I think, again, in collaboration with startups and others who would be, be the users on the other side, I think you don’t know if a system is useful until you actually try to use it. And I think knowing that sooner rather than later, even in the early stages of development of these kinds of systems is incredibly important.

Jon: Fantastic. 

 

The Landscape of Machine Learning Paradigms, Models, and Applications

Priya: Yeah. So maybe zooming back out, getting at some of the questions you asked earlier, Jon, about, you know, images and videos and texts and all of these things, right? You know, how do they play into these different paradigms? So I wanted to actually slice the kind of machine learning world in a couple of other ways.

So I talked about this, you know, what kinds of use cases or what kinds of data or environment do you have? And that’s the supervised versus unsupervised versus reinforcement learning paradigm. But they’re also different kinds of machine learning models. So for each of these paradigms, there are different models that you’re using to somehow map between your data and whatever output you want.

And these models can be used within supervised or unsupervised reinforcement learning, but you basically are trying to just figure out what is the actual engine that’s mapping between my inputs and outputs. And so one that often comes up, right, and I think is very present in people’s minds is artificial neural networks, also known as neural networks, which, and then when you hear the word deep learning, it just means a machine learning paradigm that’s using artificial neural networks. And, they’re called artificial neural networks because they’re loosely inspired by the neural networks in our actual brains, I will say very loosely, and there’s sort of genuine debate about how real that connection is. But if you hear neural networks or artificial neural networks or deep learning, they’re roughly referring to the same thing.

Jon: At the, at the least. It’s good marketing.

Priya: Yes, it’s great marketing, but also, there are lots of others. So, you know, deep learning is not the only type of model or you know, neural networks are not the only type of model. There are things like, support vector machines and decision trees and vision models, and so forth. And I’m not gonna get into the details there, but what’s important to know is if you wanna use machine learning, you don’t have to use deep learning.

There are often other models you can use and some of these are really simple to implement. They’re open libraries in Python, for example, Scikit-Learn, where you can in just a couple of lines of code spin-up a support vector machine or a decision tree or something like that. So,some of these earlier models are more accessible than one might think.

And, in a lot of practical settings where your underlying data is imperfect or noisy, right? I’ve talked about all of these situations where you need a perfect data set to do something, but in a lot of situations where data is imperfect or noisy or something like that, we have seen often these simple models do quite well, whereas the, the more complicated models or maybe more brittle to these kinds of things or, or otherwise just need kind of additional research before they’re, perfect in those real settings.

Kerim: Are most of these opensource right now, or are there private companies building such models as well?

Priya: Yeah, so the frameworks to create the models are largely opensource, Scikit-Learn. I mentioned one, but then also TensorFfow, PyTorch, and JAX, are popular frameworks to be able to code up your own, you know, machine  your own, deep learning, algorithms. The models themselves when a company creates a model and, you know, trains it for some kind of purpose, is the model itself often public? Often no. In the research world, yes. But I mean, for example, you know, GPT, not a publicly available model. And, this has all sorts of implications for who can actually build on it and who is able to audit it, to really understand what’s going on under the hood and, and what needs to be improved from a public perspective.

Kerim:  Right.

Jon: You, you mentioned something that was important there. So most of, the model work and the way you’re implementing this stuff is done through Python programming language. So if, if someone was, say at, a technical school right now and saying, and listening to this and saying, Hey, I, this sounds super interesting. How do I begin maybe learning Python as a way, as an entry point to be productive here? Is that right?

Priya: Absolutely. So Python and some people are starting to prefer, you know, Julia another open-source language, but Python is a great kind of entry point into this. And again, I think it’s one of those things like the Python language itself is open and these frameworks on top of it are open. it really is a situation where, where a lot of people can, can just get going with these kind. 

Jon: Okay. Good. Thank you. 

Priya: Yeah. And then, yeah, I guess the, the last, you know, taxonomy I wanted to provide here is that, you know, machine learning models can take in different kinds of data. So you can have a right, again, supervised, unsupervised reinforcement learning. So different settings. You can have a different model in those settings that’s mapping from the data to the outputs. But then what does the data itself look like? And there are a couple of kind of terms you might hear in the world that, that. You know, just, and what these terms are doing is really describing what is the data at hand.  So if you hear the term computer vision, it’s referring to a situation where your input or your output is images or videos. If you hear natural language processing, it means that your input or output is, you know, text or speech or natural language of some sort. And if you hear robotics, it just basically means that you’re trying to enable some kind of embodied agent to move around or do something in the real world. And so your inputs and outputs are often physical data or sensory data of some other sort.

Jon: Is there a special data case for math or for some sort of, arithmatic, or that type of situation?

Priya: It’s a great question. I haven’t heard sort of a keyword for it. That doesn’t mean one doesn’t exist, but there isn’t, as far as I know, like mathematical processing as a particular use case. 

Jon: Got it.

Priya: And I think partially, this is historical, just what data sets exist have been easily, you know, readily centered as benchmarks in the machine learning community. and yeah.

Jon: Yeah, well, a company that I’ve worked with is doing computer vision around transmission and distribution assets, and they capture helicopter videos of this, these remote assets, and drone videos and pictures, etc.And then only a tiny, tiny percentage of it is analyzed by humans. Because they just, it’s overwhelming and we have clearly not done a fantastic job at upkeep and operation and maintenance of this infrastructure. And so having the ability to, as you say, 15 of these images, you could probably do, but you know, 15 billion, not, that would be hard, but this might be a great application set for it.

Priya: Absolutely. Yeah. And then this is one where, yeah, that use case of just, predicting outages, in transmission distribution, infrastructure, doing vegetation management. Buzz is doing it, EPRI is doing it. And there are a bunch of others. And this is, I think just like a really incredible use case for machine learning.

Exploring the Potential of Machine Learning – Use Cases, Strengths, and Data-Driven Insights.

Kerim: What are some other use cases? 

Priya: Yeah, so I’m happy to jump into that part if you’d like. Actually,

Jon: no, no, no. Let’s hold off

Kerim: that’s a good heading there.

Jon: All right. All right. But let’s not jump into immediate, because the next, the next session I think we’ll do is the let’s, here’s a practical example of applying some of this stuff. So…

Priya: Yeah, absolutely. And I think if we have time today, I’ll try to give the, like high level, quick taxonomy teaser to kinda to try to provide a groundwork for that but then, yeah, before that, let’s definitely dive into the machine learning strengths and weaknesses. All right. And I think we’ve talked about a lot of these.

So strengths, you know, machine learning is good at performing tasks at scale, so again, scaling human intuition in order to do something at scale, maybe slightly less quality than a human would’ve done it, but at scale.  Another one is optimizing complex systems, and this is related to this idea of reinforcement learning that I mentioned before.

Really taking in data and other information about your system and being able to provide some notion of, you know, what are the actions you should take, and really learn that in a really nuanced way. Another one is creating derived data. So taking, for example, large amounts of satellite imagery and really trying to extract from that where are the solar panels in the image or, trying to take large amounts of text documents and really extract from that.

What are the policy commonalities and different policy texts that we see that enable us to, you know, understand what those patterns are to make new policy or analyze large amounts of patent data to understand what the patterns are. So really taking large and raw amounts of data and trying to distill it some way into an actionable insight. And then another is, you know, dealing with different kinds of data sources. So, for example, forecasting solar power output, which is an example I brought up a few times. It’s something that we’ve been doing forever, right? First using rule-based techniques where you basically write down is it a holiday, is it, you know, what day is it?

And you know, or I guess a holiday doesn’t matter for solar power output as much as say, a power demand. But you get the idea, you write down some kinds of rules. You predict these things, or you can use physical numerical weather prediction models to come up with what is the solar irradiation gonna look like somewhere, and what does that mean for solar power output?

But what machine learning’s really good at is taking in multiple types of data and really learning to analyze patterns among that. So you could have a machine learning algorithm that takes in, historical weather data, but also takes in you know, the outputs of a numerical weather prediction model.

And it also takes in images of clouds moving overhead, your solar panels, and, really analyzes all of that together in order to provide some kind of prediction. So, this idea of just dealing with many different types of data sources and finding really powerful correlations between them is something machine learning’s good at.

The Challenges and Biases in Machine Learning – Understanding Data Demands, Quality Issues, and Ethical Considerations.

Priya: All right. So what’s it bad at or what are its limitations? So one, as we’ve talked about, very data-hungry. So, you often are in a situation where you’re needing to label or collate a huge data set and that often means either you are needing to put a lot of financial and human resources into that, or you’re settling for lower-quality data.

And if you’re settling for lower-quality data, it’s important to understand this principle of garbage in and garbage out. Since machine learning is trying to analyze patterns in its data, if the patterns and the data are garbage, it’s going to output garbage.

Jon: Wait a minute, let me, let me pause there for a second. So you’re saying that, if the data is unlabeled, then we’re moving over to the unsupervised side. And we’re just gonna get lower quality outputs because of that. But then some folks are saying, I’d like to improve the quality of the data. By doing what? You said something earlier at the, at the beginning of what we were discussing here, that, in some cases people are doing, I don’t know. It was sort of, maybe it was my, imagine this word, but less substandard conditions where people are doing certain types of work. What are you finding in the world of artificial intelligence? Are, are people in like sweatshops in some place? That they’re just sitting there like for 16 hours a day saying, this is a dog, this is a dog, this is a dog, this is a cat, this is a dog.

Priya: Yeah, so two things I’ll bring up there. So, unsupervised data doesn’t necessarily mean it’s worse data. It’s, it’s a different task. What is worse data is sort of if you don’t know what your model is seeing. So if you’re scraping the entirety of the internet, there’s hate speech on there. There’s all sorts of stuff in there that your model is seeing.

And that’s very different than if you’re curating an unsupervised data set. Not because you know the mapping between the data and answers, but just because you know what is in your data, that there is a hate speech in there, that can be a form of ensuring quality. And labor is an issue. I think you know, Mechanical Turk is often used to label data sets and there are of course, like there various ethical considerations with that.

Priya: And then when it comes to you know, GPT,…

Jon: Open that up a little bit further. Mechanical Turk is, explain, go a bit further with that. Explain that…

Priya: Because, yeah, so Amazon Mechanical Turk is basically a system where you can farm out really, quick tasks to a bunch of people and they get paid a small amount of money for each task that they complete. And it’s a way for people to potentially make sort of some marginal, additional income to do things like data labeling, sort of in the same way you sometimes get incentivized to, I don’t know, provide reviews on Google or something like that. But in a situation where, people are depending on that for their livelihood, maybe they’re having trouble finding other work. It doesn’t necessarily pay enough for that. And it also, of course, takes away time from things like like finding other work. So they’re just, people are not paid enough for this basically, even though it’s providing a ton of value. 

Jon: Thank you. Sorry for going too far on that…

Priya: No, absolutely. And then with GPT and, and OpenAI, there, there were some breaking stories about the fact that there was a lot of kind of data labeling and content labeling going on in, I believe Kenya, where workers were not necessarily being paid even what locally would be potentially an acceptable wage.

So work on data is extremely undervalued in the sense that the algorithmic work is often viewed as flashier. Then that, but work on data is just essential for making sure any of this algorithmic work actually means anything. 

Jon: Okay. Thank you.

Priya: Yeah. All right. The other thing is machine learning can be biased and, in a couple of ways. So one is, in terms of the underlying the patterns and the underlying data. So for example, a common use case, of machine learning or one that is kind of worked on in the climate change space, is that you would analyze data on buildings, so sensor data or something like that in order to understand which buildings are more likely to be successful candidates for a retrofit.

That’s great. Right. It’s a great way to kind of try to pinpoint your retrofits so that you, you’re doing it in places that it counts, but it’s also worth noting that buildings data in the US, for example, reflects the histories of redlining and racial discrimination in the US housing sector.

Where different parts of the building stock have been invested in differently, over the years. And so those biases, even though it’s sata on buildings, it’s this engineering data, right? It reflects human histories and social and political histories that then when you analyze that data, are there and your machine learning algorithm is going to pick up on those patterns and replicate them because it’s output.is fundamentally following the patterns that are already in the data. So that’s something that’s just, it’s, it’s extremely cognizant important to be cognizant of just what is your data really encoding? And is that something you actually want reflected in the output? But it’s not just the data, it’s also about what we choose to use machine learning for.

So what use cases, for example. There’s a lot more machine learning right now in precision agriculture than there is for smallholder farmers. That doesn’t necessarily mean that the use cases are differentially impactful or anything like that. It’s just a question of, you know, who decided to do it and who had the resources to do it.

Or I mean, in a non-climate example, when it comes to, policing, machine learning has been used in, areas like predictive policing, where the goal is to predict who will commit a crime or where a crime will be committed based on past data. And it’s not just that the data is biased, cuz what you’re often seeing is data on arrests rather than crime.

And arrests are correlated with where we choose to dispatch police resources in the first place. But it’s also the case that the kinds of crimes that are predicted, it’s not often financial crime on Wall Street, nor is it abuses by police officers themselves. It’s often, you know, the crime that is being predicted is crime by populations with less power in society.

So, the ways in which we really think about what machine learning is an accelerator of the systems it’s used in. What systems are we using it in? That’s not a neutral choice. We’re often making very powerful choices about where machine learning is actually helping us accelerate something in society.

Jon: This is such a great point because I think that a lot of, over the last, say 24 to 36 months, a lot of,  companies and organization leaders have made a concerted effort in hiring sort of a less obvious candidates for engineering work in clean economy or sales, marketing, whatever it might be. And it sort of immediately surfaces the fact that stem learning in elementary school in less privileged zip codes is really quite poor. And so you don’t find a lot of mechanical, electrical, computer science engineers coming out of these types of neighborhoods or communities. And so there are just less candidates available, and so you have to be incredibly intentional about your outreach, but in some cases, the pool is just not very deep.

And so you’re saying, well, okay, candidate to candidate. How do I do this, you know, and do it for the best of my stakeholders and my organization. Your, point here is really great, which is the data itself could be corrupt before you get it. And so like our system around education in STEM is already 10 years behind, so how are we gonna fix that today? It’s really hard.

Priya: Yeah, and it, it requires a combination of investing in the systemic solution, right? Making sure that we are investing in, for example, early education, but also of course being aware of biases that might be preventing us from even getting the pool of candidates that’s out there, the way we specify various criteria, you know, are we requiring a degree for something that doesn’t actually need a degree from a skills standpoint. and this kind of thinking, right? It’s intuition about just how do you make sure that you’re hiring in a way that’s equitable, that’s cognizant of the biases in a broader systemic sense and in your own process. That kind of thinking applies directly to when, when you’re thinking about machine learning workflows, it’s in some sense no different.

You wanna think about what is the system the algorithm is operating in, what are the specific choices I’m making when designing it, and what are the specific choices or things encoded in the data it’s getting and, really thinking about that holistically.

Jon: It’s such a good point.

Priya: Yeah. All right. And the last two things here are, maybe obvious, but I think worth saying anyway, or things we’ve covered. So one is that machine learning assumes that patterns are persistent. So if it’s seeing a data set, it assumes that data set is the world. And if you try to get it to predict for a different world, that is not reflected in the dataset, it’s not gonna do it. And so, this is exactly the example from before of if you’re trying to predict long-term weather, weather under a changing climate, your historical weather data isn’t gonna be distributed the same as your future weather data. And you need to do other things to deal with that.

And then, fundamentally, machine learning is statistics. And with statistics, we all know that phrase, finding correlation, is not finding causation. And machine learning is similar. You’re, you’re finding how things are related to each other or correlated with each other in the data. In order to find kind of causal explainability, you often have to do analysis that is not just basically data crunching or number crunching to understand that. 

Highlights from the Conversation with Priya Donti 

Jon: Wow. This is a great start for beginning to understand this incredibly important and complex topic. I really want to thank now, professor Priya Donti. Last time I saw her here in Sonoma County, it was not professor. I don’t believe so now congratulations are in order. but this has been a fantastic SolarAcademy / Suncast episode of learning just the beginning of what I hope to be a long number of series and ongoing about this subject of artificial intelligence and climate solutions. But also we’re gonna be dissecting certain company examples of how people are using these tools effectively. I am super grateful to have you on as a guest, and we look forward to very soon having you again. Kerim, you wanna sign off?  

Kerim: Thank you very much, Priya. This was very useful and I really look forward to going into these use cases in different sectors in the climate sector as well as many other use cases.

Priya: Fantastic. Looking forward to it.