If you are in VetMed, you've likely heard the buzz around artificial intelligence (AI) and its potential to revolutionize the field. But what exactly is AI, and how can it be applied in veterinary medicine? In this series, I (Gary) will be providing an AI crash course tailored specifically for veterinarians, helping you understand what's currently possible and what's on the horizon.
Understanding the Hype
The short answer to all the hype in AI right now? Transformers. To understand how we got here, let's break down the key concepts:
Defining AI:Â At its core, AI is applied mathematics, implemented through computer science and trained on large amounts of data. It's all about solving problems using math, enabled by computing power and vast datasets.
The AI Stack:Â Picture the modern AI stack as a set of nested layers, each building upon the last.
Artificial Intelligence: Solving problems using symbolic logic or flowcharts.
Machine Learning: Making predictions and classifications based on patterns in training data.
Deep Learning: Using neural architectures to let the computer identify patterns.
Generative AI: AI that can produce open-ended outputs.
Large Language Models: Processing and generating language without guaranteeing understanding.
Transformers: A specific architecture used in large language models
Why Now? OpenAI's Bet on Transformers:Â The current AI boom can be largely attributed to OpenAI's substantial investment in transformer architectures, introduced by Google in 2017. After seven years and millions spent on training, their model proved incredibly powerful, sparking a wave of interest and investment in AI.
The Potential for Veterinary Medicine
So, what does this mean for veterinarians? As AI continues to advance, we can expect to see numerous applications in our field, such as:
Improved diagnostic tools
Automated record-keeping and documentation
Personalized treatment plans
Predictive models for disease outbreaks
By understanding the foundations of AI, we can better evaluate and implement these tools in our practices, ultimately leading to better patient outcomes and more efficient workflows.
Video Deep Dive
Feel free to watch the original video presentation and leave questions as comments on the video. I'll make sure to answer in subsequent videos.
Stay Tuned
In the next part of this series, we'll dive deeper into the technical aspects of neural networks and transformers, giving you a solid grasp of how these technologies work. Armed with this knowledge, you'll be well-equipped to navigate the rapidly evolving landscape of AI in veterinary medicine.
Full Video
Full Transcript Below
1
00:00:00,662 --> 00:00:05,821
Hey everyone, my name is Gary Peters. I am co-founder and CEO of PupPilot,
2
00:00:05,881 --> 00:00:10,746
AI co-pilot for veterinarians. Today I'm not actually going to be talking about my
3
00:00:10,867 --> 00:00:12,969
company and what we're doing there,
4
00:00:13,149 --> 00:00:16,453
but I am going to be kind of giving more of an overall AI
5
00:00:16,473 --> 00:00:19,285
crash course for veterinarians.
6
00:00:19,286 --> 00:00:21,948
I've been on a lot of the forums and I've listened to a lot
7
00:00:21,928 --> 00:00:26,312
of people talk about AI and I think there's a lot of excitement around
8
00:00:26,313 --> 00:00:29,756
what AI can do and sometimes there's a little bit of pessimism around what
9
00:00:29,776 --> 00:00:33,499
AI can do and really what I want to do is just set a
10
00:00:33,519 --> 00:00:38,164
really solid foundation for where AI is at so that anyone
11
00:00:38,184 --> 00:00:40,756
even if you don't have a background in AI can have a really good
12
00:00:40,757 --> 00:00:43,558
understanding of currently what's possible,
13
00:00:43,638 --> 00:00:48,323
currently what's not possible, and just being able to kind of like weigh those
14
00:00:48,663 --> 00:00:52,007
those upcoming features and tools and everything out for yourself.
15
00:00:52,067 --> 00:00:54,750
That's that's my ultimate goal is just that by the end of this course
16
00:00:54,770 --> 00:00:58,293
that you'll be able to do that. So I'm adopting this from a lecture
17
00:00:58,313 --> 00:01:01,808
that I gave at National Phil Fetch called Top Things to Know about AI
18
00:01:01,828 --> 00:01:06,072
in Vet Med, and I'm gonna break that down into kind of a few
19
00:01:06,073 --> 00:01:08,554
different parts here. So one,
20
00:01:08,534 --> 00:01:12,258
I'm just gonna kind of give an understanding of the hype in AI.
21
00:01:12,338 --> 00:01:15,381
You know, there's a reason for all this hype that we're seeing in AI
22
00:01:15,401 --> 00:01:18,684
here, and I'm gonna do that by first kind of just giving a brief
23
00:01:18,704 --> 00:01:21,457
primer on primer to Gen AI in Vet Med,
24
00:01:21,477 --> 00:01:23,499
and then I'm gonna go a bit under the hood. And,
25
00:01:23,820 --> 00:01:26,783
we're gonna get a bit technical, but it's not gonna be very math heavy.
26
00:01:26,784 --> 00:01:31,327
Uh, it's really more just to kind of set an intuitive understanding of what's
27
00:01:31,328 --> 00:01:33,369
happening in AI. And then,
28
00:01:33,549 --> 00:01:36,372
for the top use cases in Gen AI in Vet Med,
29
00:01:36,373 --> 00:01:41,047
because once you have that good intuitive understanding all of the use cases
30
00:01:41,048 --> 00:01:43,449
will make sense, and, conversely,
31
00:01:43,450 --> 00:01:47,814
the last video, um, it'll make sense why certain things are not possible,
32
00:01:47,954 --> 00:01:49,996
right? And, um,
33
00:01:50,016 --> 00:01:52,979
the, the, my goal is that as,
34
00:01:53,359 --> 00:01:55,361
as the technology adapts,
35
00:01:55,682 --> 00:01:58,284
this can be kind of like an open line of communication where,
36
00:01:58,285 --> 00:02:02,799
um, we can be kind of like sharing more of what's happening in
37
00:02:02,879 --> 00:02:07,603
AI and showing what is now increasingly possible and still what is not possible.
38
00:02:08,404 --> 00:02:12,088
So, a little bit of background on myself, just to kind of help validate
39
00:02:12,089 --> 00:02:14,711
if you've never met me or never talked to me why I would be
40
00:02:14,731 --> 00:02:19,275
in a position to be able to talk about this, um,
41
00:02:19,276 --> 00:02:23,409
when I was actually at Vanderbilt. I was a math and engineering major there,
42
00:02:23,410 --> 00:02:26,673
um, the, they didn't have a name for it at the time,
43
00:02:26,693 --> 00:02:29,195
it was called, um, computational,
44
00:02:29,196 --> 00:02:33,740
uh, or scientific computing and that was kind
45
00:02:33,720 --> 00:02:37,243
of like the precursor to a more like formal AI degree but I,
46
00:02:37,323 --> 00:02:40,316
I really enjoyed my time while I was there. I ended up joining a
47
00:02:40,296 --> 00:02:44,680
startup, uh, afterwards called Sizzle and what we did,
48
00:02:44,660 --> 00:02:49,065
it was basically a natural language processing startup and so what that means is
49
00:02:49,085 --> 00:02:53,870
that we built algorithms to read through large bodies of text
50
00:02:53,990 --> 00:02:56,212
and then take actions based on those texts.
51
00:02:56,213 --> 00:02:59,225
So that startup, for example, it would read through JavaScript.
52
00:02:59,226 --> 00:03:03,269
job descriptions, identify keywords in those job descriptions and then match them with a
53
00:03:03,270 --> 00:03:05,571
very large video template library.
54
00:03:05,572 --> 00:03:10,256
So that was like, we would find things about job descriptions about Miami
55
00:03:10,316 --> 00:03:15,061
and we would find, you know, pictures and videos of beaches and other things
56
00:03:15,062 --> 00:03:17,583
that would be like a very simple example of like,
57
00:03:17,584 --> 00:03:22,215
you know, what that would be doing. Now that leveraged
58
00:03:22,216 --> 00:03:25,475
my work, my work there kind of like leveraged into what I did next,
59
00:03:25,615 --> 00:03:27,695
which was my graduate work at Carnegie Mellon,
60
00:03:27,755 --> 00:03:32,035
where I had a focus in artificial intelligence. I really enjoyed my time there
61
00:03:32,036 --> 00:03:36,295
and it was a really great and exciting time to be there.
62
00:03:36,715 --> 00:03:41,206
2017 is when the very famous paper from Google came out,
63
00:03:41,207 --> 00:03:45,912
which is attention is all you need, and the it
64
00:03:45,932 --> 00:03:50,516
was it was definitely a very, it was the it was kind of like
65
00:03:50,677 --> 00:03:54,421
the foundational work for all the AI stuff that is out and popular right
66
00:03:54,422 --> 00:03:56,923
now is really set during that period of time.
67
00:03:57,244 --> 00:04:00,056
So it was a great time to be in school. While I was there,
68
00:04:00,176 --> 00:04:04,801
I ended up winning a competition that's kind of like a multi-university competition across
69
00:04:04,802 --> 00:04:07,043
like Stanford, Carnegie Mellon, MIT,
70
00:04:07,044 --> 00:04:10,947
and some other places, and that actually brought me over to Stanford.
71
00:04:11,027 --> 00:04:13,810
I did not get a degree from Stanford, but I started working with a
72
00:04:13,811 --> 00:04:18,434
PhD group of students there on some applied artificial
73
00:04:18,435 --> 00:04:22,428
intelligence work. After getting accepted into the Venture Studio program there,
74
00:04:22,749 --> 00:04:24,771
we ended up starting a company called Uncapped,
75
00:04:24,871 --> 00:04:28,715
which ended up being accepted into Stanford's StartX program,
76
00:04:28,996 --> 00:04:32,059
which is a way that Stanford supports the local students,
77
00:04:33,280 --> 00:04:37,125
and provides resources and funding, and same with a program through Carnegie Mellon.
78
00:04:39,265 --> 00:04:41,617
We focused initially on default rate prediction,
79
00:04:42,018 --> 00:04:47,083
basically we were using artificial intelligence to figure out if somebody would
80
00:04:47,084 --> 00:04:51,688
default on their home loans, and the idea was to
81
00:04:51,928 --> 00:04:56,132
help reach out to people who would normally not be able to afford a
82
00:04:56,232 --> 00:04:59,275
home, or maybe more of that like borderline area.
83
00:04:59,276 --> 00:05:01,848
That was kind of like where a lot of AI research was at the
84
00:05:01,849 --> 00:05:06,793
time, was kind of being able to like identify uhm, individuals that normally,
85
00:05:06,794 --> 00:05:09,595
traditionally wouldn't be able to get into a home. Uh, that was,
86
00:05:09,696 --> 00:05:11,938
that was a great experience that ultimately got acquired,
87
00:05:12,018 --> 00:05:14,580
which in the tech world is usually a good thing. So,
88
00:05:14,581 --> 00:05:17,663
uh, after that I ended up being CEO of Quohome,
89
00:05:17,683 --> 00:05:20,175
which was uh, another type of mortgage company,
90
00:05:20,395 --> 00:05:24,735
mortgage tech company, and I built a different type of AI there.
91
00:05:25,615 --> 00:05:28,235
Uhm, uh, AI back in 2017,
92
00:05:28,395 --> 00:05:31,435
2018 was much more about prediction, and the AI,
93
00:05:31,436 --> 00:05:35,175
the type of AI systems we're working at Quohome were more uhm,
94
00:05:36,015 --> 00:05:40,266
around determining if someone was could get into a home,
95
00:05:40,306 --> 00:05:43,649
and if they couldn't, giving them kind of these automated,
96
00:05:43,650 --> 00:05:46,632
uh, step-by-steps of how to get into a home.
97
00:05:46,633 --> 00:05:48,895
So we'd automatically be able to tell anyone,
98
00:05:48,896 --> 00:05:51,137
hey, you can't get into a home, but if you do X, Y, Z,
99
00:05:51,157 --> 00:05:53,259
you can. Uhm, and that was really cool,
100
00:05:53,260 --> 00:05:55,341
just kind of trying to open up, uhm,
101
00:05:55,361 --> 00:05:59,676
home buyership to more people. Uhm, that startup ended up getting acquired.
102
00:05:59,677 --> 00:06:02,698
And, uhm, now working on PupPilot,
103
00:06:02,699 --> 00:06:05,481
uh, introduced to the vet world through my wife,
104
00:06:05,621 --> 00:06:09,405
who's a, a veterinarian, and we noticed that she had many,
105
00:06:09,605 --> 00:06:11,808
many late nights in the office, and,
106
00:06:11,809 --> 00:06:15,752
uh, was related to soap notes, and so that was initially what got us
107
00:06:15,772 --> 00:06:19,265
in the doors, uh, generating one of these automated scripts.
108
00:06:19,266 --> 00:06:21,507
And, uhm, and we've,
109
00:06:21,607 --> 00:06:25,311
we've developed a lot more in AI, like our peer-reviewed chatbot and more,
110
00:06:25,331 --> 00:06:27,934
but as I said, this is not like a sales pitch,
111
00:06:27,935 --> 00:06:29,936
uh, uh,
112
00:06:29,956 --> 00:06:33,500
lecture, so that's the last I'm going to really be talking about PupPilot.
113
00:06:33,820 --> 00:06:38,164
So, diving into just kind of understanding the hype in AI,
114
00:06:38,204 --> 00:06:42,297
right? so I want to break this down.
115
00:06:42,577 --> 00:06:45,180
So, the ultimate answer is Transformers.
116
00:06:45,200 --> 00:06:48,603
Like, what's all this hype in AI right now? Short answer is Transformers are
117
00:06:48,584 --> 00:06:51,166
the hype. So, how did we get here?
118
00:06:51,167 --> 00:06:53,369
Uhm, first,
119
00:06:53,589 --> 00:06:55,811
I want, in order to, to figure that out,
120
00:06:55,931 --> 00:06:57,974
we, I think it needs to be very clear, kind of like,
121
00:06:58,054 --> 00:07:01,347
what is AI? It's a term thrown around a lot,
122
00:07:01,728 --> 00:07:04,290
and I think it's misused a lot. It does,
123
00:07:04,330 --> 00:07:08,354
it is a broad term, but I think understanding what AI typically means is,
124
00:07:08,394 --> 00:07:11,277
is important. Next, I want to understand,
125
00:07:11,317 --> 00:07:14,520
really, kind of like, why now? Why did Transformers come out now,
126
00:07:14,540 --> 00:07:18,004
essentially? Then, we're going to be covering neural networks,
127
00:07:18,024 --> 00:07:21,016
which is really the, kind of like, the foundation. of Transformers,
128
00:07:21,096 --> 00:07:25,200
and then, fourth, I really want to deep dive into the Transformer,
129
00:07:25,240 --> 00:07:28,203
right, because this is ultimately a thing why we have so much hype in
130
00:07:28,204 --> 00:07:32,928
AI. So, the way that I'm going to start by defining AI is with
131
00:07:33,029 --> 00:07:35,791
math. At the end of the day, uhm,
132
00:07:35,811 --> 00:07:39,115
if you can, you can really think of, uh, artificial intelligence as,
133
00:07:39,116 --> 00:07:41,267
uh, applied mathematics. I, I, like,
134
00:07:41,287 --> 00:07:44,651
if you go into university, before they started calling them,
135
00:07:44,652 --> 00:07:47,013
uhm, you know, uh,
136
00:07:47,033 --> 00:07:49,215
AI degrees, many, many of them started as,
137
00:07:49,235 --> 00:07:51,478
like, applied math majors, or PhDs,
138
00:07:51,598 --> 00:07:54,801
or, uh, graduate students, which is the branch that I came from.
139
00:07:55,302 --> 00:07:58,265
So, that's, that's kind of, like, the foundational element of it.
140
00:07:58,285 --> 00:08:00,917
they're the way that you are implementing it,
141
00:08:00,918 --> 00:08:03,780
right, how are you applying it, is effectively through a computer.
142
00:08:03,800 --> 00:08:06,502
So, computer science is integral to,
143
00:08:06,662 --> 00:08:09,145
to the equation here. And then, finally,
144
00:08:09,485 --> 00:08:12,408
there is a, uh, data component,
145
00:08:12,409 --> 00:08:15,191
uhm, and a very large data component,
146
00:08:15,211 --> 00:08:19,055
if you will. And so, AI really sits at the center of these three
147
00:08:19,075 --> 00:08:21,227
things. And the way I like to think about it is,
148
00:08:21,247 --> 00:08:25,792
it's really solving problems with math enabled by computer science and
149
00:08:25,812 --> 00:08:30,476
data. Uh, so, not to go under the hood in the second
150
00:08:30,477 --> 00:08:33,119
part of the lecture, but just like the first, first part here.
151
00:08:33,339 --> 00:08:35,341
This is, this is not too deep in the hood, if you will.
152
00:08:35,342 --> 00:08:39,695
Uhm, this is a common,
153
00:08:39,696 --> 00:08:44,400
uh, graph, to kind of show the many layers inside of
154
00:08:44,620 --> 00:08:48,024
AI. Uh, and you could kind of think of this as almost like the
155
00:08:48,064 --> 00:08:50,586
nesting dolls of artificial intelligence.
156
00:08:50,927 --> 00:08:53,269
Now, I would be remiss to say,
157
00:08:53,489 --> 00:08:55,611
there are many more layers to this,
158
00:08:55,812 --> 00:08:57,834
and there's many more complexities to this.
159
00:08:58,374 --> 00:09:02,268
And, I'm just, Just trying to kind of give like a high-level overview,
160
00:09:02,628 --> 00:09:05,711
and as I have said before, this is really for kind of like the
161
00:09:05,792 --> 00:09:08,735
veterinarian, or the vet staff,
162
00:09:09,095 --> 00:09:12,498
who's looking to better understand artificial intelligence.
163
00:09:12,859 --> 00:09:16,723
I'm not trying to kind of create a actual AI courses here.
164
00:09:16,823 --> 00:09:19,314
So, this is just kind of like high-level, a good,
165
00:09:19,315 --> 00:09:22,317
framework to kind of think about the modern AI stack.
166
00:09:22,337 --> 00:09:24,559
So, we're going to go through these one by one.
167
00:09:24,560 --> 00:09:27,222
And also, just just so that people aren't,
168
00:09:27,322 --> 00:09:30,045
you know, there's not some sort of big reveal what this is,
169
00:09:30,125 --> 00:09:32,968
GPT, if you've if you've heard of chat GPT,
170
00:09:32,988 --> 00:09:37,793
of course. The GPT in GPT stands for generative
171
00:09:37,853 --> 00:09:40,847
pre-trained transformer. You'll understand all of that by the end of this,
172
00:09:41,207 --> 00:09:44,430
but that transformer bits kind of in the heart of there,
173
00:09:44,450 --> 00:09:46,733
right in that in that in that white circle down there,
174
00:09:46,753 --> 00:09:49,355
right? So, we're pretty deep into the AI layers,
175
00:09:49,375 --> 00:09:53,820
if you will. So, um, all right,
176
00:09:54,301 --> 00:09:57,384
if AI is solving problems using math,
177
00:09:57,404 --> 00:09:59,736
computer science, data, um,
178
00:09:59,756 --> 00:10:03,900
what does that look like? Well, it could be as broad stroked as just
179
00:10:03,920 --> 00:10:05,942
kind of solving, uh,
180
00:10:05,962 --> 00:10:10,186
problems using symbolic logic or to maybe like ground this a little bit more.
181
00:10:10,166 --> 00:10:12,408
I'm going to pull in a medical example.
182
00:10:12,468 --> 00:10:14,530
This chart may feel a little overwhelming,
183
00:10:14,570 --> 00:10:16,572
but I'll help ground this is,
184
00:10:16,573 --> 00:10:19,585
uh, it's it's a heart attack prevention risk analysis.
185
00:10:19,586 --> 00:10:24,550
So, we're doing basically step one and two to figure out how risky is
186
00:10:24,610 --> 00:10:28,975
someone for a heart attack. And this is basically a flowchart,
187
00:10:29,055 --> 00:10:31,317
and there's lots of these inside of the world of medicine.
188
00:10:31,838 --> 00:10:34,560
And this flowchart was initially,
189
00:10:34,561 --> 00:10:39,255
uh, developed based off of large amounts of data that practice.
190
00:10:39,256 --> 00:10:41,417
Titioners reviewed, uh,
191
00:10:41,437 --> 00:10:44,761
trying to determine how risky someone was for,
192
00:10:44,941 --> 00:10:47,643
or how much at risk someone was for a heart attack.
193
00:10:48,264 --> 00:10:50,987
And it may not feel like AI,
194
00:10:51,047 --> 00:10:55,792
but effectively, if this system is coded into
195
00:10:55,793 --> 00:10:59,245
a computer and you are kind of putting the inputs in.
196
00:10:59,246 --> 00:11:02,388
And right, like this person's this age, and they had this blood pressure,
197
00:11:02,389 --> 00:11:06,212
and yadda yadda. Uhm, if the computer makes the decision,
198
00:11:06,192 --> 00:11:08,334
this is a type of AI, right?
199
00:11:08,335 --> 00:11:12,839
Because basically, the computer is acting as
200
00:11:12,879 --> 00:11:15,582
the intelligence for you here. You're putting in the inputs,
201
00:11:15,662 --> 00:11:18,345
and it's giving you an output, telling you what to do,
202
00:11:18,365 --> 00:11:21,397
basically. And in the most broad sense,
203
00:11:21,517 --> 00:11:24,640
this is what artificial intelligence is,
204
00:11:25,461 --> 00:11:27,804
and that may sound very strange.
205
00:11:27,805 --> 00:11:31,427
It may feel a little overly broad, and I'd argue that it like,
206
00:11:31,808 --> 00:11:34,070
in most applications, it is a little overbroad.
207
00:11:34,071 --> 00:11:36,652
There's kind of like a running joke, you can call almost anything done by
208
00:11:36,672 --> 00:11:39,946
a computer AI. But in many ways, it's it is,
209
00:11:40,066 --> 00:11:43,309
you know, and especially if you kind of like, with the grand purview of
210
00:11:43,329 --> 00:11:47,173
time, looking back a hundred years, it's pretty incredible that we're relying on these
211
00:11:47,213 --> 00:11:49,455
machines to be able to give these types of answers.
212
00:11:49,696 --> 00:11:52,619
So, I take it all with a grain of salt.
213
00:11:52,679 --> 00:11:56,282
But there's a rhyme and a reason to when people use the word AI
214
00:11:56,322 --> 00:11:59,455
almost everywhere. You kind of can.
215
00:11:59,456 --> 00:12:01,537
Um, it's not inappropriate.
216
00:12:01,538 --> 00:12:06,763
Um, now, I want to dive into the next
217
00:12:06,803 --> 00:12:10,266
layer, which is usually what people are starting to talk about when we talk
218
00:12:10,267 --> 00:12:12,689
about artificial intelligence, um,
219
00:12:12,709 --> 00:12:14,731
which is in the world of machine learning.
220
00:12:14,732 --> 00:12:17,513
And what we're trying to do is we're trying to make predictions,
221
00:12:17,553 --> 00:12:19,625
we're trying to do classifications.
222
00:12:19,626 --> 00:12:23,870
And we're sometimes implementing that through something like ensemble learning.
223
00:12:23,871 --> 00:12:26,532
Now, um,
224
00:12:26,552 --> 00:12:31,417
the main driver for all of this is we're telling the computers
225
00:12:31,898 --> 00:12:36,563
a pattern that we had found from our training data or from
226
00:12:36,564 --> 00:12:38,565
a training algorithm. them.
227
00:12:38,566 --> 00:12:42,497
And. I want to, I really want to drive this point of it home,
228
00:12:42,617 --> 00:12:47,342
which is we've discovered something effectively through the training data
229
00:12:47,522 --> 00:12:53,088
or we've kind of iterated many times in this training algorithm to identify
230
00:12:53,068 --> 00:12:57,712
a pattern and that pattern is something that we ultimately
231
00:12:58,093 --> 00:13:01,227
put into kind of the final AI model,
232
00:13:01,407 --> 00:13:05,591
right? That's, that's what, that's, that's what we're kind of implementing.
233
00:13:06,012 --> 00:13:09,595
So in many ways, we are telling the computer the pattern.
234
00:13:09,776 --> 00:13:13,639
We've, we've gone and researched it, we can kind of understand what the pattern
235
00:13:13,700 --> 00:13:15,802
is and now we're putting that into the computer.
236
00:13:16,322 --> 00:13:20,176
And this is used in many, many areas, one of which is is being
237
00:13:20,196 --> 00:13:23,299
able to predict a sepsis diagnosis.
238
00:13:23,880 --> 00:13:25,922
And so the way that you're going to be doing that is you're going
239
00:13:25,942 --> 00:13:28,484
to be taking in large amounts of training data,
240
00:13:29,265 --> 00:13:33,229
and you're going to be running these training algorithms over that training data,
241
00:13:33,509 --> 00:13:37,153
and you're going to be looking for patterns that you can effectively verify through
242
00:13:37,173 --> 00:13:41,627
statistics. And then you're going to implement that into an AI model,
243
00:13:41,647 --> 00:13:44,811
which will then make that prediction for you going forward.
244
00:13:44,812 --> 00:13:49,195
Now, the beauty of most machine learning models is the reason they're called machine
245
00:13:49,215 --> 00:13:51,798
learning is because as you get more data,
246
00:13:51,858 --> 00:13:54,981
as that model is active in making those predictions,
247
00:13:57,243 --> 00:13:58,985
you take some of that data in, and you continue to train and refine
248
00:13:58,965 --> 00:14:02,278
your model. over time. And the way you develop these models is so that
249
00:14:02,358 --> 00:14:04,700
as you get more data in, it takes that in,
250
00:14:04,701 --> 00:14:09,505
you're able to further evaluate your training algorithm and your overall
251
00:14:09,585 --> 00:14:12,388
algorithms that you're using inside of your final state model,
252
00:14:12,528 --> 00:14:15,792
and just continue to iterate over and over and over again,
253
00:14:16,052 --> 00:14:19,055
which is why these models get better with larger amounts of data and more
254
00:14:19,075 --> 00:14:21,627
time. So, at the end of the day,
255
00:14:21,607 --> 00:14:24,270
a lot of this is just kind of like advanced statistics,
256
00:14:24,271 --> 00:14:27,773
um, that's, that's kind of like an oversimplification,
257
00:14:27,793 --> 00:14:30,316
but it's, I think it's one that really can hold true,
258
00:14:30,796 --> 00:14:34,440
and also I should caution, like, this is not the only way to solve
259
00:14:34,700 --> 00:14:38,404
like sepsis prediction, I'm just kind of identifying like a use case,
260
00:14:38,405 --> 00:14:41,137
uh, or a, like a methodology.
261
00:14:41,918 --> 00:14:44,720
Now, the next thing I'm going to kind of get into here is what's
262
00:14:44,741 --> 00:14:47,083
called deep learning. Now,
263
00:14:47,084 --> 00:14:49,485
deep learning uses a neural architecture,
264
00:14:50,426 --> 00:14:53,970
and, uh, or is derived initially from a neural architecture,
265
00:14:54,831 --> 00:14:57,994
and this, I want to kind of stand in,
266
00:14:58,114 --> 00:15:01,445
in, in conjunction into a machine. Machine learning is where we're letting the computer
267
00:15:01,446 --> 00:15:06,345
choose the pattern, and a
268
00:15:06,346 --> 00:15:10,905
common example of this is basically having a computer
269
00:15:10,906 --> 00:15:13,205
identify whether or not,
270
00:15:13,206 --> 00:15:15,685
uh, a cell is a tumor cell or not.
271
00:15:15,686 --> 00:15:19,275
Now, really trying to drive it at port 90.
272
00:15:19,276 --> 00:15:21,758
distinction. Previously in the machine learning models,
273
00:15:21,898 --> 00:15:25,121
I'm picking kind of like the features. I'm picking,
274
00:15:25,122 --> 00:15:27,924
I'm picking what the computer is identifying as the pattern.
275
00:15:28,664 --> 00:15:33,249
Here, in this model, I'm giving the computer lots
276
00:15:33,269 --> 00:15:37,653
of training data, which has been labeled normal or tumor,
277
00:15:38,234 --> 00:15:42,368
and I'm telling the computer, the computer, if it's gotten the answer right after
278
00:15:42,388 --> 00:15:46,973
it takes a random guess, and I've built in a system so
279
00:15:46,953 --> 00:15:51,197
that it kind of self-regulates so that it can learn from its mistakes,
280
00:15:51,497 --> 00:15:55,161
but I'm not ultimately identifying the pattern,
281
00:15:55,421 --> 00:15:57,523
right? Like, I'm giving it a very high level,
282
00:15:57,824 --> 00:16:01,037
saying like, here's, here's the first tumor, here's not tumor,
283
00:16:01,417 --> 00:16:05,982
but I'm not going that extra layer deeper and saying this is the
284
00:16:06,042 --> 00:16:08,464
thing that defines a tumor, right?
285
00:16:08,544 --> 00:16:11,667
I'm not, I'm not saying oh, well when you see cells that look like
286
00:16:11,708 --> 00:16:13,950
this, this is, this is really what defines a tumor.
287
00:16:14,691 --> 00:16:17,253
I'm just giving it this labeled data set of normal,
288
00:16:17,313 --> 00:16:21,948
not normal. And I'm not letting it figure out why something
289
00:16:21,949 --> 00:16:25,792
is a tumor. And if that doesn't mean a whole heck of a lot
290
00:16:25,812 --> 00:16:28,334
right now, in my second video,
291
00:16:28,394 --> 00:16:33,379
I'm going to really deep dive on that specific element of like the
292
00:16:33,980 --> 00:16:36,602
importance and, uh,
293
00:16:36,622 --> 00:16:38,724
the importance of defining the, the, the,
294
00:16:38,705 --> 00:16:41,077
the, the computer defining the pattern.
295
00:16:41,417 --> 00:16:45,421
In many ways, this is how come we sometimes call an AI a black
296
00:16:45,521 --> 00:16:48,845
box. Because, in deep learning,
297
00:16:49,225 --> 00:16:53,810
we're not necessarily, we don't necessarily care about
298
00:16:54,070 --> 00:16:56,072
why the computer came out with the answer,
299
00:16:56,073 --> 00:16:59,575
we're more concerned with just the computer coming out with the right answer.
300
00:16:59,576 --> 00:17:02,148
And that has a lot of implications.
301
00:17:02,149 --> 00:17:06,672
So, everything kind of nested into this circle is going
302
00:17:06,692 --> 00:17:10,516
to have a bit of that uh sting to it,
303
00:17:10,517 --> 00:17:13,780
if you will, right? Like, as we go deeper in these circles,
304
00:17:14,240 --> 00:17:18,704
everything in deep learning will kind of have a bit of this black boxy-ness
305
00:17:18,985 --> 00:17:22,959
to its nature to it because we're going to be predominantly utilizing these neural
306
00:17:23,019 --> 00:17:26,202
architectures and even if it's a variant of it,
307
00:17:26,262 --> 00:17:29,605
it's still going to have a lot of what we sometimes call as hidden
308
00:17:29,625 --> 00:17:33,890
layers. So, while we're going down this route,
309
00:17:34,250 --> 00:17:37,493
we can have really good understanding of,
310
00:17:37,513 --> 00:17:40,787
if we give this input, I'm going to get feel very confident I will
311
00:17:40,788 --> 00:17:45,752
get this output, but it doesn't necessarily explain why we're getting that output.
312
00:17:45,832 --> 00:17:48,134
And that's, that's going to be really important as we kind of go forward
313
00:17:48,114 --> 00:17:52,939
here. So, AI that can give open-ended outputs
314
00:17:53,439 --> 00:17:55,882
is many times considered generative AI.
315
00:17:55,942 --> 00:18:00,476
So, an example of this that isn't related to language is taking
316
00:18:00,456 --> 00:18:03,539
a bunch of 2D imaging and making it 3D.
317
00:18:03,720 --> 00:18:06,362
And I think that's gonna be a really cool application inside the world of
318
00:18:06,502 --> 00:18:10,867
medicine. Within the world of language models,
319
00:18:11,127 --> 00:18:13,870
we're inputting language and outputting language.
320
00:18:13,910 --> 00:18:16,052
And I want to add that little asterisk there, right?
321
00:18:16,072 --> 00:18:19,575
Like there's no guaranteeing on the understanding.
322
00:18:19,576 --> 00:18:24,270
Language models traditionally were derived to sound
323
00:18:24,290 --> 00:18:26,993
good. That's how they work.
324
00:18:26,994 --> 00:18:30,636
We've kind of entered this new era where we're trying to make them accurate.
325
00:18:30,637 --> 00:18:33,219
But the goal was that,
326
00:18:34,300 --> 00:18:36,642
I would almost kind of call it like the soap opera,
327
00:18:36,622 --> 00:18:40,656
like the doctor soap opera. Like, it doesn't really matter if they're saying anything
328
00:18:40,657 --> 00:18:42,899
that makes, like that actually makes sense.
329
00:18:43,139 --> 00:18:45,421
It's, it just, it has to sound like it makes sense.
330
00:18:45,501 --> 00:18:48,965
Or it's kind of like if you watch like Star Trek and they're talking
331
00:18:48,966 --> 00:18:51,187
about like flux capacitors. It's like, you know,
332
00:18:51,528 --> 00:18:55,351
it matters more that it like sounds right than it being right.
333
00:18:55,452 --> 00:18:58,234
Like the physics doesn't actually have to add up, it just has to sound
334
00:18:58,235 --> 00:19:02,789
like it does. And foreshadowing what we're going to get into
335
00:19:02,869 --> 00:19:07,854
later, this is how come certain applications of AI are so dangerous,
336
00:19:08,194 --> 00:19:10,356
because it's goal is to sound right.
337
00:19:10,396 --> 00:19:13,259
It's not to be right. So you, you have to really be kind of
338
00:19:13,279 --> 00:19:16,142
like careful. So, alright,
339
00:19:16,262 --> 00:19:19,235
taking a step back and more thinking of just like an application.
340
00:19:19,236 --> 00:19:23,980
Application of this in the world of medicine, a very common application is trying
341
00:19:24,000 --> 00:19:26,402
to structure unstructured data.
342
00:19:26,403 --> 00:19:29,846
So, I think this is a medspacey or spacey example.
343
00:19:30,266 --> 00:19:33,369
We've used this a lot. And it's, uhm,
344
00:19:33,389 --> 00:19:36,552
there's a lot of, there's a lot of different tools that are related to
345
00:19:36,553 --> 00:19:40,411
language models that are not built on the Transformer architecture.
346
00:19:40,412 --> 00:19:44,990
That allow you to kind of classify elements of data.
347
00:19:44,991 --> 00:19:47,533
And when something like this shows up,
348
00:19:47,593 --> 00:19:50,616
it's very helpful to have these classifications.
349
00:19:50,617 --> 00:19:54,040
Because a computer, prior to all these labels showing up,
350
00:19:54,041 --> 00:19:56,582
doesn't really know how to, doesn't know what to do with this sentence.
351
00:19:57,103 --> 00:19:59,345
I could build some triggers in, uhm,
352
00:19:59,346 --> 00:20:02,478
later that can, if medication is identified,
353
00:20:02,479 --> 00:20:04,801
for example, that something, something else happens.
354
00:20:04,802 --> 00:20:09,105
Like maybe, maybe I want to build like an, a medication inventory management system
355
00:20:09,505 --> 00:20:12,408
based off of conversations that are taking place.
356
00:20:12,709 --> 00:20:15,471
Well, I need to be able to identify medications being stated,
357
00:20:15,491 --> 00:20:17,774
right? Because I could say like, if no medication stated,
358
00:20:17,794 --> 00:20:19,926
not important. But if medication is indicated,
359
00:20:20,026 --> 00:20:22,769
important. And so this is, this is kind of like where I'd be using
360
00:20:22,770 --> 00:20:25,211
that. Alright, so finally,
361
00:20:25,251 --> 00:20:27,253
we've gotten to transformers. So,
362
00:20:27,653 --> 00:20:32,518
kind of like looking backwards, right? So transformers are very clearly
363
00:20:33,579 --> 00:20:35,982
in the world of large language models. And I would,
364
00:20:36,142 --> 00:20:38,725
I would I would say the graph is not perfect because,
365
00:20:38,726 --> 00:20:43,279
uh, It's transformers are not exclusively to the world of large
366
00:20:43,280 --> 00:20:45,882
language models. Uh, they can be used in other things.
367
00:20:45,883 --> 00:20:48,925
So I'm just gonna give that preference here. This is not a perfect visual.
368
00:20:48,926 --> 00:20:53,529
Um, so transformers though can be
369
00:20:53,549 --> 00:20:56,472
used in large language models. And specifically in type of,
370
00:20:56,572 --> 00:20:58,795
in this generative AI world, right?
371
00:20:58,815 --> 00:21:01,948
So like Thank you. Where we're generating this kind of input output,
372
00:21:02,248 --> 00:21:06,332
right? Where it's not fixed. Where fixed was more like I'm giving a prediction
373
00:21:06,352 --> 00:21:08,835
or a classification, right? It could be a bit more than that.
374
00:21:09,395 --> 00:21:12,218
And we're inside of this world of deep learning where we're gonna be using
375
00:21:12,219 --> 00:21:16,402
this neural architecture where the middle part of it may be hidden.
376
00:21:16,482 --> 00:21:18,925
We're much more concerned with how accurate is this.
377
00:21:18,926 --> 00:21:21,517
the same. System not its explainability,
378
00:21:21,858 --> 00:21:26,542
right? But we're still inside of this world of machine learning where we're
379
00:21:26,663 --> 00:21:31,287
learning constantly from the data and we're applying these algorithms and kind of all
380
00:21:31,327 --> 00:21:34,771
scoped within AI where we're taking,
381
00:21:34,971 --> 00:21:39,355
you know, big picture where we're implementing mathematical algorithms.
382
00:21:39,356 --> 00:21:42,128
Using a computer trained trained on data.
383
00:21:42,288 --> 00:21:46,632
That's that's kind of like we're right here, right? So as I've stated,
384
00:21:46,873 --> 00:21:51,257
GPT is that generative pre-trained transformer.
385
00:21:51,317 --> 00:21:55,401
So what we're what we're talking about here is that it's generative,
386
00:21:55,402 --> 00:21:58,044
right? It's inside of that world of AI,
387
00:21:58,084 --> 00:22:01,496
generative AI. It's it's it's it's not just a prediction.
388
00:22:01,476 --> 00:22:04,639
It's not like a single prediction. It's actually generating content for us.
389
00:22:04,980 --> 00:22:09,985
And it's pre-trained in that like we have gone out and run a lot
390
00:22:10,045 --> 00:22:12,427
of training data over this, right?
391
00:22:12,908 --> 00:22:17,893
And we've trained the model so that it's actually smart and intelligent and accurate,
392
00:22:17,933 --> 00:22:21,185
right? Because we're not No. We're not concerned about that explainability.
393
00:22:21,186 --> 00:22:24,065
We're concerned about this accuracy. So we've we've put a lot of time,
394
00:22:24,066 --> 00:22:27,125
energy, and effort into training it to get it pre-trained.
395
00:22:27,665 --> 00:22:30,665
And and it's a transformer and I'll get into that in the next video,
396
00:22:30,845 --> 00:22:33,505
but like w- like y transformer, right? We're building up to that.
397
00:22:33,845 --> 00:22:36,125
And GPT, right?
398
00:22:36,385 --> 00:22:39,125
This is a specific type of transformer, right? So just .
399
00:22:39,126 --> 00:22:42,015
. Just to be clear on this. Alright, so umm.
400
00:22:43,315 --> 00:22:47,135
Alright. So with this, I feel like we have a good understanding of what
401
00:22:47,136 --> 00:22:49,935
is meant by AI. This. This is the chart I would want you all
402
00:22:49,936 --> 00:22:54,115
to think of. Umm. There are other spaces that I have not touched.
403
00:22:54,415 --> 00:22:58,995
Is a deep, deep topic. You know, people spend like PhDs taking a slice
404
00:22:58,996 --> 00:23:01,407
of . out of this. Not like covering the whole thing.
405
00:23:01,408 --> 00:23:03,689
Umm.
406
00:23:03,709 --> 00:23:06,312
So. I want to get into why now.
407
00:23:06,332 --> 00:23:10,476
Which is the other real kind of important aspect inside of this gen AI
408
00:23:10,516 --> 00:23:15,501
system. So. Open AI is really kind of like the primary
409
00:23:15,742 --> 00:23:19,185
one that I believe should be credited for everything that's kind of.
410
00:23:19,235 --> 00:23:23,775
Happening here. So Google's research development team has been
411
00:23:23,776 --> 00:23:28,455
coming out with AI papers for probably decades but like especially
412
00:23:28,456 --> 00:23:31,035
over the past like 10, 15 years.
413
00:23:31,535 --> 00:23:35,195
They were kind of like the predominant place and they wanted to be the
414
00:23:35,196 --> 00:23:37,595
predominant place where AI research was done.
415
00:23:38,615 --> 00:23:43,225
And so. So, So all of OpenAI's success is actually based off of
416
00:23:43,226 --> 00:23:46,225
a paper that Google published in 2017.
417
00:23:46,885 --> 00:23:50,785
Umm. Now. Now OpenAI was founded prior to 2017.
418
00:23:51,065 --> 00:23:54,885
I believe it was founded in 2015. And it was founded when Sam Altman
419
00:23:54,886 --> 00:23:59,245
went with I believe Elon Musk to Palo Alto and tried to,
420
00:23:59,246 --> 00:24:02,879
convince uh Google engineers, AI engineers,
421
00:24:03,159 --> 00:24:07,724
to leave Google and start a company. And umm no one was really interested
422
00:24:07,725 --> 00:24:12,709
except for umm one individual umm
423
00:24:12,729 --> 00:24:17,433
and he left with Sam Altman and they uhh bet
424
00:24:17,654 --> 00:24:21,107
very strongly on, uhh on uhh the transformer.
425
00:24:21,527 --> 00:24:24,410
And uhh they uhh I would be interested,
426
00:24:24,411 --> 00:24:27,633
they did, they tried a few other things that was not like they just
427
00:24:27,634 --> 00:24:30,676
went transformer zero to one and that was that.
428
00:24:30,677 --> 00:24:32,779
umm they, they, they,
429
00:24:32,799 --> 00:24:36,522
they, they experiment with a number of other types of AI that were really
430
00:24:36,523 --> 00:24:41,117
cutting edge. But umm. The one that they really focused
431
00:24:41,137 --> 00:24:45,842
in on and really put a substantial bet into much more so than anyone,
432
00:24:45,843 --> 00:24:49,005
right? Cause this goes back to the training part, right? Like training cost money.
433
00:24:49,225 --> 00:24:51,487
You're spending time, energy, effort,
434
00:24:51,488 --> 00:24:55,171
but more importantly computing power to train these computers.
435
00:24:55,511 --> 00:24:58,134
And at very large scales that becomes very,
436
00:24:58,154 --> 00:25:01,828
very expensive. And so. So they made a very large bet with their time,
437
00:25:01,928 --> 00:25:05,591
energy, effort, money on this transformer model.
438
00:25:05,872 --> 00:25:10,176
And they kept doubling down and doubling down. And they spent millions on the
439
00:25:10,177 --> 00:25:15,081
training. And after seven years uh with no revenue they became
440
00:25:15,082 --> 00:25:19,775
an overnight success. Where this model was able to be proven out as incredibly
441
00:25:19,776 --> 00:25:24,795
powerful. And just for like context like I remember being a graduate student
442
00:25:24,796 --> 00:25:29,255
2017 and this paper being talked about and being like they're being like some
443
00:25:29,256 --> 00:25:32,555
buzz around it. But there's like a hundred papers around all the time.
444
00:25:32,695 --> 00:25:35,035
It was like a running joke that every week there's a new cutting edge
445
00:25:35,036 --> 00:25:38,055
paper in in AI technology.
446
00:25:38,535 --> 00:25:40,685
It's it's So, hard to kind of see this through the noise.
447
00:25:40,905 --> 00:25:43,165
I really do want to give credit where credit's due for being able to
448
00:25:43,166 --> 00:25:45,325
bet so heavily on this and be right. Right?
449
00:25:45,565 --> 00:25:48,425
Like that's that takes quite the vision. Umm.
450
00:25:49,005 --> 00:25:53,765
And in addition to that like the early results of these tools I
451
00:25:53,766 --> 00:25:57,845
remember seeing some of the results of some of these large language models 2018
452
00:25:57,846 --> 00:26:02,597
specific. I did not see them going
453
00:26:02,598 --> 00:26:06,461
anywhere like personally. I was like I was I was very kind of unimpressed.
454
00:26:07,082 --> 00:26:10,806
I think there were many other interesting aspects of AI which is kind of
455
00:26:10,807 --> 00:26:15,471
where I had focused originally. Umm uh and this
456
00:26:15,491 --> 00:26:18,955
is this is something that I would have never guessed.
457
00:26:19,225 --> 00:26:24,065
Until 2022 when I started seeing this stuff launch and come out I
458
00:26:24,066 --> 00:26:28,505
was like wow they really they really found the winning ticket with this architecture.
459
00:26:29,265 --> 00:26:34,065
Umm it's it's I would imagine this for those like in the medical world
460
00:26:34,066 --> 00:26:37,025
just to kind of help understand maybe how this feels.
461
00:26:37,205 --> 00:26:39,586
It probably feels very similar to like drug discovery.
462
00:26:39,587 --> 00:26:44,100
Where there's all these different ideas and they could work this could work this
463
00:26:44,080 --> 00:26:47,964
could work you just really don't know until you try it and it takes
464
00:26:47,965 --> 00:26:49,986
time energy money to to do it right.
465
00:26:50,566 --> 00:26:55,391
So no surprise that they raised ten billion dollars from Microsoft.
466
00:26:55,392 --> 00:26:59,095
Umm and they're worth close to a hundred billion dollars now.
467
00:26:59,115 --> 00:27:03,628
Yeah. Umm but I think the bigger thing
468
00:27:04,209 --> 00:27:09,154
is a rising tide lifts all ships so they've
469
00:27:09,174 --> 00:27:13,118
raised that amount of money and the world is going to be dumping in
470
00:27:13,679 --> 00:27:18,264
much much more money. These lines for reference this is the the most up
471
00:27:18,284 --> 00:27:20,435
to date when I could get. But like this is kind of old.
472
00:27:20,775 --> 00:27:22,795
This is back 2022 effectively.
473
00:27:23,375 --> 00:27:25,455
This this is taken off even further.
474
00:27:25,935 --> 00:27:30,835
Like I think there's specific VC funds that have invested basically the
475
00:27:30,836 --> 00:27:33,235
the amount that the entire U.S.
476
00:27:33,575 --> 00:27:37,615
did in this point. It's it's the amount of money that has gone into
477
00:27:37,616 --> 00:27:42,368
AI. Over the past two years is astronomical.
478
00:27:42,929 --> 00:27:47,794
And think about think about it like think about like money going into AI.
479
00:27:47,795 --> 00:27:52,078
Like money going into drug discovery. It's like you're you're not just gonna keep
480
00:27:52,079 --> 00:27:56,603
getting one better drug, right? Like it's it's it's going the whole field
481
00:27:56,643 --> 00:27:59,255
is going to kind of lift with it because.
482
00:27:59,256 --> 00:28:03,860
As when the money goes in there's more experimentation that can happen
483
00:28:03,880 --> 00:28:09,005
and there's been a backlog of papers
484
00:28:09,305 --> 00:28:12,468
and research done. And so there's just kind of like an infinite number of
485
00:28:12,488 --> 00:28:14,831
things to test and review and research.
486
00:28:15,431 --> 00:28:18,655
And that's really where we're at. It's just there's not enough AI engineers to
487
00:28:18,656 --> 00:28:20,987
go around. And so they're training, you know,
488
00:28:21,027 --> 00:28:24,991
training the models to become AI engineers to help with the problem.
489
00:28:25,031 --> 00:28:29,936
So umm. Anywho, this is the last slide I have for this part.
490
00:28:29,937 --> 00:28:34,581
Um, the next next we're going to be diving into neural networks and
491
00:28:34,582 --> 00:28:38,164
really diving into the transformer and going under the hood and really kind of
492
00:28:38,165 --> 00:28:41,061
giving a nice foundation. For understanding all of that.
493
00:28:41,322 --> 00:28:45,215
So I hope that this was helpful and umm thanks for glistening.