See all John Savill's Technical Training transcripts on Youtube

youtube thumbnail

AZ-305 Designing Microsoft Azure Infrastructure Solutions Study Cram - Over 100,000 views

3 hours 38 minutes 34 seconds

Speaker 1

00:00:00 - 00:00:25

Hey everyone, in this video I want to provide an AZ305 study cram. I want to look at what is the path to actually get to the Azure Architect Solutions Expert Certification. What to expect in the exam and then cover at a high level all of the different knowledge you will actually need. Something you could watch just before taking it. As always please like, subscribe, comment and share.

Speaker 1

00:00:25 - 00:00:32

Huge amount of work especially this video goes in and hit that bell icon. So our focus is all about

Speaker 2

00:00:34 - 00:00:34

AZ-305.

Speaker 1

00:00:37 - 00:00:41

So that's really that final exam towards that architect

Speaker 2

00:00:45 - 00:00:47

certification. Now this is part of

Speaker 1

00:00:47 - 00:00:59

the new path to actually get there and to get the architect cert we need 2 exams. So the AZ-305 is the second exam but what we do first is the

Speaker 2

00:01:01 - 00:01:10

AZ-104. So that is kind of the infrastructure administration exam. So those 2 things together,

Speaker 1

00:01:10 - 00:01:28

if I've done the Azure Administrator Associate, i.e. AZ-104, and then I take AZ-305, I get that architect certification. I already have a full playlist of prep for the AZ-104 and my assumption for this video is

Speaker 2

00:01:28 - 00:01:38

you have watched that already. You have gone through all of that study. There's a lot of videos in there, including my masterclass and browser. Then I've got a study cram.

Speaker 1

00:01:38 - 00:02:03

So I'm assuming you've done all of that. So you take the AZ-104 admin, then you do additional study more around the architecture components for the AZ-305. And this isn't particularly that much more complex. It's not focused on how you do the things, it's focused on which components do I need to use. Your starting point, as always, should really be the MS Learn.

Speaker 1

00:02:04 - 00:02:30

So right now it is in beta. So if you go to the AZ305 page, it kind of talks through that path that I just talked about. I.e., hey look, take that Azure Administrator Associate cert then pass this 305. You can go and schedule the exam. It talks about the key areas, the skills measured, and then how to prepare.

Speaker 1

00:02:30 - 00:03:00

So go through this free learning path. Now I'm going to break this down based on its learning paths so you can kind of refresh. Notice it finishes with the hey Microsoft Azure well architected framework. There's no rocket science to that, it really just builds on some of the key components we think about with architecture. But I will frame this study cram in the same way as those learned modules just to help out.

Speaker 1

00:03:01 - 00:03:19

I do have a playlist for this AZ-305. Again, it doesn't have the same content as the 104. I'm expecting you've done that already. These are additional things that maybe go into more detail, maybe more architecture related, that go to the AZ-305. A key point.

Speaker 1

00:03:20 - 00:03:32

Yes, Microsoft are changing the path to get to architect, but if you have it already, this does not apply to you. If you have it already, you just go through that annual renewal.

Speaker 2

00:03:32 - 00:03:44

You don't have to take 305 at all. If you've got your architect expert, you're done. You're just gonna do that annual renewal. This is only if you don't have it yet. Now I did sit the AZ305

Speaker 1

00:03:45 - 00:03:56

just because I was trying to get an understanding of what is the exam like. I did not need to take it because I already have the Architect Cert. So again, you don't have to take this. I took it just so I knew what to expect, so

Speaker 2

00:03:56 - 00:04:00

I could help create this exam cram for all of you.

Speaker 1

00:04:01 - 00:04:25

Now, you got 2 hours to take the exam. I had 61 questions. Now, out of those, I think I had 3 case studies. Each case study had 4 to 5 questions. Now, remember, a case study, typically it's, hey, the scenario, current state, and then requirements, business, technical, and that detail is going to vary, but I did find them quite clear.

Speaker 1

00:04:25 - 00:04:38

It wasn't like the network certification when they were all intertwined, you were jumping about 50 different pages. The questions for the case studies really refer to a particular page of the case study. So go through the case study first, have

Speaker 2

00:04:38 - 00:04:57

a quick browse through, look for key details, then read the question, and it's typically gonna direct you to a particular page of the case study materials. Hey, app 1 has these technical requirements, how would you meet them? And then a list of options. So, oh okay, I need to go and look at app one's technical requirements.

Speaker 1

00:04:58 - 00:05:03

In addition, there were just regular questions. And these are just, hey,

Speaker 2

00:05:03 - 00:05:20

there's a list of components, which of these would solve that problem. Maybe it's, hey you're using these components how many instances of that component would you need? I.e. You need to know its limits, what its boundaries are, maybe it's subscription, maybe it's region, so you need to pay, how many do I need of these?

Speaker 1

00:05:21 - 00:05:29

You did have the ones where it's a problem statement, so they give you, hey, we're trying to solve this, then you have to say, they give you a solution,

Speaker 2

00:05:29 - 00:05:57

and you have to say, would this meet the requirement or not? Once you answer, you can't go back, because they're gonna give you that same problem statement again, 2 or 3 times, with different answers, and you have to say, would this meet it? So you'll see a newer 1 might say, oh no, that was wrong, so you cannot go back for those. They're not in order, either they don't get better as they go through, it might be the yes, the correct answer was first, and they get worse. So just look at the problem statement, look at, hey, the solution

Speaker 1

00:05:57 - 00:06:08

of offering, think all the different scenarios through, does this meet it or not? Multiple ones might meet it, maybe none of them meet it, but you can't go back, so just think about this.

Speaker 2

00:06:09 - 00:06:15

It took me 50 minutes. Now I rushed through, because again, I didn't care, particularly about the exam,

Speaker 1

00:06:15 - 00:06:35

but I did answer everything, and I thought it was a fairly simple exam. There was nothing super complex about it. In the end of the day, realise there's no trick questions in here. An Azure as A solution is designed to be usable by the public. They're not gonna name things in

Speaker 2

00:06:35 - 00:06:37

a tricky way. They're not gonna hide things to try

Speaker 1

00:06:37 - 00:06:42

and implement. So if ever you're stuck, try and eliminate the obviously wrong things

Speaker 2

00:06:43 - 00:06:58

and then just think, what is the most logical way to solve this? If I was gonna solve this, how would I architect it? That's probably gonna be the right answer. So generally there's a couple of answers you can just eliminate. How would I architect this?

Speaker 2

00:06:58 - 00:07:05

Is it cheese? Definitely not cheese. It's not a hot dog. Oh, it's a virtual network. Okay, it's probably a virtual network.

Speaker 1

00:07:05 - 00:07:13

So you can eliminate probably a couple of obviously wrong things, and just go for the 1 you want. So again, don't panic with these things,

Speaker 2

00:07:13 - 00:07:15

it's an exam. I'll do

Speaker 1

00:07:15 - 00:07:26

some things at the end, but just take your time, relax, and yeah. So let's get down to the review of the particular areas. Now, if you

Speaker 2

00:07:26 - 00:07:41

saw my 104 study cram, you know the board started to fail because the new Whiteboard app has scale limitations. So I'm going to create a new board for each of the key sections, but I'll try and bring them all together at the end for 1 download.

Speaker 1

00:07:41 - 00:07:45

So what we're thinking about here is the whole designing identity,

Speaker 2

00:07:47 - 00:07:52

governance, and monitoring. So we'll look at that first

Speaker 1

00:07:52 - 00:08:02

from a review perspective. If I think about Azure management, there are really 4 key constructs we ever think about.

Speaker 2

00:08:02 - 00:08:08

Now the first thing we have is Azure AD itself. So Azure Active Directory, we

Speaker 1

00:08:08 - 00:08:15

have a particular tenant that is the identity provider for the environment. Now, in terms

Speaker 2

00:08:15 - 00:08:37

of management constructs, we have management groups. So remember, there's always a root management group directly under the Azure AD tenant, but then I can have a hierarchy of these. So I can have this whole hierarchy of management groups. So we have the whole idea of management groups. And then that can be

Speaker 1

00:08:37 - 00:08:44

up to 6 levels deep, not including the route or the subscriptions. And we can use those for various different things.

Speaker 2

00:08:45 - 00:09:02

So at management groups, I can apply things like policy, i.e. What you can do. I can have things like role-based access control, who. I can have things like budgets. How much?

Speaker 2

00:09:02 - 00:09:10

So I have those things, those constructs. Ultimately then I get a subscription. So I'm gonna have some subscription,

Speaker 1

00:09:13 - 00:09:23

which is really that idea of a logical container in which I can, it's a unit of management. Now I have those same options for the subscription as well.

Speaker 2

00:09:23 - 00:09:39

For the subscription as well, hey, I can apply policy, role-based access control, and budget. And all of these things inherit down. So if I was to put a policy, for example, at the root or some high level management group, it would get inherited down to

Speaker 1

00:09:39 - 00:09:41

the child management groups to

Speaker 2

00:09:41 - 00:10:01

the subscriptions. Within the subscription, I can create 1 or more resource groups, which again, I can apply policy, RBAC, budget, and I can have multiple resource groups in a subscription. I can have other resource groups. I can

Speaker 1

00:10:01 - 00:10:23

have lots of resource groups. A resource group is not a boundary of communication. I could have resources in here using resources in other resource groups. So it's not a boundary of any kind of communication. When I think about architecting these constructs, realize resource groups typically I'm gonna put things in together that have

Speaker 2

00:10:23 - 00:10:39

a common life cycle. They're gonna get created together, they're gonna get deleted together, they're gonna run together. So I think common life cycle. Typically I'm gonna give people the same sets of permissions to all the things in a resource group.

Speaker 1

00:10:40 - 00:10:45

How many subscriptions do I have? It varies. Realize that subscription

Speaker 2

00:10:45 - 00:11:01

is a boundary for certain types of things. A virtual network, for example, lives within a certain subscription. I might have a core subscription for core services, like my ExpressRoute connections, my domain controllers. Subscriptions do have limits,

Speaker 1

00:11:02 - 00:11:08

so maybe I have to have multiple subscriptions because of some limit I'm hitting. But I don't want to have too many subscriptions.

Speaker 2

00:11:08 - 00:11:17

So as part of my design, I think about, okay, what's the right number based on the various requirements? Realize resource groups as well. I have

Speaker 1

00:11:17 - 00:11:18

a lot of the same capabilities.

Speaker 2

00:11:19 - 00:11:24

So if I can, hey, I'd rather use a resource group. When I create a resource group, it

Speaker 1

00:11:24 - 00:11:29

does get created in a region. You pick a region, but that region is

Speaker 2

00:11:29 - 00:11:56

just where the metadata for that resource group is stored. It does not impact the resources I can put in it. So if I create a resource group in East US, I could put resources in West US inside that resource group. So if I was trying to limit, hey, where can you create resources, putting a policy just on resource groups, so only resource groups can get created, wouldn't do anything. I need to make sure, hey, I'm limiting all types of resource.

Speaker 2

00:11:56 - 00:12:05

I could apply it to the resource group because it gets inherited down, but I wouldn't be targeting where can resource groups get created. I'll be targeting where can resources get created.

Speaker 1

00:12:05 - 00:12:06

So let's say

Speaker 2

00:12:06 - 00:12:20

if I could limit it to a certain place. So resource groups lives in a region, that's just its metadata. Has no impact on what can actually be inside it. A resource can be in 1 and only 1 resource group.

Speaker 1

00:12:20 - 00:12:30

I can't nest resource groups, it is a flat structure. So those are kind of the key constructs we think about. We're gonna come back

Speaker 2

00:12:30 - 00:12:42

to these, but really we think about policy is kind of what I can do, budget is who can do it, budget is really how much.

Speaker 1

00:12:43 - 00:12:48

So think about those constructs. If we see questions like, hey, we need to control

Speaker 2

00:12:49 - 00:13:03

where you can create SANK. We need to control only this type of something. Well, that's gonna be a policy. And then what level depends on, hey, what is the scope they're asking you to actually do those things for? Policy are basically guardrails.

Speaker 2

00:13:04 - 00:13:16

I can group policies into initiatives and then apply the initiative to 1 of these scopes. Subscription, management group, resource group etc. Again it gets inherited down.

Speaker 1

00:13:20 - 00:13:36

Also we do have resource tags. So resource tags, again, it's just metadata. It's some key value I can assign. It can be really useful, for example, to track maybe certain aspects of my management structure. It might be maybe a cost center.

Speaker 1

00:13:36 - 00:14:07

It might be a creation date I put on things. They are not inherited. So if I create a tag at the resource group, it is not inherited by the resources inside it, But we can use policy to accomplish that. So if we jump over for a quick second, if we go and actually look at policy, if I just search for policy, and again it's really doing 2 things. So policy I can use to enforce, to actually control.

Speaker 1

00:14:07 - 00:14:26

It's the guardrails to say what you can do. But I can also use it for compliance checking. I can go back and look, oh, okay, based on my initiatives, for example, how in line am I with all of these things? And so we have the definitions. So we have individual policies.

Speaker 1

00:14:27 - 00:14:56

And here, if I search for, let's say, tag, Notice I have options here, inherit a tag from the resource group, is an example here. So what this would do is, hey, if I don't set tags on the individual resource, and that's a requirement, hey, I could use Azure Policy to copy it from the resource group onto the resources. And then instead of assigning individual policies, I can create initiatives, and initiatives

Speaker 2

00:14:58 - 00:15:01

group multiple policies together. So you

Speaker 1

00:15:01 - 00:15:22

can see there's a lot of policies in some of these. That makes it easier to assign them and obviously it makes them easier as well to actually track that compliance of, because generally I care about multiple policies together are achieving some desired state. When I think about the role-based access control,

Speaker 2

00:15:22 - 00:15:28

so remember we have these different scopes, management groups, subscriptions, resource groups, policy can apply to any of those.

Speaker 1

00:15:29 - 00:15:33

If I think role-based access control, This is about the idea of

Speaker 2

00:15:33 - 00:15:41

I have some identity. So there is some identity. Remember that's going to be living in my Azure AD.

Speaker 1

00:15:41 - 00:15:43

And I have some scope.

Speaker 2

00:15:44 - 00:15:55

Now again, those scopes could be kind of that management group, subscription, resource group. It can be an individual resource. That's not something we commonly do. It's very hard from a management perspective.

Speaker 1

00:15:56 - 00:16:15

And then we have roles. Now a role is really just a set of defined actions. And what we do is we give an identity, a certain role at a certain scope. So that is a role assignment.

Speaker 2

00:16:17 - 00:16:39

When we think about these roles, they're really divided into the control plane, i.e. ARM, the Azure Resource Manager. But also now we start to see some of them have roles at the data plane. So not going through the ARM API, actually maybe talking to blob storage or Q or maybe SQL. So there's other types of roles available.

Speaker 1

00:16:40 - 00:16:42

If we jump over and look, just

Speaker 2

00:16:42 - 00:16:43

to kind of clarify that,

Speaker 1

00:16:43 - 00:16:44

an easy thing to look at

Speaker 2

00:16:44 - 00:16:53

is a storage account. So if I quickly jump over to a storage account, it doesn't matter which 1. If I look at access control, so if I look at the roles,

Speaker 1

00:16:54 - 00:16:57

we can see, hey, let's say there's a,

Speaker 2

00:16:58 - 00:17:01

let's just expand these titles out so I can see them.

Speaker 1

00:17:02 - 00:17:22

I might say, okay, storage account contributor. So if we look at the storage account contributor role, notice there's Actions and Data Actions. It divides them up. So Actions, there's all these different things it can do. So there's a lot of

Speaker 2

00:17:22 - 00:17:25

the things here. Spread on different resource providers.

Speaker 1

00:17:26 - 00:17:48

It has all these capabilities. Data Actions, it has none. So this is not a role that would give me any access directly to the data itself using the role. But if I go and look at a different type of role, where it's like full access to storage blob containers. So this role is a storage blob data owner.

Speaker 1

00:17:48 - 00:17:53

It still has some Azure Resource Manager actions, but now it also has data actions.

Speaker 2

00:17:54 - 00:18:13

It can actually do read blob, write blob. So this is how I could, using Azure AD, also get access to the underlying data stored in that resource. And that's really what we want to try and get to as much as possible. That's kind of a key direction for a lot of these different things.

Speaker 1

00:18:16 - 00:18:21

RBAC is cumulative. So if I had certain role assigned maybe at

Speaker 2

00:18:21 - 00:18:32

the management group, and then another role at the subscription, another role at the resource group, I get the sum of those permissions. There is no deny assignment outside of something called a blueprint

Speaker 1

00:18:33 - 00:18:34

or I can also have something called

Speaker 2

00:18:34 - 00:18:45

a managed application. Today they are the only places I can do an explicit deny. So I start off with no permissions and then I get permissions added to me by the roles that are granted to me.

Speaker 1

00:18:46 - 00:18:54

Now ordinarily those roles are just applied to us all of the time. We don't really like that. A good architecture practice is about just in time.

Speaker 2

00:18:54 - 00:19:16

I get the permissions as I need them. So a good way to do that is we actually think about privileged identity management. So that is basically giving it to me just in time. I can go and request, hey, I need this role. Maybe I have to go for a strong authentication like MFA, then I get it for a duration of time.

Speaker 2

00:19:17 - 00:19:32

That is an Azure AD Premium P2 feature. So that's about getting something only when I need it. Another big challenge is, well who has roles? Who is in this group? Who has access to this application?

Speaker 2

00:19:32 - 00:19:35

So another common thing you're gonna see is access reviews.

Speaker 1

00:19:37 - 00:19:43

So access reviews, they're very flexible. Again, that is a P2 feature.

Speaker 2

00:19:44 - 00:19:59

And that is actually part of the identity governance. And that does multiple things. Again, it can be about who has certain roles, who is in certain groups,

Speaker 1

00:20:00 - 00:20:07

who has access to certain applications. And I can do that in such a way as I can delegate those checks to certain people or

Speaker 2

00:20:07 - 00:20:28

it can even be a self-review. Hey, validate you still need this. So that is a very powerful feature in terms of, hey, What is a good way to check someone still has access? What feature would I use? Hey, well, if I'm trying to validate someone still needs this role or group membership or application access, hey, access reviews.

Speaker 2

00:20:28 - 00:20:48

Well, that's part of Azure ADP too, part of the identity governance. So that's a powerful capability over there. In terms of that idea of deploying resources and being able to put down broad-based access control and policies and creating those resource groups, today

Speaker 1

00:20:49 - 00:20:52

1 of the big technologies you'll actually use is something called

Speaker 2

00:20:52 - 00:20:57

an Azure Blueprint. So Azure Blueprint is this construct

Speaker 1

00:20:59 - 00:21:13

that basically consists of artifacts. So 1 of the artifacts it supports is things like resource groups. So hey I want to create a new resource group. It also supports ARM JSON templates.

Speaker 2

00:21:13 - 00:21:28

So that's an infrastructure as code way to create resources. It supports things like role-based access control. So okay, in this blueprint, I'm gonna create a resource group, deploy some resources, then I wanna set these particular permissions.

Speaker 1

00:21:29 - 00:21:39

It can also apply policy. So they are the 4 key constructs we support in Azure Blueprints. Now that gets stored at a certain level.

Speaker 2

00:21:39 - 00:21:48

I could store that in, for example, a management group. I could store it in a subscription and it's then available to anything underneath that

Speaker 1

00:21:48 - 00:22:02

level I store it. And then what I do is I apply it. So if I apply it, well then those things I describe, those artifacts will be stamped down. Now that application could be done in

Speaker 2

00:22:02 - 00:22:09

different modes. There's like a don't lock, i.e. Create it, but then people could change it if they wanted to.

Speaker 1

00:22:10 - 00:22:17

They could say, hey, do not delete. So they can't delete the things that the Blueprint stamps down,

Speaker 2

00:22:17 - 00:22:35

but they could modify them, or it can get stamped down as read only. You can't modify this configuration at all. Now again, these are all at the Azure Resource Manager level. If it was a storage account, for example, I could still perform data operations inside there. So it's not going to stop me from actually doing that.

Speaker 2

00:22:35 - 00:22:37

So that's obviously an important point now.

Speaker 1

00:22:39 - 00:22:43

Okay, so those are those kind of key things as

Speaker 2

00:22:43 - 00:22:45

I think about those constructs.

Speaker 1

00:22:46 - 00:22:53

So let's go back now to the Azure AD. Now that Azure Active Directory you have, most likely

Speaker 2

00:22:53 - 00:23:24

what you're actually doing is you are replicating that. So today you have an Active Directory domain services. So normally we just call it Active Directory, but it's actually Active Directory domain or directory services and you replicate that. So hey, if I have a user in here, that user gets replicated and gets created over here as a synchronized user. And the way we do that is Azure AD Connect.

Speaker 2

00:23:25 - 00:23:35

There's also an Azure AD Connect Cloud Sync, where the engine, instead of running on premises, actually runs in the cloud. We just have some lightweight connectors there.

Speaker 1

00:23:36 - 00:23:43

As part of this, what we also want to recommend is we want to send the password hash of the hash.

Speaker 2

00:23:44 - 00:23:56

By having that hash of the hash replicated, it lets us do things like look for leaked credentials. Because now Azure AD knows what the password hash of the hash is, it hashes it again, it's not just a regular hash.

Speaker 1

00:23:56 - 00:24:00

On the dark web, when it's scanning, it can now go and find

Speaker 2

00:24:00 - 00:24:12

those things. So things like Azure AD Identity Protection would now be able to stop things like a breach replay. We'll say, hey look, we found your password out on the dark web, we should change this. We should make them do an MFA. That's a P2 feature.

Speaker 1

00:24:14 - 00:24:18

If I want a nice experience for the end user, yes we can synchronize those things but

Speaker 2

00:24:18 - 00:24:24

then we want to also turn on things like seamless sign-on. So seamless sign-on

Speaker 1

00:24:25 - 00:24:35

is hey if I'm sitting at machine that's on a network that can talk to a domain controller, I can just go and access the Azure AD without having to do anything else. It's just gonna give me a very

Speaker 2

00:24:36 - 00:24:53

easy, smooth interaction with that. And the whole point of this is Active Directory talks things like Kerberos and NTLM and LDAP. They're not good for the cloud. So Azure AD tools cloud. OAuth2, OpenID Connect, SAML, WSFED.

Speaker 1

00:24:53 - 00:25:09

And then what would happen is you have a whole bunch of different cloud applications trusting this to be the identity provider of those. So we have all these things trusted, including Azure. Azure subscriptions will trust a certain Azure AD tenant.

Speaker 2

00:25:11 - 00:25:41

There is something called Azure AD domain services. So if you had a question, hey, we need to use Kerberos or NTLM or today LDAP in an Azure subscription, and you don't have a regular domain, which technology would solve this? Well, Azure AD domain services creates a managed AD in a particular virtual network for you. So that would be a way to enable that. So that's great for users from maybe my AD, I can create cloud accounts as well, natively.

Speaker 2

00:25:42 - 00:26:11

But what if there are other companies I'm working with? What about the idea that, okay, there's maybe a partner that I'm working with, they have their own Azure AD tenant, or maybe they have Microsoft accounts or Gmail accounts, maybe they have their own SAML, maybe they have something else, and I want to be able to kind of email them a one-time passcode. Well, I can add them using B2B, business to business,

Speaker 1

00:26:11 - 00:26:22

and then I get a little stub account, basically a guest, using B2B. So B2B lets them use their existing credentials. So the authentication

Speaker 2

00:26:25 - 00:26:46

is still happening here, i.e. The password, that initial who are you proving who you are happens whatever their source identity provider is. What happens on my side is the authorization, i.e. Are you allowed to actually do this? That would include things like conditional access, which we'll get to in a second.

Speaker 2

00:26:46 - 00:26:48

So if I add partners, and I want to

Speaker 1

00:26:48 - 00:27:17

be able to give them permission and access to things that trust my Azure AD, most of the time the answer would be, hey, I want to add them as a guest to enable that interaction. And again, there's many different types of things you can do with this. If we quickly jump over and look at my Azure AD for a second, what we'll see, if I look at my users, first you can see, is it a directory synced user or not? I.e. It came from Active Directory, and

Speaker 2

00:27:17 - 00:27:18

I have 1 that will say yes.

Speaker 1

00:27:19 - 00:27:28

But then I have all these guests. So I've guests, this one's mail. So that's like it's going to use a one-time passcode. It's going to email them a passcode. I have Facebook.

Speaker 1

00:27:30 - 00:27:57

I have ones coming also, I have Gmail. So these different options, I even have someone using text, that's not a guest though. So I have these different ways people can carry on and use their existing accounts when they're actually going to use this. Now I have a deeper dive video on a whole bunch of these things. Now 1 of the cool things you can do with these is, if I jump over,

Speaker 2

00:27:57 - 00:28:15

and we look at external identities, Well, firstly, you can configure who are the identity providers. You can see I've added Google and Facebook and I've enabled one-time basket. I've enabled basically everything. But you might go and enable them if you hadn't. You could go and add a custom SAML or WSFED.

Speaker 1

00:28:17 - 00:28:35

The other cool thing you can do is I can add user flows. So as part of a user flow, I could enable those guests to be able to do a self sign up. I could also have particular steps I want them to go through. I could whitelist certain roles. I can then add them to groups and roles.

Speaker 1

00:28:35 - 00:29:02

These guests, whatever path I take, are just identities. I can add them to groups, I can give them access to roles. I have really most of those same sets of capabilities. So that's the the B2B option there. Now alternatively maybe I as a company I've created some fantastic application I'm super proud of.

Speaker 2

00:29:02 - 00:29:05

So I've created my awesome app, and

Speaker 1

00:29:05 - 00:29:06

I want to make

Speaker 2

00:29:06 - 00:29:14

it available to my customers. I do not want to add customers to my corporate Azure AD. That's a terrible idea.

Speaker 1

00:29:15 - 00:29:31

So what we do is there's a separate type of Azure AD today. So what we can do is we can create an Azure AD B2C, business to consumer instance. What that lets me do is these customers, my app would now trust that for

Speaker 2

00:29:31 - 00:29:57

its identity provider. Customers could either create local accounts in the B2C or it supports a huge range of different types of social accounts, more than the regular Azure AD today. Things like Weibo and Twitter, and the customers have a choice. Hey, I can create an account or I wanna bring my existing social account to use with that application. Today, B2C has this fantastic ability to like customize every pixel.

Speaker 2

00:29:58 - 00:30:18

I can hide the B2C URL with things like Azure Front Door. There's 100 custom attributes as a whole on-boarding flow. It's a really powerful solution. So if I have an app I want to make available to my customers, B2C. I don't want to put that in my corporate Azure AD tenant.

Speaker 2

00:30:19 - 00:30:26

I mentioned conditional access and the fact that authorization, even with those B2B happens at my Azure AD.

Speaker 1

00:30:27 - 00:30:29

So a really powerful capability

Speaker 2

00:30:29 - 00:30:58

when I think about Azure AD, is actually this whole idea of conditional access. And I can think about that conditional access is that authorization layer. So no matter what way I'm coming in, I'm going to pass through that conditional access. Now that's a feature of Azure AD Premium P1 or above. It's also bundled with other types of license.

Speaker 1

00:30:59 - 00:31:02

And the whole point of conditional access is I can specify

Speaker 2

00:31:02 - 00:31:18

a whole bunch of conditions. This could be hey you have to be in maybe a certain group it's pretty easy to just see this so if I jump over and we go to our security and we go to conditional access and I just create a

Speaker 1

00:31:18 - 00:31:32

new 1 for a second. Well, I'll give it a name, but I can assign it. So I could assign it to certain users, to certain groups. I could assign it to certain roles. So I have a lot of flexibility in how I actually assign this.

Speaker 1

00:31:32 - 00:31:48

I can target everything. I could target particular cloud applications. So here we'd see all my custom applications. We'd see a lot of ones that are built in, even Azure itself. So if I, for example, look at, I think it's Microsoft

Speaker 2

00:31:51 - 00:32:04

Azure Management. Microsoft Azure Management is Azure Management itself or anything that goes through the portal. So, hey, if I want to control something Azure itself, I could use that app to target that.

Speaker 1

00:32:05 - 00:32:36

And then I have conditions and there's a whole set of conditions. User risk, sign-in risk, that is built with part of Azure AD Premium P2's identity protection feature. I can target particular platforms, I can target particular locations that I define, device state, is it healthy according to things like Intune for example. And then, do I give access or not? Notice I could block access, or maybe I grant access but I require things like MFA.

Speaker 2

00:32:37 - 00:32:49

I require it to be marked as compliant, maybe something like Intune. I require it to be hybrid joined. So I have a whole list of different options around there and there's kind of session controls as well.

Speaker 1

00:32:50 - 00:32:56

And if I kind of go back from those locations, so location is either a

Speaker 2

00:32:56 - 00:33:32

public IP range I set, or it can even be GPS coordinates now, or it can just be based on IP ranges for particular countries. So I can create locations I want to target with my conditional access. But the whole point of this is it gives me the ability to set, based on these certain conditions, this has to be met in order to let me do this. And a big 1 is, oh, I'm gonna make you do things like MFA. So if I had a B2B user, the MFA would still happen at my tenant.

Speaker 2

00:33:32 - 00:33:35

That's kind of a big point about that.

Speaker 1

00:33:36 - 00:33:38

There are things like identity protection, I just talked about that.

Speaker 2

00:33:39 - 00:33:44

So identity protection is about looking at an individual session's risk,

Speaker 1

00:33:44 - 00:33:47

an individual logon, or the user in general. Looks at

Speaker 2

00:33:47 - 00:33:59

things like impossible travel, it's using an IP address that's been linked to malware, part of some password spray attack, it's anonymous IP, generally nothing good is coming from an anonymous IP.

Speaker 1

00:34:01 - 00:34:04

And it's building an overall risk status for

Speaker 2

00:34:04 - 00:34:41

the user that A, can give me warnings and reports, it can trigger certain actions, but also I can build that into things like the conditional access. So identity protection, which is a P2 feature, I can leverage as part of my conditional access. And again, it has its own checks, its own sets of actions I can actually drive from there. Talked about users, users in my company, users in partners, customers.

Speaker 1

00:34:42 - 00:34:48

But obviously there's another big type of user. If I think about an application,

Speaker 2

00:34:50 - 00:34:54

so I have for example my Azure subscription

Speaker 1

00:34:56 - 00:35:01

and I want to create an application. So applications often need to authenticate.

Speaker 2

00:35:02 - 00:35:04

So if I create my application,

Speaker 1

00:35:05 - 00:35:27

well my app might want to be able to access some type of other resource somewhere. I don't want to ideally store secrets. Now 1 way I can do this is I just create a service principle. So a service principle is some account, for example, app gets an account. It could use a secret, i.e.

Speaker 1

00:35:27 - 00:35:28

Password, or it could use a

Speaker 2

00:35:28 - 00:35:45

certificate, but then it has to handle that some way. It has to store that in some way. I've got a whole deep dive video on app registrations, which is how an app registration creates us that service principle, and then how I can leverage that. A better option

Speaker 1

00:35:45 - 00:35:56

is rather than having to try and work out how to store that secret or that certificate, if it's an Azure resource, and there's a huge number of resources that support this, but imagine I'm resource 1. This could be

Speaker 2

00:35:56 - 00:36:00

a VM, this could be an AKS environment, it doesn't matter.

Speaker 1

00:36:00 - 00:36:21

What we can actually do is we can say, hey, I want to turn on managed identity. Now, the default is a system assigned. And what that means is now there's an identity, a service principle, but it's just managed for me automatically. Only that particular resource can be R1. No 1 else can ask for it.

Speaker 1

00:36:21 - 00:36:23

So with a system assigned,

Speaker 2

00:36:26 - 00:36:32

the lifecycle is one-to-one. That resource is that managed identity.

Speaker 1

00:36:32 - 00:36:51

When that gets deleted, this managed identity goes away. But now for resources, I could say, hey, R1, you're a contributor. So now without this resource having to store any kind of credential, it can get access to that resource. Fantastic. That's phenomenal if I've got a resource.

Speaker 1

00:36:51 - 00:36:55

What about if I have something like a scale set, something behind a load balance,

Speaker 2

00:36:55 - 00:37:01

so I've got a bunch of resources. So I've got resource 234I

Speaker 1

00:37:01 - 00:37:04

could use a system assigned managed identity again.

Speaker 2

00:37:04 - 00:37:28

And I'd have to basically duplicate the permissions 3 times or 4 times or 10 times. Or what I can actually do is create something called a user assigned managed identity. Now this time it has a separate life cycle. Let's just call this user assigned MI1.

Speaker 1

00:37:28 - 00:37:41

So it has its own resource. And what I do is I grant all 3 of these the permission to use UAMI1. And then I can give that UAMI1

Speaker 2

00:37:43 - 00:37:45

permissions. I'll give you contributor.

Speaker 1

00:37:47 - 00:37:54

So the key point there is if it's system assigned, only that 1 resource can use it, and I give it permission to things.

Speaker 2

00:37:55 - 00:38:03

With a user assigned managed identity, it is a separate lifecycle, so this can be kind of 1 to N. If I delete the resources, that doesn't go away.

Speaker 1

00:38:04 - 00:38:15

But now I can give that identity permission to things and then let multiple resources that need the same sets of permissions use that single managed identity to really cut down on management.

Speaker 2

00:38:16 - 00:38:30

So if I saw a question, hey, you have these 10 virtual machines that all need the same set of permissions. What's the minimum number of identities I could use? Well, I could use 1 managed identity if it's a user assigned. So that would kind of help me out there.

Speaker 1

00:38:31 - 00:38:40

Now there were some things, ideally I'm gonna use managed identity. If all things are equal, I will just use a managed identity and then give

Speaker 2

00:38:40 - 00:38:52

the managed identity the permissions on the resource. Remember we had those data plane roles, but maybe there's some things we can't. There's some resource that just doesn't work. I still need a secret. I need a shared access signature.

Speaker 2

00:38:53 - 00:39:23

Where do we store those things? So the best way to store those things in Azure is we have Azure Key Vault. So Azure Key Vault is all about the idea of storing secrets, it's some piece of data that I can write and get back. Keys, I can import or generate, but I can't extract them out. But I can perform cryptographic operations within the key vault and certificates.

Speaker 2

00:39:25 - 00:39:27

It can handle the lifecycle, help with the distribution.

Speaker 1

00:39:28 - 00:39:43

And this has full role-based access control. So what I would do here, if I had some secret, well, I would store the secret in my key vault, and then I would give the identity permission to it.

Speaker 2

00:39:43 - 00:40:12

So maybe here, I could say, oh, I'll give the user a sign managed identity 1, get permission. So it can get that secret and use it. So now this application, again, doesn't have to store anything. It's gonna authenticate to Azure AD as it's user sign managed identity, and then use that identity to get access to the secret, and then that secret is going to go and use to talk to something else that is required. There are 2 models

Speaker 1

00:40:13 - 00:40:20

for Azure Key Vault permissionings. So, Key Vault actually started out with its own model.

Speaker 2

00:40:22 - 00:40:26

It didn't really integrate with Azure AD, so it had its own access policies.

Speaker 1

00:40:28 - 00:40:34

And notice here what I would do is I would create an access policy for a certain identity and

Speaker 2

00:40:34 - 00:40:43

then I would give permissions for the type of resource. I couldn't be granular. I couldn't give permission just to a certain secret or a certain key.

Speaker 1

00:40:43 - 00:40:58

I would get the permissions for all secrets or keys of that type. Now notice how granular the permissions are. Get, list, set, delete, recover, backup. So obviously to get a secret, I just need get. That's the only permission I need.

Speaker 1

00:40:58 - 00:41:11

If I want to be able to enumerate through them, well I need list as well. If I want to change it, well, I need set. Obviously, delete, et cetera, et cetera. There's different permissions, but I can be super granular. But it applies to everything of that type in the vault.

Speaker 1

00:41:12 - 00:41:31

The other option, and this is newer, is you can change the access policy to now use Azure role-based access control. And that actually allows me, notice there's no access policies here, that actually allows me now at an individual kind of secret level

Speaker 2

00:41:34 - 00:41:52

to now set permissions. So it's actually permissions to operate the data plane. So like Key Vault Reader, Key Vault Secrets user can actually read just that 1 secret and not the others. So that's a more granular option available to me.

Speaker 1

00:41:53 - 00:42:00

Now some services will kind of abstract that away from me. So obviously yes I could absolutely write and I

Speaker 2

00:42:00 - 00:42:02

could use the Azure APIs to go and get to that.

Speaker 1

00:42:03 - 00:42:04

But if, for example, I

Speaker 2

00:42:04 - 00:42:21

was using Azure Kubernetes Service, AKS, well, Azure Kubernetes Service actually has an Azure Key Vault CSI driver. And What that basically does is I can expose certain secrets as if it's part of the file system.

Speaker 1

00:42:21 - 00:42:26

So the app itself doesn't need to care about Key Vault, it's just interacting with a file.

Speaker 2

00:42:26 - 00:42:43

For me, it's like App Service. Well, App Service has the concept of kind of environment variables that I can use. So I have these application settings, an application setting can be a reference to a particular secret. So again, I'm not

Speaker 1

00:42:43 - 00:42:45

able to do something in my app to really worry about that.

Speaker 2

00:42:47 - 00:42:49

I can just expose it as an application setting.

Speaker 1

00:42:51 - 00:42:52

This is obviously super super important.

Speaker 2

00:42:53 - 00:43:17

It's got those secrets and things stored within it. So 1 of the things you can do is these replicate. So this does replicate to kind of a paired region. And if something happens to our primary region then the paired version would become available. But it would become available in a read-only

Speaker 1

00:43:19 - 00:43:19

mode.

Speaker 2

00:43:21 - 00:43:41

So I could get, I could list, I couldn't delete things, I couldn't modify the values. So it would become read-only. So if it does fail over, that protection is built in, the vault would still be available, but it will go into this read-only mode, so I can't perform any sort of change operations.

Speaker 1

00:43:43 - 00:43:44

So those are kind of some of

Speaker 2

00:43:44 - 00:43:55

the key constructs we think about from that identity governance perspective. The next layer we get into is really about the monitoring.

Speaker 1

00:43:56 - 00:44:05

So if I think for a second about monitoring. Monitoring is key in a

Speaker 2

00:44:05 - 00:44:25

lot of things, and we'll actually come back to this. If I was doing migrations, I'd need to understand the type of usage currently on a system to make sure I'm really migrating the right thing. But I can get data coming from many, many different sources when I think about monitoring. So if we think, well, there's always Azure AD. There's always Azure AD at kind of the top.

Speaker 1

00:44:25 - 00:44:32

So Azure AD has many, many types of logs. There's obviously things like the audit logs.

Speaker 2

00:44:32 - 00:44:36

So seeing what's actually happening in the system. There's sign-in logs.

Speaker 1

00:44:38 - 00:44:55

And there are many others as well that we can actually go and see. If you go and look at Azure Active Directory, you'll see there's a whole bunch of different types of logs that I can actually get. Then the Azure subscription itself. So the Azure subscription has an activity log.

Speaker 2

00:44:59 - 00:45:22

So that activity log lets me actually go and see, oh, something has been created at the ARM level, the Azure Resource Manager level. It lets me go and see other things about, the object is modified at the ARM level again. And we can change aspects of this. We can send it to other places through single diagnostic settings, which we'll get to in a second.

Speaker 1

00:45:22 - 00:45:30

Then I have the resources themselves. Now the resources themselves have different types of output. A lot of them have metrics.

Speaker 2

00:45:33 - 00:45:40

Now by default they just go to an Azure Monitor time series database. Many of them also have logs. But I'm

Speaker 1

00:45:40 - 00:45:49

gonna put that in a square bracket because they do not exist by default. You have to configure them to go somewhere before it will actually go and create those logs.

Speaker 2

00:45:50 - 00:46:05

Then there are many other types as well. There's things you can do inside the operating system, there are things I can do inside applications, I can do operating systems on-premises, there are insight capabilities, there's a whole bunch of different things I can turn on.

Speaker 1

00:46:06 - 00:46:24

But a key point for all of these different types of resources that can generate this data, there are different places I can actually send them. So this is what's been generated. Well, I can send it to something called Log Analytics Workspace. So 1 option is Log

Speaker 2

00:46:26 - 00:46:30

Analytics. I can keep that for up to 2 years. So it's kind of

Speaker 1

00:46:30 - 00:46:42

got a maximum duration of 2 years. That's really powerful that it's not just the storage, it has a whole Kusto query language so I can run queries against it. And many solutions sit on top

Speaker 2

00:46:42 - 00:46:46

of this to give me additional value added on top of here.

Speaker 1

00:46:46 - 00:46:48

I can also send it to an event hub.

Speaker 2

00:46:50 - 00:46:59

So an event hub is kind of a publish subscribe. It's really useful if I had some kind of maybe a third party sim that I wanted to

Speaker 1

00:46:59 - 00:47:04

be able to send things to, or I just want to trigger something else. I could even maybe have something

Speaker 2

00:47:04 - 00:47:32

like an Azure function hanging off of this, maybe via event grid in the middle, say, hey, when something gets created, I want to go and run this serverless thing. Or I could even do a storage account. A storage account is useful because it's cheap retention. It's not super useful to do anything with. When I send to a storage account, I can actually pick a retention, how many days those files will it actually keep.

Speaker 1

00:47:32 - 00:47:34

And the way I configure all

Speaker 2

00:47:34 - 00:47:37

of these is I have diagnostic settings.

Speaker 1

00:47:42 - 00:47:44

For nearly all of these, I can configure those

Speaker 2

00:47:44 - 00:48:01

and that's why I can say, hey, I wanna send it to here or here or here, and I want to keep it for this amount of time. So I have all of those available to actually drive that. If we jump over really quick, and let's just pick, Let's do that here.

Speaker 1

00:48:04 - 00:48:05

Let's do SQL.

Speaker 2

00:48:09 - 00:48:15

I'm just picking a resource. So we have these diagnostic settings. And What

Speaker 1

00:48:15 - 00:48:34

I can do here is I can add, and this is where you can see, well look, there's all these different types of logs that I could send and the metrics I can send as well. I want to send it to a Log Analytics workspace, and I would pick which 1. I want to send it to a storage account. If I send it to a storage account, well then we pick a number of days. That's the retention for the storage account.

Speaker 1

00:48:35 - 00:48:48

That is not the retention for anything else. Log Analytics is not asking me that. Log Analytics has its own retention configuration up to that maximum 2 years. That retention is only if I'm sending it to a storage account.

Speaker 2

00:48:48 - 00:48:56

I can send it to an event hub. And some of them now we even have partner solutions that typically hook in via things like event hub.

Speaker 1

00:48:57 - 00:49:02

So for nearly every type of resource, we will see those same options.

Speaker 2

00:49:02 - 00:49:29

If I looked at my Azure AD and I looked at things like my signing logs, we have these export data settings and here, hey look, I see those same options. If I was gonna go and look at my subscription and I looked at my activity log, if I can remember where it is, I thought

Speaker 1

00:49:29 - 00:49:30

it was near the top, There we go,

Speaker 2

00:49:30 - 00:49:36

it is at the top. Activity log. I have diagnostic settings and I can do the same things.

Speaker 1

00:49:36 - 00:49:51

So the key point is we have these really the same options across nearly all types of resource. They all use these diagnostic settings to send to these different types of solutions. So hey, Log Analytics, I wanna do rich analysis.

Speaker 2

00:49:53 - 00:50:06

I'm building other solutions on top of it. Has that 2 year maximum. I pay for the data that's ingested and the data stored once it's past a certain age. Hey, I wanna send it to some third party system or I wanna trigger something. Oh, maybe I'll use Event Hub.

Speaker 2

00:50:06 - 00:50:23

Hey, I wanna store it as cheaply as possible for long-term retention. Oh, a storage account would be good for that. Another thing I can do is we can create this idea of alert rules. So I can create alert rules.

Speaker 1

00:50:28 - 00:50:38

Now alert rules can actually trigger off a number of different things. I can trigger off, for example, things like the activity log. I can trigger off metrics.

Speaker 2

00:50:38 - 00:50:43

I can trigger off of logs and metrics from log analytics as well. So I could look for, hey

Speaker 1

00:50:43 - 00:50:51

I've reached a certain value, I've seen this type of log, and what this can actually do is in those cases, well it can raise an alert.

Speaker 2

00:50:55 - 00:51:00

You can see the board is slowing down, it's not catching up anymore so I'm going to start a new board.

Speaker 1

00:51:00 - 00:51:07

And then these can fire off something called action groups. So in response, either as part of the alert rule, I

Speaker 2

00:51:07 - 00:51:14

can specify call an action group, or I can actually separate them now for action rules and say, hey, if an

Speaker 1

00:51:14 - 00:51:16

alert fires at this scope, call this.

Speaker 2

00:51:16 - 00:51:30

And action groups can do a tonne of things. I can do things like an SMS, an email, I can call an API, I can call a function. There's a huge set of things I can do in response to that. So if

Speaker 1

00:51:30 - 00:51:50

we jump over again, and if we look at monitoring, we can see alerts, and we can see, well, we have alert rules, and there's all these different things we can trigger off of, oh, the activity log, service health as well, which feeds into activity log, app insights, which is coming from

Speaker 2

00:51:50 - 00:51:55

a log analytics. There's all these different types of rules I can use.

Speaker 1

00:51:56 - 00:52:07

But then what I can do beyond there is once I have that, I can also have action groups. So these are the things I want to do. So I can have

Speaker 2

00:52:07 - 00:52:10

notifications, but

Speaker 1

00:52:10 - 00:52:11

I also have a whole bunch

Speaker 2

00:52:11 - 00:52:15

of different actions I can perform. You can see there's a huge range of those things.

Speaker 1

00:52:16 - 00:52:23

And as I mentioned, I can separately now also do action rules. So action rules, rather than setting the

Speaker 2

00:52:23 - 00:52:26

action group as part of the alert rule,

Speaker 1

00:52:26 - 00:52:30

I can say hey, at a certain scope, if I see a certain type

Speaker 2

00:52:30 - 00:52:32

of alert, then call this action

Speaker 1

00:52:32 - 00:52:43

group, or I could even do suppression. I could be like, okay, well normally, this thing would happen, I wanna actually suppress it. Maybe it's Christmas, and I'm like, you know,

Speaker 2

00:52:43 - 00:52:58

I don't care if the system goes down, it's Christmas, I'm gonna set a suppression rule, So for these times, I don't actually want to get alerted. So I have all these great capabilities available to me actually as part of that.

Speaker 1

00:52:59 - 00:53:01

Now realize some things have their own solutions

Speaker 2

00:53:01 - 00:53:25

and their own notification methods. It will vary, but if I think about Azure AD Connect, it has something called Azure AD Connect Health. And Azure AD Connect Health has its own sets of notifications. I can specify users to be notified if there were sync errors. So there are things that have their own sets of solutions for there.

Speaker 1

00:53:27 - 00:53:28

But those are the key elements you

Speaker 2

00:53:28 - 00:53:46

need to kind of understand. Again, nothing rocket science here. That's really just about that identity, thinking about the governance, policy is a huge part of that, setting the right levels, getting the right structure. And then monitoring is all about, you have these things that can create signals, metrics, logs.

Speaker 1

00:53:47 - 00:53:51

Hey, where do I want to send them? Log analytics, Azure Sentinel sits on top of that.

Speaker 2

00:53:52 - 00:54:16

I just want to send it to another sim, Event Hub, but then I can file those various types of rules. So that's kind of that identity governance and that component. So now we'll start a new board to talk about design business continuity solutions. So when I think about business continuity, disaster recovery, this is actually something we'll actually come back to when we talk about the well-architected framework, because it goes into a lot of detail about this.

Speaker 1

00:54:17 - 00:54:24

A key point is when I'm thinking about any kind of business continuity, make sure you understand all of the components in your solution.

Speaker 2

00:54:24 - 00:54:33

My VMs, my load balancer, how is it using something on-premises, in which case, what is my connection to on-premises? Think of all of the different levels.

Speaker 1

00:54:33 - 00:54:43

And then I have to think about, well, where is it stored? Where am I running it? And what are the resiliency options I can enable for that?

Speaker 2

00:54:43 - 00:54:51

We always talk about region. Now remember, a region we always think about as this 2 millisecond latency envelope.

Speaker 1

00:54:52 - 00:54:57

But the reality is that region is comprised of physical data centers.

Speaker 2

00:54:58 - 00:55:04

So I might have, I'll just draw 3, multiple buildings.

Speaker 1

00:55:06 - 00:55:24

Now in those buildings, I have racks of servers. Now I could think about a failure can happen at an individual node in the rack, could happen at the rack level, a top rack switch, a power supply unit. So the first unit of resiliency we

Speaker 2

00:55:24 - 00:55:28

can do is those racks can be thought of as fault domains, like fault domain

Speaker 1

00:55:28 - 00:55:29

0, 1, 2.

Speaker 2

00:55:30 - 00:56:03

And what we can leverage is something called availability sets. If I create an availability set, what it's going to do is distribute the workloads I add into that set over typically kind of 3 racks. So I create VM1, it puts it there, VM2 there, VM3 there, VM4 there, 5. It also separates them on nodes as well. That helps for updates.

Speaker 2

00:56:03 - 00:56:10

So you'll see full domains, typically 3, but you'll also see update domains when it rolls out changes. This can be between, I think, 5 and

Speaker 1

00:56:10 - 00:56:10

20.

Speaker 2

00:56:12 - 00:56:38

Never mix workloads, because again, it's just randomly distributing them. If I mix domain controllers and IIS servers and SQL servers in the same availability set, through sheer bad luck, all of the DCs might be on this rack, all of the SQL on this rack, all of the IIS on that rack. So I'd create an availability set for each unique workload, for each unique website, for each unique database cluster. So availability sets, I survive a node or

Speaker 1

00:56:38 - 00:56:42

rack level failure, but if a data center failed,

Speaker 2

00:56:42 - 00:56:44

I still lose all of it.

Speaker 1

00:56:45 - 00:56:55

Some regions, when I talk about these individual buildings, they ensure they have independent power, cooling,

Speaker 2

00:56:57 - 00:57:04

and communications, i.e. Networking. So these get exposed as availability zones.

Speaker 1

00:57:08 - 00:57:10

And you'll only ever see 3.

Speaker 2

00:57:10 - 00:57:27

So I'll see availability zone 1, 2, and 3. They are not buildings called 1, 2, and 3. They are logical per subscription. So what is my subscription's AZ1 could be another subscription's AZ3. So there's no consistency between subscriptions.

Speaker 1

00:57:28 - 00:57:34

So now what I have, When I deploy my resources, so hey, I create a VM1 over here,

Speaker 2

00:57:35 - 00:57:53

VM2 over here, VM3 over here, I've now got resiliency at a data center level failure. So if you see questions, hey, I want to deploy my app, I want to make sure I can survive a data center failure, I'm going to use availability zones.

Speaker 1

00:57:55 - 00:57:58

It's not magical. I still have to deploy at least 3 instances.

Speaker 2

00:57:59 - 00:58:00

I need 1 in each AZ.

Speaker 1

00:58:01 - 00:58:04

If I just have 1 instance in 1 availability zone,

Speaker 2

00:58:04 - 00:58:12

it doesn't help me. So I'd have to have 1 in each of the different availability zones, so I can deploy those there. Now, some services have

Speaker 1

00:58:12 - 00:58:20

something called zone redundant. So if I think about a service as being zone

Speaker 2

00:58:22 - 00:58:23

redundant,

Speaker 1

00:58:25 - 00:58:28

that service, if I pick that option,

Speaker 2

00:58:29 - 00:58:47

automatically has its instances distributed over the 3 different AZs. So say like a standard load balancer, I can pick to be zone redundant. On a storage account, I can pick zone redundant storage. And then those 3 copies of the data are actually in the 3 availability zones.

Speaker 1

00:58:48 - 00:58:52

Also, I may see the option to be called something called zonal.

Speaker 2

00:58:53 - 00:59:12

Zonal, I pick which AZ I want it to be in. So I'm gonna deploy it to AZ3. Again, to be useful, I would want 3 instances of that zonal solution. 1 in AZ3, 1 in AZ2, 1 in AZ1. If it's regional, you have no clue where that is.

Speaker 2

00:59:12 - 00:59:22

You don't know which building it's in. It's not going to be resilient against any particular data center failure. So that's the other option is, hey, I just do regional. I have no clue.

Speaker 1

00:59:24 - 00:59:29

When I think about data resiliency options, always remember services

Speaker 2

00:59:29 - 01:00:01

sit on top of each other. IE, Azure Data Lake Storage sits on top of Blob. So Blob has that zone redundant storage, so Azure Data Lake Storage Gen 2 has that same zone redundant option available to me. Make sure you have equal resiliency in all of the components of your architecture. It gives me no benefit if I deploy, for example, my virtual machine scale set, and I pick, I want to deploy it zone redundant across AZ1, 2, and 3.

Speaker 2

01:00:02 - 01:00:20

Fantastic. And then I stick it behind a basic load balancer, or I stick it behind a standard load balancer, but that's zonal. Well, now if that zone goes down, I still can't get to my service. So I want equal levels of resiliency for my entire solution from the ground up, or it's not going to get me anything.

Speaker 1

01:00:22 - 01:00:27

When I think about these solutions within a region, it's typically synchronous replication because I

Speaker 2

01:00:27 - 01:00:29

have super low latency, so

Speaker 1

01:00:29 - 01:00:36

I can do synchronous. The other resiliency option I have is obviously another region.

Speaker 2

01:00:41 - 01:01:05

Region 2, which again has its sets of buildings, etc, etc, etc. Now between regions, that's gonna be asynchronous, nearly always. Especially if I do good architecture, I want these hundreds of miles apart. If you use the Azure built-in pairings, they're hundreds of miles apart. So there's some latency, 10, 20 milliseconds of latency.

Speaker 2

01:01:05 - 01:01:42

I don't want synchronous replication. It would slow down the operations that's actually happening. So it would be an asynchronous replication, but that would be another way to survive a data center failure. If I can't do this zone redundant option, well my other option would be I have a solution across multiple regions then that would let me have that. A lot of services have that kind of geo redundancy built into them like Azure SQL database has options to have read replicas in other locations, storage accounts have GRS or GZRS, it combines those things.

Speaker 2

01:01:42 - 01:01:52

Azure database for Postgres, for MySQL, single server have options for those kind of replicas. So there's a ton of different things I can do there.

Speaker 1

01:01:53 - 01:01:55

If it's like a regular virtual machine,

Speaker 2

01:01:56 - 01:02:25

so let's just say it's just a VM, Inside there I have obviously the operating system and I have my application. We actually have different options for that. Yes, 1 option is Azure itself can do the replication using Azure site recovery. That actually uses a service, the mobility service, at kind of the OS level. It sits between the file system, the volume driver, as changes come down, it's gonna send them over.

Speaker 2

01:02:26 - 01:02:46

So ASR could do that replication for me, Or maybe the app could do it for me. So depending on what the application, imagine it was a database, well then it could replicate at the app level. Now that would mean I'd have to have an OS running, so I'm paying. Remember I pay for things that are running? That'd have to be up and running.

Speaker 2

01:02:46 - 01:02:48

But that would be another option.

Speaker 1

01:02:48 - 01:02:54

And typically that would give me a faster failover. Obviously, if there's an app running, getting the transactions as it's replicating,

Speaker 2

01:02:55 - 01:03:22

if the same happens over here, that's gonna start up faster than, okay, I've replicated the storage to a disk, now I have to create a VM, start it, starts in some crash consistent state. So that would generally give me a richer, a nicer option available, but it's gonna cost me more money. There's always this balance between what is my actual requirement. Remember when I talk about the well-architected framework at the end, if I try and bring all this together, we'll talk about things like recovery point objectives, recovery time objectives, and how that would work.

Speaker 1

01:03:24 - 01:03:40

Now, as soon as I introduced this second region, it brings in a challenge. When I'm within a region, there are different solutions to balance between the multiple instances. So if I was, for example, at layer 4,

Speaker 2

01:03:40 - 01:03:48

like TCP UDP, I could have a standard load balancer here. If I was layer 7, I could use something like App Gateway.

Speaker 1

01:03:52 - 01:03:57

But they are running within a region. So that's no good to

Speaker 2

01:03:57 - 01:04:10

balance to another region. If this region went down, They're down as well. So the whole point is I typically have another set of solutions here as well, running my same workload. So they're regional.

Speaker 1

01:04:12 - 01:04:19

Now I need something to balance between them. So to balance between them, I can think about, well,

Speaker 2

01:04:20 - 01:05:01

if it's layer 7, I can use Azure Front Door. And I've got deep dive videos on all of these things. Essentially, Azure Front Door, if you think about the Azure Backbone Network, it has these kind of points of presence all over it, and it does multiple things. It does a split TCP, so when I'm talking to it, I actually establish my TCP and SSL sessions, this local 1, but then it can have multiple targets. So typically what it's going to do is point to multiple app gateways, because they're layer 7 as well and it will send them to whichever 1 is closer.

Speaker 2

01:05:02 - 01:05:33

And it's gonna cache the content, it can do content caching as well, can do SSL offload, it has a whole bunch of rich capabilities. So that's a layer 7 solution that would let me balance between those. Now if it wasn't a layer 7, then I can't use the Azure Front Door, another option is an actual DNS solution. So a DNS solution is something like Traffic Manager. So a Traffic Manager has a certain name, a name.trafficmanager.net, which you could hide with an alias of your own name.

Speaker 2

01:05:34 - 01:05:52

And that basically just points to different DNS names. So that could be a standard load balancer exposed out to the Internet. And again, it's going to balance. Normally, there's different balancing options for Traffic Manager and Azure Front Door. Performance is a very common 1.

Speaker 2

01:05:52 - 01:06:14

Redirect people to the 1 that's closest to you, so I get the lowest latency and the best overall performance. So that's a very common solution there. But again, I would balance a global solution with a regional solution. I want to really be consistent. So if it's a layer 7, if I've got app gateway, I'm going to want to put Azure Front Door in front of those.

Speaker 2

01:06:15 - 01:06:42

If it's not, if it's just a layer 4 like a standard load balancer, well then Traffic Manager is probably going to be a good solution. So that's how I can make that a single entry point for the users, and then balance and redirect if 1 of them goes down. Remember that replication is never a replacement for backup. So I'm talking about kind of replication in here. There are also many different backup services available for different types of workloads.

Speaker 2

01:06:43 - 01:06:53

When I think about backup, don't actually think about the backup. That's kind of weird. Think about what you might want to restore. Do I want to restore everything? Do I want to be

Speaker 1

01:06:53 - 01:06:55

able to restore a database?

Speaker 2

01:06:55 - 01:06:58

Do I want to be able to restore a certain item?

Speaker 1

01:06:59 - 01:07:00

How much data can I lose?

Speaker 2

01:07:00 - 01:07:03

That might impact the frequency I'm doing those backups.

Speaker 1

01:07:04 - 01:07:13

So Azure Backup is a native solution. It can back up things in Azure, it can back up things from on-premises by using the Azure Recovery Services agent.

Speaker 2

01:07:14 - 01:07:35

It can integrate with Data Protection Manager, it's actually an Azure backup server. So I can even replicate, sorry, backup things from on-prem into my Azure cloud. I can backup Azure VMs, I can backup file shares, I can backup SQL Server in IaaS VMs, SAP HANA in IaaS virtual machines. When I think about Azure Backup,

Speaker 1

01:07:37 - 01:07:42

it runs in 2 modes. So sometimes when I think about Azure Backup,

Speaker 2

01:07:47 - 01:07:52

It's actually different services. You'll see kind of backup vaults and recovery services vaults.

Speaker 1

01:07:53 - 01:07:55

Sometimes what it will actually

Speaker 2

01:07:55 - 01:08:02

do is it will copy the content into the vault. Hey, I'll maybe take

Speaker 1

01:08:02 - 01:08:08

a disk snapshot and I'll copy it into the backup vault. I might also keep disk snapshots locally so

Speaker 2

01:08:08 - 01:08:12

I could do a really quick restore for a limited amount of time.

Speaker 1

01:08:12 - 01:08:15

For other things, it doesn't actually copy it to the vault.

Speaker 2

01:08:16 - 01:08:32

What it really acts as is an orchestrator because it's just not logical to copy it into a vault, let's do it in the same region anyway. For example, if I'm a blob storage account, why would I copy the blob to a vault, whereas I could just use blob snapshots.

Speaker 1

01:08:32 - 01:08:34

But what I do wanna do is say I

Speaker 2

01:08:34 - 01:08:46

only wanna keep this many or take them at this time. So Azure Backup can actually act as an orchestrator to take those snapshots of my blob storage account of my Azure file. So there are these different options available to me.

Speaker 1

01:08:47 - 01:08:55

We can go and see some of these things. So if I was to look super quick, we actually start looking

Speaker 2

01:08:55 - 01:08:56

at a storage account. So if

Speaker 1

01:08:56 - 01:09:37

I just go and look at my storage account for a second, What we can see is there's actually a really rich set of data protection options it has. Now notice I've turned on operational backup with Azure Backup. So it's not copying the data to the vault, But Azure Backup is the 1 that is actually going to go and create these kind of, now, all known it will be snapshots, but it's not even doing that. Because we have this whole capability here with Blob, I can have a point in time restore. It has features like versioning, soft delete for the blob and the containers.

Speaker 1

01:09:37 - 01:09:51

I can have a whole change feed, which means if I've got those things turned on, I can actually go back to any previous point in time I want. But Azure Backup has gone and configured those settings for me based on my requirements.

Speaker 2

01:09:53 - 01:10:02

Azure Files can do exactly the same kind of thing. It can go and create snapshots. If I go and look at my files for a second.

Speaker 1

01:10:05 - 01:10:06

If I look at snapshots,

Speaker 2

01:10:09 - 01:10:35

notice there's all these snapshots that were created by Azure Backup. I created some manually a long time ago, but it is now keeping, is that 2 months? I can't do math. Couple of months of snapshots, and it's taking them at the same time every single day for me. So Azure Backup is not storing them, but it's orchestrating the actual solution.

Speaker 1

01:10:37 - 01:10:38

If I went and looked

Speaker 2

01:10:40 - 01:10:42

at my recovery service faults, for example,

Speaker 1

01:10:45 - 01:11:08

We can see I have this whole concept of policies, and this is where I can configure. Look, this is an Azure File Share. The Azure File Share, you have a policy, and I can do things like, well, I want to retain a daily for a certain amount of time. I could create a weekly and keep it for a certain amount of time, keep it monthly, a yearly. I have a lot of granularity in what I can actually do with that.

Speaker 1

01:11:11 - 01:11:13

If I have like a virtual machine, actually, let's go back to that

Speaker 2

01:11:13 - 01:11:17

for a sec. Go back to my policies and look at virtual machines.

Speaker 1

01:11:19 - 01:11:52

Well, this can do some nice things. It's actually using disk snapshots to actually capture the state. It still integrates with the Volume Shadow Copy service running in Windows, or it can freeze the file system on Linux. But notice what I configured here is actually keep those snapshots 2 days locally with the disk in addition to copying it to the vault. So what that would let me do is if there was actually a problem, instead of having to copy the data from the vault back over, I can just restore the snapshot

Speaker 2

01:11:52 - 01:12:00

which will be super super fast. So I have those capabilities as well. Remember as well, In the

Speaker 1

01:12:00 - 01:12:06

same way we could replicate from the app, if I back up at the VM level, what do we understand?

Speaker 2

01:12:07 - 01:12:13

We understand the VM and maybe files and folders. I have 0 clue what a database is. So if I want granularity to

Speaker 1

01:12:13 - 01:12:22

be able to restore a database, then I probably need to do a backup within the guest. And then my restore granularity would be, hey, I can restore this database.

Speaker 2

01:12:22 - 01:12:58

So things like SQL in IaaS virtual machines, SAP HANA in IaaS virtual machines, there's a rich interaction with that. Also the backup vault, if it is stored in the vault, I can do GRS to make my backup data available cross region. And often I can also do a cross region restore to actually have that available should that region fail. So that's a high level quick view and I think about that business continuity but we're going to come back to some of this when we talk about that well architected framework. So let's jump to a new whiteboard.

Speaker 2

01:12:59 - 01:13:18

Okay So the next part is design data solutions. And I have a whole study cram for the DP 900 test, which is really a lot of this similar content. So I would recommend going and looking at that. It's in the playlist for this AZ305.

Speaker 1

01:13:20 - 01:13:29

When I think about data, there's really 3 buckets of data we ever think about. We have data that we consider is structured.

Speaker 2

01:13:31 - 01:13:53

So we have some kind of structure to our data. Think of databases. I have my data organized in kind of the idea where I have rows, columns. There's a schema that describes these are the attributes in this table. This is the format of them so structured

Speaker 1

01:13:56 - 01:14:02

The next type we have is semi-structured this could be documents it could be self-describing

Speaker 2

01:14:06 - 01:14:33

Commonly you might think of something like a JSON document or XML, even HTML. They're all self-describing, they do have a structure, but it's not predefined, there's not a schema. In this case it's just self-describing what that is. And then of course we just have unstructured. This could be documents, this could be media.

Speaker 2

01:14:34 - 01:14:41

It's just something I need to store. Blob is a very common type of solution around this.

Speaker 1

01:14:42 - 01:14:44

When I think about unstructured,

Speaker 2

01:14:44 - 01:15:13

we'll start with that and kind of build upwards. There are various services in Azure that facilitate this type of service but the key 1 we're going to start with is really thinking about a storage account. A storage account is a key building block for many things in Azure. Many other richer services actually sit on top of things from a storage account. A managed disk, fundamentally, is using a storage account.

Speaker 1

01:15:14 - 01:15:19

And there are a number of different services we actually expose. Fundamentally,

Speaker 2

01:15:20 - 01:15:43

we can think about the idea of blob, some binary large object. Now, out of blob, we have block. It's made up of blocks, to the blob. We have page, where it's made up of pages, very good for random read, write anywhere in the file. And we have append.

Speaker 2

01:15:44 - 01:16:10

I just need to add commonly, keep going to the end of it. So there's different types of blob. And also what we have is files. Predominantly Azure Files was built around SMB, but they do actually now give you the option of NFS as well, and then the option of queues. A very simple first in, first out solution.

Speaker 1

01:16:12 - 01:16:20

Within the storage account, so those are the types of data I can have inside it, it has various attributes of its own.

Speaker 2

01:16:20 - 01:16:49

So the storage account has a certain type. So these are all the objects supported and then we have a type of the storage account. Now The common 1 we're going to see most of the time is just a standard, and it's this General Purpose V2. That supports all of the different types of data we might want. It has things like tiering, hot call archive or transaction optimized hot call.

Speaker 1

01:16:51 - 01:17:01

There is a general purpose v1. I can't really think of a reason to use general purpose v1 today. So if ever you see general purpose v1 as the answer to a problem

Speaker 2

01:17:02 - 01:17:04

you can probably eliminate that right from the start.

Speaker 1

01:17:05 - 01:17:15

Then we also have premium. Now with premium, they are tied to a certain type of service and The

Speaker 2

01:17:15 - 01:17:19

primary ones you're gonna deal with is block and files.

Speaker 1

01:17:20 - 01:17:52

What premium gives us is very high performance, generally lower latency. There is also page blob option. And what we'll see is when we look at these different types of accounts, they may not have all the same options available. So for example, if I go and look for a second, and if I think about, okay, I want to create a storage account, and I'll create a new 1. Now you can start to see some of the options that are available.

Speaker 1

01:17:52 - 01:18:09

Well I have to give it obviously a subscription, a resource group like any other resource, a name that has to be globally unique in all of Azure. It does get deployed to a region, but you have this idea of the performance. Standard or premium? If I pick standard, it's not even asking me the type,

Speaker 2

01:18:09 - 01:18:22

it doesn't even offer me general purpose v1. It's saying you probably want a general purpose v2. If I pick premium, now I pick the type, block blob, file shares, or page blobs.

Speaker 1

01:18:22 - 01:18:31

But what I want you to notice is, what is the redundancy options? As soon as I pick premium, well for that 1 it's just LRS.

Speaker 2

01:18:34 - 01:18:47

That one's LRS and ZRS. That one's LRS and ZRS. There is no GRS. So with premium, I can never have a globally redundant solution.

Speaker 1

01:18:47 - 01:18:56

So that's the important thing to remember. When I think about the options available, yes, premium is going to give me the best performance, but it's going

Speaker 2

01:18:56 - 01:19:02

to limit some of my other options. So as we saw, they have a type,

Speaker 1

01:19:02 - 01:19:05

they get deployed to a certain region,

Speaker 2

01:19:07 - 01:19:10

but then also we have those replication options.

Speaker 1

01:19:14 - 01:19:40

And Those options are going to vary depending on what is the type of that storage account. Now before we go any further, how many storage accounts might you need in your architecture? Well think about those attributes. If there are different sets of requirements, maybe 1 set of requirement is I need the highest performance, lowest latency solution, that's resilient to

Speaker 2

01:19:40 - 01:19:42

a data center failure in this region.

Speaker 1

01:19:42 - 01:19:44

Well, okay, I can use premium.

Speaker 2

01:19:45 - 01:19:48

If there was another requirement that was, we need

Speaker 1

01:19:48 - 01:19:54

the ability to tier data and I need to be geo-redundant, well, okay, then I know I'm looking at

Speaker 2

01:19:54 - 01:20:15

general purpose V2 because then for the replication, I can get the different options. Those GRS, the replications we talk about, they're always within the same geopolitical boundary. I'm still gonna replicate data outside of some maybe data sovereignty line you have, except for Brazil South. Today that replicates the South Central US.

Speaker 1

01:20:17 - 01:20:44

There's encryption at the storage account level, so maybe if I need different encryption, I might want different storage accounts. Although there are things called encryption scopes now, so for Blob I can actually use different keys for different sets of data, but different isolation requirements, different replication requirements, maybe I need certain features that are incompatible with each other. Those would all be reasons I might drive to have multiple storage accounts. And when I think of the resiliency,

Speaker 2

01:20:45 - 01:20:53

and I kind of drew replication here, There are different options. So the base level is LRS, locally redundant storage.

Speaker 1

01:20:54 - 01:20:57

With locally redundant storage, there were always 3 copies of the data,

Speaker 2

01:20:58 - 01:21:19

but it's within 1 storage cluster, i.e. Within 1 particular building. If I do ZRS, there's 3 copies, but those copies are now distributed over 3 different data centers. So I have resiliency from those. And then I could think about kind of a region

Speaker 1

01:21:19 - 01:21:22

2. So if I do GRS,

Speaker 2

01:21:23 - 01:21:29

well I have the 3 copies and then it replicates to have another 3 copies over here.

Speaker 1

01:21:30 - 01:21:31

And then there are combinations.

Speaker 2

01:21:31 - 01:21:40

So I can do like GZRS, where the 3 copies are distributed over AZs here, and then I have 3 copies in a particular data center

Speaker 1

01:21:40 - 01:21:43

there. Sometimes you'll see an RA

Speaker 2

01:21:45 - 01:22:06

variant, RAGRS, RAGZRS. That means for some of the services, for example, blob, I can read that copy in the paired region, but I can't write to it. It's a read-only copy, so I get read access to it. That does not work for things like Azure Files. So I can't do it for Azure Files.

Speaker 2

01:22:08 - 01:22:17

So we have the different resiliency options. So once again, if the requirement is, hey, I need to survive a data center failure, hey, that means ZRS.

Speaker 1

01:22:18 - 01:22:21

Or, if ZRS is not an option, maybe it's GRS.

Speaker 2

01:22:21 - 01:22:37

Maybe they might say AZs are not available in this region. What is another way I could survive? Well, GRS, I'm still surviving a data center failure because I've got 3 copies going somewhere else as well. So that would be another way I could actually leverage and solve that problem.

Speaker 1

01:22:39 - 01:22:46

Now when I come back to the features, there are some key things we have. So blobs,

Speaker 2

01:22:46 - 01:22:59

we put them in containers. It is a flat structure. There are no folders. If I want folders, there's the hierarchical namespace I can turn on, which is typically the Azure Data Lake Storage Gen2. Then I have true folders.

Speaker 2

01:22:59 - 01:23:03

I can do true moves, I can do a rename of the file, etc.

Speaker 1

01:23:05 - 01:23:21

The account type does vary some of the features I have available. So here, if the type is standard, and that's really a key point, for Blob I have access tiers. So here,

Speaker 2

01:23:23 - 01:23:38

we have access tiers. So premium is a different type of account. So that's just premium. There's no tiering in premium. If I picked premium, blop, blop, it's just premium.

Speaker 2

01:23:38 - 01:24:11

But if I pick standard, then I have hot, cool, and archive. So I actually have true tiering available. There's things like lifecycle management that's a native feature that can automatically move data between them. Maybe it's not been modified for a certain amount of time or accessed for a certain amount of time. A key point here is archive is actually offline.

Speaker 2

01:24:13 - 01:24:28

I have to move it back into cool or hot to actually be able to read and access that data. It's the cheapest option. So if I think about why would I have these things? Dollars. So what do we typically pay for?

Speaker 2

01:24:28 - 01:24:29

Well, we pay for capacity,

Speaker 1

01:24:32 - 01:24:34

but then we also pay for transactions.

Speaker 2

01:24:38 - 01:24:55

Now premium, actually I don't mean it charges you for transactions at all, but I pay more for the capacity. With hot, I pay the most for capacity, but the least for transactions. For archive, I pay the least for capacity, but I pay the most, I actually

Speaker 1

01:24:55 - 01:24:57

have to move it back.

Speaker 2

01:24:57 - 01:24:59

So there's this balance. So if

Speaker 1

01:24:59 - 01:25:04

I had data I was constantly interacting with, Hopped here makes the most sense.

Speaker 2

01:25:06 - 01:25:19

If it's data I have to keep for 7 years, and hey, and you'll listen to key points. You have to keep this data for 7 years, and you can wait up to a day to be able to access the data, that's going to be archive.

Speaker 1

01:25:20 - 01:25:22

If it's, hey, I have to keep this data for

Speaker 2

01:25:22 - 01:25:29

a prolonged period but need immediate access, what's the cheapest way to store it? That would be call. Call is still available instantly.

Speaker 1

01:25:31 - 01:25:32

You can see how those costs

Speaker 2

01:25:33 - 01:25:49

actually balance out. So if we look for a second at the costing page, So here we can see the idea of premium. So notice the pricing.

Speaker 1

01:25:49 - 01:25:58

Premium is 15 cents per gigabyte. Way more than hot, which is way more than cool, which is way more than archive. Archive is, what is that?

Speaker 2

01:25:58 - 01:26:10

I can't even do the math on that today. A hundredth of a penny. I guess a tenth of a penny basically, tiny, tiny amount. So that's the cost of actually storing it. So I pay a lot less

Speaker 1

01:26:10 - 01:26:27

money for the storage. But then if we actually go and look at the operations, so premium, so now some of them are free, you don't pay anything, but the actual interactions are really cheap for premium. They get a bit more expensive for hot,

Speaker 2

01:26:28 - 01:26:52

a lot more expensive for cool. An archive, well, read operations has this big price because I have to bring it back. So there's a whole set of data retrieval and it costs more to actually do things against it. So we have that balance of, okay, what's the right thing I actually need? SLA is vary as well, based on these different services, you can go and check into those things.

Speaker 2

01:26:53 - 01:27:10

But that's really the point of those options. And notice there's encryption scopes I mentioned there, if I wanna use different encryption keys at that blob level. So that's why we have different tiers. I have different requirements. Hey, I need to access it really frequently.

Speaker 2

01:27:11 - 01:27:12

Okay, hot would

Speaker 1

01:27:12 - 01:27:18

be good. Hey, I need the lowest possible latency, highest performance. I don't need geo-redundancy.

Speaker 2

01:27:19 - 01:27:21

Okay, I'll use premium. So you're

Speaker 1

01:27:21 - 01:27:25

gonna balance those things. Blob also has locking options.

Speaker 2

01:27:26 - 01:27:51

So you'll hear a lot about the idea of kind of immutable. Immutable is kind of that proof that, hey, I'm not changing this in any way. So on Blob, I can do legal holds. So you can't change, I can't delete this until I take off this legal hold, or there's time-based holds. Got to keep this for 60 days or a year or something like that.

Speaker 2

01:27:51 - 01:27:56

So that enables me to actually stop changing those types

Speaker 1

01:27:56 - 01:28:07

of things. If I think about getting data into Blob, obviously there are tools where I can copy it over the network. There's Azure Storage Explorer, there's AZ Copy.

Speaker 2

01:28:08 - 01:28:27

Offline there's things like Import-Export. We have BitLocker encrypted disks that we send and receive. There's Azure Data Box, a big appliance. It enables us to, they ship us the appliance, we copy the data onto it, they ship it back and put it in the storage account. So there's different options on how we can actually get that data into there.

Speaker 1

01:28:29 - 01:28:32

Azure Files, once again, that has a premium option

Speaker 2

01:28:33 - 01:28:46

where, hey, lowest latency, highest performance. Azure Files is all about SMB, typically, although again, there is kind of that new NFS

Speaker 1

01:28:46 - 01:28:47

4.1

Speaker 2

01:28:48 - 01:29:14

option available for Azure files as well, where that has to then integrate with a virtual network, where I lock it down for a service endpoint or private endpoint. We'll talk about those constructs in a second. SMB, I can do our calls based on, for example, I could integrate it with Active Directory domain services. That is regular Active Directory. That is not Azure AD, that is regular Active Directory.

Speaker 2

01:29:15 - 01:29:34

My storage account gets joined in a way, it gets a Kerberos object in my AD, so it can then validate tokens and I can get granular ACLs. Or it can integrate with Azure AD domain services, but it's a lot more work and it's really not that pleasant.

Speaker 1

01:29:35 - 01:29:52

There's a thing called Azure File Sync. Azure File Sync is really nice if you think about the idea that, well, I have that share, my Azure file share in the Cloud, but maybe what I also have, and

Speaker 2

01:29:52 - 01:29:54

this could actually be for migration purposes,

Speaker 1

01:29:54 - 01:30:02

I have existing file shares on-premises. Azure File Sync will replicate

Speaker 2

01:30:04 - 01:30:12

between them. It's always 31 endpoint in a sync group, but I could use it to migrate data. Hey, I want to take this and move it to Azure File Shares,

Speaker 1

01:30:12 - 01:30:25

how I could set up Azure File Sync. Or I want to keep these and use this as that kind of key synchronization point and failover point. And what's nice about this is Azure File Sync will keep the ACLs. So if I then did this option,

Speaker 2

01:30:25 - 01:30:32

the Active Directory Domain Services integration, those ACLs would be enforced even when I access the Azure File Share.

Speaker 1

01:30:32 - 01:30:45

Another nice feature is this has tiering. So what I can actually say is, hey, if I get to 80% capacity, take the least used content and just store it

Speaker 2

01:30:45 - 01:30:53

in the Azure File Share, but leave a thumbprint here so it looks like it's here, and I'll dynamically pull it down if someone actually tries to access it.

Speaker 1

01:30:55 - 01:31:04

There is tiering again. Once again, there's that premium option, and then it doesn't have archive. There's no offline, but it has something called transaction optimised.

Speaker 2

01:31:04 - 01:31:20

So there's transaction optimised, hot and cool. And it has that same flow of cost of capacity versus, we can see that super quick. So if we look at the pricing page again, says premium, transaction optimised, hot and cool.

Speaker 1

01:31:21 - 01:31:36

And here you can see premium, once again you pay more for the storage, transaction optimized you pay more than hot, which you pay more than cool. But for the actual transactions, well, premium, you don't pay anything for transactions.

Speaker 2

01:31:36 - 01:31:41

You pay less for transaction optimised, you pay more for hot and more for cool.

Speaker 1

01:31:41 - 01:31:48

So it's always that balance of what is my requirement? What do I need to actually have?

Speaker 2

01:31:49 - 01:31:52

And then I pick the most efficient option

Speaker 1

01:31:52 - 01:32:01

for me. So, we have these choices so that I can really pay the right amount for what I need. Azure is all about consumption, as

Speaker 2

01:32:01 - 01:32:04

is the cloud. Pick the right option

Speaker 1

01:32:04 - 01:32:09

so I'm only paying for what I actually need. That's the driver. When you're architecting, you're going

Speaker 2

01:32:09 - 01:32:17

to see that for everything we do, be it compute or storage or network, whatever that is, there's always options. And so what is the

Speaker 1

01:32:17 - 01:32:27

1 that makes the most sense in terms of getting the requirements met? And generally a requirement is also optimize my spend. So always keep that in mind. Whenever you see a question,

Speaker 2

01:32:27 - 01:32:38

if you see multiple solutions that all look good, Okay, which 1 costs less? As an architect, I want to do the right thing for my customer. That's typically the thing we're going to pick.

Speaker 1

01:32:40 - 01:32:53

There's also actually, if I think about files, SMB and NFS, there is Azure NetApp files as well. So that's NetApp file is running in Azure data centers. It's provided as a native Azure service. Azure NetApp Files is generally

Speaker 2

01:32:53 - 01:33:20

a solution for when I need a higher level of performance. So it goes to a higher level of performance even than the premium Azure files. So that might be a solution. Maybe I'm used to NetApp today, I'm used to the NetApp management, maybe I want to replicate from a NetApp file on-prem, Azure NetApp files would be a good optional way for that. When I'm thinking about other services, I talked about blob files and queues.

Speaker 2

01:33:20 - 01:33:27

There's actually something else that lives on top of a storage account, but you don't see the storage account.

Speaker 1

01:33:30 - 01:33:38

So think, if you ever created virtual machines or AKS clusters or even other things, you've probably seen the idea of a managed disk.

Speaker 2

01:33:43 - 01:33:45

Managed disk actually

Speaker 1

01:33:48 - 01:33:51

is a page blob. When I create a managed disk, which is

Speaker 2

01:33:51 - 01:33:55

a first party Azure resource manager, resource with RBAC and snapshots and

Speaker 1

01:33:55 - 01:34:20

all that wonderful stuff, what it's actually doing is creating a storage account, creating a page blob, it just hides it from me. In ye olde days, we had to manually manage the storage accounts and create the page, and then we had to worry about all the limits of the page blob, the limits of the storage account, we put too many page blobs in 1 storage account, then we hit the storage, it was horrible. So managed disk basically abstracts all of that away.

Speaker 2

01:34:21 - 01:34:44

Now there are different types of managed disk. You'll see there's things like a standard hard disk drive, a standard SSD, a premium SSD and then an ultra disk. I always joke the next 1 will be called the super-duper wow that's fast disk.

Speaker 1

01:34:44 - 01:35:00

Not funny. But we have these different options. And a core point of these, once again, they offer us different capabilities. As you would expect, they're getting higher performance as we go down. So the performance gets better, but the cost goes up.

Speaker 1

01:35:01 - 01:35:31

For most of these, what we typically have here is as the capacity goes up, so does the performance. It's kind of like that. So if I want the disk size I pick, could be based on the capacity I need, or it might actually be based on the performance I need. So I get a bigger disk than I need because I need more IOPS or throughput. UltraDisk is different.

Speaker 1

01:35:31 - 01:35:35

UltraDisk actually has 3 dials. It has capacity,

Speaker 2

01:35:37 - 01:35:43

but it also has IOPS and throughput, and these you can actually dynamically change.

Speaker 1

01:35:44 - 01:35:49

So while the disk is being used, I can increase the IOPS because I've got some batch job running, then

Speaker 2

01:35:49 - 01:35:52

I can decrease it again when I don't need it anymore.

Speaker 1

01:35:53 - 01:36:05

Premium SSD actually lets me change the performance of my disk separate from the actual capacity. I'd pay for what that bigger capacity is, but

Speaker 2

01:36:05 - 01:36:12

it means I don't have to grow and shrink the disk. So if I jumped over here and look at disks for a second.

Speaker 1

01:36:13 - 01:36:17

So these are all, they're page blobs hidden away, I just can't see them.

Speaker 2

01:36:18 - 01:36:21

But if I actually look at premium disk only.

Speaker 1

01:36:22 - 01:36:32

So notice here I have the size and notice the IOPS and the throughput go up as the disk gets bigger, But I can change the performance tier.

Speaker 2

01:36:32 - 01:36:35

So that won't make the disk bigger, but

Speaker 1

01:36:35 - 01:36:46

it will give me the performance as if it was a bigger disk. Now I will pay for this bigger number. So you might say, why on earth would I ever want to do that? Well, I can increase the size of disks in Azure,

Speaker 2

01:36:46 - 01:36:47

but I can't shrink them.

Speaker 1

01:36:47 - 01:36:52

So if I needed a higher performance for a certain duration of time, but I want

Speaker 2

01:36:52 - 01:37:08

to bring it back down again, I don't want to make the disk bigger because then I'm stuck. But with Premium SSD, I can raise the performance tier up, run at that high performance, then bring it back down. There's certain time limits around that, like I can't constantly do it, I think it's 12 hours I have to leave

Speaker 1

01:37:08 - 01:37:37

it or something. Also, standard SSDs, premium SSDs actually have bursting. So if we look at those pricing details of these disks, 1 of the nice things is for the smaller disks, we'll actually see, for like the P1, P2, notice you have this idea about with bursting in brackets and bursting for throughput and IOPS. So we can actually burst up for up to, I think it's 60 minutes,

Speaker 2

01:37:37 - 01:37:47

maybe it's 30 minutes, 1 of those things. For a certain duration of time, it can burst up to a bigger number. I'm sure it answers that bursting in here.

Speaker 1

01:37:48 - 01:37:51

And what it's doing is, this is the credit-based bursting.

Speaker 2

01:37:51 - 01:37:55

Oh, it's 30 minutes, it says it here, 30 minutes. For up to 30 minutes, we can do the burst.

Speaker 1

01:37:56 - 01:38:19

And I don't pay for that. It's the whole idea of accruing credit. So for the smaller VMs, so the P20 and smaller, I can get that bursting for free. For the bigger disks, it's called on-demand bursting. I have to turn that on and pay for that, but then I can go to these much higher numbers, see 30, 000, but I pay for that.

Speaker 1

01:38:19 - 01:38:30

So it's different from the kind of credit-free bucket, bigger disks I have to pay. Standard SSD also has bursting. Kind of see those options there as well for

Speaker 2

01:38:30 - 01:38:53

the smaller disks, but not for the bigger ones. Poor standard hard disk drives does not. And then UltraDisk, the key point here is notice you're paying separately for capacity, IOPS and throughput. So we have those different options depending on what is it we actually need to do. So that's a key point now.

Speaker 2

01:38:54 - 01:39:24

Storage account is encrypted. For the storage account, we can always pick, is it a Microsoft managed key, or is it a customer bring your own key? If it's bring your own key, it gets stored in Azure Key Vault. Both storage accounts and managed disks now have the option that it will just point to a secret, not a version of a, sorry, a key, not a version of a key. And then if I create a new version of the key, I want to rotate, it will automatically get detected and applied.

Speaker 2

01:39:26 - 01:40:00

For managed disks, for that encryption, right, I have the Microsoft managed. Or I can create a disk encryption set where I'm bringing my own key from Azure Key Vault. So I have those choices. So there's encryption at rest for the actual storage account, the disk. I can think about this host level encryption I can also turn on.

Speaker 2

01:40:00 - 01:40:26

So the temporary files it might create, the cache, I can turn on host-level encryption to encrypt that data. And even within the operating system as well, I can do Azure Disk Encryption. So Azure Disk Encryption for Windows would use BitLocker for Linux DM Crypt. So that would actually now encrypt inside the OS as well. So I have all these different options available to me depending on what I actually need.

Speaker 1

01:40:28 - 01:40:45

So those are some of the key constructs we actually have in terms of storage, obviously that the final thing we would kind of bring together on that is security. I mean, that's a huge part. And there's different levels when I think about security.

Speaker 2

01:40:46 - 01:41:04

So there's obviously security in terms of access to it. There's the firewalls on the services to restrict who can talk to it. There's integrations with networks like service endpoints to restrict access to certain subnets. There's private endpoints which can then be used from that VNet or connected and we're going to talk more about that.

Speaker 1

01:41:05 - 01:41:11

But then in addition to the kind of access idea, so again access I

Speaker 2

01:41:11 - 01:41:34

can think about firewalls, IP, VNet, There's all constructs we have around there, and that VNet, again, service endpoints, private endpoints. We also have the rights. So at the data level, how can I access this? Now, there's an access key on a storage account. There's 2 of them.

Speaker 2

01:41:34 - 01:41:46

The whole idea of having 2 is that I could be using 1, or I wanna rotate it, or I can switch to using the other 1, rotate 1 of the key, switch to using that, and then rotate the other 1. So there's always 1 key I can be using.

Speaker 1

01:41:46 - 01:41:50

So we have the whole idea of those master access keys.

Speaker 2

01:41:54 - 01:42:08

Do not use them. There's even options now in storage accounts to disable them. So if I was to go and look for a second at my storage account. If I just pick 1, I don't know. I think it's in configuration.

Speaker 2

01:42:09 - 01:42:13

If I can find where configuration has gone, there we go.

Speaker 1

01:42:14 - 01:42:27

1 of the options you have now, allow storage account key access. So I can actually disable that. I can now say that all powerful account keys, again there's 2 of them,

Speaker 2

01:42:27 - 01:42:30

I don't want to let you use that, so I can disable that.

Speaker 1

01:42:30 - 01:42:37

Now you do have to be a little bit careful because yes, access keys are 1 of the things I can do. Another thing I

Speaker 2

01:42:37 - 01:43:00

can do is I can create a shared access signature. And there's 2 types. I can think about an account level where I can specify access to multiple types of service like files, queues, or there's actually a service which is specific to a certain type.

Speaker 1

01:43:01 - 01:43:03

And those can be time limited, they can

Speaker 2

01:43:03 - 01:43:04

be restricted to certain types of

Speaker 1

01:43:04 - 01:43:16

operations, they're restricted to certain IPs. But, they're signed by the access key. So if I disable the use of access keys, I

Speaker 2

01:43:16 - 01:43:20

can't use a shared access signature because it's signed by. So you just

Speaker 1

01:43:20 - 01:43:24

have to be careful of that. As we saw earlier on, we looked

Speaker 2

01:43:24 - 01:43:50

at access control. There's also now role-based access control data plane. We have those data actions for many types of service, like the queues, like the blob we saw. So we have that as the other option to actually control that. In terms of data in transit, Well, another option we have right here is I can turn on secure transfer.

Speaker 2

01:43:51 - 01:44:04

So I have to use HTTPS. If I was using Azure Files, it's going to require that we use SMB3 and make us use the encryption option. So we have those all available actually to us.

Speaker 1

01:44:06 - 01:44:14

So that's a huge focus all about the unstructured data. If we now switch gears, again this

Speaker 2

01:44:14 - 01:44:15

is a review,

Speaker 1

01:44:17 - 01:44:33

we also have the idea of structured. Now there are multiple solutions in Azure for structured. A big 1 we always focus on, obviously, is SQL Server. So I can think SQL Server SQL-based database.

Speaker 2

01:44:33 - 01:44:46

And there are actually different types of this available. So if I think about SQL, so SQL database, there are different options available to us. The first 1 we'll think about is Azure

Speaker 1

01:44:48 - 01:45:04

SQL Database. So this is a pure PaaS, it's a managed solution for us. We do very, very little, it's fully managed, it supports very large databases up to like 100 terabytes. There's even a serverless option for auto scale.

Speaker 2

01:45:04 - 01:45:18

So I can have super large if I pick the right type. There is also this auto scale option. There's different service tiers available. So I'm gonna make service tiers.

Speaker 1

01:45:20 - 01:45:31

And there's really 3 key ones we think about. I can think there's general purpose. And this actually applies to some

Speaker 2

01:45:31 - 01:45:41

of the other types of Azure SQL database we're going to see as well. If I think about general purpose, this is really based around the

Speaker 1

01:45:41 - 01:45:42

idea that there's some node

Speaker 2

01:45:45 - 01:46:17

that's active that is connecting to my storage that's got the files and it's connected to it. There's a bunch of spare nodes sitting around just in terms of capacity in Azure. If my node fails, 1 of the ones with spare capacity would say, okay, I'll connect to that storage now and then re-offer my database. There's a certain amount of downtime, there's a certain amount of failover as it attaches to the disk, et cetera.

Speaker 1

01:46:18 - 01:46:24

And there's the idea of business critical. So business critical, well, rather than having kind

Speaker 2

01:46:24 - 01:46:30

of those spares sitting around and it will grab the storage kind of account as it feels like it,

Speaker 1

01:46:30 - 01:46:32

well now we have the idea of we have the node,

Speaker 2

01:46:32 - 01:46:35

but it's actually connecting to its own high-performance managed disk.

Speaker 1

01:46:36 - 01:46:40

But we have multiple nodes.

Speaker 2

01:46:40 - 01:46:42

We have this ring of the nodes.

Speaker 1

01:46:45 - 01:46:47

Now, there's still a primary.

Speaker 2

01:46:49 - 01:47:12

We have these secondaries. I can make, for example, this read access, and these are really forming an always-on availability group. So now I get a much faster failover if there was a problem. I can also get better kind of scale because I can make those read access. I have all these secondaries that can be made available and I can use those things.

Speaker 2

01:47:13 - 01:47:37

I can even have these zone redundant. Now there is actually a gateway layer above this as well that the initial request comes in and that has a similar option. So I could make this zone redundant and then the gateway service above it would be zone redundant as well. And then there's hyperscale. So hyperscale is the idea that we have to actually shard the data.

Speaker 2

01:47:37 - 01:47:59

We want to separate the data out. SQL itself is not a multi-master type of service. It's a single primary. So even with hyperscale, we still have the idea of kind of this primary compute node, but then we can have, and we have configuration of this, we can have secondaries and we can have replicas and we can have read access.

Speaker 1

01:48:00 - 01:48:02

But what this actually does is as data comes in,

Speaker 2

01:48:02 - 01:48:13

it kind of writes to a log service. And then what we have is multiple page servers. We have this whole... And these scale completely separate from this. So we have all these page servers.

Speaker 2

01:48:15 - 01:48:20

And these page servers have its own sets of storage. Get worse and worse at drawing.

Speaker 1

01:48:21 - 01:48:29

And that goes in rights. So now when it has a compute request and has to perform some operation, well, it can actually distribute

Speaker 2

01:48:29 - 01:48:56

the request over all these kind of shards of the data, so I get much higher performance, but I also get a much higher set of scale, because now I'm separating that out over all these different page servers that get queried through this primary. And again, I can do certain amounts of resiliency by having these secondaries at that compute layer as well. There's also things like private endpoint support so I can get an IP address within that actual cluster.

Speaker 1

01:48:58 - 01:49:02

There are comparison documents. So this

Speaker 2

01:49:02 - 01:49:24

is quite nice, it talks about Hyperscale, and it talks about those 3 general purpose Hyperscale and Business Critical. It talks about the type of service that are supported for them, compute sizes, how the storage works between them, maximum storage size, we get that 100 terabytes of the hyperscale. Talks about that multiple tiering.

Speaker 1

01:49:25 - 01:49:26

So there's some nice documents.

Speaker 2

01:49:27 - 01:49:56

And it talks about the geo-redundancy availability. Notice on that standard general purpose 1, hey, I just had the 1 copy, 1 replica, there's no read scale out. There's zone redundancy in preview, those spares could be scattered out, but that's really as good as it gets. Whereas with that business critical, there's 3 replicas, I can have a read scanner instance, I can get full zone redundancy. Hyperscale can have up to 4 read scale outs.

Speaker 2

01:49:57 - 01:50:14

So we have those different options available to us. And just like everything else we're going to do, we pick the 1 that meets the requirements. And we want to be cost optimized. Sure, I could use business critical all the time, but it's costing me a lot more money. So what is the requirement?

Speaker 2

01:50:14 - 01:50:24

What is the maybe speed of failover? What is the importance of this? There's different SLAs associated. Understand those things so you can pick the right 1 to meet the requirements.

Speaker 1

01:50:25 - 01:50:32

So that's kind of the Azure SQL database. Now the next 1 we get is Azure

Speaker 2

01:50:32 - 01:50:35

SQL Managed Instance.

Speaker 1

01:50:36 - 01:51:00

Now once again, this is a PaaS offering. So it's still PaaS, but it deploys into your virtual network. So it's PaaS but in VNet. It's basically a better compatibility. What's happening is it's deploying basically SQL into these virtual machines that you don't manage.

Speaker 1

01:51:01 - 01:51:20

It's managing SQL in the VMs, but it's a lot more regular type of SQL. So if I'm moving a workload from on-premises, for example, to Azure SQL, it's going to come down to what features do I need. If I'm using certain features today, that I'm used to like comment

Speaker 2

01:51:20 - 01:51:34

and language runtime, linked servers, a service broker, the SQL server agent, those things are not gonna work on Azure SQL Database. This is really built for I'm doing some cloud architected, optimized solution.

Speaker 1

01:51:34 - 01:51:36

By moving something from on-prem and

Speaker 2

01:51:36 - 01:51:39

I'm using those types of features, well then Azure SQL MI is probably

Speaker 1

01:51:39 - 01:51:44

the solution. So it's still fully managed, I'm not patching SQL, but it's gonna have a much better compatibility.

Speaker 2

01:51:46 - 01:51:50

And I didn't really go into details, but with this, you have

Speaker 1

01:51:50 - 01:51:56

the option of kind of single database or elastic pool. So elastic pool is I have a set of resources

Speaker 2

01:51:56 - 01:52:08

that I can put multiple databases into, so they can kind of share and have a little bit of wiggle room in needing extra or less resource. If I do a single server, all of those resources are dedicated to just that 1 instance of a database.

Speaker 1

01:52:08 - 01:52:09

So it's guaranteed to be there for

Speaker 2

01:52:09 - 01:52:26

it, but there's no real movement on there. Managed instance has the same idea. I can have a single instance, or I can have an instance pool where multiple databases can share the same set of resources. So this is really gonna be all about that better compatibility.

Speaker 1

01:52:27 - 01:53:16

If you go and look at the feature comparison document, it's gonna walk through those. So if you see a question like, hey, we're moving a database, which solution and you see Azure SQL Database and Azure SQL MI, look for some of these key features and look for the kind of nodes for Azure SQL Database, and yeses for Azure SQL Managed Instance. I think some of the biggest ones that I've seen is things like this common language runtime, cross database transactions, notice you have things like the agent as well. So just take a look and I'll put all these links in the description below in this video, but understand some of the differences between them, but really it's gonna come down to compatibility. Hey, I'm using this thing, so this SQL Server agent is a big 1.

Speaker 2

01:53:17 - 01:53:31

What are you needing? And then that's probably why I would use Azure SQL MI. If all things equal, hey, I'd rather use Azure SQL database. But if I have some compact requirement, It

Speaker 1

01:53:31 - 01:53:32

used to be this might be because it was

Speaker 2

01:53:32 - 01:53:37

in the VNet as well, but with private endpoints, I can still get an injection directly into the virtual network.

Speaker 1

01:53:38 - 01:53:44

Of course, I still have the option for SQL running in an IaaS virtual machine,

Speaker 2

01:53:45 - 01:53:52

just a regular VM. And even with that, that's obviously the highest compatibility. It's SQL Server running in

Speaker 1

01:53:52 - 01:53:56

a VM. But now you're managing that whole thing.

Speaker 2

01:53:57 - 01:54:00

I'm patching, I'm backing up, But

Speaker 1

01:54:00 - 01:54:14

there are actually features to help you. There's a SQL Server IaaS agent extension. And it's free. Well, that actually does, it helps me track licensing, but it also gives me automated backup, automated patching, and

Speaker 2

01:54:14 - 01:54:28

a whole bunch more. So that is available for me. There's a SQL data migration assistant at DMA tool that can help me migrate on-premises SQL up to those various Azure solutions.

Speaker 1

01:54:30 - 01:54:37

When I think about scaling of any of these, a lot of the times I can scale up,

Speaker 2

01:54:39 - 01:55:10

I get a bigger skew. A lot of these don't really scale out very well because it's a single master model. Now what I can do, depending on my requirement, if it's read access that I'm trying to scale out, hey, I can add read replicas. That might be a solution, and I can do that for the premium SKUs, for the business critical SKUs, and Azure SQL Database and Managed Instance both have this automatically provisioned read replicas. So that can help scale outwards.

Speaker 2

01:55:11 - 01:55:14

By now I can go to different instances for my read purposes.

Speaker 1

01:55:16 - 01:55:27

If I have an elastic pool, remember, it's a group of resources shared by multiple databases. So then I do actually get some wiggle room in the resource I can consume, because I'm sharing that resource with other databases.

Speaker 2

01:55:28 - 01:55:54

Sometimes I can use more, sometimes I can use less. Hyperscale obviously shards the data out. So I have this huge scaling this way, and it's gonna automatically provision those page servers based on capacity performance. And I can also get readout scale if I provision at least 1 secondary replica. There are tools I could manually shard, so there's Azure Elastic Database tools, so I could do the sharding myself.

Speaker 2

01:55:56 - 01:56:17

When I think about data security on the solutions, the database, so again we come to security, there's different aspects of security I actually have to think about. A big 1 is, well, what is the data? So I have to classify the data itself.

Speaker 1

01:56:17 - 01:56:20

Now there's different tools to do that. But

Speaker 2

01:56:20 - 01:56:32

is it public data? Is it confidential? Is it restricted? There's things like data discovery and classification solutions built into SQL Server. Things like Azure Purview that will actually go and look at your data.

Speaker 2

01:56:33 - 01:56:40

It can give my complete data lineage. There's things I can do to classify. And once I classify it, I can apply different things to it.

Speaker 1

01:56:40 - 01:56:43

I can think about security when it's at rest.

Speaker 2

01:56:47 - 01:56:51

For Azure SQL Database, there's transparent data encryption, TDE,

Speaker 1

01:56:52 - 01:57:14

so that's just always encrypted. If I think about in transit, well, once again, I have encryption on the connection. Azure SQL MI is running in the VNet. Regular Azure SQL database can have private endpoints but it's a restricted connection to them and I can restrict to only those. So I have that capability.

Speaker 1

01:57:15 - 01:57:35

And if I think about in use, there's 2 different ways of thinking about the security, the encryption in use. There's the idea of hiding data from the end user. So think of a social security number. So there's dynamic data masking. This allows me to basically write a function

Speaker 2

01:57:36 - 01:57:47

that, hey, if the data is this classification, hide all of it except the last 4 characters. That would be useful for a social security number. So it's not encrypting it differently,

Speaker 1

01:57:47 - 01:57:49

but when I try and view the data, if

Speaker 2

01:57:49 - 01:57:54

I don't have the right permissions, I see the masked version of the data. I can't see all of it.

Speaker 1

01:57:55 - 01:58:11

And then we also have the idea of always encrypted. So always encrypted, well, this is using client-side encryption. So even if I'm the DBA, that data is just encrypted. There's nothing I can do. It's completely encrypted

Speaker 2

01:58:11 - 01:58:16

away. As the DBA admin, I can't do anything about that.

Speaker 1

01:58:17 - 01:58:32

There are other types of SQL solutions. So there's Azure SQL Edge. So Azure SQL Edge is really optimized for Internet of Things. Internet of Things is

Speaker 2

01:58:32 - 01:58:41

all about this idea that, hey, there's chips in everything. And they tend to generate these huge streams of information, maybe it's a sensor, for example.

Speaker 1

01:58:42 - 01:58:51

And so the whole point of Azure SQL Edge is it's very lightweight. And by lightweight, I'm really talking about it's less

Speaker 2

01:58:51 - 01:58:53

than a 500 megabyte memory footprint.

Speaker 1

01:58:54 - 01:59:04

So I can think about these constant streams of data. So this could be used, so there's some streaming engine locally to stream into this thing, then that could be used by some separate business logic process

Speaker 2

01:59:05 - 01:59:34

to actually do analysis, maybe machine learning solutions. There's 2 versions of this. There's a developer option, which is like 4 cores and 32 gigabytes of memory, and the regular production, which is 8 cores and 64 gigabytes of memory. It can run in both a connected mode, where it will pull it down from the marketplace, or a disconnected mode, I can grab a Docker image, but this is a Linux containerized version of SQL, and that's where we are. Finally,

Speaker 1

01:59:36 - 01:59:38

we have the whole idea about the semi-structured

Speaker 2

01:59:39 - 01:59:48

data. This is when we talk about typically documents, JSON, XML, whatever that might be.

Speaker 1

01:59:49 - 02:00:08

The typical solution, there's a very easy basic version. I think about Azure Storage accounts, there is a fourth type. So there's blobs, queues, files, then there's tables. So there is tables. So tables gives me this very simple

Speaker 2

02:00:09 - 02:00:17

key value type store. Table is fantastic for that. It's very, very basic though.

Speaker 1

02:00:17 - 02:00:27

But if I just have a very simple requirement, that may meet that. The bigger, rich solution is Cosmos DB.

Speaker 2

02:00:28 - 02:00:49

This was a born in the cloud database. It has things like multi-region. So I can say, hey, I want this available in multiple regions. It has different consistency models. So to support the multiple regions

Speaker 1

02:00:49 - 02:01:00

what I can actually do is I can say well what is the consistency model I need? Is it a strong consistency? Is it session consistency? All the way through to

Speaker 2

02:01:01 - 02:01:01

eventual.

Speaker 1

02:01:03 - 02:01:16

So I can pick the consistency model so I could have maybe an active-active solution I design and it will eventually get consistent, which is good enough. I just need it consistent at a particular location,

Speaker 2

02:01:17 - 02:01:19

a session, maybe it's a shopping cart or something. And as long

Speaker 1

02:01:19 - 02:01:35

as it eventually gets consistent in the other regions, that's good enough. It does have a table API, so it is compatible with regular table storage. There's prod and non-prod account types. There's provision throughput, where I get a certain amount of request units.

Speaker 2

02:01:37 - 02:02:00

And if I go past that, I'll get fottled down. Or there is auto, now, provisioning capabilities. I pay a little bit more, but now I get the request units up to some max I specify for what I need. A key point about this is it does support various types of APIs and types of storage. So an obvious 1 is the document, those JSON documents I talked about.

Speaker 2

02:02:00 - 02:02:11

So here, hey, from an API perspective, I might be using SQL, I might be using Mongo, kind of DB, when I'm working with the document type data.

Speaker 1

02:02:12 - 02:02:13

Then I can use things like Cassandra

Speaker 2

02:02:15 - 02:02:38

for the columnar. So I'm actually storing the data in the columns, which is very efficient if that's how I want to interact with the data. It does have kind of the table type API, the etcd, the key values. It has Gremlin if it's graph. So the relationship between nodes, etc.

Speaker 2

02:02:39 - 02:02:41

There's a Cosmos DB data migration tool

Speaker 1

02:02:41 - 02:02:51

to migrate data into Cosmos from other types of solutions. So I have all these different options available to me.

Speaker 2

02:02:51 - 02:03:06

And of course I kind of focused on SQL DB, but there are Azure Database for Postgres, Azure Database for MySQL, Azure Database for MariaDB. There's a managed Cassandra offering now. So there's all these different types of service that are available in the environment.

Speaker 1

02:03:08 - 02:03:15

And then when I have all of these different types of solutions, well, often we're not making data out of thin air. There's

Speaker 2

02:03:17 - 02:03:28

data somewhere already. I have the idea that, well, I have data sources. So I have some source for my data,

Speaker 1

02:03:29 - 02:03:31

and I need to do something to it

Speaker 2

02:03:32 - 02:03:41

and then get it to some kind of sync. Some end storage service could be Azure SQL database, it could be a SQL warehouse, i.e. Synapse, could be Cosmos DB.

Speaker 1

02:03:41 - 02:03:47

So I need something to drive that process through. So what we have is Azure

Speaker 2

02:03:49 - 02:04:00

Data Factory. Can see how we need to start a new whiteboard. They're rolling this whiteboard back, thank goodness, because really

Speaker 1

02:04:00 - 02:04:07

the performance of this is terrible. So Azure Data Factory does a number of different things. It's an extract,

Speaker 2

02:04:09 - 02:04:10

transform, load solution.

Speaker 1

02:04:11 - 02:04:19

I get the data out of somewhere, change it in some way, and then load it into something else. Sometimes you'll actually see an ELT, extract it,

Speaker 2

02:04:19 - 02:04:37

load it somewhere and then transform it. It's a data integration solution. But the big deal here is it's an orchestrator. It's doing that orchestration. That is the key power of this.

Speaker 2

02:04:38 - 02:04:48

So I can have all of these different sources. There's like 90 plus built-in connectors. And obviously, there's a whole number of syncs it

Speaker 1

02:04:48 - 02:04:51

can talk to. But what I'm going to do

Speaker 2

02:04:53 - 02:05:18

is create a pipeline. And what my pipeline is going to do is I have these integration runtimes. That is Azure hosted integration runtimes, where it's obviously hosted in Azure, and I can integrate with Azure based services. There's basic things I can do, some simple data transformations natively in Data Factory itself.

Speaker 1

02:05:18 - 02:05:25

But then I can also, for this integration runtime, yes, there's Azure, but there's also self-hosted.

Speaker 2

02:05:27 - 02:05:32

So imagine I had the idea where I have some data source on-prem,

Speaker 1

02:05:35 - 02:05:39

and I need to feed that into Azure Data Factory. I don't want to open up

Speaker 2

02:05:39 - 02:06:17

a firewall port so Azure can talk to my on-prem. So I would have a self-hosted integration runtime on-premises that would feed the data in and could execute the actions given to me by the orchestrator that could then go and feed into the other things. It might hook into Azure Databricks, it might hook into Hadoop, transform the data, and then send it to whatever that target service actually is. So a pipeline is just a set of activities that perform a task. An activity to ingest the data, to transform the data, again, using Databricks, which is Apache Spark, HDInsight, Azure Functions, whatever I wanna do.

Speaker 2

02:06:18 - 02:06:29

Again, little basic transformations that I can do natively here. But this brings that complete story together. I want to get the data in from somewhere. Could be a

Speaker 1

02:06:29 - 02:06:35

CRM system, for example, or multiple systems. I may then actually load it directly. I might put it in

Speaker 2

02:06:35 - 02:06:38

a data lake straight away. And we'll talk about that in a second.

Speaker 1

02:06:38 - 02:06:41

Then I want to transform it into a structure,

Speaker 2

02:06:41 - 02:07:05

a format that is then useful for analysis. So I'll use those data bricks. I'll use that HDInsight to convert it, map reduce functions, to then store it into maybe a SQL database or something else that I can actually then do something with. So that's the whole point. If I'm saying, hey, I want to be able to get data from somewhere, put it somewhere else and do this transformation, probably gonna be Azure Data Factory.

Speaker 2

02:07:05 - 02:07:11

That's gonna be what we do. Now I talked about the idea of Azure Data Lake.

Speaker 1

02:07:12 - 02:07:25

So the key point here is in The old days when storage was really expensive and scarce, we would extract the data, we'd have to transform it straight away to get it to a much smaller amount of data. So we only had what

Speaker 2

02:07:25 - 02:07:30

we needed to do the analysis because storing it was really expensive. It's really not the case anymore.

Speaker 1

02:07:31 - 02:07:38

Now, we might wanna just store the data almost straight away in its native raw format.

Speaker 2

02:07:41 - 02:07:43

We have the whole idea of a data lake.

Speaker 1

02:07:45 - 02:07:51

And the benefit of the data lake is now, if I'm not 100% sure of what I might want to do with

Speaker 2

02:07:51 - 02:07:52

the data in the future,

Speaker 1

02:07:53 - 02:08:04

as soon as it reads this in, I can kind of go and store it in the data lake. Then I can transform it, whatever I want to do. If in the future I have a new set of analysis I

Speaker 2

02:08:04 - 02:08:07

want to perform, requirements I need, well I

Speaker 1

02:08:07 - 02:08:15

can go back to the data lake that has the raw format and transform it in a different way to get some different insights into that data.

Speaker 2

02:08:16 - 02:08:20

So this is ADLS, Azure Data Lake Storage Gen

Speaker 1

02:08:20 - 02:08:20

2,

Speaker 2

02:08:21 - 02:08:34

and it is absolutely built on top of Blob. What it adds is kind of that hierarchical namespace. So I have true folders. It adds things like POSIX style ACLs. I'm drawing star and star because this

Speaker 1

02:08:34 - 02:08:36

is just falling apart now.

Speaker 2

02:08:37 - 02:08:43

But it supports things like Hadoop HDFS in terms of the interactions. So that's a key point around that.

Speaker 1

02:08:44 - 02:08:47

I mentioned Databricks. So Databricks is a managed

Speaker 2

02:08:48 - 02:09:06

offering of Databricks, which is built on Apache Spark. I can use SQL, Java, Python. It's really for big data processing and analytics. So if I need to perform some job like that, Hey, Databricks, which I could call from here or directly, is fantastic for that.

Speaker 1

02:09:06 - 02:09:16

There's also things like Azure Synapse Analytics. So Azure Synapse Analytics is a solution. And really what it comprises of is a set of other solutions,

Speaker 2

02:09:16 - 02:09:35

but it brings them all together. It brings them together from a UI perspective. It uses things like Azure Data Factory for analysis. It has its own Apache Spark capabilities for data processing, the Synapse Spark pool. It has a Synapse SQL pool dedicated and serverless offerings.

Speaker 2

02:09:36 - 02:09:46

It has a Synapse link so I can go and talk to Cosmos DB and it has its own IDE. I'm Trying to think what else.

Speaker 1

02:09:46 - 02:09:47

Just in terms of regular

Speaker 2

02:09:48 - 02:09:53

data ingestion, you might hear the idea of a hot path, a cool path.

Speaker 1

02:09:58 - 02:10:12

I have different types of data coming into my system. If I think about warm, so if a warm path, there's some data coming in, that I need to get information about it pretty quick so I

Speaker 2

02:10:12 - 02:10:14

can get some insight out of it.

Speaker 1

02:10:14 - 02:10:22

So that would be a warm, so as data is flowing through near real time, I wanna do something and store that data.

Speaker 2

02:10:23 - 02:10:29

Maybe I wanna store it in Azure SQL Database or Cosmos DB. Azure Stream Analytics would be good for that.

Speaker 1

02:10:29 - 02:10:43

I can have a cold path. So a cold path is historical data. I want to analyze past data. Now, I might also need to merge it with a warm path, which I can do that. Azure Data Factory would be fantastic for that.

Speaker 2

02:10:43 - 02:10:55

Then it might be a hot path. A hot path is real-time analysis. It could be very latency sensitive. Maybe it's looking at sensors and detecting a failure of some kind. I need to do that instantly.

Speaker 2

02:10:55 - 02:11:30

So Hotpath would be very useful for doing that. I mentioned Azure Stream Analytics. So this is a fully build based on the CPU, the memory I'm consuming. It's just designed for, hey, if I have that idea that I have some maybe sensor, could be IoT, and I'm streaming in that data, well that Azure Stream Analytics, I'm just gonna write stream because this board is failing miserably,

Speaker 1

02:11:32 - 02:11:40

that Stream Analytics will take that data in and it will do the event processing for that data coming in so I

Speaker 2

02:11:40 - 02:11:59

can actually get insights out of it. So that's all about the Azure Stream Analytics. It could then send it to Blob, to SQL, to Cosmos DB, to Azure Synapse Analytics for Data Warehouse. It could send it to Power BI for data visualization. I could trigger an Azure function for some serverless compute.

Speaker 2

02:11:59 - 02:12:11

But It's all about, hey, I need to perform a thing on this constant stream of data coming in. So that's when we think about the data solutions part. These different types of data. Again, what are the requirements?

Speaker 1

02:12:12 - 02:12:16

And then if there's multiple solutions, what's the right way to solve it in the most kind

Speaker 2

02:12:16 - 02:12:36

of cost optimal way that meets those requirements. So that's the key point for there. The last kind of main module before we go to the well-architected framework, which brings a lot of these things together, is the design infrastructure solutions. In Azure, there is a huge range of compute solutions available.

Speaker 1

02:12:37 - 02:12:40

And the best way to really think about the 1 we choose is

Speaker 2

02:12:40 - 02:13:09

a lot about the responsibility. Is it my responsibility as the customer, or is it Azure's responsibility as the provider of the service? And we always draw this idea of layers. So I can think about, well, there's the network, there's storage, there's the servers themselves, the compute. There's some type of virtualizations.

Speaker 2

02:13:09 - 02:13:40

There's a hypervisor. And then we get the operating system running inside, typically whatever that construct is. We might say a virtual machine, but even things like AKS and app services, they're built on virtual machines. There's an operating system, there might be a runtime like .NET or J2E, some middleware solution. Then the actual app and the data, the thing we really care about at the end of the day, that's what brings business value that differentiates us from someone else.

Speaker 2

02:13:42 - 02:13:59

If I think about those components for an on-premises solution. So on-premises, who's responsible? I mean, it's me. I'm responsible for every single component of that.

Speaker 1

02:13:59 - 02:14:06

I might have different teams in my company, but it's me. As I start to move

Speaker 2

02:14:06 - 02:14:13

to the cloud, we often start with the idea of infrastructure as a service, i.e., a VM in the cloud.

Speaker 1

02:14:13 - 02:14:15

And we have this kind of delineation

Speaker 2

02:14:16 - 02:14:33

that starts there for IaaS. So now, the provider of the service, they're responsible for the physical fabric. They're responsible for the hypervisor. What I get is a VM in the cloud.

Speaker 1

02:14:33 - 02:14:43

So I'm responsible for the operating system, the app, the run times, all of those things. I don't have to worry about the physical fabric anymore. So I get the most flexibility

Speaker 2

02:14:45 - 02:14:49

up here, but I have the most responsibility. Now,

Speaker 1

02:14:49 - 02:14:52

even when we talk about things like virtual machines, as we'll see, there

Speaker 2

02:14:52 - 02:15:11

are things to help me. There are extensions, there are agents that help me do parts of that job. But I am responsible for turning those right things on, for doing that. Then we move into platform as a service. Now, platform as a service, that line now moves all the way up to there.

Speaker 2

02:15:12 - 02:15:24

The only part I'm responsible now is my app and my data. Now the cloud provider is responsible for all of those other things.

Speaker 1

02:15:25 - 02:15:28

There are still VMs there. It's not, it doesn't magically run on

Speaker 2

02:15:28 - 02:15:42

thin air for most of the time, but I'm not responsible for that. Saying else is managing the operating system, patching it, it's security, that is not my problem. I just focus on my app and my data that drives that business value.

Speaker 1

02:15:42 - 02:15:44

Then there is also software as a service.

Speaker 2

02:15:46 - 02:15:55

Software as a service, I don't do anything. I'm not installing SharePoint or Exchange or like Office 365 would be a good example of that.

Speaker 1

02:15:56 - 02:16:19

So those are the responsibility shifts we basically see. Now, there are a whole bunch of different solutions, and we're going to kind of draw those into here. But when I'm thinking about which 1 should I use, like most of the times, we want to do as little work as possible. So if there is something that can kind of do the job for me, I want to use that. So in the architecture side, there's a decision tree.

Speaker 2

02:16:20 - 02:16:24

And it really boils down, we kind of look into this,

Speaker 1

02:16:24 - 02:16:28

as well, okay, am I starting, am I migrating, or

Speaker 2

02:16:28 - 02:16:34

am I building new? If I'm migrating, is it lift and shift? Can it be containerized? Well, I'm probably going

Speaker 1

02:16:34 - 02:16:35

to end up with a VM,

Speaker 2

02:16:36 - 02:16:40

or maybe I can put it in Azure App Service, if it's like a basic website, for example.

Speaker 1

02:16:40 - 02:16:56

If I'm building new, well, if I require full control, it's saying use a VM. If it's not, is it a high-performance cloud workload? Or I could use Azure Batch. Is it a microservice? Is it event-driven, i.e.

Speaker 1

02:16:56 - 02:17:16

Serverless with short-lived processes, Azure Functions? Do I need a full-fledged orchestration for my container environment? No, I can use ACI. Yes, well then there are different options for this rich containerized environment. It could be AKS, it could be Azure Service Fabric.

Speaker 1

02:17:16 - 02:17:22

There's all these different options and it goes through how to think about actually picking those.

Speaker 2

02:17:23 - 02:17:50

So we have different options of compute available to us. Now I'm going to start with the most basic 1. So if we start from kind of the beginning, that most fundamental level, we think about a virtual machine. So a virtual machine really is kind of that building block of what is infrastructure as a service. Now even with virtual machines, and we're going to talk about this in the next module, I have to think about it's a consumption-based service.

Speaker 2

02:17:50 - 02:17:54

I pay for the seconds. It's provisioned. It's deployed to a particular host.

Speaker 1

02:17:55 - 02:18:20

But there are different sizes available. There's a huge number of different sizes available based on the shape of my virtual machine. And by shape, what we're really going to talk about is the idea that, well, there's memory, there's CPU, there's storage performance, there's network performance. And so then there's a whole bunch of different VM sizes based around those different ratios of kind of CPU to memory.

Speaker 2

02:18:20 - 02:18:34

So I could see, for example, with compute optimized, a typical ratio here is kind of that 1 to 2. For every 1 CPU, I get 2 gigs of memory. So that's compute optimized.

Speaker 1

02:18:35 - 02:18:43

If I look at general purpose, well, a general purpose 1, it's a bit more balanced. It's 1 to 4.

Speaker 2

02:18:45 - 02:18:52

I see that ratio. If it's memory optimized, well, as you would expect now, oh, didn't

Speaker 1

02:18:52 - 02:19:15

mean to click that. If I click memory optimized, well, now it's 1 to 8. Also along with that, we see things like amount of temporary storage, we might see network performance changes as well. So the bigger the VM, typically the bigger the other constructs and the other attributes of that VM

Speaker 2

02:19:15 - 02:19:18

will grow as well. So we have those different options.

Speaker 1

02:19:18 - 02:19:20

So we need to understand load, so we

Speaker 2

02:19:20 - 02:19:26

can pick the right type of virtual machine. And again, we're gonna go into detail when we think about optimization.

Speaker 1

02:19:27 - 02:19:42

So we have basically virtual machines. Then there's things like Batch. A Batch is just a pool of compute resources built on virtual machines. For this large-scale parallel workloads for high-performance computing, I can create a job.

Speaker 2

02:19:43 - 02:19:45

So I'm thinking about Batch over here.

Speaker 1

02:19:47 - 02:20:06

And that job, it might actually use thousands of virtual machines that it's going to scale to. That could be Windows or Linux. And I just configure the app that I want to be used as part of that. I run the job, which consists of those various tasks. There are things like virtual machine scale sets.

Speaker 1

02:20:06 - 02:20:14

So with virtual machine scale set, it's built on virtual machines, but now what I'm saying is, hey, I've

Speaker 2

02:20:14 - 02:20:33

just got n number of this particular resource, this website of this processing whatever, and it will go and create the virtual machines. It can auto scale based on maybe a schedule based on some trigger like a metric threshold is crossed,

Speaker 1

02:20:33 - 02:20:53

and it will delete them. So you can create, delete, create, delete as my variations vary. So the key point of the cloud, it's consumption-based. I want to make that whole allocation match the demand of the actual app. So virtual machine scale sets will create and remove, delete the VMs, including the storage.

Speaker 1

02:20:53 - 02:21:09

So I'm optimizing that cost as well as I'm going along. Then other things like app services. So I might draw App Service kind of up here. App Services are great for HTTP-based workloads. This could be a web application.

Speaker 1

02:21:09 - 02:21:17

It could be a RESTful API. It could be a mobile backend. I can use a variety of languages and runtimes. I can use Windows. I can use Linux.

Speaker 1

02:21:18 - 02:21:21

There are different SKUs that have different features, different scale capabilities.

Speaker 2

02:21:22 - 02:21:23

Some of them have auto scale. Some of

Speaker 1

02:21:23 - 02:21:40

them have deployment slots. So I could have a production deployment slot, a pre-production where I can warm up the code. They're gonna share the same set of resources in the same app service plan, but it allows me to switch over, failover quite nicely. They can have high availability. I can auto deploy them via pipelines.

Speaker 2

02:21:41 - 02:21:59

They have their own built in authentication capabilities. I can hook into like an open ID connect solution. So all these different options available. So there's web applications for basic ASP.NET, Java, Node.js type apps. I have web jobs, which are just running some background program or script.

Speaker 2

02:21:59 - 02:22:10

We have mobile apps. This could be a back end for an iOS or an Android app. It has built-in capabilities to help with things like push notifications. So web apps are fantastic for that.

Speaker 1

02:22:10 - 02:22:15

We have things like Azure Container Instances. So hey I have a container workload.

Speaker 2

02:22:16 - 02:22:21

I don't need a full orchestrator. I don't need auto scaling.

Speaker 1

02:22:21 - 02:22:46

I have a couple of containers I want to push out super simply. Maybe I have a couple of them need to talk to each other so I can actually create this concept of a container group. That container group contains multiple Azure Container Instances. They can then talk to each other, they can share a set of Azure files for persistent storage. It's a per second billing, there's various SKUs, again, Windows or Linux, different sizes.

Speaker 1

02:22:47 - 02:22:48

But it's a very

Speaker 2

02:22:48 - 02:23:17

simple, hey, there's this image, deploy this, I need that. But then you get things like Azure Kubernetes Service. This is a full Kubernetes managed environment, the full rich orchestration, deployment, YAMLs, different types of networking and policy. It has the control plane fully managed for me. Then I have the nodes that actually run the pods, and a pod contains a particular container instance.

Speaker 2

02:23:18 - 02:23:40

So if I need a full, rich Kubernetes environment, AKS is a great option because I don't really have to worry about the Kubernetes. With AKS, I get autoscale. We get the idea of scaling the pods, so the instances of some microservice as a horizontal pod autoscaler based on some maybe metric of the pod.

Speaker 1

02:23:41 - 02:23:45

And I can even scale out the actual nodes running the pods, this

Speaker 2

02:23:45 - 02:23:53

thing called the Cluster Autoscaler. So based on the scheduler's ability to actually put pods there, I mean, it's trying to scan out, but the nodes are full.

Speaker 1

02:23:53 - 02:23:54

The scheduler says, hey, I'm trying

Speaker 2

02:23:54 - 02:24:08

to schedule this pod and I can't. Hey, we should go and add another node that I can run pods on, which is built on virtual machine scale sets. So you see this kind of commonality. And then we get into serverless.

Speaker 1

02:24:09 - 02:24:34

All of these, I'm still really paying for some back-end workload. I'm still paying for some VM size that's running for a certain duration. And what I might want to do is I don't want that at all. I just actually want to pay for the CPU cycles I'm actually using when it's triggered. It's some triggered event-driven solution.

Speaker 1

02:24:35 - 02:24:37

So 1 of the solutions here is Azure Functions.

Speaker 2

02:24:40 - 02:24:49

Generally they're short-lived, but I can have state-based with durable functions so it can trigger certain things, it could fan out, it could wait for some kind of interaction.

Speaker 1

02:24:50 - 02:25:03

I can actually run that in a pure consumption plan, so I just pay for the resources I'm using. Or I can actually run functions inside an app service plan and use its resources or have a dedicated set of resources. But this is going

Speaker 2

02:25:03 - 02:25:20

to be triggered, it's event-driven. There's a REST API, there's a blob has been created, event grid, which we'll talk about, can call Azure Functions when it sees something else. But I have a trigger and I can bind to other inputs and bind to outputs. There's a whole number of different bindings

Speaker 1

02:25:20 - 02:25:31

it can do. There's also logic apps. So logic apps are built more around the idea that I have some, this is code, I'm writing code for functions. This is

Speaker 2

02:25:31 - 02:25:58

a nice graphical designer that, hey, based on this happening, we'll drag a little box and create this. Now call this. Now call this. And this might be easier if we just kind of go and look at 1 of these. So If I quickly look at my Logic Apps, if I go to my Logic Apps Designer, we notice it's this little graph.

Speaker 2

02:25:58 - 02:26:20

I'm not writing code. I'm just dragging the things I want to do, but there are templates. And here, hey look, when a message is received on a service, we ask you when a tweet is posted. So there's all these different things I can do and it's showing me the different connectors. Huge number of connectors are available that I have available.

Speaker 1

02:26:20 - 02:26:22

So in this nice little graphical view

Speaker 2

02:26:22 - 02:26:35

of things, I can trigger these things to actually run. So there are other services, there are other things available to me, but there are some of the kind of key ones we typically focus on.

Speaker 1

02:26:36 - 02:26:46

Now when I think about my application architecture, I might have a mixture of these things. The point is we try and go as far upwards as we can.

Speaker 2

02:26:46 - 02:26:52

If there's a SaaS solution available, I'm going to use that. If I can use serverless, I'm going to use that. Again, meet the requirement in

Speaker 1

02:26:52 - 02:26:55

the most cost optimal way. Serverless is generally going

Speaker 2

02:26:55 - 02:27:00

to be the most cost optimal way. And I move kind of down as I need to.

Speaker 1

02:27:00 - 02:27:02

A VM is the most flexibility, but

Speaker 2

02:27:02 - 02:27:18

again, there's the most work and responsibility involved. Ideally, I wanna focus on this data and the app that provides business value. I don't want to be managing virtual machines, if I can get away with it. So we try and get as far up there as we actually can. So most

Speaker 1

02:27:18 - 02:27:19

of the time, our applications,

Speaker 2

02:27:21 - 02:27:37

modern architecture is about, we've moved from this big monolithic thing, this 1 block of code that had very tight couplings between different functions, to we have the whole idea of these more distributed, loosely coupled microservices,

Speaker 1

02:27:37 - 02:27:57

things that have a certain, very specific function. But obviously, they still have to be able to communicate and call each other. Now, to do that very loose coupling, there's different ways we can do that. The ultimate decoupling is maybe I have an event or a message

Speaker 2

02:27:58 - 02:28:10

kind of to trigger between them. So I have the idea of an event or I have the idea of a message. It's possible I would use absolutely I could have a combination of these.

Speaker 1

02:28:11 - 02:28:14

So when I think about an event, an event is a lightweight

Speaker 2

02:28:18 - 02:28:32

Notification. Something is generating an event. It's maybe a notification of a state change of something. Something has happened. I, a blog file got created.

Speaker 2

02:28:32 - 02:28:50

Doesn't contain the blog file, it's just letting you know, hey, a blog file got created. And there's no maybe expectation of something's gonna happen with this. There could be multiple subscribers. There's no, the publisher has no expectation of what's gonna really happen next. That's different from a message.

Speaker 2

02:28:50 - 02:28:57

So with a message, it's the actual data. I'm containing actual data.

Speaker 1

02:28:58 - 02:29:28

And when I think about this, The publisher of that message has an expectation that the consumer of that message is gonna do something. There's some expected next step that's actually gonna happen. Now, if I think about solutions for events, So we have Event Hub. So Event Hub is about large scale real time data ingestion. Remember the monitoring, we could use Event Hub.

Speaker 1

02:29:28 - 02:29:52

So I could publish the events to the Event Hub and then multiple things could subscribe to it. This could be data streamed in. It's a pull model, so something is publishing to the event hub, and then something else subscribes and pulls those events off. It doesn't delete it, so it could be read by something else. So this is the whole idea of, hey, I have this event hub.

Speaker 1

02:29:52 - 02:30:05

Then there's also event grid. Now, event grid is really focused on the idea that I have something generating events.

Speaker 2

02:30:07 - 02:30:08

Lots of things can generate events.

Speaker 1

02:30:09 - 02:30:12

And so I can have the idea of there's some

Speaker 2

02:30:14 - 02:30:16

event generated and then I

Speaker 1

02:30:16 - 02:30:22

want something to actually respond to it. So there's this idea of a source

Speaker 2

02:30:22 - 02:30:23

of the event and then I want something to actually respond to it. So there's this idea of a source of the event,

Speaker 1

02:30:24 - 02:30:28

and then I want something to handle it. So I have the whole idea of handlers.

Speaker 2

02:30:29 - 02:30:32

This could be like Azure Functions, something that is event driven.

Speaker 1

02:30:32 - 02:30:36

Now in the past a lot of the way these had to work is it was a hammer poll.

Speaker 2

02:30:36 - 02:30:40

Hey have you done something? Have you done something? Have you done something? I don't want to do that.

Speaker 1

02:30:40 - 02:31:01

So event grid sits in the middle. The whole point of event grid is it understands a whole bunch of different types of things that can generate events, then I can register the handlers with event grid. So event grid does the work of seeing the event and then calling the handler. Now, there's a huge number of

Speaker 2

02:31:01 - 02:31:14

different things that this can actually work with. So if we look at the documentation for a second, notice this idea of the event sources. Blob, resource group, subscriptions. So hey, something gets created. IoT, Hub, Maps.

Speaker 2

02:31:14 - 02:31:15

I mean, there's

Speaker 1

02:31:15 - 02:31:50

a whole bunch of these, and then it can have event handlers, typically serverless Azure functions. Logic App, a service bus, Event Hub. So realize, I said they could work together, Event Grid might then trigger pushing it to Event Hub, because then maybe something else is going to do something with that. It could call an Azure automation. So you have this really great capability now of regardless of what is generating that event, many almost anything that is event-driven can actually be on the other side of that.

Speaker 2

02:31:51 - 02:31:55

From a messaging solution, remember the storage account?

Speaker 1

02:31:56 - 02:32:03

We had the idea of those Azure queues. That is a very, very simple basic solution.

Speaker 2

02:32:03 - 02:32:13

It's a simple first in, first out. That's really what that is gonna give me. A richer solution is the Azure Service Bus.

Speaker 1

02:32:15 - 02:32:21

Now the Azure Service Bus can actually run in different ways. It's an enterprise solution. It has basic queues.

Speaker 2

02:32:22 - 02:32:27

If I do just want that regular first in, first out, which is a one-to-one expectation.

Speaker 1

02:32:28 - 02:32:36

But it also has the idea of topics. So with topics, I can have subscribers to that topic

Speaker 2

02:32:36 - 02:33:04

and what it will actually do is it will create a copy of the message for each of the subscribers. They can then go read it and perform some action on that. So that that's a nice capability we can actually hook in and do something with that. There's also caching solutions. It might be that the backend data store for whatever I'm doing is not as quick as what we're trying to do things, so we have to cache it.

Speaker 2

02:33:04 - 02:33:33

There are in-memory caches, like Redis Cache is very popular in-memory cache. There's Azure Redis Cache, a managed offering of that, that could cache things for things like Azure SQL Database, Cosmos DB. I could use it as a content cache for static content. I could use it as a data cache, session storage, basic queuing. There's a whole bunch of different things I can do with that in-memory cache, but it can act as that buffer for some other type of storage.

Speaker 1

02:33:35 - 02:33:37

Some of the things we might

Speaker 2

02:33:37 - 02:34:03

be offering up here is an API. Now when I think about an API, I may not want to just directly offer that out to whatever that end consumer is. So I might have multiple APIs being offered from my service. So in my environment, I might have a whole bunch of different things that provide an API. I have a whole bunch of different consumers that want to consume those APIs.

Speaker 1

02:34:05 - 02:34:08

Well, what I can put in the middle of these

Speaker 2

02:34:09 - 02:34:12

is actually the Azure API Management.

Speaker 1

02:34:14 - 02:34:20

So the Azure API Management provides a point that I can offer out,

Speaker 2

02:34:20 - 02:34:44

and then it will redirect to whatever is actually providing that API. It has the ability to hook into its own kind of authentication schemes. So a good example here is, maybe I wanna bring something that doesn't integrate with Azure AD. It has its own authentication or authorization like OAuth2. I can actually integrate this with like a third-party OAuth2 system.

Speaker 2

02:34:45 - 02:34:59

It's secure. It has its own kind of encryption, it's providing that gateway, that security for me. So it gives me a lot of those nice capabilities to actually provide those services outside to other people.

Speaker 1

02:35:00 - 02:35:08

So I have my compute, I have how they talk to each other. Well, yeah, talk to each other

Speaker 2

02:35:08 - 02:35:12

in terms of maybe a REST API or something else, but what about from a networking perspective?

Speaker 1

02:35:13 - 02:35:14

So the next big area we

Speaker 2

02:35:14 - 02:35:24

have is the network itself. And networking is, if you know me, you know networking is probably my favourite thing. So, network and identity, they're my favourite things. So I have

Speaker 1

02:35:24 - 02:35:26

the idea of a virtual network, remember?

Speaker 2

02:35:26 - 02:35:38

A virtual network is 1 or more IPv4 CIDR ranges, and optionally I can have IPv6 as well. So I can have multiple of those. CIDR range would be like the

Speaker 1

02:35:38 - 02:35:38

10.0.0.0

Speaker 2

02:35:40 - 02:36:13

slash 16 or something. So I have a certain CIDR range for this virtual network. A key rule is you never overlap those CIDR ranges. So be it if it's another virtual network that I want to be able to connect it to, be it a network on-premises that I might want to be able to connect it to, they have to use different CIDR ranges. So this would be a different range that does not overlap in any way with each other.

Speaker 2

02:36:13 - 02:36:33

Because if they overlap, I can't route. It breaks things. So if you ever see questions about picking subnets, make sure they're not overlapping. So if this is a certain IP range, or if this is a certain IP range and you have to pick an IP range for this, make sure it doesn't overlap. Remember we break our virtual networks down into subnets, and a subnet is a portion of that IP range.

Speaker 2

02:36:34 - 02:37:01

Any subnet we create, we always lose 5 IP addresses. Remember, it's always kind of the all zeros, all ones for broadcast, and then 1 for the gateway and 2 for DNS purposes. You always lose 5, so think about the sizing of them. When I think about, it's very common to do a slash 24 because it's just easy for our brains. So if you have a certain number of VMs, you have the resources you want to run in a subnet, well, how many is it?

Speaker 2

02:37:01 - 02:37:37

And then you can work out what is the subnet mask. If it's a slash 24, it's roughly 250 resources I can run in that thing. If I'm doing like gateways, like an express route or a site to site VPN, the minimum size is a slash 29, but they generally recommend a 27 because you may want to run express route and site to site. So then it has to be a slash 27. So from a gateway subnet, generally you want a slash 27, but the minimum could be a 29.

Speaker 2

02:37:37 - 02:38:10

So you're gonna think about those. In terms of that connections, we have options, remember? So 1 option could be a kind of site to site VPN, so it's going over the internet. Or I can have a private connection where that's ExpressRoute. So ExpressRoute private peering is a private connection from your network to the Azure backbone, and then private peering lets me then map it to a particular virtual network via a gateway.

Speaker 2

02:38:11 - 02:38:31

I could have ExpressRoute and site-to-site VPN, where the site-to-site VPN would act as the failover for the ExpressRoute. ExpressRoute is not encrypted because it's a private connection. If you needed it encrypted end to end, I could actually run the site-to-site VPN over the ExpressRoute private peering. So that is an option if I have to do that.

Speaker 1

02:38:33 - 02:38:40

It's very common if I have multiple VNets, what I can do is, well, I can peer them. So if I peer,

Speaker 2

02:38:41 - 02:38:49

that can be in the same region. I can do cross-region peering. So now they can directly talk, using again that Azure backbone capability.

Speaker 1

02:38:51 - 02:39:12

I could also have features like, hey, use remote gateway and allow gateway transit. So this VNet, basically be spokes, can use the connectivity of that main hub virtual network and be able to travel across that site-site VPN or that express route private peering. If I want to

Speaker 2

02:39:12 - 02:39:50

segregate, limit the communications, Well, to limit things, we can do network security groups. So, a network security group, remember, is a series of rules, generally based around the source IP port, destination IP port protocol, and do I allow it or not? And then I can link the NSG to particular subnets where it would then enforce the flow of communication. So I can use an NSG. I could also use Azure Firewall or third-party network virtual appliances to control traffic.

Speaker 2

02:39:51 - 02:40:25

If I did something like an Azure firewall, obviously I have to make sure traffic flows through it. The way I can make traffic change from the default routes is I can create user-defined routes. So user-defined route says hey when I'm going to this particular IP range this is now my next hop instead of what the default. So my next hop would actually be, for example, Azure Firewall. So I can say, hey, when you want to go somewhere, we're actually now going to hop via the Azure Firewall that could control that next flow.

Speaker 2

02:40:25 - 02:40:32

So user-defined routes, let me alter the flow of traffic to go via something else.

Speaker 1

02:40:33 - 02:40:36

If I want to control access to

Speaker 2

02:40:36 - 02:40:42

services, we talked before about the idea of service endpoints and private endpoints.

Speaker 1

02:40:43 - 02:40:45

If I had a storage account or It could

Speaker 2

02:40:45 - 02:40:48

be a SQL database, just giving an example.

Speaker 1

02:40:49 - 02:41:03

If I want to control which resources can talk to it, remember, storage accounts, SQL, they have their own kind of firewalls, but a virtual network is typically an OFC 1918, i.e. A non-routable IP space.

Speaker 2

02:41:04 - 02:41:17

What I can do is I can actually turn on something called a service endpoint at a subnet level. And I could say I want to turn that on for storage. And I could even do a particular region.

Speaker 1

02:41:17 - 02:41:36

What that lets me now do is add an instance of that storage type on its firewall, I could actually say, hey, let's say a subnet 1234, I'm gonna say subnet 2, I'm gonna let through. So a service endpoint does 2 things. It gives me an optimal route

Speaker 2

02:41:37 - 02:41:52

to the service and it lets me restrict to only that subnet or things in that subnet. It's about to talk to it. But it's only things in the subnet. It does not apply to other things on the VNet or on-premises, it doesn't work that way.

Speaker 1

02:41:53 - 02:41:59

My other option to control access to services would actually be, hey, let's say I've got a

Speaker 2

02:41:59 - 02:42:02

different storage account, storage account 2,

Speaker 1

02:42:03 - 02:42:11

This time we'll create a private endpoint. So private endpoint is just an IP address from that subnet. And it points to

Speaker 2

02:42:11 - 02:42:28

a particular instance of a service. And it bypasses any firewall. So I could completely block that from its public IP now, and only let it access that private endpoint IP. This is an IP address from this IP range. Well, I could now get

Speaker 1

02:42:28 - 02:42:36

to it from peered VNets. I could get to it even from on-premises. It's just an IP address. There is some special DNS requirements

Speaker 2

02:42:37 - 02:42:37

because it has to

Speaker 1

02:42:37 - 02:42:49

be able to resolve the public name. In Azure, I can use Azure private DNS zones. If I was on-premises, I'd have to maybe create the private link zones for the particular service or have it forward to a DNS forwarder that can talk

Speaker 2

02:42:49 - 02:42:52

to Azure DNS to manage that for me.

Speaker 1

02:42:53 - 02:43:21

If I have virtual machines and I want to be able to RDP or SSH to them, then we just open that up to the Internet. There's a service called Azure Bastion. Azure Bastion provides a managed jump box that through the portal, I can connect to my instances in the VNet of the Bastion or peered or even now on-premises ones as well. If I want to control access, remember that conditional access we talked about? Well, Azure Bastion is via the portal.

Speaker 1

02:43:22 - 02:43:24

The portal is controlled by that Microsoft Azure Management.

Speaker 2

02:43:24 - 02:43:47

So I could lock it down through there. And there's Azure Virtual WAN, which is basically a managed hub. I don't manage gateways or peering things anymore. It provides a black box managed solution so I don't have to care about those things anymore. And I guess the final part of all of this would really be migration.

Speaker 1

02:43:53 - 02:44:30

Now when I think about migration, the key point here is I have to understand what we have and as a key element of what do we want. What is our corporate direction? So I need to know where I'm coming from, I need to understand where I want to go because then I can drive what should be my desired architecture because what we have to what we want, well, that's our migration. So I have to have quality knowledge about what we have. I have to, from what we have, What are the dependencies?

Speaker 1

02:44:34 - 02:44:43

What is the usage? I.e. The peaks, the troughs, seasonality. I need to know all of those things. What are kind

Speaker 2

02:44:43 - 02:44:47

of the SLAs required? What is the HA and the DR requirements?

Speaker 1

02:44:47 - 02:44:59

I need to understand all of those things so I can architect the right solution. Now there's a whole cloud adoption framework that has multiple steps. I have a whole video,

Speaker 2

02:44:59 - 02:45:06

again it's linked in the playlist for this AZ305 study about this. But when I think about migration,

Speaker 1

02:45:08 - 02:45:12

there's kind of 4 models to this. Now there's steps,

Speaker 2

02:45:12 - 02:45:23

what is my strategy, my planning, getting ready, migrating, innovating, and then obviously governing and migrating the actual resources. But I

Speaker 1

02:45:23 - 02:45:27

think there's 4 key types of actual migration. There's rehost.

Speaker 2

02:45:30 - 02:45:41

Rehost is the simplest. You can think lift and shift. Rehost is, hey, it's running in this. I'm going to stick it in IaaS virtual machines. It might be some minimal modifications, but really not really.

Speaker 2

02:45:41 - 02:46:03

I'm taking what it is now, and I'm just running it in the cloud. Then we have refactor. Refactor, again, is I'm not changing the app code, but maybe I can move it from a SQL server running in IaaS VM or Postgres running in a VM to Azure SQL Database or Azure SQL MI, or maybe Azure Database for MySQL, or

Speaker 1

02:46:03 - 02:46:22

MariaDB, or PostgreSQL. So I'm moving up some. Maybe I can even move it in an app service, or I can run it in a container, not changing the app code, I'm just refactoring how I'm running the thing. There might be re-architect. I am doing some code changes.

Speaker 1

02:46:23 - 02:46:26

Maybe I'm now modifying the app and moving to microservices so

Speaker 2

02:46:26 - 02:46:49

I can use more cloud native types of service for this. And then we actually have rebuild. I'm starting from scratch, I'm looking at my requirements, I'm designing it for cloud native solutions, managed databases, Azure functions, all of those great things that's gonna get us the most. Now, this is gonna give us the best, most cloud native solution, but

Speaker 1

02:46:49 - 02:46:59

it's the most work. When I talk about those, what we have and what we want, remember what we want will also include things like time scales.

Speaker 2

02:47:02 - 02:47:22

It will include how much I'm willing to spend on the migration. And then there's a balance. Well, there's the cost to migrate and then the cost of running it. Maybe I can spend more money in the migration, then I'll save more money when it's running. So over years, it's actually cheaper to spend a bit more money and maybe refactor or re-architect.

Speaker 2

02:47:23 - 02:47:26

And just, yes, it's the cheapest option to just re-host.

Speaker 1

02:47:27 - 02:47:32

But overall, it's going to cost me more money. There are lots of different tools.

Speaker 2

02:47:32 - 02:48:08

So there's the Azure Migrate solution. This is the overall solution that can do assessment, it can help you with the planning, it can help you with the migration across a whole bunch of different workloads, VMs, databases, web apps, putting things in containers. I talked about the database migration assistant for SQL. And there's the Azure Database Migration Service that can migrate to different types of database like SQL and Cosmos DB and Azure Database for MySQL, Azure Database for Postgres, both online and offline. There's a Cosmos DB, data migration tool.

Speaker 1

02:48:08 - 02:48:16

When I'm trying to understand that dependency between components, there's things like Service Map. Service map goes and looks at what

Speaker 2

02:48:16 - 02:48:35

are the network communication calls between different components and then can work out, well, based on this port, this protocol, how about a SQL database? Oh, it's using Active Directory. It can go and work out those dependencies for you. So as I plan my migration, I don't forget something. Oh, I forgot about that bit.

Speaker 2

02:48:35 - 02:48:51

Well, now that's traveling over an express route link with 30 milliseconds latency. I wonder why my application is now so poor. That's a key point in my architecture. I'm going to talk about this in a second with the well-architected framework. Latencies are huge when I talk about migrating things.

Speaker 2

02:48:51 - 02:49:05

I can't generally leave 1 part on-prem and move the other part to the Cloud. Because suddenly I'm moving from sub-millisecond latency between components to 30 milliseconds or 40 milliseconds, and it's a sad day for everyone.

Speaker 1

02:49:06 - 02:49:26

So those are some of the key points. We identify what's the right type of service we want based on meeting the requirements with ideally the least amount of responsibility for us as the customer. Different types of servicing events. Hey, it's a lightweight notification but doesn't contain the data. Maybe it's a reference to, and as an event, a blob is created.

Speaker 1

02:49:26 - 02:49:27

Here's the blob name.

Speaker 2

02:49:27 - 02:49:50

Then I could go and grab it. Whereas a message is the actual data. We think about securing from a networking perspective and the types of interactions we can have, and obviously that migration as well. Now for this final part, we're going to talk about the well architected framework. This architected framework.

Speaker 2

02:49:51 - 02:50:06

This is not 1 of the components in the set of skills that are measured in the layout, but it is a huge part of the learning plan that's actually part of the AZ305 site.

Speaker 1

02:50:06 - 02:50:51

Because what we've talked about so far are really different components of our architecture, what the technologies are and what the capabilities are. But from an architecture perspective, there are really some key pillars that I always have to think about that drives which components I might pick, how I put them together. And those really apply across. So the whole point of the well-architected framework is it's not rocket science, it's just a way to think about of what are those considerations that I'm bearing in mind as I architect my solution that gets me the best overall architecture. So there's 5 pillars to this.

Speaker 1

02:50:52 - 02:50:53

Now the first pillar

Speaker 2

02:50:57 - 02:50:58

is all about, I'll actually

Speaker 1

02:50:58 - 02:51:34

do this in green, because this first pillar is all about cost optimization. Now when I think of cost optimization I want to ensure, well, Azure is consumption based. I want to make sure the resources I have are the right ones and the right number based on the load I have coming in. That's really my key consideration for this. So we can always think about, well, we have a certain amount of resource.

Speaker 1

02:51:37 - 02:51:44

I want to make it match whatever my demand is, whatever the load

Speaker 2

02:51:45 - 02:51:47

on the system is. I don't want a whole bunch of

Speaker 1

02:51:47 - 02:52:13

spare capacity available. I'm just wasting money. So a key tenet is I want to eliminate waste. Now remember that elimination of waste could be in terms of number of instances I have, the size, maybe the tier. So again, whenever I'm seeing these questions about which solution or which number or which size, think about what's the 1 that actually meets the requirement.

Speaker 1

02:52:14 - 02:52:20

Now Ordinarily, on premises, we think a lot about, well, we buy some asset

Speaker 2

02:52:21 - 02:52:27

and we depreciate it over a certain amount of time. So there's this whole idea of capital expenditure, CAPEX.

Speaker 1

02:52:29 - 02:52:51

In the cloud, well, it's operational expense. We're not buying some piece of equipment, we pay for the services as we use them. Now a huge benefit of this consumption nature of the cloud is I can respond to many different types of scenario. I can have these unexpected peaks. I can have this fast growth.

Speaker 1

02:52:52 - 02:53:23

I can have this on-off type pattern. There's a whole number of scenarios I can react to because I only pay for what I'm using. Now, I obviously need to make sure to make these things work to architect in the right way, I always need insight. Now that insight into my resource comes from many different places. We talked about monitoring, all that monitoring, those metrics, that gives us a key insight.

Speaker 1

02:53:23 - 02:53:42

It could be logs as well. Certain types of events say, hey, we're running out of something. But maybe I feed that into that Log Analytics workspace, and then I can run insights on top of that to get ideas about what I actually need. So when I think about insights to architect my right solution, remember those insights in terms of

Speaker 2

02:53:42 - 02:53:43

what are my business

Speaker 1

02:53:43 - 02:54:08

requirements, what are my technical requirements, I need to understand those things and I need to understand once it's up and running, well, how is it running? I need to always be able to answer those questions to help me architect the right solution and then make sure it's as efficient as it possibly is.

Speaker 2

02:54:10 - 02:54:23

I talked about virtual machines and I showed you the page where there's like compute optimized and memory optimized and general purpose, there's special ones with great storage and GPUs. What it really boils down to though, if

Speaker 1

02:54:23 - 02:54:31

I'm trying to eliminate waste and making that resource match the actual demand, I do think about, Well, there's a certain shape.

Speaker 2

02:54:31 - 02:54:33

Actually, I'm gonna try and stick to green.

Speaker 1

02:54:34 - 02:54:36

There's a shape, and I've got a

Speaker 2

02:54:36 - 02:54:37

whole video, again, that's in

Speaker 1

02:54:37 - 02:54:43

the playlist that talks about this. But there's a shape of my work, and I

Speaker 2

02:54:43 - 02:55:08

wanna make the shape of my resource match the shape of my work in terms of there's dimensions about CPU, dimensions of memory, dimensions of storage IOPS and throughput, and I run out of dimensions. But obviously network could be 1 of these as well. Maybe I have special purpose requirements like GPUs, like really high performance networking adapters.

Speaker 1

02:55:09 - 02:55:48

But I want to make sure I pick the right SKU. So the right SKU equals the right shape. So it has the right ratios of CPU to memory to storage to match the load coming in. So if I monitored my work client, those insights, and I saw, well the CPU is running at 80%, but the memory is running at 20%, there's probably a better skew that has a better ratio of CPU to memory that matches my actual need. So the right skew is the right shape, and I wanna make sure I have the right size, because again, they kind of scale up linearly.

Speaker 1

02:55:48 - 02:56:15

So once I've got the shape right, I want to pick the, make sure I have the right size of it. Bearing in mind, I probably want N instances. I generally don't want 1 really big instance. I want multiple instances so I can create and delete as that fluctuation may happen in the actual demand. So I want to be able to auto scale the number of I have to match what's actually happening.

Speaker 1

02:56:15 - 02:56:33

So that auto scale was a key point. When I think about optimization and eliminating waste, in addition to getting the right skew, the right size, I wanna make sure I'm stopping, I'm deallocating. Maybe even I'm deleting

Speaker 2

02:56:35 - 02:56:41

when it's not required. If I have these N instances, we'll make sure I'm actually deallocating them.

Speaker 1

02:56:41 - 02:57:05

So I'm not paying that compute charge anymore when I don't actually need it. That's a key point. Now, if I just deallocate a VM, remember that VM also has a disk hanging off of it. Unless I'm using a femoral storage, which is where it's using that temporary or cache area of the host, that's a managed disk that's costing me money. Whereas if I actually delete them, which things like VM scale sets lets me do, it deletes all of that.

Speaker 1

02:57:05 - 02:57:36

So I'm not even paying for that anymore. So this is where great features like virtual machine scale sets can help me do that. And remember what we saw in the list of compute is that AKS sit on top of VMSS to give me those capabilities. So that deallocate is great, but I'd still be paying for the disk, whereas delete, the disk goes away as well. So right sizing is really important, but it's more than just remember the VM.

Speaker 1

02:57:36 - 02:57:44

Remember as we talked about on the storage side. Okay the right SKU, the right size, maybe it's the right

Speaker 2

02:57:45 - 02:57:52

tier if I'm thinking about storage. Hot, cool, archive, premium.

Speaker 1

02:57:53 - 02:58:03

Make sure you're picking the right things. Make sure the users understand these things so they can pick the right service. Now, when I'm trying to think about actually

Speaker 2

02:58:04 - 02:58:06

what is something going to cost me,

Speaker 1

02:58:07 - 02:58:35

remember the whole point of this is I'm going to end up with some architecture. So I've got this insight. What this insight is going to lead me to ultimately is an architecture of my solution. So I'm going to come up with, hey, this is what it's going to look like in the cloud. Once I have the architecture and I understand the requirements, I understand the load, well, I can also then understand what it's going to cost me.

Speaker 1

02:58:36 - 02:58:38

So to work out those dollars,

Speaker 2

02:58:38 - 02:58:49

well we have the pricing calculator. The pricing calculator is going to let me say these are the resources I'm gonna use.

Speaker 1

02:58:50 - 02:58:53

And then based on that, it will show me the cost.

Speaker 2

02:58:53 - 02:58:58

So I can go and put in different types of resource. So I've got things like virtual machines in here.

Speaker 1

02:58:59 - 02:59:12

Remember, it doesn't have to be running 24 7. So when you're trying to work out the cost, don't just assume, oh, I've got 6 running at 730 hours. If you have scaling, then maybe some of them are running for 730 hours, some of them

Speaker 2

02:59:12 - 02:59:24

are running for 20 hours. You would build that in as part of that overall total solution. So that really is kind of a key point to that.

Speaker 1

02:59:27 - 02:59:42

So that helps me work out what the price is going to be. Now how do I control that cost? So how do I control those dollars? Well, obviously we have things like Azure Policy. We talked about governance.

Speaker 1

02:59:43 - 02:59:44

That controls what

Speaker 2

02:59:45 - 03:00:00

I can actually create. Hey, I'm not gonna let you create premium storage accounts in development. Hey, you're not gonna be able to use an M-series or these big VMs in development. So that's kind of what and where. And then I

Speaker 1

03:00:00 - 03:00:20

can use things like budgets. Budgets let me again control how much. And those budgets I can apply at those same constructs we saw, the management group, the subscription, the resource groups. A budget can be based on how much you've spent so far. A budget can be based on the forecast.

Speaker 1

03:00:20 - 03:00:30

So based on where it's seeing you trending, hey, if it looks like the budget's gonna hit 110%, well, let's do something now to try and fix

Speaker 2

03:00:30 - 03:00:32

that so I don't go to there.

Speaker 1

03:00:32 - 03:00:42

So insight is a key component. I need to make sure I understand what is being used and to get that insight, there's things like cost analysis.

Speaker 2

03:00:46 - 03:01:01

Cost analysis lets me go in and look at different levels and work out where am I spending it. Or maybe I forgot to turn something off or look, analytics is costing more than I realize. Maybe I'm keeping it longer than I actually need it. Remember though,

Speaker 1

03:01:02 - 03:01:22

there's also the total cost of ownership. You may just look at a component and say, oh, it's costing me X. But maybe you're saving a bunch of money somewhere else. Like if it's a managed database solution, like Azure SQL Database, You might say, oh, it's costing me more than a regular VM, maybe. But remember, what are you now responsible for?

Speaker 1

03:01:22 - 03:01:45

What is my total responsibility? So you have to consider what am I doing? Because it may cost a couple of dollars more, but maybe what's now happening is I have a lot less responsibilities, so I'm saving a lot of money in other ways. Now, another way to be very optimal is, yes, We pick the right

Speaker 2

03:01:45 - 03:01:55

skew, we pick the right shape, the right size, the right number of instances. And again, we always think ideally here about that VMSS because that adds, gives me things like auto scale.

Speaker 1

03:01:58 - 03:02:12

That's where I have some discrete resources dedicated to me. But another option I can try and do is things like serverless. If I can, I would like something event-driven? Hey, I

Speaker 2

03:02:12 - 03:02:28

get charged for the resource I use, be it a function, be it a logic app. That helps me be super, super efficient. And sometimes I can't use those. I just need to use virtual machines. But even here,

Speaker 1

03:02:29 - 03:02:39

there are things I can do to optimize my cost. So yes, the right skew, the right size, the right number of instances. But what about if I have some workload that

Speaker 2

03:02:39 - 03:03:02

I need a bunch of resources but it's not time critical? Maybe it can survive being stopped because someone else needs it, but I'm willing to pay a lot less money. Well, we have things like spot VMs. And there's again a deep dive video on this. Spot VMs, I pay a lot less for the cost, but if someone comes along who's willing to just do regular on-demand capacity, they're gonna boot me off.

Speaker 2

03:03:02 - 03:03:16

But I might pay a tiny fraction of what that normal resource would cost. So this is a way to really save money. Another way is, if I think about this auto scale, across my entire environment, I

Speaker 1

03:03:16 - 03:03:30

might think about, well, my actual resource consumption kind of does this, whatever that might be. But there is this base floor

Speaker 2

03:03:32 - 03:03:37

that I've always got, always, that amount of resource running.

Speaker 1

03:03:37 - 03:03:38

I know I'm going to need that for

Speaker 2

03:03:38 - 03:03:43

the foreseeable future. So that's where we have things like reserved instances, Azure reservations.

Speaker 1

03:03:44 - 03:04:02

And we saw that on the pricing calculator. So 1 of the things that lets us do is, hey look, I can actually go and get a great big discount if I wanna go and purchase a reservation. There's certain flexibility in the exact types of resource, like there's families, so I

Speaker 2

03:04:02 - 03:04:04

can have different sizes within that particular SKU.

Speaker 1

03:04:05 - 03:04:25

But if I know for the next 3 years I need this family of resource, well, I can get a huge 61% discount. There's additional savings if I use things like Azure Hybrid Benefit. That's where I'm bringing my existing license to be used as part of the solution.

Speaker 2

03:04:26 - 03:04:30

So that's something else I can do to really save money on there.

Speaker 1

03:04:30 - 03:04:57

So there's lots of things I can do to help out and optimise my costs. So from cost optimisation, if I know that floor, hey an RI would save me a bunch of money on that. So this all up sort of cost optimisation, that's a huge component of what we kind of do. So the next pillar is all about operational excellence.

Speaker 2

03:04:57 - 03:05:17

So now we'll draw another pillar. And for operational excellence we'll use blue, there we go. Operational excellence. I.e. I don't wanna be manually clicking a button or manually managing things where I really don't need to be.

Speaker 2

03:05:17 - 03:05:39

I want to try and optimise, I'll do a line, separate those sections out. So operational excellence. Modern practices, things like DevOps, they're all about enabling faster development, constantly delivering these small incremental units of business value. I have insight into the constant stream of what's going on.

Speaker 1

03:05:39 - 03:05:42

So when I think operational excellence, we can start

Speaker 2

03:05:42 - 03:06:04

with things like, well, DevOps. Now I have a whole masterclass on DevOps and I recommend you go through that before this. There's not a huge number of, I wouldn't expect questions on this. This is more about practices of what you're gonna do. Because when I think about DevOps, DevOps is all about this pipeline of that continuous integration.

Speaker 2

03:06:04 - 03:06:21

People bringing their code together from some get type version control system. And I'll draw this out in a second. It's about continually building it, finding errors early and often, and then maybe even continuously deploying it. Now, as part of that, if I

Speaker 1

03:06:21 - 03:06:34

want to automatically deploy things, I don't want to be clicking things in the portal. So, 1 of the huge things DevOps is this idea of infrastructure as code. I'm describing my

Speaker 2

03:06:34 - 03:06:36

infrastructure in a declarative way.

Speaker 1

03:06:39 - 03:06:47

Now, this differs from imperative. So, we also have imperative. Imperative is where I say how to do something.

Speaker 2

03:06:48 - 03:06:58

So imperative would be how I'm using PowerShell. I'm using the Azure CLI. I'm saying, hey, create this storage account. Create it with this options.

Speaker 1

03:06:58 - 03:07:08

And they work, But the challenge is if I've run a script, and now I want to change the configuration, can I change the value of the script

Speaker 2

03:07:08 - 03:07:22

to instead of being a general purpose storage account to GRS to LRS? No, error. It said resource exists already. I'd have to use a very, very different command to modify it. I can't detect the drift easily.

Speaker 2

03:07:22 - 03:07:32

Whereas declarative, I'm saying what I want the end state to be. I'm not telling it how to do it. I want a storage account that's LRS. I've changed my mind. I change it to now say I want

Speaker 1

03:07:32 - 03:07:44

a storage account that's GRS, and I just run it again. I want to validate it still matches that description, I just run it again. It doesn't matter. So we have things like the ARM, Azure Resource Manager JSON templates,

Speaker 2

03:07:45 - 03:08:24

is an example of a declarative solution. Azure Bicep, that's a lot more human-friendly. Third parties like Terraform, which have providers for different types of Cloud and even on-premises, they are all infrastructure as code, they're declarative solutions. And the benefit of these things is I can go and store those in some kind of repo. So I put them in a repo, that gives me things like version control, and that could be GitHub, it could be Azure DevOps repos, I can easily track those things.

Speaker 2

03:08:24 - 03:08:37

Now, because it's declarative, I can take whatever that maybe JSON or Bicep and I can just apply it to my subscription. It's idempotent, so

Speaker 1

03:08:37 - 03:08:52

I can run it as many times as I want. It's not going to damage anything, and it will create my resources. Whatever they are, I can detect drift. So this ensures consistency because it's this resource that's version controlled.

Speaker 2

03:08:52 - 03:09:07

I could deploy this to dev, then prod, and I know there's not gonna be any differences. I could have different parameter values because the names might be slightly different, but I know it's gonna be consistent within there. So that's kind of a really powerful option.

Speaker 1

03:09:08 - 03:09:19

Now, I might need images. Maybe I can't do everything declaratively, but then I can still have the ability to actually build an image. So I could still have things like custom images.

Speaker 2

03:09:21 - 03:09:31

There's things like Azure VM image builder. So that VM image builder is actually using Packer behind the scenes. We'll get some configuration,

Speaker 1

03:09:32 - 03:09:38

come from there, take in a marketplace image and spit out my own custom image that

Speaker 2

03:09:38 - 03:09:44

I store in the Azure compute gallery, used to be called the shared image gallery, but

Speaker 1

03:09:44 - 03:09:56

I can put apps in it now as well. And I could deploy those images, maybe reference those from some deployment. So that gives me a lot of powerful flexibility. And I can still then run various extensions.

Speaker 2

03:09:57 - 03:10:05

We talk about responsibilities. There's backup extensions and agents. There's configuration, there's run commands, there's custom script extensions.

Speaker 1

03:10:05 - 03:10:17

I can add all of those things to do other stuff as part of that actual deployment. So I get a lot of flexibility there. Now we talked about this DevOps,

Speaker 2

03:10:17 - 03:10:31

and obviously a big part of DevOps is that whole continuous integration. The idea that, hey, I have a bunch of developers working on their own copy of the repo. I want to constantly bring the code together to integrate it.

Speaker 1

03:10:31 - 03:10:43

I want to find if there's any kind of clash. And then, I want to constantly be building it, continuous delivery, so it's ready to deploy, and maybe even continuous deployment

Speaker 2

03:10:46 - 03:10:49

to actually push it out to something.

Speaker 1

03:10:49 - 03:11:07

So I have these different options that I can build on as part of my pipeline. Now with that, when I think about, hey, I'm constantly bringing these things. That could be GitHub Actions. It could be Azure DevOps pipelines. They have all those abilities.

Speaker 1

03:11:08 - 03:11:28

When I'm deploying that out, there's different ways to deploy. So if I'm continually integrating, I'm continually delivering, and then maybe I'm continuously deploying. I'm actually pushing this out. There's different ways I can continuously push things. I might have kind of blue-green.

Speaker 1

03:11:29 - 03:11:31

Blue-green is different environments,

Speaker 2

03:11:32 - 03:11:36

and I push out the new version to the other environment.

Speaker 1

03:11:36 - 03:11:53

I warm it up and get it ready and I switch them over. It might be I do things like canary. I make it available to a small population and then expand it out as they're okay. Rings is another variation on this. It could be A-B testing.

Speaker 1

03:11:54 - 03:12:10

A-B testing is where maybe feature flags, some populations get 1 version, another get a different version, and I can get insight back. Remember that these all fit together. I get the insight back. Well, how are people using these? Which do people like more?

Speaker 1

03:12:10 - 03:12:52

So I make a decision on what is the best thing. And then the whole point of these kind of pipelines is I am testing all the way across. We have things like unit testing, testing some very quick isolated component, do I get the right result? There might be things like smoke testing, a little bit more exhaustive to verify maybe certain interoperability between the components, make sure it doesn't start smoking, kind of the idea of that. There's integration testing, making sure now we are actually getting the full extensive interactions and output from the components.

Speaker 2

03:12:54 - 03:12:58

There may be manual testing. That's obviously the most expensive type of testing

Speaker 1

03:12:58 - 03:13:09

to do, but now there's humans involved. But then as we get to these other points, well then there's stress testing. Can it handle the actual load I'm throwing at it? Can I do fault injection?

Speaker 2

03:13:11 - 03:13:14

Things like Azure Chaos Studio lets

Speaker 1

03:13:14 - 03:13:24

me actually simulate certain types of faults. How does it handle that? And then of course I'm doing security testing. Am I secure? Have I introduced some problem in the environment?

Speaker 1

03:13:24 - 03:13:36

And a key point is often we'll move between environments as we do this, especially the continuous deployment. It would maybe go to a dev, QA, production. I need to make sure they're consistent. They don't have to be

Speaker 2

03:13:36 - 03:13:44

the same consistency in terms of size, they should definitely be the same in terms of configuration and type. So that's why we want

Speaker 1

03:13:44 - 03:14:05

to use the same template between the environments to ensure I've got consistency. Otherwise, my tests really may not be that good if there's differences between them. I want to make sure I automate as much as possible. So I think about automation. Now, automation can be done in different ways.

Speaker 2

03:14:05 - 03:14:22

There are things like Logic Apps. Logic Apps are phenomenal to doing something maybe on a schedule. It replaces the old Azure scheduler. So through a Logic App, I could run something at a certain time, automatically shut down VMs. You'll actually see for a lot of types of resource, you have a task option.

Speaker 2

03:14:22 - 03:14:28

As a task for virtual machines to shut them down automatically. For storage accounts, tasks that move between tiers.

Speaker 1

03:14:28 - 03:14:32

These are using logic apps behind the scenes.

Speaker 2

03:14:33 - 03:14:42

Some serverless compute option that just does something for as cheaply as possible as I need it to. So you might see these exposed actually as tasks. I might write

Speaker 1

03:14:42 - 03:14:47

a function. Again, saying else. There's things like Azure Automation.

Speaker 2

03:14:49 - 03:14:55

I have runbooks where I can run PowerShell, I can run Python to do certain things.

Speaker 1

03:14:57 - 03:15:14

But again, there's that key point that I need to have monitoring all of these different components. I need to understand what's happening at the storage, the compute, the network, that insight component to make sure I've got all of the right things in place

Speaker 2

03:15:14 - 03:15:23

to make sure I can react, so I can have those action groups, those alerts if there are problems. So yes, I wanna automate, but I'm making sure I've got that

Speaker 1

03:15:23 - 03:15:26

monitoring and automate

Speaker 2

03:15:31 - 03:15:47

responses if I see things of a certain type, to keep the environment healthy, keep it secure, et cetera. We should have drawn that over a bit. Okay, so now we think about another pillar. I might tidy this up a little bit later on.

Speaker 1

03:15:48 - 03:15:50

So my next pillar here,

Speaker 2

03:15:50 - 03:15:59

we'll go for orange, would be all about performance efficiency.

Speaker 1

03:16:06 - 03:16:25

I want to make sure that what I'm using is the most efficient way of doing it. Now there's a lot of commonality here, If you think about this, between the cost optimisation, there's a lot of things that say, hey, if I'm cost optimising, that's also gonna optimise my efficiency. But now

Speaker 2

03:16:25 - 03:16:31

I'm thinking about the performance side of it. My pivot is a little bit different. I actually gotta draw upwards.

Speaker 1

03:16:31 - 03:16:49

So for my performance efficiency side, yes, there is commonality, but again, this really is about making sure that consumption is matching my load. The work coming in, the requirements, I

Speaker 2

03:16:49 - 03:16:53

want to make sure the resources I have really match that.

Speaker 1

03:16:53 - 03:16:57

And so the focus here is all about auto-scale.

Speaker 2

03:17:00 - 03:17:08

And specifically, I'm auto-scaling in and out. So it's horizontal auto scale. I'm focusing about adding

Speaker 1

03:17:09 - 03:17:27

and removing instances as the workload fluctuates on my system. So that's really the key point. I want to do this horizontal over, say, vertical. So yes, I can make things bigger

Speaker 2

03:17:30 - 03:17:36

But very few things I can do that live, dynamically while they're running. So

Speaker 1

03:17:37 - 03:17:51

I try to stay away from that. I try and work out the right size for a unit of work, and then auto-scale the number of instances of that work as I think about the load changes over time. So I

Speaker 2

03:17:51 - 03:17:52

want to be using that auto scale.

Speaker 1

03:17:54 - 03:18:10

Now remember, our key base unit for this is virtual machine scale sets. Remember things like AKS sit on top of this and use that for the nose to actually accomplish this. So use virtual machine scale sets as much as I can

Speaker 2

03:18:10 - 03:18:13

that will actually delete and create these as required.

Speaker 1

03:18:15 - 03:18:37

That includes the storage. So that means I'm also not wasting money on the disks themselves. From an AKS perspective, remember the whole point of AKS is the management plane is just done for me. What I end up with are the nodes where my pods run. Well, the pods, they can have auto-scales.

Speaker 1

03:18:37 - 03:18:49

There's a horizontal pod auto-scaler based on the work of the pod. And then if my nodes are all full up, well, then there's also the cluster

Speaker 2

03:18:51 - 03:18:51

auto-scaling,

Speaker 1

03:18:55 - 03:19:15

which can actually go and add and remove nodes if the scheduler part of AKS can't schedule the pods because they're full. So that's where I can bring in that cluster auto-scaling. So that is another way to really make that efficiency work together. Now, we talked about Azure Container Instances.

Speaker 2

03:19:17 - 03:19:20

AKS can use this for burst scenarios.

Speaker 1

03:19:20 - 03:19:30

So there's a virtual kubelet that enables ACI to actually be used by AKS. And this really does follow through,

Speaker 2

03:19:30 - 03:19:42

like app service plans, app service plans, I pick a certain SKU, which is a certain size and again it can auto scale between those. So from an app services perspective,

Speaker 1

03:19:44 - 03:19:48

that idea of understanding the shape of my workload

Speaker 2

03:19:49 - 03:19:56

applies to VMSS, to AKS, to app services. It has those same constructs. Ideally, maybe I can use serverless.

Speaker 1

03:19:57 - 03:20:00

Serverless, when we think about efficiency, If

Speaker 2

03:20:00 - 03:20:13

I can be triggered by some event, if I can use a function, if I can use a logic app, that's gonna generally be my most efficient use as long as it hits the requirement of what I have.

Speaker 1

03:20:15 - 03:20:27

Remember, when we talk about PaaS solutions, don't forget there are the database solutions as well. So when I think about the sizing and the different options available, well, SQL, there are database

Speaker 2

03:20:28 - 03:20:44

considerations. SQL Server, I can have things like virtual CPU or DTU, where it's a blended. If I'm using Cosmos DB, I have these request units.

Speaker 1

03:20:46 - 03:20:48

Other servers, hey, Azure database, I

Speaker 2

03:20:48 - 03:20:55

pick a SKU. Azure database for Postgres, whatever that might be, I pick a certain SKU size.

Speaker 1

03:20:55 - 03:21:12

So there's always this concept generally of a size. I have a size, a shape, and then I have certain numbers of instance of that. So make sure you consider all of the aspects around those types of things. Storage is exactly the same. Now we covered that in detail.

Speaker 1

03:21:12 - 03:21:30

Storage, managed disks, standard hard disk drive. So don't forget about that. So while yes we're focusing on this, don't forget about the idea of the storage. Is it the type? Is it the tier?

Speaker 1

03:21:31 - 03:21:42

All of those things come into play. Do they have bursting? We talked about the bursting ability. Maybe I can pick a smaller disk because it's a small window of burst. Well, I can use that.

Speaker 1

03:21:42 - 03:21:50

Red is cache. If I need that kind of in-memory caching capabilities, I might have a combination of storage.

Speaker 2

03:21:51 - 03:21:57

I might use Blob, I might use Azure SQL Database, I might use Cosmos DB. They call this polygot persistence,

Speaker 1

03:21:57 - 03:22:20

where I have this combination of different solutions. Don't forget about the network. So we talked about, yes, great, I have this whole idea of performance efficiency for my compute services. But remember, as we said, networking. 1 of the biggest things you need to consider is latency.

Speaker 1

03:22:22 - 03:22:41

Also, generally we pay for data egress. Am I unnecessarily sending data out, which is gonna cost me money? So I have to think about what is the latency between the different components I have. How can I optimize that? Maybe I use buffering.

Speaker 1

03:22:41 - 03:22:55

I use some messaging layer if there is a latency aspect to that. Within a region, so intra region, I can reduce latency by using things like a proximity placement group.

Speaker 2

03:22:55 - 03:23:10

That's going to put things as close together as possible. And then there's obviously the inter-region between regions. Well that's generally just the speed of light. So I want to be super careful of my architecture to make sure I'm not trying to do things synchronous across them. That's going to hurt me.

Speaker 1

03:23:10 - 03:23:39

If I'm going from on-premises to Azure, well remember, the site-to-site VPN is going over the internet. So the latency, who knows, is going over the internet. Whereas if I use ExpressRoute, that's a private dedicated connection. It's still gonna take time, but hey, it's gonna be as efficient as it possibly could be. And again, in terms of really optimising my path, this is like content delivery networks.

Speaker 1

03:23:39 - 03:23:50

If I'm offering something out to the internet, a content delivery network can cache content all around these points of presence around the world to make it more easily available. So hey, is

Speaker 2

03:23:50 - 03:24:00

there caching options? Azure Front Door, remember, can integrate with that. So CDN, Front Door, et cetera. Okay.

Speaker 1

03:24:02 - 03:24:09

So we've solved cost optimization, operational excellence, performance efficiency. The next big pillar,

Speaker 2

03:24:12 - 03:24:22

what should we use for this? We'll use gray, is reliability. And we talked in detail about this.

Speaker 1

03:24:23 - 03:24:42

So from a reliability perspective, I have to understand what is my requirement. So often we want to survive a failure at some level. Is it a node failing, a rack failing, a data center failing, a whole region failing? I have to understand what are my requirements. Often we'll hear about the idea of an SLA.

Speaker 1

03:24:43 - 03:25:10

And we have some number of nines. 99.99, 99.9. If we say 3 nines, well that's about, was it 10.1 minutes of downtime a week. If we say 4 nines, that's basically 1 minute per week. Now that includes unplanned and it includes planned maintenance, so I have to consider that.

Speaker 1

03:25:12 - 03:25:27

Within a region, from that reliability perspective, for this pillar, so I could say, well, in region, what did we have? We had the idea of availability

Speaker 2

03:25:30 - 03:25:43

sets. This whiteboard is about to die, you can tell. I really didn't do very much testing on this. Availability sets. So that's kind of the idea of a rack level fault domain or a node level protection.

Speaker 2

03:25:44 - 03:25:56

Then we have availability zones. That's availability zones, that's an entire data centre level survivability.

Speaker 1

03:25:58 - 03:26:03

And then of course, Across regions, that gives me resiliency from

Speaker 2

03:26:03 - 03:26:09

an entire regional level problem. Now obviously between regions, so region 2,

Speaker 1

03:26:11 - 03:26:41

then there's some kind of replication or it could be some kind of data job that copies the data. It could be a backup and that backup vault is replicated, but there's something taking the data from 1 to the other. So again, it could be a backup restore. It could be there's different layers. So remember, I can replicate at the fabric level, and we talked about that Azure site recovery, remember?

Speaker 1

03:26:42 - 03:27:04

It could be saying at the application level. So there's different levels of things we can do. What's going to drive this is what is my recovery point objective, what is my recovery time objective, how long do I have to start back up, how much

Speaker 2

03:27:06 - 03:27:08

can I lose?

Speaker 1

03:27:09 - 03:27:11

So recovery point objective might

Speaker 2

03:27:11 - 03:27:16

be 5 minutes. I can lose 5 minutes of data in some unplanned disaster.

Speaker 1

03:27:16 - 03:27:18

Recovery time objective might be

Speaker 2

03:27:18 - 03:27:20

you have to be up and running in an hour.

Speaker 1

03:27:21 - 03:27:24

Now if my recovery time objective was 3 days

Speaker 2

03:27:24 - 03:27:28

and my recovery point objective was 12 hours, I can probably, as

Speaker 1

03:27:28 - 03:27:55

long as I'm backing up twice a day, My plan could be restore backup. If my recovery point objective is 5 minutes, my recovery time objective is an hour, I'm looking at some replication. As they get smaller and smaller, maybe I'm actually moving to an active-active type of configuration. Remember, active-active can be super difficult from an architecture perspective because where's the state? If the state is in a database, how

Speaker 2

03:27:55 - 03:28:19

do I handle that? Cosmos DB has great capabilities for that. From a database perspective, maybe I have a read replica and my application has to be smart enough to read from the replica, but do writes to the primary. So I have to be able to architect all around that to actually make that work. So that, But that reliability, I have to understand what I'm trying to solve.

Speaker 1

03:28:20 - 03:28:21

So see what the question is saying.

Speaker 2

03:28:22 - 03:28:40

Hey, I wanna survive a rack level failure. Okay, I could probably use availability sets. Hey, I wanna survive a data centre failure. Ding, ding, it's gonna be availability zones or I'm using multiple regions. Remember the types of service, LRS, ZRS, GRS for storage accounts, databases let you have replicas.

Speaker 2

03:28:40 - 03:28:55

Nearly all computes are regional. So if I'm trying to make a compute service available to multiple regions, I'm going to have different instances in the multiple regions, an AKS in region 1 and AKS in region 2. And then remember, I can balance between them

Speaker 1

03:28:55 - 03:29:00

with things like Azure Front Doors, we talked about Azure Traffic Manager, that would be that distribution

Speaker 2

03:29:01 - 03:29:02

for those solutions.

Speaker 1

03:29:04 - 03:29:12

So we've had cost optimization, operational excellence, performance, efficiency, reliability. The last pillar,

Speaker 2

03:29:13 - 03:29:20

by no means least, is security. So, when I'm architecting my solution, I

Speaker 1

03:29:20 - 03:29:23

need to make sure I'm thinking of security.

Speaker 2

03:29:27 - 03:29:31

I'm going to try and write as little as possible now because this is not going to handle it at all.

Speaker 1

03:29:32 - 03:29:38

So when I think security, there might be regulatory standards. So is this something regulatory?

Speaker 2

03:29:43 - 03:29:48

This could be HIPAA, it could be credit cards like PCI, DSS,

Speaker 1

03:29:49 - 03:30:19

It could be GPR. But I have some kind of requirement that drives a certain configuration. There are just basic things built into Azure, what was the Azure Security Center, that has different standards that I can apply. Some things I'm responsible for, some things Azure is responsible for, and it can show me those things. But it uses Azure policy behind the scenes and initiatives that have the various configurations I need to go and see that status.

Speaker 1

03:30:20 - 03:30:42

There's a huge focus on 0 trust. 0 trust is about never Assuming trust. We constantly revalidate that trust at every single step we have. We're always validating it. So what that does is it helps protect us against that lateral movement.

Speaker 1

03:30:42 - 03:30:47

If something has got into your network, well,

Speaker 2

03:30:47 - 03:30:56

we don't just naturally assume I can do anything at once. Even inside our networks, we constantly re-evaluate all of the different things that we actually might want to do.

Speaker 1

03:30:58 - 03:31:01

We have the huge focus on defence in

Speaker 2

03:31:04 - 03:31:17

depth. It looks terrible at this point, I apologise. And so defence in depth is there's different layers. So we think like an onion. And yes, this can make

Speaker 1

03:31:17 - 03:31:33

us cry as well. But we want to think about defending at every single layer of what we have. Now at the top, we have things like our data. So how is our data encrypted? Are we using Azure Key Vault?

Speaker 1

03:31:33 - 03:31:52

Are we bringing our own keys? Are we having encryption? Are we rotating them? Are we encrypting inside the virtual machine like Azure Disk Encryption? Making sure we have the right things within our data, we're encrypting it the right way.

Speaker 1

03:31:53 - 03:32:02

Within our application, are we introducing vulnerabilities? Now there are various Azure defender solutions that can help us with this.

Speaker 2

03:32:02 - 03:32:25

There are things like the web application firewall that can look for common types of attacks, and we're gonna talk more about this in a second, that can help protect me. But for my app, am I doing good things? I'm not storing secrets in my app. I'm not making it vulnerable to code injection. I want security to be part of the entire process, that entire security development lifecycle.

Speaker 2

03:32:26 - 03:32:50

I want that part of it. For any compute services I have, again I want security in them, anti-malware. I'm not opening up RDP or SSH to the internet. I want maybe just in time access. I want to be using private connections via a site, site VPN or point site VPN or private peer of express route or Azure Bastion, that managed jump box solution.

Speaker 1

03:32:51 - 03:33:06

But I want to control those things. I do think about the network. So I have the security in the network, again, public access, limit connectivity, don't put VMs directly on the internet.

Speaker 2

03:33:06 - 03:33:17

When we talk about things like that load balancing on all of those different solutions, when I do auto scale, wherever I put that now, Somewhere over here, there's

Speaker 1

03:33:17 - 03:33:27

things like a load balancer sitting in front of those virtual machine scale sets to give me that single entry point. That also now slightly abstracts away, I'm

Speaker 2

03:33:27 - 03:33:29

not putting my VM directly on the internet.

Speaker 1

03:33:29 - 03:34:03

And if I use a layer 7 like App Gateway, then I can use things like web application firewall to give me additional protections for that network. So again, as many layers as I can, I want to add that? I think about the perimeter. So things like distributed denial of service, there's a basic protection just built into Azure. There's a standard layer I can leverage for more granular controls, something that's more machine learning tuned to what my regular kind of work is.

Speaker 1

03:34:04 - 03:34:09

We have policy. So policy can apply to things like, well, what is the authorization?

Speaker 2

03:34:10 - 03:34:16

Do I do MFA? Am I using conditional access? Do I have identity protection? What are the types of resources? What are the agents required?

Speaker 2

03:34:16 - 03:34:18

What's the auditing I have going on?

Speaker 1

03:34:18 - 03:34:20

Then there's obviously the physical facility.

Speaker 2

03:34:20 - 03:34:25

Now in Azure, you're not responsible for that. That's part of Azure's job.

Speaker 1

03:34:26 - 03:34:43

But they go through certain audits to meet certain requirements you may have. So there's different levels of responsibility, but I think about all of those things for my all up solution. Now with that, identity is huge. Identity is kind

Speaker 2

03:34:43 - 03:34:54

of the security perimeter in the cloud. It's not really the network. Network's part of it, but really identity is a bigger part. So I wanna make sure I've really rich identity controls. We talked about conditional access and MFA.

Speaker 2

03:34:55 - 03:35:08

Partners can be through things like Azure AD B2B, enable seamless sign-on. As much as possible, get rid of passwords altogether. That's becoming more and more of a reality today. For my customers, I can use Azure AD B2C as we saw.

Speaker 1

03:35:10 - 03:35:19

PIM for just in time access to resources and different permissions. But as much as possible, I want this single identity.

Speaker 2

03:35:19 - 03:35:26

If it's an Azure resource, use managed identities. If I have lots of resources that need the same permissions, use a user-assigned managed identity.

Speaker 1

03:35:27 - 03:35:44

So there's all these things that can bring this together. But the whole point of this, these pillars, these are not particular individual Azure solutions. These pillars are, as I'm architecting, keep these in mind because

Speaker 2

03:35:45 - 03:36:03

I have to build this into my architecture to make sure it's the best architecture for my customer. And during the exam, you're not gonna get tested, I don't think, on the well-architected framework. That's not the point. The point is, these Considerations, when I'm looking at what are

Speaker 1

03:36:03 - 03:36:08

the possible answers, will help me formulate in my brain, oh okay, well which is

Speaker 2

03:36:08 - 03:36:20

the most efficient 1? And what would give me the right reliability based on those requirements that have been given to me? How would I optimise my cost? Okay, what's the best way? Okay, I need to deploy these resources.

Speaker 2

03:36:20 - 03:36:40

What are options I could use for infrastructure as code? Well, okay, oh yeah, an ARM template, Terraform. What would be an imperative option? Oh, okay, well PowerShell or Azure CLI, a script essentially. So I'm understanding these are all up services, so I can architect the best possible solution.

Speaker 2

03:36:42 - 03:36:45

So that's it, and this whiteboard won't take any more anyway.

Speaker 1

03:36:45 - 03:36:56

I mean, Azure is constantly changing, stay up to date. We wanna make sure teams are educated, architectures will evolve over time. We wanna automate as much as possible.

Speaker 2

03:36:57 - 03:37:04

For the exam, I already talked about what's in the exam. I would just relax. Worst case, if you don't pass the first time, at

Speaker 1

03:37:04 - 03:37:33

the end of it will show you the different sections on where you did strong and where you did slightly weaker. The areas where you did weaker, go back and refocus. Go through that breakdown of what's actually in the exam and look at each of the individual skills. So ideally you want to be able to look at that word document, if I jump back to that for a second. So this page here, I want to be able to, this skills measured, You should be able to go to each of these things

Speaker 2

03:37:34 - 03:37:39

and say okay yeah I know I can answer all these different things

Speaker 1

03:37:40 - 03:37:59

and again in the exam it might be there's multiple answers so then we apply okay well what's the right 1 based but seriously do not panic It's an exam, you can retake it. If you don't pass, it's going to show you where you're weaker. You can go re-back and re-double.

Speaker 2

03:38:01 - 03:38:15

Don't stress out. I really hope this was useful this was a ridiculous amount of work so I really would appreciate a subscribe and a like but really just good luck and hope to see

Speaker 1

03:38:30 - 03:38:15

you