Ask the experts: 10 cloud migration challenges (and how to overcome them)
Don't roll the dice on your cloud migration. Our panel of cloud experts breaks down 10 migration challenges and the strategies they've used to overcome them.
Jun 08, 2023 • 20 Minute Read
Migrating to the cloud can be a challenge. But don't roll the dice on your migration. If you're starting or planning a cloud migration, there are countless common cloud migration challenges that can derail your efforts.
How can you avoid migration missteps on your way to realizing the benefits of cloud?
In our free, on-demand webinar Cloud Migration: Ask the Experts, we asked a panel of cloud experts with more than 80 years of combined IT experience to play along as we break down 10 cloud migration challenges and share the winning strategies they've used to overcome them.
The panel includes:
The quotes below have been lightly edited for clarity and brevity. Check out the webinar for the whole conversation.
Table of contents
- What cloud migration challenges are faced in planning?
- How can organizations without operational experience in running a safe cloud environment ensure they're protected?
- Can you migrate to the cloud and shift to a DevOps mindset simultaneously?
- How do you balance cost and speed when considering architecture?
- What if there’s a lack of understanding or documentation around how the app you’re migrating works?
- Are there nontechnical roles that also need a grasp on cloud to make a migration successful?
- Is it faster to hire cloud talent and teach them your environment or to upskill on-prem engineers to be cloud engineers?
- How do you sell cloud skeptics on cloud?
- How would you influence a tech architect who has a legacy mindset but is also accountable for writing the new hosting strategy?
- What’s the biggest cloud migration mistake you’ve ever seen?
What cloud migration challenges are faced in planning?
Kacy Clarke: One of the biggest challenges when planning a migration is how much you're going to change.
Are you going to move as quickly as possible and clean up the mess when you get on the other side? Do you want to refactor your database to run on RDS? How much automation do we want to put in place so we’ve got automated CI/CD pipelines? Are we going to move the application over to containers? There are so many different questions.
But it has to come back to the business case of why you’re moving?
Is it just that I want to get out of the data center? Or are there constraints on the application right now that you want to address? Start out with what is the scope of what you want to do.
Andy Warzon: Adding on to how much do you want to change, a classic challenge is: you have a cloud migration and you potentially have a DevOps transformation. And you want to do both at the same time. And there’s no easy answer. That’s a lot to change at once, and it gives you more potential for failure and a lot of cultural pushback in trying to change too much at once.
But if you don’t address that DevOps mindset, you’re not going to be able to take care of all the benefits of cloud as you migrate. And you might end up with an environment that is very static and looks very similar to your on-premises environment — and the business might not see many of those expected benefits. Figuring out what to tackle up front is big.
How can organizations without operational experience in running a safe cloud environment ensure they're protected?
Andy Warzon: You have to start with account layout and authentication and an authorization model — a permission model for the organization. That’s a key first step. Production, non-production accounts, potentially shared services accounts, maybe if there’s a microservices architecture there you might want to think about even individual accounts for individual sets of workloads. So laying out that account structure, and then then permission model on top of that.
Is there going to be single-sign-on using Active Directory or some other SSO service? And then, what are you giving developers access to? Who has access to production? All those really key questions should be laid out up front.
Then, a closely related thing to tackle upfront is the network layout. Especially if migrating from on-premises, and you have an existing network environment presumably during some transition period there’s going to be connectivity between that on-prem and cloud environment. So what is the layout of IP spaces and subnets and LANs and connectivity back to on-premises so you don’t have conflict and you have room for growth.
John Wright: When dealing with hybrid networking setups, for me, the biggest thing is getting security and networking in the room on day one and saying, "Let’s have a real conversation. This has to happen." Because it’s hybrid, they're going to want protections that they may not need long term when they’re in AWS.
Ask security about the rules for a lift and shift migration, or what are the guardrails? I’ve found that a lot of times those organizations can be blockers in migrations — not in a negative way, because they have a building and things to project — but getting them in the room and giving them time and empowering them to make decisions about the migration is a big key to success.
Kacy Clarke: We usually take the client through an exercise of defining their minimum viable cloud, which is security framework, operations framework, automation framework, core account structure, and networking architecture.
And in the security framework, frankly, at least half of the work I do has been around security — identity and access management. What are the access patterns? Are these global clients? Are they coming in through the internet? Are they coming in through an app? You have to understand those access patterns and what threat vectors are associated with that.
John’s point about bringing in security at the beginning is absolutely essential. They’ll have to change controls and policies. And this is expense information. Depending on what the level of PII or data that’s in there, we’ll need to up our game. It could be data loss prevention, threat vulnerability, what are we doing to the end clients. It ends up being an examination of what is that minimum viable cloud out there from the security point of view.
Because that’s one thing we usually have to add more to in the cloud than in the data center. We’re used to the perimeter within the data center. But it’s not good enough to just protect the perimeter in the cloud. We have to protect the workload. And, frankly, we have to protect each tier from each other — so encrypting all data in flight, encrypting it at rest.
How do you manage and rotate the keys for that? What are you doing to the containers or the .NET platform out there to protect the end points? There’s a ton of work to do there. As well as on the operational side: incident management, change management, how are these processes going to work, how are we going to add automation to what we’re doing for availability or failover?
Can you migrate to the cloud and shift to a DevOps mindset simultaneously?
Forrest Brazeal: I notice how even most technical considerations are actually business considerations when you scratch a little deeper. Which I think is something a lot of us miss. But incident and change management takes us to CI/CD. How can an organization deploy quickly, safely, and reliability — whatever our deployment looks like upfront? Do we have the ability to transition to a DevOps mindframe in a short timeframe?
John Wright: To me, you totally get a client into a DevOps mindset even with a lift and shift. It’s more a matter of do they have the resources to do that and do the life and shift or replatform. Let’s assume we do. In parallel, we can be doing training on the side, even if it’s not how we’re moving the workloads. Because maybe the right way to move this app is to deploy it into AWS with a pipeline or whatever you want to use.
To me, if the client has the resources to support a fast migration and learning DevOps, then I see no reason you can’t do both. It comes down to time and budget, but I try to make sure we’re training people on cloud-first deployments on the side while we’re doing the main migration, that way it’s fun and exciting and we’re learning new things.
Kacy Clarke: The DevOps mindset is also how the developers and the operations folks are collaborating — infrastructure as code. To me, the big advantage of the cloud is not that it’s all virtualized and you pay as you go. One of the biggest advantages is that everything has an API, so now everything is software.
That DevOps mindset is how do you bring that automation mindset to everything you’re doing? So you can shut down things if you’re not using them, and autoscaling is part of the nature of what you do so you can match the workload deployment to the actual workload you’re seeing.
That collaboration so that dev is not just throwing stuff over the wall and ops is not letting stuff run at way too high a provisioning rate so that you’re managing cost, you’re managing effectiveness — that’s core to actually taking advantage of the innovation platform of the cloud.
How do you balance cost and speed when considering architecture?
Forrest Brazeal: Moving to architecture and strategy around cost and speed, let’s say you’re dealing with a workload with large SQL Servers — a traditional relational database that has a substantial amount of data in it — and there is some data gravity there with where it’s currently located. Given a short turnaround time, do you think it makes sense to replatform this?
Andy Warzon: My default approach with any apps that are core to the business is to try to look at replatforming opportunities upfront. Sometimes, if you have to skip that because you’re weeks away from a deadline or have hundreds of applications, so be it. But if you have a big focus on the application, you want to look for opportunities to replatform — like RDS or there may be others — managed load balancers or different state storage services.
The most important thing is to investigate some of these replatforming opportunities and find the ones that are the best tradeoff of effort and cost. Finding a few replatforming wins is key for a core application in that it helps to start build buy-in. The team starts to see the benefits of cloud when they see those wins.
John Wright: Most migrations I see that have a timeline associated with them, things like databases might get broken off into maybe 20 databases. Or some type of decoupling or replatforming on the database layer.
But load balancers, things like that almost always those get shifted to a more native solution. And even apps and web servers that don’t have high availability, you can get quick wins by adding another one to another AZ. Let’s do an auto-scaling proof. Let’s build some AMIs with something like Packer and some of that DevOps work that you can even do on a lift and shift. So now you’ve created this highly available app.
But looking at databases, the technical stuff usually isn’t what makes the decision. It’s the timeline and what licensing agreements you have in place. The fact is, if the client only has licensing — and I can’t stress that word “licensing” enough — if they have a licensing agreement and they want to do BYOL, there’s your decision right there. That’s a lot of the driver. It’s rough because the technically right thing isn't done because of licensing agreements.
Kacy Clarke: The data migration service from AWS is certainly helpful, particularly if we want to replatform. But John’s point about licensing is important and walking the client through this is important. But if you’ve got say 300 TB — even if you break up these databases, you’ll have to do a bulk load and an incremental synchronization until you’re ready to actually do the switch over.
And you have to consider how that’s going to work and how it will affect your clients and how will we make sure as we go from dev to test to UAT to prod that we’re actually coordinating with them — we’re validating that the data is being synchronized appropriately to however many target databases we’re going to. And then testing the changeover in a UAT environment and then the final change over as we get ready to go. So there’s a lot of planning that needs to be done on the data migration side.
What if there’s a lack of understanding or documentation around how the app you’re migrating works?
Forrest Brazeal: I want to bring in an unexpected challenge. What would you do if no one on the team knows how the app works and many critical features aren’t documented. Maybe it was built by people who are no longer with the company. Maybe there’s a six-year-old document and some pictures of whiteboards. What do you do there?
John Wright: Budget time for discovery tooling. If you don’t want to pay for it, negotiate. There are tools like RISC Networks, Cloudamize and things like that that are agent-based or not and connect to your existing servers. They connect data for 15-30 days, paint architecture diagrams showing traffic, they’ll size the servers for cloud, they’ll tell you about IOPS requirements, they’ll show you the executables running, and exactly which ones are reaching out and touching other things.
Without that, you have to pay a partner or your own people to do discovery manually for six months the old-fashioned way. You can spend less and run discovery tools for 30 days and get you 90% of what you really need.
But you also have to accept you will not have all your features documented. That’s one of the reasons you're probably moving to cloud — to modernize your app and make it better. You’re probably going to learn more after you migrate. There’s no way to learn everything.
Andy Warzon: One of the key things you can get with discovery tooling is the dependency mapping, especially across the network. You may not understand what every piece of code does, but you have to understand how the data is moving across the nodes and the network.
I hate seeing when folks don’t tackle that and then they take that fortress mindset to their network design. They just open everything up because they don’t know what’s talking to what. You want to wrap your head around that part of the problem so you can lock down security groups and network access as much as possible. And just make sure that that part is understood.
Another thing is, instead of trying to understand what every bit of code is doing, establish your performance requirements for the migration. What is the key outcome you’re looking for? That will guide what sort of discovery you need to do in the application as you go.
Kacy Clarke: Discovery tools are very important and useful, but I would go beyond that and call out code-scanning tools — going in and understanding what’s going on.
How many stored procedures are associated with this. That’s a real bog down area for some. Oftentimes, I’ll bring in application performance management tools, like SignalFx or AppDynamics or something else in there. Discover all the different dependencies out there and what is it going to mean to latency and connectivity once you move your application. It could be data synchronization that’s going on that you didn’t expect. It could be reports that are going across database. Or DB Links. (God helps us.)
It’s not just looking at the application. Every application is a part of an ecosystem. And being able to trace all the connectivity so you understand what is really part of the ecosystem, because you may need to move a number of different applications together to make this work. So, discovery tools are essential, but don’t stop there. There are more questions to be asked.
John Wright: And don’t forget that you’re still running on hardware. There’s a server with a disk somewhere. These discovery tools will look at things like IOPS. Maybe you have a workload today that is just too latency sensitive to be moved the way it sits. Look at the app side, but also the hardware. You have to also look at those things as well from these discovery tools and a lot of on-prem monitoring things don’t capture enough of that type of information in regards to what’s in cloud.
Are there nontechnical roles that also need a grasp on cloud to make a migration successful?
Andy Warzon: When talking about non-technical roles that need to speak cloud, I would start with the CFO and their team. Thinking about the financial side is really important. Often, you can see this be a blocker when you have a group that thinks in a mindset of capitalization and shifting from CapEx to OpEx. Work with that team to understand the cost of the migration and the benefits around total cost of ownership, where the savings are coming from, what does the elastically look like in terms of cost, and how are we’re going to think about that in terms of the business model.
They also need to understand the things to look out for. They need to be a partner in the business and maybe get more advanced in things like chargebacks or thinking about cost per customer.
Is it faster to hire cloud talent and teach them your environment or to upskill on-prem engineers to be cloud engineers?
Kacy Clarke: You’ve got to do both. One person will help move the needle, but eventually you’re going to have to bring everyone along with you. And get them to understand how designing and architecting for cloud is different — how are operations different, why do we do things like automated deployments and provisioning and self-healing and auto-scaling. It’s just a very different environment from the data center for developers and operations folks.
You need people who understand the applications who are coming up to speed on cloud and you need people coming in from the outside with skills who can understand your application. And you need to have them work together to figure out what you’re going to do and how much you’re going to refactor as you move to cloud.
Be very proactive about your education program. Build cloud institutes, reward people for getting certifications, and make it exciting.
Frankly, a cloud migration can scare a lot of your staff. We deal with fear all the time — people thinking they’re going to lose their jobs. It changes everything, including how we deal with licensing and agreements. So there are a lot of people, not just one cloud person who is going to make a difference here.
John Wright: It’s also hard to find affordable cloud people. Let’s be honest. So it’s better to take your time and hire the right people, like a chief architect or principal architect for your organization.
But let’s say you have storage, compute and network back in your data center. If you just hire 10 people and say they’re the cloud team, those other people are going to quit. And now you’ll have no one to support your migration. You don’t want to scare anybody. To me, you hire a 1:10 ratio. If you have 10 storage, network, and compute people, you tell them, "You’re now the cloud team, you each have your speciality, learn the cloud." And away we go.
I think you should tell people in advance that you're going to hire some people to help with the migration, but we’re going to transition you to be cloud people with your own expertise. Because if you scare them off, the migration ends. And then it gets really expensive because you have to hire contractors and consultants to do everything.
How do you sell cloud skeptics on cloud?
Forrest Brazeal: Let’s throw another wrench into things here. What if you have a CISO who has raised concerns around safety in the cloud — maybe he’s heard some things about S3 breaches.
Kacy Clarke: You have to bring them on the journey. You have to talk to the CISO about best practices in the shared responsibility model. How do they control things? How do they encrypt and protect the data? How do they do replication and make it secure?
It’s a ton of work for them to learn about the cloud and not get nervous and understand web application firewalls and the frequency of key rotations. But you have to start early so that they can turn on more and more services in AWS and make them available to developers and take advantage of the innovation possible in the cloud — and not get scared and shoot things down from the very beginning.
Andy Warzon: Absolutely. I don't think we’ve ever had a deployment that didn’t deeply involve S3. I think the phrase “S3 breach” is often a key opportunity. Often people have the mindset they imagine a zero-day, some kind of fundamental flaw. Some education can really open their eyes.
How would you influence a tech architect who has a legacy mindset but is also accountable for writing the new hosting strategy?
John Wright: If you’re a big Fortune 50 company moving to cloud, I’m going to tell you need a cloud-first chief architect. I’m not saying you have to fire your tech architect. But you almost always have to hire a principal architect that doesn’t come with all that baggage. They can help push that mindset over time.
If you take a technical architect who has been in the organization for 20 years, they’re often not the right person to drive that architecture going forward into the cloud by themselves. You need some additional guidance, it could be a partner or AWS. But I wouldn't just take them and tell them, “Go do this by yourself.” I would train them and get them some help.
Andy Warzon: Yes, you can’t leave the architect on their own. But if you pair them up with a partner or some other expertise, we tend to find that if you stay with them and work to empower them that they will come along. Those people will start to appreciate the opportunities and get excited about that. If you’re in that role and you can’t get excited about new technologies, then maybe it’s not the right field for you. I think given time, they’ll get there and become a huge advocate.
Kacy Clarke: Getting that enthusiasm about the opportunities in the cloud and the career opportunities about getting more informed around cloud and Big Data and AI and machine learning is huge. There’s a techie in there somewhere that you can get excited. You want to take advantage of all the knowledge they have about your company.
What’s the biggest cloud migration mistake you’ve ever seen?
Andy Warzon: Forgetting to really think through the cutover planning until it’s almost time to cutover. And realizing the challenges there with downtime or the extra timeline it adds.
John Wright: Not putting the energy into discovery and planning. You can’t have a migration unless you discover and plan correctly.
Kacy Clarke: Treating it like a data center. One of my clients turned on all of their services they needed for production as we got into dev and left it running. $1.2 million showed up on next month’s bill. Be cost aware. Don’t leave things running unless you need it.
Transforming careers, transforming businesses.
Learn faster. Move faster. Transform now with courses and real hands-on labs in AWS, Microsoft Azure, Google Cloud, and beyond.