Live Weather & Live Traffic Server Monitoring - Auto Reset For Outages or Degradation

I work for a company that develops a large, fairly complex SAAS application suite that is hosted in the Microsoft cloud, so I’ll add my $0.02, which is similar to what others here have mentioned but possibly with some additional insight.

All of the tools necessary to do the kind of service level monitoring you refer to are indeed part of the MS Azure offering, but the big caveat is that there is some planning and coding required to take advantage of it. How much or how little telemetry you get really depends on how much work has gone into initially setting this stuff up.

And, of course, there is the question of what gets done with the telemetry. Who is monitoring for alerts, and what is the expected response, communication, etc.? All this needs to be planned out by the developer. MS provides the tools and monitors the underlying infrastructure, but that’s it.

One might think that given this is a Microsoft application, that utilizing the tools and handling host infrastructure issues was planned and baked into the project from day one, but I can say from my own experience that this isn’t necessarily a given. I think without strict discipline it is easy for this stuff to be overlooked, or at least take a back seat priority-wise. And it isn’t always obvious to the coder working on a given feature what online services could impact it requiring telemetry hooks to be added.

I’m not trying to make any excuses for MS/Asobo here, just illustrate some of the devils in the details. My guess is that they probably have folks who work on monitoring and telemetry, but it is probably a small team and considered low(ish) priority, and it is likely an evolving, iterative process. When you launch something this complex you try to identify and monitor risks upfront, but you don’t always know what you don’t know and end up discovering new things that need monitoring on a regular basis, post release.

4 Likes

I totally agree with the post around this being a $100 piece of entertainment software, not a mission critical application…yes it is a nuisance when something isn’t working like live weather or traffic, but it’s somewhat unreasonable to assume that MS should have 24x7 monitoring of this and have folks around on the weekend to immediately try to fix it…it’s a $100 or so piece of entertainment software and not a mission critical app. And honestly the amount of upgrades and feature adds in the last year is fantastic when they were all free. Yes, this thing is so complex it breaks often (like every month when the do a release) but usually those breaks are things we can live without for a short while or work around, again given that this is an entertainment software, albeit one that all of us have poured time and money into that goes well beyond the $100 we gave MS (I think that’s why folks are so passionate as they’ve spent thousands on PCs, controllers, add ons, etc,…I have probably dropped $5k in the last year alone on a new PC and peripherals, but I don’t feel that MS needs to have a team on call 24x7 to immediately fix anything that goes wrong). I do think they need to be more responsive on communicating status of bugs, errors where online stuff fails, help with the download/update mess, etc…that bit of increased transparency on status of their backlog with more details on expected timing to address things would really help. The current development update is not all that helpful in terms of detail, timing, complexity or tradeoffs, etc. It’s just a long list with generic explanations and no real sense for the complex project management involved in this sim, especially now that we have both XBOX and PC users who are all passionate and online every day submitting feedback. Fly On.

I won’t be judging whether they should monitor 24/7 or not, I’m not in any position to do so, however qualifying it as a $100 piece of entertainment is not representative in my opinion:

  • It is a flagship technological demonstrator for MSFT to sell Azure and Bing services.
  • It is also a $200M to $400M gross revenue just with the PC licenses sold
  • It is a 30% cut on every DLC sold on the Market Place.
  • It is also a flagship Xbox title to drive Xbox consoles and Xbox peripherals sales.
4 Likes

That’s fair, all great points…I think most in this community would agree that better communication from MS/Asobo is needed on bugs, network service issues and complexity, status on tradeoffs that have to be made when fixing things or building new features. They, to your point, have the money to invest in better communication on all of the work to maintain and enhance this sim and the current development updates and twitch streams, etc. are not commensurate with the large revenue stream or visibility of the sim. I’m totally with you on that point. I was pushing on the idea posted that they should have weekend staff to be monitoring and sending out alerts to users and fixing things like live traffic on a weekend if it breaks…that seems like too much to ask but it’s entirely a subjective thing I suppose. You could theoretically have a few junior staff on call monitoring the forums and looking at their own dashboards on performance of the network features and they could post in the forums at least to alert users to their awareness of problems, not the fix yet, but the general “hey, we see the problem and are on it.”. That would not be expensive for them to build/hire for.

1 Like

If you happen to follow XBox support on twitter, you will see that there are people in place that monitor, respond to and fix game issues 24/7. You will see acknowledgement of problems, resolution time frames (when known) and alerts of fixes on many, many games that cost the same money as MSFS. There is a working infrastructure in place. If you care to read some of the tweets about what some of you would call ‘trivial’ click here

I have seen maybe one mention from XBox support about MSFS on Twitter. Either it is not an important title or the proper telemetry isn’t in place.

I would think that a game that depends so heavily on cloud services would have redundancy and fail overs in place before it was ever released and would continually tweak those.

And for all you “It is just traffic. You can still fly” people; I play my game the way I want to play my game and I have an expectation that ALL the features will work as advertised. You play your game with your expectations and don’t tell me what my expectations will be, because it will not change my mind.

Remember that this is not the first time that Traffic has been switched off.

5 Likes

Answer is: Because they don´t have any monitoring for it.

As stated before, the underlying services are one of the biggest problems in the stability department of the MSFS and this is a real problem.

When you carefully read the EULA they are not guaranteeing the services, so that you - as a customer - don´t have any rights to return it or get your money back.

3 Likes