Showing posts with label Site Reliability Engineering. Show all posts
Showing posts with label Site Reliability Engineering. Show all posts

Saturday, February 14, 2026

Poor user experience - or DevOps and definitely no SRE.

 In my continuation to find examples of where software systems have not moved beyond the SDLC of the 1980s and 90s, or have started doing DevOps, but haven't realised that now we're in to SRE, and if you think SRE is DevOps, then there's a problem.

DevOps = Automation, implement the new features, make sure it's reliable and stable, make sure "we" can fix an issue when it occurs.


Right, you probably think that includes SRE because you're thinking about the client/stakeholder, but how do they (the client) know if the problem is them or you?


So many organisations don't provide enough information to a client when they have an issue, either through an update, or an outage of some kind.


SRE requires our understanding of the user, not just in how they use the application, but making sure they don't have to be technical. It's not down to the user to uninstall and re-install the application and then have to log back in - that to a non-technical user is the biggest "bug bear" you could ask them to do. We're now in the 21st century and if you're a large organisation with lots of clients, should be able to think more about your customers experience. Yes, SRE has some nice principles for determining whether the user is experiencing a degradation in performance, or whether developers should be focusing on reliability rather than new features. BUT we should still be thinking, are we putting too much on the end user and assuming that they are comfortable in re-installing, or want keep putting in their password every week?


These are areas that we should be thinking about if we want to make our software more exceptional than the competition. If you can provide the experience to users that saves them logging in every 5 minutes, or having to re-install manually, then you're moving into a whole new area of customer satisfaction.


Today's poor practice;


Southern Water's application that 1000s of sea and river swimmers rely on at this time of year due to the amount of sewage discharges they do, even in non-extreme weather conditions is out for the entire day - see the picture attached.


https://riversandseaswatch.southernwater.co.uk/release-history?BathingSite=FOLKESTONE&Status=Genuine



Yesterday Tennis TV had performed an update without checking the client applications, causing the App to just quit with no warning messages - this in turn causing the user to become frustrated and panic.


Don't slip back to the 1970 and 1980s method of customer support and hide behind the Internet and ChatBots (in the 70s and 80s it was the phone), and not provide adequate information to the users. Today there are many practices and tooling that we should not be seeing this type of service as users.


Step in to the 21st century and start using DevOps environments to test your updates and upgrades for both application and infrastructure before deploying to LIVE. Ensure that your pre-Live environments have at least 1 of every resource so that there is NOTHING NEW when you deploy to LIVE.


TESTING is not a last minute options, it's a before you code to ensure you met requirements, and that you don't write code to suit the tests (1980s and 90s).


Provide meaningful error messages to your users, don't fail SILENT!