Simple FOSS versus Complex Enterprise Software

As is often the case real IT operators in large organisations find themselves having to deal with "enterprise" software which has been imposed upon them. The decision to implement such software is usually determined by perceived business requirements (which is reasonable enough), but with little consideration of the operations and flexibility for new, or even assumed, needs.

For example, a certain large university which I have some familiarity with has decided to introduce an absolutely $awful_ticketing_system campus-wide. It serves as an excellent example because every step of the way there are just terrible things about this which illustrates how IT management decisions cause unnecessary problems for IT operators.

The beginning of this story comes with what initially seemed to be a simple problem. Emails that were sent to the ticketing system were being automatically assigned as an Incident, which had a tight Service Level Agreement period, as one would expect. The tickets that were being received, however, were not Incidents, but rather feature Requests. Changing this would be simple and easy (more on this in a moment), right?

Apparently not. I feel for the poor person who had to respond to me with this explanation.

Technically, an incoming mail to a mailbox can be converted into any type of ServiceNow ticket.

Well, this is good news.

But, out of all ticket types, this conversion is the most complicated for "request" type of tickets due to its three layered structure (REQ, RITM, CTASK). It has to have all three layers created for each request type of ticket. There is no out of the box or simple way of converting incoming mails to request tickets. Hence, it has not been implemented so far.

Wait, what? In well-known FOSS ticketing systems (e.g., OTRS, Request Tracker, Trac), the operator can set the queue or equivalent. Here, with this proprietary enterprise software, not even the administrators can do it in a simple manner. The solution being developed? Get rid of email tickets and use a web portal instead.

... the option of mailbox that creates tickets in Servicenow, was made available for use cases where users (such a externals, guests, students etc.) could not access forms and hence could not request relevant services. But the idea was always to use $awful_ticketing_system Portal as the access point to report any issues or request services, thereby avoiding the need to use mailboxes.

I'll leave it to others to think about how a web portal won't be associated with a mailbox or how this will allow operators to assign the ticket type. But don't spend too much time on it. Because this specific software and this specific example isn't the real issue at hand.

The first problem is that IT Managers don't listen carefully enough to IT Operators. An article by Jeff Ello, ten years old now, explains the many problems involved. The entire thing is worth reading, multiple times even, but to give a summary of a single paragraph from the article:

While everyone would like to work for a nice person who is always right, IT pros will prefer a jerk who is always right over a nice person who is always wrong. Wrong creates unnecessary work, impossible situations and major failures. Wrong is evil, and it must be defeated. Capacity for technical reasoning trumps all other professional factors, period.

There is one additional statement that needs to be added to this: This is not optional. In many other human-to-human roles, such as management, it is possible to reconstruct questions, find a compromise between competing agendas, and so forth. This is not possible with IT Operators (assuming they are honest and knowledgeable), not because they don't want to do it (whatever "it" is), but because it simply isn't possible. To repeat: technical reasoning trumps all other professional factors. When it comes to technical matters, IT Managers need to do more than just take advice from the Operators; they need to do what the Operators tell them. Otherwise, the wrong technical decisions will be made and that will cost time and money in the future.

The second problem, and yes, it cascades from the first, is that IT Managers have an erroneous propensity to choose enterprise software which is complex and easy rather than free and open source software which is simple and hard. The difference between the two has been well stated by Rich Hickey. Managers tend to choose software which is easy to use because they are typically not Operators themselves. They become especially enticed by software which is easy to use and feature-rich because they incorrectly perceive that will satisfy the requirements of (often dogmatic) business logic. When it is inevitably discovered that there are new business requirements the enterprise software needs to be somehow re-developed (at additional cost and time, as determined by the external body), or worked around. or try to find a new enterprise software product (and then experience the horror of vendor lock-in).

All of this is something that would make an experienced FOSS IT Operator's remaining grey hairs stand on end. In our world if something isn't working right, we fix it there and then, and if we can't we ask around because somebody else will know how. Collectively we have more knowledge than any of us individually can have. In the FOSS world, a product can be hard, in both senses of the word. It does have a steep learning curve, but it also is durable. It is however, also simple, that is, it isn't interleaved. Simplicity is a prerequisite for reliability said Edsger W. Dijkstra in 1975. Most FOSS (and yes, there are exceptions) is built on the UNIX philosophy: Design programs to do only a single thing, but to do it well, and to work together well with other programs.

The third problem, and yes, it cascades from the second, is that IT systems are often too heavily orientated towards singular solutions for the entire environment. Now initially I (erroneously) agreed with this approach. There was a time that I would have argued that a single ticketing system throughout an entire campus was a good idea as it lowered administrative and training costs and allowed for a single point from which a variety of metrics could be derived. My concern was that in selecting a single system that the wrong system might be chosen (and remember "wrong is evil and it must be defeated") or, one is terrified to say, the worst system might be selected (and usually it is, for the reasons previously stated).

The problem is with the conceptualisation. A singular system as a monolith will almost inevitably suffer the aforementioned issues of complexity, and if it's a closed-source enterprise product, there's nothing you can do about it. A single system, especially a cloud-based system, will suffer massive performance issues simply due to the physics of distance and physics is not optional. So rather than a single system, it really is worth using multiple systems according to what is most contextually appropriate and which can interact with other systems when needed. Again, getting back to these core computer science concepts of modularity, specialisation, and connectivity. The supposed gains in reducing administrative overhead are actually losses if the operator is unable to adapt to new systems, and that adaption is achieved through knowing how generic systems work together.

By way of conclusion, it is worth thinking in terms of stripped-down, minimalist, free and open-source programs which satisfy all functional requirements and can be modularly extended in a manner that is simple but durable. The philosophy of the team behind suckless.org is an excellent approach. As for a practical example of a ticketing system, Joe Amstrong's example of the Erlang Ticketing System as a "minimal viable program" satisfies the criteria.

"Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand."
-- Archibald Putt, "Putt's Law and the Successful Technocrat", 2006

Comments

The problem is that large project management is entirely political and people's careers are made by managing large projects. And senior project managers always have an exit strategy if it looks like it is going to be unsuccessful, so none of the mud sticks on them (even more so in government projects). Your jerk will almost certainly be filtered out of the important parts of the decision process, even if technically correct. An later be blamed for not pointing out these failings in the forensic analysis of what went wrong. After all, he was the technical expert and should have said something...

Also at this level, your legal department will almost always insist on non-FOSS approaches, simply because it gives them somebody to sue if the project fails (and potential future employment as to whether the terms of the contract have been correctly enacted by the supplier). Large organisations and governments will always prioritise the opinions of lawyers. Which leads to the amusing situation of Crown Law often taking 27+ days to evaluate a 30-day tender to see if it is legally compliant, even before it is submitted to the people that actually have to make the decision. And sueing people is very important to them.

And most supply companies will do the minimum amount of work to satisfy the stipulated contract requirements. And just that. If they have something they can kitbash into an approximate form, they will do that to save costs at their end. Usually outsourced to Indian code sweatshops that produce really really terrible code. And I really do mean really. Your system really was built by the lowest tender all too often.

And one size fits all solutions do tend to end up as one size fits no one solutions. Especially in IT in a University environment, where different departments have vastly different needs wrt their computing requirements. And helpdesks, especially helpdesks manned by external contractors, are notoriously bad in this aspect. And if it isn't easy to use people will find ways to bypass it (this is also the biggest problem with enterprise level security arrangements as well - your people trying to do their jobs will always be the biggest problems [and they will hide their "solutions" from you in order to not get in trouble]).

Sorry. You just awakened lots of bad memories...

From: Ian Borchardt

"Most FOSS (and yes, there are exceptions) is built on the UNIX philosophy: Design programs to do only a single thing, but to do it well, and to work together well with other programs"

Notable modern exception: systemd. Which coincidentally today had 3 root level exploits against it published, stemming entirely from its intermingled complex codependent behaviour. Yay!

In other news, I've never been a fan of centralisation since the first time I ever saw it implemented, in 1999 when CSIRO wanted to run ATNF's mail server. That's when I finally discovered exchange. My view was solidified when Swinburne wanted to run Astrophysics' supercomputer.

From: Tim Connors

The rubbish continues.

A user has cancelled a ticket by accident rather than have us have it marked as resolved. There does not seem to be a way to uncancel a ticket, and there should be.

If you're using something to generate metrics of work done, and the "customer" cancels the ticket thinking, erroneously, that how a ticket is resolved, then your metrics are going to be inaccurate.

This should be trivial right? Apparently not with $awful_ticketing_system.

> These changes that you are proposing would effect all teams that work off
> tickets such as INC, REQ, RITM, TASK etc.
> When a change this large is preposed it is a policy requirement that it
> needs to go via the change management process.
> As the impact is larger than Just your team.

Other people would benefit and gain a level of control that's normal
and expected in other ticketing systems!

> If you're keen to change it you'd have to submit a change and get it
> approved in CABB. I don't understand why we can't do this ourselves?

It's pretty incredible that this is something that administrators can't change.

This shouldn't be a feature request that requires a change management process, additional code to be written, or anything like that. At worst it should be a group permissions issue.

In other words, it's more of the same. ServiceNow (there, I said it) is possibly the worst ticketing system ever produced. It is "complex and easy" rather than "simple and hard".

To reiterate the difference:

* "Simplicity is a prerequisite for reliability". - Edsger W. Dijkstra
* You need to build simple systems if we want to build good systems (i.e., reliable, flexible, extendable)

If you select an easy "feature-rich" (i.e., complex) system, you'll end up the situation designed here, a request that should be trivial to implement, requiring an inordinate expenditure of time, effort, and expense.

Sometimes you have to wonder if the only reason some people have a job is due to unnecessary complexity.

Responeses to another worker's experience:

1. API out of the box is woefully inadequate. From their description, the API is designed to be integrated to the web UI. Batch processing seems to out of scope.

Why on earth would batch-processing, one of the basic low-level systems in almost any software in the world, be unavailable? Because the system was built with high-level features in mind and ignored decades of computer science that says start with the low-level functionality.

2. We discussed about our use-cases. Their take is: Creating tickets are easy. Viewing demands are easy. Updating tickets are HARD as they have to custom make the functions.

As described above regarding the Updating issues. The system is designed that basic tasks are complex and therefore fragile.

3. The API and web UI runs on the same application, so they will need to build an API proxy to handle the workload. Works on this proxy is still on the drawing board and hasn't actually been implemented. The current App is run on 4 nodes with 16 semaphores. Pressing if that means it can only handle 16 connections at one time got no straight answer. They said it can handle up to 1000 requests at one time without crapping out. Asking them if throwing more hardware at it will help but they avoid answering that question.

And this is what happens with such a design.

IT managers must listen to computer scientists. High level features are not a replacement for low-level functionality. One can build features from functionality, but not the other way around.

An interesting rant on Reddit's systadmin group asks the question "why is enterprise software so insecure", and makes similar observations to this post.