How deep does machine learning architecture go?

I'm reading Hands-On Machine Learning with Scikit-Learn and TensorFlow. Hereafter, HOML.

At the introductory level where I'm studying machine learning, we're looking at individual types of algorithms. Yet one point from Chapter 1 of HOML has stood out: components of the same system may observe different groups of rules to produce interconnected outputs.

Here's an introduction to a class of problems with partially labeled and partially unlabeled training data. (Chapter 1, Semisupervised learning)

Most semisupervised learning algorithms are combinations of unsupervised and supervised algorithms. For example, deep belief networks (DBNs) are based on unsupervised components called restricted Boltzmann machines (RBMs) stacked on top of one another. RBMs are trained sequentially in an unsupervised manner, and then the whole system is fine-tuned using supervised learning techniques.

Naïvely, I'm used to thinking of a problem statement as being best solved with a single approach. Most of the work lies in breaking the problem down until a solution to every subproblem becomes easy to discover in code. (Of late, I'd add that the other 90% of the work is in figuring out how to use the framework. I hope that this is a different kind of problem.)

There's a suggestion here that some problem domains are big enough that an algorithm that seems strongly predictive for one part won't produce meaningful findings for other, related parts.

Implementing a machine learning software package involves deciding which kinds of analytical tools will be included. That means breaking up a gnarly problem (perhaps a problem domain) into pieces (component problems) and addressing each one by implementing software components that emit separate streams of results.

Those streams are generating data about data, which gets you looking at structuring the system to corral these outcomes into, yes, two more kinds of data outputs:

  1. the findings you wanted to learn in the first place
  2. fitness data on the system itself, evaluating the real-world accuracy of what you learned

I find it thrilling, and a bit overwhelming! Since I don't know very much about real-world applications of ML at this point, I'd love to learn if there are problem domains where multiple-component ML problem exploration seems to obtain better-performing results than picking one algorithm and running with it.

My contract with the goat requires me to ask this last question: we're unit-testing all of it, right? Yes? Excellent.