Well, in your example, that `os` string column is a categorical, which can be handle by SimpleImputer.
Imputers are actually lightweight predictors, so it will also depend if not having a value is significant for the rest of your processing, vs having a 'guessed'/'best shot' value.
For example, in the case of `os`, depending on the usecase you may want to handle it differently.
If you are in a I/T support kind of usecase, you will probably want to retain missing `os` value as a category of its own. because this is likely to be correlated with the diagnostics. An unknow os can be an indication of a totally computer-illiterate user, which usually comes with benign non-issues reported to support. In a corporate I/T context, it can be an indication of a non-instrumented machine (i.e. no agent on it), which is a category which could have issues of its own.Trying to fill-in on-existent values may actually have an adverse effect on the predictions you'll derive from the dataset.