I think the context here is pure functional programming. So not just replacing e.g. loops with mapping and such things, but actually making effect-handling explicit. The IO type (or otherwise deferred-type) is essentially for that. Saying it doesn't fit into the FP paradigm doesn't make sense. Without IO, FP is useless because you simply cannot "do" anything at all.