Findings so far are that removing one or two layers has a relatively moderate impact on quality. PPL and KLD suffer quite a lot, as expected considering that pruning changes the logits distribution, but the drop in inference quality, as reflected by tests' scores, is less pronounced.
Another interesting side-effect, at least with Qwen3-30B-A3B, is that pruning 3 or more layers makes the model forget English and reply in Chinese! but with still reasonable answers.