Eh, fine-tuning seems to work well enough that it can be added back in after.
Though, previous fine-tunings/textual inversions won’t work since the CLIP encoder has been replaced too. I’d be interested in knowing if it needs to be retrained too for this case.
Though, previous fine-tunings/textual inversions won’t work since the CLIP encoder has been replaced too. I’d be interested in knowing if it needs to be retrained too for this case.