Make a better layout detector. Every character on its line.
Separate (more) merged characters.
Deal better with frames, lines, pictures, etc.
