The three bytes at the end of this paragraph are \xED\xA0\x80, which are invalid in UTF-8 because they represent a code point that is reserved for use in surrogate pairs, not a part of UTF-8. í €