Hello,I am working on migration a large dataset from archive to a new format. This work is joined with searching a small differences between original and transformed data. For this work I implemented a small library for PostgreSQL. It contains a two functions: diff_string and lc_substring.These functions should to support multibyte encoding. I hope, so this can be useful for someone.
This library exists on pgFoundry http://pgfoundry.org/frs/?group_id=1000457
postgres=# select lc_substring('Hello World','ello');
lc_substring
──────────────
ello
(1 row)
postgres=# select diff_string('Hello World','ello');
diff_string
─────────────────────────────
<del>H</>ello<del> World</>
(1 row)