After discovering the excellent Gwene service, which allows you to subscribe
to newsgroups to read RSS
content (blogs, planets, commits, etc), I came to
read this nice article about Happy Numbers. That's a little problem that
fits well an interview style question, so I first solved it yesterday
evening in Emacs Lisp as that's the language I use the most those days.
A happy number is defined by the following process. Starting with any positive integer, replace the number by the sum of the squares of its digits, and repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1. Those numbers for which this process ends in 1 are happy numbers, while those that do not end in 1 are unhappy numbers (or sad numbers).
Now, what about implementing the same in pure SQL
, for more fun? Now that's
interesting! After all, we didn't get WITH RECURSIVE
for tree traversal
only, did we?
Unfortunately, we need a little helper function first, if only to ease the reading of the recursive query. I didn't try to inline it, but here it goes:
create or replace function digits(x bigint) returns setof int language sql as $$ select substring($1::text from i for 1)::int from generate_series(1, length($1::text)) as t(i) $$;
That was easy: it will output one row per digit of the input number — and
rather than resorting to powers of ten and divisions and remainders, we do
use plain old text representation and substring
. Now, to the real
problem. If you're read what is an happy number and already did read the
fine manual about Recursive Query Evaluation, it should be quite easy to
read the following:
with recursive happy(n, seen) as ( select 7::bigint, '{}'::bigint[] union all select sum(d*d), h.seen || sum(d*d) from (select n, digits(n) as d, seen from happy ) as h group by h.n, h.seen having not seen @> array[sum(d*d)] ) select * from happy; n | seen -----+------------------ 7 | {} 49 | {49} 97 | {49,97} 130 | {49,97,130} 10 | {49,97,130,10} 1 | {49,97,130,10,1} (6 rows) Time: 1.238 ms
That shows how it works for some happy number, and it's easy to test for a
non-happy one, like for example 17
. The query won't cycle thanks to the seen
array and the having
filter, so the only difference between an happy and a
sad number will be that in the former case the last line output by the
recursive query will have n = 1
. Let's expand this knowledge
into a proper function (because we want to be able to have the number we
test for happiness as an argument):
create or replace function happy(x bigint) returns boolean language sql as $$ with recursive happy(n, seen) as ( select $1, '{}'::bigint[] union all select sum(d*d), h.seen || sum(d*d) from (select n, digits(n) as d, seen from happy ) as h group by h.n, h.seen having not seen @> array[sum(d*d)] ) select n = 1 as happy from happy order by array_length(seen, 1) desc nulls last limit 1 $$;
We need the desc nulls last
trick in the order by
because the array_length()
of any dimension of an empty array is NULL
, and we certainly don't want to
return all and any number as unhappy on the grounds that the query result
contains a line input, {}
. Let's now play the same tricks as in the puzzle
article:
=# select array_agg(x) as happy from generate_series(1, 50) as t(x) where happy(x); happy ---------------------------------- {1,7,10,13,19,23,28,31,32,44,49} (1 row) Time: 24.527 ms =# explain analyze select x from generate_series(1, 10000) as t(x) where happy(x); QUERY PLAN ---------------------------------------------------------------------------------------- Function Scan on generate_series t (cost=0.00..265.00 rows=333 width=4) (actual time=2.938..3651.019 rows=1442 loops=1) Filter: happy((x)::bigint) Total runtime: 3651.534 ms (3 rows) Time: 3652.178 ms
(Yes, I tricked the EXPLAIN ANALYZE
output so that it fits on the page width
here). For what it's worth, finding the first 10000
happy numbers in Emacs
Lisp on the same laptop takes 2830 ms
, also running a recursive version of
the code.