Mysql – Distinct XML values thesql

MySQLxml

I have a problem parsing XML input. I get a string of all Authors while I only want the distinct author or first value.

+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| doc_id   | xml_doc   
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+                                                                                                                                                                                                                                                           |
| 1        | <catalog><book id="11"><author>Chuck Palahniuk</author><title>Fight Club</title></book><book id="12"><author>Chuck Palahniuk</author><title>Survivor</title></book></catalog>                                                                                        |
| 2        | <catalog><book id="21"><author>Bernard Werber</author><title>Les Thanatonautes</title></book></catalog>                                                                                                                                                              |
| 3        | <catalog><book id="31"><author>Boris Vian</author><title>The Big Sleep</title></book><book id="32"><author>Boris Vian</author><title>The Lady in the Lake</title></book><book id="33"><author>Boris Vian</author><title>The World of Null-A</title></book></catalog> |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


SELECT ExtractValue(xml_doc, '//author[1]') AS author FROM catalogs ORDER BY author;

OUTPUT:

+----------------------------------+
|              author              |
+----------------------------------+
| Bernard Werber                   |
| Boris Vian Boris Vian Boris Vian |
| Chuck Palahniuk Chuck Palahniuk  |
+----------------------------------+

While I only would like distinct xml. Can this be achieved somehow? Or am I missing something?

My expected result would be

+----------------------------------+
|              author              |
+----------------------------------+
| Bernard Werber                   |
| Boris Vian                       |
| Chuck Palahniuk                  |
+----------------------------------+

Best Answer

To get the first author of the first book, you can use

select ExtractValue(xml_doc, '//book[1]//author[1]');

To get a real distinct list you would have to write a stored function that circles through all books and all authors.