The MySQL SET datatype is a disaster - as are any type of array datatypes (supported unfortunately by many RDBMSs). Check out my answer to another question here. As I note, MySQLs SET is a breach of Codd's second rule - no repeating group datatypes. It is also completely non-portable should you wish to change RDBMS.
If you wish to store that data this way is to have a BOOK table with book_id, name, author, country_of_publication, subject.... Then have a CHAPTER table with book_id, chapter_id, chapter_name, chapter_sub_subject.... and then a PAGE chapter with book_id, chapter_id, page_id, pager_number, ... images... other stuff... then LINE book_id, chapter_id, page_id, line_id, line_text...
This conforms to the relational model. I'm not clear on why you'd want to do this. What's wrong with TEXT datatype? Also, you might want to look into FTS (Full Text Search) - MySQL's native FTS (available on both MyISAM and InnoDB now). There's also Lucene, Sphinx and Spider (there may well be others - Google is your friend).
[EDIT in response to OP's comment]
I had the feeling it was wrong to make identically-structured tables
for each book
The whole point is that you should have identically structured tables for all books - you make the structure general enough to cope with (virtually) every possibility.
but I also though it would be slower to search through larger table
(containing ALL sentences not just 1book's worth) to (re)construct a
page, as well as doing an extra WHERE book_id = 'N' match...
Use indexes on book_id, chapter_id and maybe even page_id. Also, if you're going to be searching in your entire corpus for a particular word, then condsider Full Text Indexing.
Presumably this extra overhead is near-identical to me first sending
mysql to the single-book table. This leaves me with the query of how
to store a small array for use in PHP --- I'm leaning towards VARCHAR
with (un)serialize.
Be aware that the maximum size of VARHCHAR is 65535 bytes - some of James Joyce's sentences might go beyond that :-), but it's unlikely that even Joyce went beyond the 4GB of a TEXT datatype :-)
I Googled "storing books in MySQL" and came up with this which you may find of interest, and interestingly up popped this on the first page of the same search.
Probably not interesting for the OP but I'll give it a shot anyhow. Given the current tables:
create table Businesses
( id int not null primary key
, name varchar(20) not null) engine=innodb;
create table Employees
( id int not null primary key
, name varchar(20) not null
, business int not null) engine=innodb;
insert into Businesses (id, name)
values (3,'Planet Express'),(7,'Planet Express'),(11,'Whatever');
insert into Employees (id, name, business)
values (999,'Fry',3),(666,'Bender',7),(333,'Bob',11);
To fix the immediate problem we can update a join against a derived table like:
update Employees e
join (
select min(id) as min_id, max(id) as max_id, name
from Businesses
group by name having count(1) = 2
) as b on e.business = b.max_id
set e.business = b.min_id;
select * from Employees;
+-----+--------+----------+
| id | name | business |
+-----+--------+----------+
| 333 | Bob | 11 |
| 666 | Bender | 3 |
| 999 | Fry | 3 |
+-----+--------+----------+
and then delete the duplicate business. Note that I assume that there are exactly two rows involved in each duplicate.
For the long run I would suggest several things. Names like id, name, etc are to vague, name them something like business_id and use that name throughout the model.
If the name of a Businesses is supposed to be unique, add a constraint that ensures this. Something like:
create table Businesses
( business_id int not null primary key
, business_name varchar(20) not null unique
) engine=innodb;
create table Employees
( employee_id int not null primary key
, employee_name varchar(20) not null
, business_id int not null
, constraint fk_business foreign key (business_id)
references Businesses (business_id)
) engine=innodb;
Best Answer
If you have foreign keys to this table, add the option ON DELETE CASCADE. Then when you delete a row in a parent table all the rows in the child tables that refference that row will be deleted as well.